C207 questions
Why do marketing teams use big data?
"Datamining" assists in targeting the best customers for new products or reaching existing customers for like products
Which tool helps us collect and organize the data for a histogram?
Check sheet
_____ are there significant patterns/differences among frequency data
Chi-Square
A local grocer datamines to target customers who use gas points so they can market other products. Which tool is this?
Cluster Analysis
Mama Regina's Pizza chain needs to select two among the states of New York, Florida, Georgia and Utah to open new chains. There are 10 regional directors who can vote for the state. What statistical rule can be used to determine the POTENTIAL voting OUTCOMES? (a) Multiplication (b) Intersection (c) Bayes Theorem (d) Combination
Combination
The grocer is running a 10 for $10 special on cans of soup. You can mix and match the 5 different flavors of soup. You decide to purchase 10 cans. What can we use to calculate all the possible outcomes of flavors picked?
Combination
Visually presenting the data assists in which stage of the Davenport-Kim model?
Communicating the results
What is an ANOVA used for?
Comparing three or more means.
You are taking a survey and the question is leading. It seems advantageous to answer a specific way. What type of bias is this?
Conscious Bias
Type of Bias introduced because respondents believe it will be BENEFICIAL if selected is ______.
Conscious Bias.
What are some ways that performance (e.g., fiscal responsibility) can be measured in the Government/ Public Sector?
Cost-benefit analysis Benchmarking Payback period
You are opening your store and deciding which of three suppliers would be OPTIMAL in COSTS. Which analysis tool will best compare the COSTS BASED ON VOLUME? • (A) Regression • (B) Chi-square • (C) Cross-over • (D) Balanced Scorecard
Crossover
A Health care researcher plans to determine the number of new cases of Zika virus infections in Florida since Jan 2016. What analytic measure is of interest here? -Cumulative incidence -Incidence Rate -Prevalence -Ratio
Cumulative Incidence
Net Promoter Score (NPS) focuses on ____.
Customer Loyalty
What is the focus of Net Promoter Score (NPS)?
Customer loyalty
The process of discovering patterns in large data sets is _____.
Data mining.
A luxury car dealer must decide how many European sports cars to order for the coming year. It costs her $25,000 to keep each unsold car in inventory, and her gain (profit) is $50,000 on each car sold. Further, suppose that the car dealer can order 0, 1, 2, or 3 cars. How will she select potential PAY-OFFS for any given COURSE OF ACTION? • (A) Pareto Chart • (B) Cause and Effect Diagram • (C) Decision Tree • (D) Simulation
Decision Tree
Describes characteristics of a dataset. (Mean, Median, Mode, Variation, Standard deviation, range,....)
Descriptive Analytics
Analyze 5 years of historical statistics between Honda, ford, and Toyota.
Descriptive Statistics.
The number of appointments per day averaged 7 over the last 6 months. Which form of analytics is this?
Descriptive analytics
Advantages of decision analysis
Determines the greatest decision with the greatest value. Produces a value under certainty, uncertainty, and risk.
A histogram shows what?
Distribution or frequency
What are the key words to the Addition Rule?
Either, Or
ANOVA uses this test statistic
F-stat. (must be higher than critical value to reject the null)
Balanced Scorecard includes what 4 diminsions?
FLIC (Financial, Learning, Internal, Customer) -Financial -Customer -Operational Efficiency -Employee learning/innovation
True or false: Our local sewing machine shop uses big data to search for customer addresses to send out Christmas cards.
False.
In which stage of the Davenport-Kim model is problem recognition?
Framing the Problem
What are the key words to the Bayes Rule?
Given that
Advantage of cluster analysis
Helps determine target markets
_____ is when the random variables have an unequal spread of variances.
Heteroscedasticity
A regional bank has broadly identified problems with its current banking software and wishes to now determine how these problems are DISTRIBUTED across departments. Which tool is best? (a) Simulation Diagram (b) Histogram (c) Decision Tree Diagram (d) Fishbone Diagram
Histogram
You conduct a quick survey asking friends to rate your new website on a scale of 1 - 10. Which graph will show how the responses are DISTRIBUTED?
Histogram/Bar chart
_____ occurs when all of the random variables have the same general finite variance.
Homoscedasticity
Which of the following statistics measures the SPREAD OF a DATAset? (More than one) ( a) IQR ( b) Z-score ( c) Range ( d) Variance
IQR RANGE VARIANCE
If a responder lies on their survey, this creates what type of bias?
Information Bias
Ignoring the purpose of the information collected is _____ bias.
Information bias
You're angry with your boss so you give angry responses to command climate survey. Which bias is this?
Information bias
Non-truthful answers can cause ____ bias.
Information bias.
Name the data type equal distance; temperature, dates
Interval
what makes KPI Dashboard different from a KPI?
It is a VISUAL representation of multiple KPIs. Can reveal TRENDS over time.
Why do we collect big data?
It is used to encourage buying behavior.
As the manager, we want to VIEW the time it takes to wait on customers along with the sales per day. Which MEASURE is best? • (A) Balanced Score Card • (B) KPI dashboard • (C) Flowchart • (D) Key Performance Indicator
KPI Dashboard
How do you calculate a z-score?
(Me - mean) divided by standard deviation
You want to know how your height compares to all of your family tree. How do you calculate a z-score?
(Your height - mean of the family tree height) divided by the standard deviation of the family tree height.
A Telecommunication company focuses on the following critical success factors: Revenue growth, customer satisfaction, and operation efficiency. Which of the following would be most appropriate for assessing if the company's revenue performance goal is being achieved? -Key Performance Indicator -Balanced Scorecard -Pareto Chart -Net Performer Score
Key Performance Indicator
Name three performance measurement frameworks.
Key Performance Indicators (KPIs) & KPI Dashboards Balanced Scorecard Net Promoter Score (NPS)
Based on the limited storage tanks and pumps, a gas station determines the right "mix" of gasoline and diesel to carry each month.
Linear Programming
Which is not necessarily an advantage of Key Performance Indicators? -Data-driven results that make it easier to quantify performance -Links operations with company strategy -Allows for internal benchmarking -Can be used across an entire organization.
Links operations with company strategy
______ can be applied when the dependent variable is a categorical, binary variable.
Logistic regression
List the analytics in order of difficulty
Low - descriptive Medium - predictive Hugh - prescriptive
The ______ measure of central tendency alone should not be used because it includes outliers.
Mean
The use of REGRESSION PROVIDES R-SQUARED statistic. What is the definition of R-squared? ☐Measure of causation ☐Measure of goodness of fit ☐Measure of variance from the mean ☐Measure of collinearity ☐ Not Enough Information
Measure of goodness of fit
A random same prevents ____ bias.
Measurement Bias
Am unrepresentative sample would be ____ bias
Measurement Bias
Not selecting a random sample is what type of bias?
Measurement Bias
Valid data____
Measures what is intended to be measured Does your test score represent your ability?
The best measure of central tendency when making decisions is _____.
Median
A hamburger restaurant wants to analyze the incomes of the residents in a city. Which of the following is the best statistic to use to measure the typical income of the residents? (a) Mode (b) Median (c) Mean (d) Variance
Mode
Probability A X Probability B What rule should be used?
Multiplication Rule
There is 20% of parking close to the door and a 30% chance of getting a shopping cart with a squeaky wheel. How is the probability of both events occurring calculated?
Multiply
What are the four types of data?
NOIR Nominal Ordinal Interval Ratio
NPS = % of______ - % of______.
NPS= % of PROMOTERS - % of DETRACTORS
As sales increase, estimated delivery times decrease. This relationship is _______
Negative
0 means ____ fit and 1 means ____ fit
No, excellent
If your F-stat is at .026 and your F-critical is 1.96. Can you reject the null?
No. If your score (F-stat) is smaller than your cut score (F-critical), you do not pass the test. We can only reject when we pass.
Identify, Group, or Categorize is ______ level of data.
Nominal
Name the data type categorical; gender, hair color
Nominal
The colors of cars is which level of measurement?
Nominal
Which level of measurement cannot be analyzed with most statistical tests?
Nominal
To make an inference about all used car purchases, you look up the last 30 buyers of used cars. What is wrong with this sample?
Non-random Measurement bias
In making an inference about the safety of a convertible, you measure all 2 door coupe vehicles.
Non-representative sample
Measuring the housing market in North Carolina when making an inference about the entire East coast is an example of __________.
Non-representative sample
Which of the following is not a Balanced Scorecard perspective? -Financials -Customers -Suppliers -Internal Business Processes
- Suppliers (Remember FLIC)
A normally distributed sample spreads ____ to ____ standard deviations from the median.
-3 to +3
A relationship is weakest when r is closer to what number?
0
For NPS, how are the three groups of respondents categorized on an 11-point scale (0-10)?
0-6 = Detractors 7-8 = Passives 9-10 = Promoters
68.2% of a sample will be within _______ standard deviations of the mean.
1
A relationship is stronger when r is closer to what number?
1 (100%)
Name 4 advantages of Balances Scorecards.
1. Better ORGANIZATIONAL ALIGNMENT. 2. Better internal and external COMMUNICATION 3. LINKS OPERATIONS WITH COMPANY STRATEGY. 4. Emphasizes STRATEGY and organizational RESULTS
Name 3 Disadvantages of KPIs
1. EXPENSIVE and requires ONGOING MAINTENANCE. 2. Shows changes that are NOT STATISTICALLY SIGNIFICANT. 3. May focus on SHORT-TERM, rather than long-term goals.
Name 4 advantages of KPIs.
1. Helps TRACK goals. 2. provides DATA-DRIVEN RESULTS to quantify performance. 3. Can be used ACROSS AN ENTIRE ORGANIZATION.
Name 4 disadvantages of Balances Scorecards.
1. Requires TIME & EFFORT. 2. Challenges for cross-company ADOPTION. 3. May not encourage DESIRED BEHAVIOR CHANGES.
After twelve long months, we are proud to report monthly mean sales of $25,000. If we want to compute the -95%- confidence interval, which 'Z-SCORE' do we use? • (A) 1.69 • (B) 1.96 • (C) 2.75 • (D) 2.58
1.96
There is a 90% chance you will have to wait in line for more than 6 minutes to check out at the grocer. What is the complement?
10%
95.4% of a sample will be within ______ standard deviations of the mean.
2
You score 70 on an exam where the mean is 50 and the standard deviation is 10. Assuming the scores are normally distributed, approximately what percent of students scored better than you did? - 2.5% - 5% - 68% - 95%
2.5%
After 12 long months, we are proud to report monthly mean sales of $25,000. If we want to compute the 99% confidence interval, which z=score do we use? A: 1.69 B: 1.96 C: 2.75 D: 2.58
2.575 Keyword: 99% Z-Score
99.7% of a sample will be within ______ standard deviations of the mean.
3
In 1968, the price of a Big Mac was $1.60. In 2014, a Big Mac cost $4.80. What is the 2014 price of a Big Mac expressed as an index? - $4.80 - 3 - 100 - 300
300 4.8/1.6 x 100
According to The Empirical Rule, approximately _____% of the data points in a dataset will be within 1 standard deviation of the mean.
68.3% will be within 1 standard deviation of the mean
Approximately ______% of the data points in a dataset will be within 2 standard deviations of the mean.
95.4% will be within 2 standard deviations of the mean.
A specialty car mechanic finds out that the amount spent by customers has NORMAL DISTRIBUTION with AVERAGE $500 and STANDARD DEVIATION $100. What is the PROBABILITY that the next customer will spend between $200 and $800? (a) 68.3% (b) 95.5% (c) 99.7% (d) 100%
99.7%
What are the three probabilities of the bell curve?
99.7% 95.4% 68.2%
What percent of the data points in a dataset will be within 3 standard deviations
99.7% will be within 3 standard deviations of the mean.
When you take the objective assessment for this course, your score represents ______. -A true score -A norm-referenced score -A criterion-referenced score - A percentile score
A criterion-referenced score
There is 20% of parking close to the door and a 30% chance of getting a shopping cart with a squeaky wheel. How is the probability of either event occurring calculated?
Addition
What are the key words to the Combination Rule?
All Possible
What are the key words for the Multiplication Rule?
And, Both, All
A researcher DECLARES that chess improves math skills by noting that children who play chess have higher math scores. Which misuse of statistics would occur if chess attracts children who have high math skills? (a) Small sample size (b) Association and causality (c) Lack of blinding (d) Response Bias
Association and Causality
Disadvantage of time-series analysis
Assumes past data patterns will repeat in future, which may not be true
______ occurs when a given data point on a time series analysis is affected by a previous data point for that series.
Autocorrelation
STRATEGIC teams that are essentially striving for ALIGNMENT TO corporate GOALS by applying management concepts to PERFORMANCE improvement. Would be using what method? • (A) Results Based Management • (B) Six Sigma • (C) Quality Initiative Team • (D) Balanced Scorecard
Balanced Scorecard
50% chance it will rain. There is a 20% chance you park in the garage in the rain and a 60% chance you park there if it is not raining. Given you are parked in the garage, what is the probability it is raining? Which probability technique applies?
Bayes
A pizza chain notes that 30% of customers use coupons. Also they know that 50% of customers dine in. The probability that a customer will dine in given that they use a coupon is 10%. The probability that a customer will not use a coupon GIVEN THAT he/she dine outs is 0.46 . What probability rule could be used to determine this? (a) Multiplication (b) Intersection (c) Bayes Theorem (d) Combination
Bayes Theorem
What are the two major issues surrounding research standards?
Best Practices and Ethics
Research best practices eliminate _____.
Bias
Data Mining is most associated with ______ -Results-based management -Big Data -KPI Dashboards -Linear Programming
Big Data
Both structured and unstructured data that is difficult to process using traditional database and software techniques is_____.
Big data
Where is big data stored?
Big data Warehouse
Name the data type order; organizational rank; Likert, yes, maybe, no
Ordinal
Linear regression is often referred to as _____.
Ordinary Least Squares (OLS) Regression
____ orders from highest to lowest frequencies to show importance
Pareto Chart
When does research fail to produce reliable results?
Poor Research validity
What type of analytics? Use the past 10 years of customer satisfaction scores to estimate the upcoming year.
Predictive analytics
If we move to six appointments an hour we can increase revenue 2%. Which form of analytics is this?
Prescriptive Analytics
What type of analytics? Based on previous rebate response rates, increase the rebates by 2% in anticipation of a 5% increase in revenue.
Prescriptive analytics
Goodness of fit is defined by
R-squared How much of the dependent variable can be determined by the independent variable?
Errors that fix themselves are _____.
Random Errors
Continuous data with a unique zero point is ______ level of measurement.
Ratio
Females are 2x more likely to have Alzheimer's disease than males. This statement is an example of _________. -Proportion -Ratio -Prevalence -Cumulative Incidence
Ratio
Name the data type unique zero point (indicates absence); money, height, weight
Ratio
The price of the car is which level of measurement?
Ratio
Which level of measurement can be multiplied and divided?
Ratio
We start an ad campaign and send out BOGO coupons to our loyal customers each week. If we want to see if sales are going up as coupon mailings increase, which technique would we use? (a) Regression (b) t-test (c) Scatter Diagram (d) Chi-square
Regression
_____ takes information from one data set and can predict information for another data set.
Regression
A scatter plot shows what?
Relationship
A question is leading but you feel your boss might want you to answer a specific way. What type of bias is this?
Response Bias
You started a business 5 years ago and you need to grow revenue. Now that your immediate focus of increasing sales is underway, you want to WORK WITH ALL departments and include them in having a LONG-TERM IMPACT. What improvement strategy is this? • (A) Six Sigma • (B) SIPOC • (C) Balanced Scorecard • (D) Results Based Management
Results based management
A nonprofit charity seeks an ongoing strategy to ensure that it is achieving its goals with a system of partnership to other charities. What approach should the company use? -Net promoter Score - Balanced Scorecard -Big Data Analysis -Results-based management
Results-based management
The ______ is the chance that the decision will not produce the intended result or desired outcome.
Risk
Key Performance Indicator (KPI) often follow ________.
SMART Criteria Specific Measurable Attainable Relevant Time-bound
You have the number of customers that visit your website by clicking on a promotional email. Which graph can show you if sales are CORRELATED to the number of promotional emails each week?
Scatter Diagram
A consultant wants to determine how training course HOURS AFFECT individual PERFORMANCE. Which of the following could be used to determine this? (a) Scatter plot (b) Histogram (c) Standard Deviation (d) Range
Scatter Plot
____ Composes or puts together data from many different sources WITH NO WEIGHING
Simple composite Index
KPI is typically a _____ measure.
Single performance
In which stage of the Davenport-Kim model is data collection?
Solving the problem
How do you easily find an out of range error?
Sort
A high standard deviation means that the numbers in the dataset are ______.
Spread out
Structured or unstructured data? Collecting multiple choice survey question responses.
Structured data
Spelling error is what type of error?
Systematic Error
This error repeats itself
Systematic Error - Skewed results
Errors that need correcting are ______.
Systematic errors
T-test uses this test statistic
T-stat. (must be higher than critical value to reject the null)
Which statistical test could be used to determine if there is a significant difference in the average overtime worked in a machine shop vs. the neighboring shop? ☐T-test ☐Chi-square ☐Least Squares Regression ☐Multiple Regression
T-test
Framing the Problem, Solving the Problem and Communicating Results are part of which decision-making model?
The Davenport-Kim three-stage model
In order to be a statistically valid sample, the sample must be:
The appropriate size and Random.
SD is the square root of ______
The variance
A disadvantage of cluster analysis is __________________.
Time consuming and expensive.
What is the key reason we study statistics?
To make informed decisions
True or Fase: Correlation can be a negative number.
True
Tue or False: A pareto chart is a histogram
True
________ introduces an element of risk to decision-making problems.
Uncertainty
Adding two events calculates a _______________
Union
What are the three elements of an experimental study?
Unit Treatment Response
Searching 20 years of medical notes to see if depression is mentioned. Structured or unstructured data?
Unstructured data
Which of the following STATISTICS could help in directly assessing the RISK for a portfolio? (a) mean (b) variance (c) z-score (d) mode
Variance
____ can be used to differentiate between two samples with the same mean.
Variation
A low standard deviation means that most of the numbers are ______.
Very close to the average
A manager runs a t-test on the monthly performance of two machine shops and gets the following: T-stat: 2.89 T-Critical 2.06 p-value 0.02 Is there a significant difference between the two shops? ☐Yes ☐No ☐Maybe ☐Not Enough Information
YES Absolute value of t-stat is greater than t-crit
Your t-stat is 2.06 and your t-critical is 1.65. Can you reject the null?
Yes. If your score (t-stat) is larger than your cut score (t-critical), you pass the test!
____ = the number of standard deviation units a raw score is from its mean
Z
Which measurement can tell us where a single data point is on the bell curve?
Z-Score
Advantage of Regression Analysis
allows sophisticated analysis of cost behavior and sales forecasts.
Reliable data is
consistent and repeatable A measure of the instrument (test)
Best decision based on estimated value
decision tree
Multiplying two events calculates a ______.
intersection
Outliers create this type of error
out of range
z-scores let us measure two samples by _____________.
putting them to the same scale.
Regression determines the ____.
relationship between two data sets
For a test without ___ error, the ___ score is the ___ score plus___ error - random/observed/true/systematic -random/true/observed/systematic -systematic/observed/true/random -systematic/true/observed/random
systematic/observed/true/random
Which measure lets you measure your height compared to your maternal family tree and your height compared to your paternal family tree?
z-score
Name the 7 quality tools
• Cause-and-effect diagram • Check sheet • Run or Control chart • Histogram • Pareto chart • Scatter diagram • Flowchart