Business Analytics Quiz Questions

Ace your homework & exams now with Quizwiz!

If the two-sided p-value of a given sample is 0.0020, what is the one-sided p-value for that sample mean? A.) 0.0010 B.) 0.0020 C.) 0.0040 D.) The answer cannot be determined without further information

A.) 0.0010

If the two-sided p-value of a given sample mean is 0.0040, what is the one-sided p-value for that sample mean? A.) 0.0020 B.) 0.0040 C.) 0.0080 D.) The answer cannot be determined without further information

A.) 0.0020 The one-sided p-value is half of the two-sided p-value. Since the two-sided p-value is 0.0040, the one-sided p-value is 0.0040/2=0.0020.

If an independent variable has a p-value of 0.07, which of the following could represent the Lower 95% and the Upper 95% for that variable? A.) -14.52, -3.25 B.) -14.52, 3.25 C.) 3.25, 14.52 D.) The answer cannot be determined without further information

B.) -14.52, 3.25 The p-value, 0.07, is greater than 0.05 so the independent variable is not significant at the 5% significance level. Therefore, the 95% confidence interval for the coefficient of the independent variable must include zero. The interval between -14.52 and 3.25 contains zero.

If you are performing a hypothesis test based on a 90% confidence level, what are your chances of making a type I error? A.) 90% B.) 10% C.) 5% D.) It is not possible to tell without more information

B.) 10% The probability of a type I error is equal to the significance level, which is 1-confidence level. A 90% confidence level indicates that the significance level is 10%. Therefore there is a 10% chance of making a type I error.

A manager of a factory wants to know if a new quality check protocol has decreased the number of units a worker produces in a day. Before the new protocol, a worker could produce 27 units per day. What null hypothesis should the manager use to test this claim? A.) M = 27 units B.) M is greater than or equal to 27 units C.) M is greater than 27 units D.) M is less than 27 units

B.) M is greater than or equal to 27 units This is the null hypothesis. Remember that the null and alternative hypotheses are opposites.

Which of the formulas would calculate the statistic that is MOST APPROPRIATE for comparing the variability of two data sets with different distributions? A.) Mean / SD B.) SD / Mean C.) Mean - Median D.) Median - Mean E.) Mean / Variance F.) Variance / Mean

B.) SD / Mean This is the formula for the coefficient of variation, the best statistic to compute to compare the variability of two data sets with different distributions. Dividing by the mean provides a measure of the distribution's variation relative to the mean.

Coefficient of variation equation

SD / Mean

IQ scores are known to be normally distributed. The mean IQ score is 100 and the standard deviation is 15. The top 25% of the population (ranked by IQ score) have IQ's above what value?

Since you are only interested in the top 25%, calculate the IQ at which 75% of people are below. The Excel function NORM.INV(probability, mean, standard_dev) returns the inverse of a normal cumulative distribution function. Here, NORM.INV(0.75,B1,B2)=NORM.INV(0.75,100,15)=110 indicates that 75% of people have IQ's lower than 110. Hence, 25% of people have IQ's greater than 110.

A journalist wants to determine the average annual salary of CEOs in the S&P 1,500. He does not have time to survey all 1,500 CEOs but wants to be 95% confident that his estimate is within $50,000 of the true mean. The journalist takes a preliminary sample and estimates that the standard deviation is approximately $449,300. What is the minimum number of CEOs that the journalist must survey to be within $50,000 of the true average annual salary? Remember that the z-value associated with a 95% confidence interval is 1.96.

The formula for calculating the minimum required sample size is n > (Z s/M)^2 1.96 (449,300/50,000)^2 = 310.10 Answer = 311

P(μ-2σ≤x≤μ-σ) percentage

13.5%

P(μ+2σ≤x) percentage

2.5%

P(μ-2σ≤x≤μ) percentage

47.5%

An airport shuttle company forecasts the number of hours its drivers will work based on the distance to be driven (in miles) and the number of jobs (each job requires the pickup and drop-off of one set of passengers) using the following regression equation: Travel time=-0.60+0.05(distance)+0.75(number of jobs) On a given day, Victor and Sofia drive approximately the same distance but Sofia has two more jobs than Victor. If Victor worked for 4 hours, for how long can the company expect Sofia to work?

5.5 The only difference between the workloads of the two drivers is the number of jobs each has; Sofia has two additional jobs. Therefore the company can expect Sofia to work the four hours Victor worked, plus an additional 0.75 hours for each of the two additional jobs, that is, 4+0.75(2)=5.5 hours.

P(μ-σ≤x≤μ+σ) percentage

68%

A sporting goods store manager wants to forecast annual sneaker revenues based on the type of sport (running, tennis, or walking), color (red, blue, white, black, or violet) and its target audience (men or women). How many independent variables should the manager include in her multiple regression analysis?

7 Sales revenue is the dependent variable. Type of sport, color, and target audience are categorical variables which must be represented using dummy variables. Recall that it is necessary to use one fewer dummy variables than the number of options in a category. Thus, type of sport should be represented by 3-1=2 dummy variables, color should be represented by 5-1=4 dummy variables, and target audience should be represented by 2-1=1 dummy variables, for a total of 2+4+1=7 independent variables.

How do you calculate the 60th percentile of a set of data?

=PERCENTILE.INC(B2:B76,0.60)

A movie theater manager wants to determine whether popcorn sales have increased since the theater switched from using "butter-flavored topping" to real butter. Historically the average popcorn revenue per weekend day was approximately $3,500. After the theater started using real butter, the manager randomly sampled 12 weekend days and calculated the sample's summary statistics. The average revenue per weekend day in the sample was approximately $4,200 with a standard deviation of $140. Select the function that would correctly calculate the 90% range of likely sample means. A.) 3500 +/- CONFIDENCE.T(0.10,140,12) B.) 4200 +/- CONFIDENCE.T(0.10,140,12) C.) 3500 +/- CONFIDENCE.NORM (0.10,140.12) D.) 4200 +/- CONFIDENCE.NORM (0.10,140.12)

A.) 3500 +/- CONFIDENCE.T(0.10,140,12) Correct. The range of likely sample means is centered at the historical population mean, in this case $3,500. Because the sample contains fewer than 30 data points, we use CONFIDENCE.T. Excel's CONFIDENCE.T function syntax is CONFIDENCE.T(alpha, standard_dev, size). Because we wish to construct a 90% range of likely sample means, alpha equals 0.10.

A manager of a factory wants to know if the average number of workplace accidents is different for workers who attended an equipment safety training compared to those who did not attend. What alternative hypothesis should the manager use to test this claim? A.) Attended does not equal Mdid not attend B.) Attended > M did not attend C.) Attended < M did not attend D.) Mattended = Mdid not attend

A.) Attended does not equal Mdid not attend The manager has reason to believe that the training has changed the average number of workplace accidents between the two groups of workers. For a two-sided test, the manager should use the alternative hypothesis Ha: µattended ≠ µdid not attend. This is the claim the manger wishes to substantiate.

Recall that the owner of a local health food store recently started a new ad campaign to attract more business and wants to know if average daily sales have increased. Historically average daily sales were approximately $2,700. The upper bound of the 95% range of likely sample means for this one-sided test is approximately $2,843.44. If the owner took a random sample of forty-five days and found that daily average sales were now $2,984, what can she conclude at the 95% confidence level? A.) Average daily sales have increased B.) Average daily sales have decreased C.) Average daily sales have remained the same D.) Average daily sales have not increased E.) The answer cannot be determined without further information

A.) Average daily sales have increased Since the sample mean, $2,984, falls outside the range of likely sample means (which has an upper bound=$2,843.44), the store owner can reject the null hypothesis that at a 95% confidence level. Since she can reject the null hypothesis, she can essentially accept the alternative hypothesis and conclude the average daily sales have increased.

The owner of an ice cream shop wants to determine whether there is a relationship between ice cream sales and temperature. The owner collects data on temperature and sales for a random sample of 30 days and runs a regression to determine if there is a relationship between temperature (in degrees) and ice cream sales. The p-value for the two-sided hypothesis test is 0.04. How would you interpret the p-value? A.) If there is no relationship between temperature and sales, the chance of selecting a sample this extreme would be 4%. B.) If there is a relationship between temperature and sales, the chance of seeing a regression coefficient this large would be 4%. C.) There is a 4% chance that there is a relationship between temperature and revenue. D.) There is a 4% chance that there is no relationship between temperature and revenue.

A.) If there is no relationship between temperature and sales, the chance of selecting a sample this extreme would be 4%. Correct. The null hypothesis is that there is no relationship. The p-value indicates how likely we would be to select a sample this extreme if the null hypothesis is true.

Which of the following statements about multicollinearity is TRUE? SELECT ALL THAT APPLY. A.) Multicollinearity occurs when two or more independent variables are highly correlated B.) Multicollinearity is usually not an issue when the regression model is only being used for forecasting C.) Multicollinearity is usually not an issue when the regression model is only being used to understand net relationships D.) Multicollinearity can typically be reduced by decreasing the sample size E.) Multicollinearity can typically be reduced by adding more independent variables

A.) Multicollinearity occurs when two or more independent variables are highly correlated Multicollinearity means that two or more of the independent variables are collinear, meaning they are highly correlated. One or more the independent variables may not be significant because the variable with which it is correlated serves as a proxy variable. B.) Multicollinearity is usually not an issue when the regression model is only being used for forecasting Multicollinearity is typically not a problem when the model is being used for forecasting, especially if the predicative power of the model is increased by the additional variable(s).

You report a confidence interval to your boss but she says that she wants a narrower range. SELECT ALL of the ways you can reduce the width of the confidence interval. A.) increase the sample size B.) decrease the sample size C.) increase the confidence level D.) decrease the confidence level E.) increase the mean D.) decrease the mean

A.) increase the sample size Increasing the sample size provides a more accurate representation of the population and therefore, reduces the width of the confidence interval. Note that another option is also correct. D.) decrease the confidence level Decreasing the confidence level reduces the width of the confidence interval. Note that another option is also correct.

An automotive manufacturer has developed a new type of tire that the research team believes to increase fuel efficiency. The manufacturer wants to test if there is an increase in the mean gas mileage of mid-sized sedans that use the new type of tire, compared to 32 miles per gallon, the historic mean gas mileage of mid-sized sedans not using the new tires. The automotive manufacturer should perform a _____________ hypothesis test to _____________. A.) one sided; analyze a change in a single population B.) two sided; analyze a change in a single population C.) one sided; compare two populations D.) two sided; compare two populations

A.) one sided; analyze a change in a single population The manufacturer believes that the new tires change fuel efficiency in a single direction (i.e., that efficiency increases) and thus should use a one-sided hypothesis test. The automotive manufacturer is analyzing the change of a single population mean compared to the known historic population mean of gas mileage in mid-sized sedans.

Which of the following is TRUE about the difference between analyzing residual plots for single variable regression models and analyzing residual plots for multiple regression models. A.) Multiple regression residual plots give insight into multicollinearity across the independent variables, whereas multicollinearity cannot occur in a single variable regression model or its residual plot. B.) Single variable regression plots give insight into the gross relationship between the independent and dependent variable, whereas multiple regression plots give insight into the net relationship, controlling for the other independent variables included in the regression model. C.) Multiple regression plots give insight into both heteroskedasticity and non-linearity, whereas single regression plots only give insight into heteroskedasticity. D.) In single regression plots, residuals are measured as the shortest distance from the regression line, whereas in multiple regression residuals are measured along the vertical axis.

B.) Single variable regression plots give insight into the gross relationship between the independent and dependent variable, whereas multiple regression plots give insight into the net relationship, controlling for the other independent variables included in the regression model. In multiple regression, we see the effect of each independent variable, controlling for all the other variables in the model, or the net effect. This is reflected in the residual plots.

Suppose you actually want to calculate the mean annual healthcare expenditures of the 192 countries. Which of the following Excel functions calculates the mean? SELECT ALL THAT APPLY A.) =MEAN(B2:B193) B.)=AVERAGE(B2:B193) C.)=MEDIAN(B2:B193) D.)=SUM(B2:B193)/192 E.)MODE.SNGL(B2:B193)

B.)=AVERAGE(B2:B193) D.)=SUM(B2:B193)/192

If the one-sided p-value of a given sample mean is 0.0150, what is the two-sided p-value for that sample mean? A.) 0.0075 B.) 0.0150 C.) 0.0300 D.) The answer cannot be determined without further information

C.) 0.0300 The two-sided p-value is double the one-sided p-value. Since the one-sided p-value is 0.0150, the two-sided p-value is 0.0150*2=0.0300.

If you are performing a hypothesis test based on a 20% significance level, what are your chances of making a type I error? A.) 805 B.) 10% C.) 20% D.) It is not possible to tell without more information

C.) 20% The probability of a type I error is equal to the significance level, which is 1-confidence level.

Now suppose we take a sample of 25 students, taking the same standardized test, which has a mean score of 500 and a standard deviation of 100, and find that the average score of this sample is 530. Which function would correctly calculate the 95% range of likely sample means under the null hypothesis? A.) 530 +/- CONFIDENCE.NORM(0.05,100,25) B.) 530 +/- CONFIDENCE.T(0.05,100.25) C.) 500 +/- CONFIDENCE.T(0.05,100,25) D.) 500 +/- CONFIDENCE.NORM(0.05,100,25)

C.) 500 +/- CONFIDENCE.T(0.05,100,25) The range of likely sample means is centered at the historical population mean, 500. Because our sample is less than 30, we cannot assume that the sample means are normally distributed, and so we should use CONFIDENCE.T rather than the CONFIDENCE.NORM function.

A streaming music site changed its format to focus on previously unreleased music from rising artists. The site manager now wants to determine whether the number of unique listeners per day has changed. Before the change in format, the site averaged 131,520 unique listeners per day. Now, beginning three months after the format change, the site manager takes a random sample of 30 days and finds that the site has an average of 124,247 unique listeners per day. The manager finds that the p-value for the hypothesis test is approximately 0.0743. How would you interpret the p-value? A.) The likelihood that the average number of unique daily listeners per day is no longer 131,520 is 7.43%. B.) The likelihood that the manager should reject the null hypothesis is 7.43%. C.) If the average number of unique daily listeners per day is still 131,520, the likelihood of obtaining a sample with a mean at least as extreme as 124,247 is 7.43%. D.) If the average number of unique daily listeners per day is no longer 131,520, the likelihood of obtaining a sample with a mean at least as extreme as 124,247 is 7.43%.

C.) If the average number of unique daily listeners per day is still 131,520, the likelihood of obtaining a sample with a mean at least as extreme as 124,247 is 7.43%. The null hypothesis is that the average number of unique daily listeners per day has not changed, that is, it is still 131,520. Therefore, the p-value of 0.0743 indicates that if the average number of unique daily listeners is still 131,520, the likelihood of obtaining a sample with a mean at least as extreme as 124,247 is 7.43%%.

A manager of a factory wants to know if a new quality check protocol has decreased the number of units a worker produces in a day. Before the new protocol, a worker could produce 27 units per day. What alternative hypothesis should the manager use to test this claim? A.) M does not equal 27 units B.) M is less than or equal to 27 units C.) M is less than 27 units D.) M is greater than 27 units

C.) M is less than 27 units The manager wants to know if the new quality check protocol has decreased the average number of units a worker can produce per day. For a one-sided test, the manager should use the alternative hypothesis Ha: μ<27 units. This is the claim the manger wishes to substantiate.

What can be concluded from the fact that the correlation coefficient between the acceptance rate at the top 100 U.S. MBA programs and the percent of students in those programs who are employed upon graduation is -0.32? A.) On average, as the acceptance rate increases, the percent of students employed upon graduation increases. B.) On average, as the acceptance rate decreases, the percent of students employed upon graduation decreases. C.) On average, as the acceptance rate decreases, the percent of students employed upon graduation increases. D.) On average, as the acceptance rate increases, the percent of student employed upon graduation remains the same.

C.) On average, as the acceptance rate decreases, the percent of students employed upon graduation increases. On average, as the acceptance rate increases, the percent of student employed upon graduation remains the same.

Consider the four outliers in the 2012 revenue data: companies with revenue of $237 billion, $246 billion, $447 billion, and $453 billion. If we removed these companies from the data set, what would happen to the standard deviation? A.) The SD would remain the same B.) The SD would increase C.) The SD would decrease D.) The answer cannot be determined without further information

C.) The SD would decrease The standard deviation gives more weight to observations that are further from the mean. Therefore, removing the outliers would decrease the standard deviation.

Which of the following is the MOST LIKELY result of using a survey with biased questions? A.) The standard deviation of the sample will be larger than the standard deviation of the population. B.) The standard deviation of the sample will be smaller than the standard deviation of the population. C.) The data in your sample will differ in a systematic way from data based on unbiased random selections from the population. D.) The data in your sample will not follow a normal distribution.

C.) The data in your sample will differ in a systematic way from data based on unbiased random selections from the population. In general, surveys with biased questions may lead to biased data, which differ systematically from what would be seen in an unbiased sample. For example, biased survey questions would lead to systematic differences between the answers given on your surveys and the answers that would be that would be given on a more neutral survey.

An internet marketing firm compiled a data set of the number of seconds website visitors stay on one of its client's homepage before abandoning the site. The firm presented the summary statistics for the data set to the client. The client asked why the mean of the data set is so much larger than the median. Which of the following is most likely true? A.) The distribution of the data is symmetric B.) The distribution of the data is skewed to the left C.) The distribution of the data is skewed to the right D.) The distribution of the data is bimodal

C.) The distribution of the data is skewed to the right When the distribution of data is skewed to the right, the mean is most likely greater than the median. The extreme values in the right tail pull the mean towards them.

A college student is interested in testing whether business majors or liberal arts majors are better at trivia. The student gives a trivia quiz to a random sample of 30 business school majors and finds the sample's average test score is 86. He gives the same quiz to 30 randomly selected liberal arts majors and finds the sample's average quiz score is 89. The student finds that the p-value for the hypothesis test equals approximately 0.0524. What can be concluded at alpha=5%? A.) The student should reject the null hypothesis and conclude that there is insufficient evidence of difference between business and liberal arts majors' knowledge of trivia. B.) The student should reject the null hypothesis and conclude that there is a significant difference between business and liberal arts majors' knowledge of trivia. C.) The student should fail to reject the null hypothesis and conclude that there is insufficient evidence of difference between business and liberal arts majors' knowledge of trivia. D.) The student should fail to reject the null hypothesis and conclude that there is a significant difference between business and liberal arts majors' knowledge of trivia.

C.) The student should fail to reject the null hypothesis and conclude that there is insufficient evidence of difference between business and liberal arts majors' knowledge of trivia. Since the p-value, 0.0524, is greater than the significance level, 0.05, the student should fail to reject the null hypothesis and conclude that there is insufficient evidence of difference between business and liberal arts majors' knowledge of trivia. Because the null hypothesis is that there is no difference between the two types of majors, this answer is correct.

Before beginning a hypothesis test, an analyst specified a significance level of 0.10. Which of the following is true? A.) there is a 90% chance that the alternative hypothesis is true B.) There is a 90% chance that the CI will include the true mean of the population C.) There is a 10% chance of rejecting the null hypothesis when it is actually true D.) There is a 90% chance of rejecting the null hypothesis when it is actually false

C.) There is a 10% chance of rejecting the null hypothesis when it is actually true Correct. The significance level specifies how different the observed sample mean has to be from the mean expected under the null hypothesis before we reject the null hypothesis. A significance level of 0.10 means that the observed sample mean is so different from the mean expected under the null hypothesis that it would only occur 10% of the time if the null hypothesis were true.

Indicate whether we would use time series or cross sectional data A.) To determine if enrollment in higher education is increasing B.) To compare the insect population in a geographic region before and after an insecticide was applied C.) To compare the current price of a gallon across different gas stations in LA D.) To see if there are differences in the average number of calories contained in school lunches on December 1st

Cross-Sectional Data: C.) To compare the current price of a gallon across different gas stations in LA D.) To see if there are differences in the average number of calories contained in school lunches on December 1st Time Series Data: A.) To determine if enrollment in higher education is increasing B.) To compare the insect population in a geographic region before and after an insecticide was applied

If the mean weight of all students in a class is 165 pounds with a variance of 234.09 square pounds, what is the z-value associated with a student whose weight is 140 pounds? A.) 1.63 B.) 0.11 C.) -0.11 D.) -1.63

D.) -1.63 z = x-m/SD = 140-165/15.3 = -1.63. The SD, 15.3, is the square root of the variance, 234.09

For a standard normal distribution (µ=0, σ=1), the area under the curve less than 1.25 is 0.894. What is the approximate percentage of the area under the curve less than -1.25? A.) 0.894 B.) 0.394 C.) 0.211 D.) 0.106

D.) 0.106 1-0.894=0.106 is the area under the curve for all values greater than 1.25. Since the normal distribution is symmetric, 0.106 is also the area under the curve for all values less than -1.25.

The mean score on a particular standardized test is 500, with a standard deviation of 100. To assess whether a training course has been effective in improving scores on the test, we take a random sample of 100 students from the course and find that the average score of this sample is 550. Which function would correctly calculate the 95% range of likely sample means under the null hypothesis? A.) 550 +/- CONFIDENCE.NORM(0.05,100,100) B.) 550 +/- CONFIDENCE.T(0.05,100,100) C.) 500 +/- CONFIDENCE.T(0.05,100,100) D.) 500 +/- CONFIDENCE.NORM(0.05,100,100)

D.) 500 +/- CONFIDENCE.NORM(0.05,100,100) The range of likely sample means is centered at the historical population mean, 500. Because our sample is larger than 30, we can assume the distribution of sample means is roughly normal, due to the central limit theorem, and use the CONFIDENCE.NORM function.

According to the Central Limit Theorem, the means of random samples from which of the following distributions will be normally distributed, assuming the samples are sufficiently large? A.) The heights of basketball players B.) The sum of two dice C.) The annual income of HBS Online graduates D.) All of the above

D.) All of the above According to the Central Limit Theorem, if we take large enough samples, the distribution of sample means will be normally distributed regardless of the shape of the underlying population.

When performing a hypothesis test based on a 95% confidence level, what are the chances of making a type II error? A.) 95% B.) 2.5% C.) 5% D.) It is not possible to tell without further information

D.) It is not possible to tell without further information A type II error occurs when we fail to reject the null hypothesis when the null hypothesis is actually false. The confidence level does not provide any information about the likelihood of making a type II error. Calculating the chances of making a type II error is quite complex and beyond the scope of this course.

If you are performing a hypothesis test based on a 90% CI, what are your chances of making a type II error? A.) 90% B.) 10% C.) 5% D.) It is not possible to tell without more information

D.) It is not possible to tell without more information The confidence level does not provide any information about the likelihood of making a type II error. Calculating the chances of making a type II error is quite complex and beyond the scope of this course.

A manager of a factory wants to know if the average number of workplace accidents is different for workers who attended an equipment safety training compared to those who did not attend. What null hypothesis should the manager use to test this claim? A.) Mattended > Mdid not attend B.) Mattended greater than or equal to M did not attend C.) Mattended less than or equal to M did not attend D.) Mattended = Mdid not attend

D.) Mattended = Mdid not attend If the manager's alternative hypothesis is that the average number of workplace accidents has changed between the two groups of workers, then the null hypothesis is that the average number of accidents has remained the same.

A streaming music site changed its format to focus on previously unreleased music from rising artists. The site manager now wants to determine whether the number of unique listeners per day has changed. Before the change in format, the site averaged 131,520 unique listeners per day. Now, beginning three months after the format change, the site manager takes a random sample of 30 days and finds that the site now has an average of 124,247 unique listeners per day. The manager finds that the p-value for the hypothesis test is approximately 0.0743. What can be concluded at the 95% confidence level? A.) The manager should reject the null hypothesis; there is sufficient evidence that the number of unique daily listeners has likely changed. B.) The manager should reject the null hypothesis; there is not enough evidence to conclude that the number of unique daily listeners has changed. C.) The manager should fail to reject the null hypothesis; there is sufficient evidence that the number of unique daily listeners has likely changed. D.) The manager should fail to reject the null hypothesis; there is not enough evidence to conclude that the number of unique daily listeners has changed.

D.) The manager should fail to reject the null hypothesis; there is not enough evidence to conclude that the number of unique daily listeners has changed. Since the p-value, 0.0743, is greater than the significance level, 0.05, the manager should fail to reject the null hypothesis.

A food truck operator has traditionally sold 75 bowls of noodle soup each day. He moves to a new location and after a week sees that he has averaged 85 bowls of noodle soup sales each day. He runs a one-sided hypothesis test to determine if his daily sales at the new location have increased. The p-value of the test is 0.031. How should he interpret the p-value? A.) There is a 3.1% chance that the true mean of soup sales at the new location is 85 bowls a day. B.) There is a 96.9% chance that the true mean of soup sales at the new location is greater than 75 bowls a day. C.) There is a 96.9% chance that the sample mean of soup sales at the new location is 85 bowls a day. D.) There is a 3.1% chance of obtaining a sample with a mean of 85 or higher assuming that the true mean sales at the new location is still equal to or less than 75 bowls a day. E.) There is a 96.9% chance that the true mean of soup sales at the new location is within 3.1 bowls of 85 bowls a day.

D.) There is a 3.1% chance of obtaining a sample with a mean of 85 or higher assuming that the true mean sales at the new location is still equal to or less than 75 bowls a day. The p-value provides us with the likelihood of the sample value equal to or more extreme than the observed sample value if the null hypothesis is true. In this case the p-value of 0.031 tells us that there would be a 3.1% chance of the sample value of 85 or above being observed if the null hypothesis were true.

Which is an example of a hidden variable? A.) Quality of life is a hidden variable because it cannot be measured directly but must be inferred from measurable variables such as wealth, success, and environment. B.) A recent study showed a correlation between a country's chocolate consumption and the number of Nobel prizes won by its scientists. The hidden variable is a strong university system that fosters talented researchers. C.) The correlation between smoking and lung cancer was a hidden variable for a long time because the cigarette lobby paid to keep the relationship hidden. D.) There is a correlation between the number of firefighters who show up at a fire and how much damage the fire causes. The hidden variable is the size of the fire.

D.) There is a correlation between the number of firefighters who show up at a fire and how much damage the fire causes. The hidden variable is the size of the fire. A hidden variable is one that is correlated with each of two variables that are not fundamentally related to each other. In this case, the size of the fire leads to a call for more firefighters, and the size of the fire also generally leads to more damage. The number of firefighters does not lead to a greater amount of fire damage.

An engineer designing a new type of bridge wants to test the stress and load bearing capabilities of a prototype before beginning construction. Her null hypothesis is that the bridge's stress and load capabilities are safe. Select which type of error would be worse. Make sure that the type of error is matched with the correct definition. A.) Type 1, the engineer does not deem the bridge safe even though it is B.) Type 1, the engineer deems the bridge safe and moves onto construction even though it is not actually safe C.) Type II, the engineer does not deem the bridge safe even though it is D.) Type 11, the engineer deems the bridge safe and moves onto construction even though it is not actually safe

D.) Type 11, the engineer deems the bridge safe and moves onto construction even though it is not actually safe The type II error is that the engineer deems the bridge safe and moves onto construction even though it is not actually safe. This would be worse than presuming that a safe bridge is unsafe.

The owner of a local health food store recently started a new ad campaign to attract more business and wants to test whether average daily sales have increased. Historically average daily sales were approximately $2,700. After the ad campaign, the owner took a random sample of forty-five days and found that daily average sales had increased to $2,984. What is store owner's null hypothesis? A.) M greater than or equal to 2,984 B.) M less than or equal to 2,984 C.) M = 2,984 D.) M greater than or equal to 2,700 E.) M less than or equal to 2,700 F.) M = 2,700

E.) M less than or equal to 2,700 The null hypothesis is the opposite of the hypothesis you are trying to substantiate. Since the owner wants to test for an increase, the null hypothesis is . Remember that the null hypothesis is always based on historical information.

Based on the regression model, the expected daily production volume with 112 factory workers is 118,846 units. The human resource department noted that 123,415 units were produced on the most recent day on which there were 112 factory workers. What is the residual of this data point? A.) -4,569 units B.) -2,163 units C.) -41 units D.) 41 units E.) 2,163 units F.) 4,569 units

F.) 4,569 units The residual is equal to the historically observed value minus the regression's predicted value(ε=y-ŷ). 112 factory workers historically produced 123,415 units, whereas the regression model predicts that 112 workers would produce 118,846 units. The residual is the difference between these two values: 123,415 units - 118,846 units = 4,569 units.

How much variation in production volume can be explained by the number of factory workers? What are you supposed to look at in the regression output table

R2

A store owner is interested in opening a second shop. She wants to estimate the true average daily revenue of her current shop to decide whether expanding her business is a good idea. The store owner takes a random sample of 60 days over a six-month period and finds that the mean revenue of those days is 3,472.00 dollars with variance 315,900.20 square dollars. Calculate a 95% confidence interval to estimate the true average daily revenue.

First calculate the sample standard deviation, which is equal to the square root of the variance. The sample standard deviation is $562.05. Then find the margin of error using the Excel function CONFIDENCE.NORM(alpha, standard_dev, size). Here, CONFIDENCE.NORM(0.05,SQRT(B2),60)=CONFIDENCE.NORM(0.05,562.05,60)=$142.22. The lower bound of the 95% confidence interval is the mean minus the margin of error, $3,472.00-$142.22=$3,329.78. The upper bound of the 95% confidence interval is the mean plus the margin of error, $3,472.00+$142.22=$3,614.22.

MAKE SURE YOU GO THROUGH ALL THE MODULES TO UNDERSTAND THE EXCEL AND REGRESSION OUTPUT QUESTIONS

MAKE SURE YOU GO THROUGH ALL THE MODULES TO UNDERSTAND THE EXCEL AND REGRESSION OUTPUT QUESTIONS

If the mean of a normally distributed population is -10 with a standard deviation of 2, what is the likelihood of obtaining a value less than or equal to -7?

To calculate the likelihood of obtaining a value less than or equal to -7, P(x≤-7), use the Excel function NORM.DIST(x, mean, standard_dev, TRUE). Here, NORM.DIST(-7,B1,B2,TRUE)=NORM.DIST(-7,-10,2,TRUE)=0.93, or 93%. Approximately 93% of the population falls in the area under the curve less than or equal to -7.


Related study sets

Entrepreneurship Final Chapter 9

View Set

AST 101 Chapter 3: Gravity and Motions

View Set

Unit 1 Progress Check: MCQ Part A

View Set