HBX Business Analytics Exam Preparation

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

Given the regression equation, S⁢e⁢l⁢l⁢i⁢n⁢gP⁢r⁢i⁢c⁢e=13,490.45+255.36⁢(H⁢o⁢u⁢s⁢eS⁢i⁢z⁢e), which of the following values represents the value of S⁢e⁢l⁢l⁢i⁢n⁢gP⁢r⁢i⁢c⁢e at which the regression line intersects the vertical axis?

$13,490.45 13,490.45 is the y-intercept, the value at which the regression line intersects the y-axis. This happens when House Size = 0, giving the equation: Selling Price = 13,490.45+255.36*0 = 13,490.45

Given the regression equation, S⁢e⁢l⁢l⁢i⁢n⁢gP⁢r⁢i⁢c⁢e=13,490.45+255.36⁢(H⁢o⁢u⁢s⁢eS⁢i⁢z⁢e), which of the following values represents the value of H⁢o⁢u⁢s⁢eS⁢i⁢z⁢e at which the regression line intersects the horizontal axis?

- 52.83 square feet The regression line intersects the horizontal axis when Selling Price = $0, that is, when House Size = -52.83 square feet. 13,490.45+ 255.36*(-52.83)=$0.00 (actually, -52.82914, which rounds to -52.83).

Which of the following values most likely belongs in the Lower 95% cell for the independent variable in the output table?

-2.45 Since the p-value, 0.3956, is greater than 0.05, the linear relationship is not significant at the 95% confidence level. Therefore, the 95% confidence interval of the slope must contain zero. The confidence interval is centered around the slope of 1.78, so the lower and upper bounds must be equally distant from the slope. The Upper 95% minus the slope is 6.01-1.78=4.23, so the Lower 95% is 1.78-4.23=-2.45.

Which of the 95% confidence intervals for a regression line's slope indicates that the linear relationship is NOT significant at the 5% level? Select all that apply.

-9.85; 5.26 The range between -9.85 and 5.26 contains zero, which indicates that the linear relationship is not significant at the 5% level. Note that another option is also correct. -5.26; 9.85 The range between -5.26 and 9.85 contains zero, which indicates that the linear relationship is not significant at the 5% level. Note that another option is also correct.

If the two-sided p-value of a given sample mean is 0.0040, what is the one-sided p-value for that sample mean?

0.0020 The one-sided p-value is half of the two-sided p-value. Since the two-sided p-value is 0.0040, the one-sided p-value is 0.0040/2=0.0020.

Which is the best estimate of the approximate amount of variability in Win Percentage that is explained by the model?

0.6298 or 63.0% is the R-square, which indicates how much variability is accounted for by the model.

If you are performing a hypothesis test based on a 90% confidence level, what are your chances of making a type I error?

10% The probability of a type I error is equal to the significance level, which is 1-confidence level. A 90% confidence level indicates that the significance level is 10%. Therefore there is a 10% chance of making a type I error.

If a particular standardized test has a mean score of 500 and standard deviation of 100, what percentage of test-takers score between 500 and 600?

100 is one standard deviation above the mean (600-500 =100= 1*100 = 1*stdev). We know that approximately 68% of the distribution is within 1 standard deviation of the mean. Therefore 34% must fall beyond 1 standard deviation above the mean.

If the average IQ is 100 and the standard deviation is 15, approximately what percentage of people have IQs above 130?

130 is two standard deviations above the mean (130-100=30=2*15=2*stdev). We know that approximately 95% of the distribution is within 2 standard deviations of the mean. Therefore 5% must fall beyond 2 standard deviations, 2.5% at the top and 2.5% at the bottom.

If the expected production volume when there are 120 workers is approximately 131,958 units, which of the following equations would provide a reasonable estimate of the 68% prediction interval for the output of those 120 workers?

131,958±14,994.93 A reasonable estimate of the prediction interval is the point forecast (131,958) plus or minus the z-value times the standard error of the regression (14,994.93). As usual, the z-value is based on the desired level of confidence. Since we want a 68% prediction interval, the z-value is equal to one. Therefore 131,958±14,994.93 is the best option.

A sporting goods store manager wants to forecast annual sneaker revenues based on the type of sport (running, tennis, or walking), color (red, blue, white, black, or violet) and its target audience (men or women). How many independent variables should the manager include in her multiple regression analysis? Please enter your answer as an integer; that is with no decimal point.

7 Sales revenue is the dependent variable. Type of sport, color, and target audience are categorical variables which must be represented using dummy variables. Recall that it is necessary to use one fewer dummy variables than the number of options in a category. Thus, type of sport should be represented by 3-1=2 dummy variables, color should be represented by 5-1=4 dummy variables, and target audience should be represented by 2-1=1 dummy variables, for a total of 2+4+1=7 independent variables.

Calculate the 90% confidence interval for the true population mean based on a sample with

Because our sample has fewer than 30 cases, we cannot assume that the distribution of sample means will be normal, and must use the t-distribution. The margin of error is based on the significance level (1-confidence level, or 1-0.90=0.10), the standard deviation (in B2) and the sample size (in B3). We can compute the margin of error using the Excel function CONFIDENCE.T(0.10,B2,B3) and it is approximately 4.93. The lower bound of the 90% confidence interval is the sample mean minus the margin of error, that is B1-CONFIDENCE.T(0.10,B2,B3)=225-4.93=220.07. The upper bound of the 90% confidence interval is the sample mean plus the margin of error, that is B1+CONFIDENCE.T(0.10,B2,B3)= 225+4.93=229.93. You must link directly to cells to obtain the correct answer.

Calculate the 99% confidence interval for the true population mean for the BMI data. Recall that the new sample contains 15 people and has a mean of 25.97 kg/m2 and a standard deviation of 7.10 kg/m2.

Because our sample has fewer than 30 cases, we cannot assume that the distribution of sample means will be normal, and must use the t-distribution. The margin of error is based on the significance level (1-confidence level, or 1-0.99=0.01), the standard deviation (in B2) and the sample size (in B3). We can compute the margin of error using the Excel function CONFIDENCE.T(0.01,B2,B3). The lower bound of the 99% confidence interval is the sample mean minus the margin of error, that is B1-CONFIDENCE.T(0.01,B2,B3)= 25.97-5.46=20.51. The upper bound of the 99% confidence interval is the sample mean plus the margin of error, that is B1+CONFIDENCE.T(0.01,B2,B3)= 25.97+5.46=31.43. We can be 99% confident that the true mean BMI of all U.S. citizens is between 20.51 kg/m2 and 31.43 kg/m2.

For the following scenario, determine whether it would be better to analyze cross-sectional or time series data. We want to compare the daily sales of stores in a mall during a day-long mall-wide event.

Cross-Sectional Since we are interested in the sales of different stores on a single day (a single point in time), we should analyze a cross-section of the stores in the mall.

To better assess student understanding of confidence intervals, a professor gives a test to a random sample of 15 students. The grades for those students are provided below. Calculate the 90% confidence interval for the true average grade on the test.

First, calculate the mean and standard deviation of the sample grades using formulas =AVERAGE(A2:A16) and =STDEV.S(A2:A16) in any of the open cells. The values are approximately 76.93 and 11.73 respectively. Second, calculate the margin of error. Because the sample size is less than 30, use the function CONFIDENCE.T(alpha, standard_dev, size) to find the margin of error using the t-distribution. Here, alpha is 0.1 and the sample size is 15. For the standard deviation value, you need to reference the cell in which you calculated the standard deviation. The result is approximately =CONFIDENCE.T(0.1,11.73,15) = 5.34. Alternatively, you could use the Descriptive Statistics tool to calculate the mean, standard deviation, and margin of error. To include the margin of error calculation to the Descriptive Statistics output, check the "Confidence Interval" box and adjust the confidence level to 90%. Note that the Descriptive Statistics tool uses CONFIDENCE.T by default to calculate the margin of error. The lower bound of the 90% confidence interval is the mean minus the margin of error, approximately 76.93-5.34=71.60. The upper bound of the 90% confidence interval is the mean plus the margin of error, approximately 76.93+5.34=82.27. You must link directly to cells in all of your calculations in order to obtain the correct answer. You can also use a single formula to complete each of the calculations: =AVERAGE(A2:A16)-CONFIDENCE.T(0.1, STDEV.S(A2:A16),15) for the lower bound, and =AVERAGE(A2:A16)+CONFIDENCE.T(0.1, STDEV.S(A2:A16),15) for the upper bound.

The spreadsheet below contains data about US corn acreage planted and corn stock in storage at the beginning of the year for each year from 1976 to 2013.Create a regression model to analyze the relationship between corn acreage planted and corn stock. Be sure to include labels, residuals, and residual plots in your analysis.

From the Data menu, select Data Analysis, then select Regression. The Input Y Range is A1:A39 and the Input X Range is B1:B39. You must check the Labels box to ensure that the regression output table is appropriately labeled. You must also check the Residuals and Residual Plots boxes so that you are able to analyze the residuals.

When analyzing a residual plot, which of the following indicates that a linear model is a good fit?

Random spread of residuals around the x-axis - A linear model is a good fit if the residuals are spread randomly above and below the x-axis.

The owner of an ice cream shop wants to determine whether there is a relationship between ice cream sales and temperature. The owner collects data on temperature and sales for a random sample of 30 days and runs a regression to determine if there is a relationship between temperature (in degrees) and ice cream sales. The p-value for the two-sided hypothesis test is 0.04. How would you interpret the p-value?

If there is no relationship between temperature and sales, the chance of selecting a sample this extreme would be 4%. Correct. The null hypothesis is that there is no relationship. The p-value indicates how likely we would be to select a sample this extreme if the null hypothesis is true.

You report a confidence interval to your boss but she says that she wants a narrower range. SELECT ALL of the ways you can reduce the width of the confidence interval.

Increase the sample size Increasing the sample size provides a more accurate representation of the population and therefore, reduces the width of the confidence interval. Note that another option is also correct. Decrease the confidence level Decreasing the confidence level reduces the width of the confidence interval. Note that another option is also correct.

If you are performing a hypothesis test based on a 90% confidence level, what are your chances of making a type II error?

It is not possible to tell without more information The confidence level does not provide any information about the likelihood of making a type II error. Calculating the chances of making a type II error is quite complex and beyond the scope of this course.

The linear relationship between two variables can be statistically significant but not explain a large percentage of the variation between the two variables. This would correspond to which pair of R^2 and p-value?

Low R-squared, Low p-value A low R-squared and low p-value indicates that the independent variable explains little variation in the dependent variable and the linear relationship between the two variables is significant.

Which of the following options most accurately describes the R2 value and the p-value of this relationship?

Low R2; high p-value (i.e., p-value greater than 0.05) A low R2 and high p-value indicates that the independent variable explains little variation in the dependent variable and the linear relationship is not significant. Since the data points are widely dispersed and do not indicate a clear linear pattern, this relationship likely has a low R2 and high p-value.

Which of the following statements about multicollinearity is TRUE? SELECT ALL THAT APPLY.

Multicollinearity occurs when two or more independent variables are highly correlated Multicollinearity means that two or more of the independent variables are collinear, meaning they are highly correlated. One or more the independent variables may not be significant because the variable with which it is correlated serves as a proxy variable. Multicollinearity is usually not an issue when the regression model is only being used for forecasting Multicollinearity is typically not a problem when the model is being used for forecasting, especially if the predicative power of the model is increased by the additional variable(s).

A curious student in a large economics course is interested in calculating the percentage of his classmates who scored lower than he did on the GMAT; he scored 490. He knows that GMAT scores are normally distributed and that the average score is approximately 540. He also knows that 95% of his classmates scored between 400 and 680. Based on this information, calculate the percentage of his classmates who scored lower than he did.

Since GMAT scores are normally distributed, we know that P(μ-1.96σ ≤ x ≤ μ+1.96σ) = 95%. Thus, to find the standard deviation, subtract the lower bound from the mean and divide by 1.96. The standard deviation of the distribution is (B1-B2)/1.96 = (540-400)/1.96 = 71.4. (Note that because the normal curve is symmetrical, we could calculate the same value using (B3-B1)/1.96 = (680-540)/1.96 = 71.4). To find the cumulative probability, P(x ≤ 490), use the Excel function NORM.DIST(x, mean, standard_dev, TRUE). Here, NORM.DIST(B4,B1,71.4,TRUE) = NORM.DIST(490,540,71.4,TRUE) = 0.24, or 24%. Approximately 24% of his classmates scored lower than he did.

Use the descriptive statistics tool to calculate the summary statistics for the data set provided below. Use C1 as the output range so that your response is graded correctly and include a label for the data.

The Input Range is A1:A13. You must check the Labels in first row box since we included a label in cell A1 to ensure that the output table is appropriately labeled, and you must select Summary Statistics to produce the output table. You need to use C1 as your output range so that your calculations are in the blue cells for grading.

Below are data showing students' grades on a statistics quiz and the number of hours they spent studying. Create a scatterplot to show the relationship.

The Input Y Range is Grade B1:B25 and the Input X Range is Hours Studied A1:A25. You must check the Labels in first row box since we included labels in cells A1 and B1 to ensure that the scatter plot's axes are appropriately labeled.

Based on the regression output, what proportion of the variability in revenue can be accounted for by whether the Red Sox are playing away? Enter the value of the percentage with exactly ONE digit to the right of the decimal place. See the drop bar if you need more detail on how to round your answer.

The R Square value measures how much of the total variation in the dependent variable (in this case, revenue) that is explained by the independent variable (in this case, away game). As shown in the regression output, the R-square value is 0.2252, or approximately 22.5%

For a normal distribution with mean 100 and standard deviation 10, find the probability of obtaining a value less than or equal to 118.

The cumulative probability associated with the value 118 is NORM.DIST(118,B1,B2,TRUE)=0.96, or 96%. Approximately 96% of the population has values less than or equal to 118. Note that because the normal distribution is continuous, the probability of an outcome being equal to single, discrete value (such as 118) is 0. Thus the probability of obtaining a value less than 118 is equivalent to obtaining a value less than or equal to 118. You must link directly to cells to obtain the correct answer.

For a normal distribution with mean 100 and standard deviation 10, find the probability of obtaining a value less than or equal to 118.

The cumulative probability associated with the value 118 is NORM.DIST(118,Mean,Standard Deviation,TRUE)=0.96, or 96%. Approximately 96% of the population has values less than or equal to 118. Note that because the normal distribution is continuous, the probability of an outcome being equal to single, discrete value (such as 118) is 0. Thus the probability of obtaining a value less than 118 is equivalent to obtaining a value less than or equal to 118. You must link directly to cells to obtain the correct answer.

Given the general regression equation, y^=a+b⁢x, which of the following describes y^? Select all that apply.

The expected value of y The dependent variable The value we are trying to predict

Consider the four outliers in the 2012 revenue data: companies with revenue of $237 billion, $246 billion, $447 billion, and $453 billion. If we removed these companies from the data set, what would happen to the standard deviation?

The standard deviation would decrease. The standard deviation gives more weight to observations that are further from the mean. Therefore, removing the outliers would decrease the standard deviation.

A journalist wants to determine the average annual salary of CEOs in the S&P 1,500. He does not have time to survey all 1,500 CEOs but wants to be 95% confident that his estimate is within $50,000 of the true mean. The journalist takes a preliminary sample and estimates that the standard deviation is approximately $449,300. What is the minimum number of CEOs that the journalist must survey to be within $50,000 of the true average annual salary? Remember that the z-value associated with a 95% confidence interval is 1.96. (Please enter your answer as an integer; that is, as a whole number with no decimal point.)

The formula for calculating the minimum required sample size is n≥(z⁢*(s/M))^2, where M=50.000 is the desired margin of error for the confidence interval, s=$449,300 is the sample standard deviation, and z=1.96. Using these data we find that (1.96⁢*(449,300/50,000))^2=310.20 Since n must be an integer (let's not even think of what 0.20 CEOs would look like!) and n must be greater than or equal to 310.20, we must round up to 311. Since 310.20 is closer to 310 than to 311, we would normally round 310.20 down to 310. However, in this case we must round up to find the smallest integer that satisfies the equation. Therefore, the minimum required sample size is 311.

Calculate the 80% confidence interval for the true population mean based on a sample with x¯=225, s=8.5, and n=45.

The margin of error is based on the significance level (1-confidence level, in this case, 100%-80%=20%), the standard deviation (in cell B2) and the sample size (in cell B3). We can compute the margin of error using the Excel function CONFIDENCE.NORM(0.20,B2,B3)=1.62. The lower bound of the 80% confidence interval is the sample mean minus the margin of error, that is B1-CONFIDENCE.NORM(0.20,B2,B3)=225-1.62=223.38. The upper bound of the 80% confidence interval is the sample mean plus the margin of error, that is B1+CONFIDENCE.NORM(0.20,B2,B3)= 225+1.62=226.62.

Calculate the 95% confidence interval for the true population mean based on a sample with mean x¯=15, standard deviation s=2, and sample size n=100.

The margin of error is based on the significance level (1-confidence level, or 1-0.95=0.05), the standard deviation (in B2) and the sample size (in B3). We can compute the margin of error using the Excel function CONFIDENCE.NORM(0.05,B2,B3). The lower bound of the 95% confidence interval is the sample mean minus the margin of error, that is B1-CONFIDENCE.NORM(0.05,B2,B3)=15-0.39=14.61. The upper bound of the 95% confidence interval is the sample mean plus the margin of error, that is B1+CONFIDENCE.NORM(0.05,B2,B3)=15+0.39=15.39.

An airport shuttle company forecasts the number of hours its drivers will work based on the distance to be driven (in miles) and the number of jobs (each job requires the pickup and drop-off of one set of passengers) using the following regression equation: Travel time=-0.60+0.05(distance)+0.75(number of jobs) On a given day, Victor and Sofia drive approximately the same distance but Sofia has two more jobs than Victor. If Victor worked for 4 hours, for how long can the company expect Sofia to work?

The only difference between the workloads of the two drivers is the number of jobs each has; Sofia has two additional jobs. Therefore the company can expect Sofia to work the four hours Victor worked, plus an additional 0.75 hours for each of the two additional jobs, that is, 4+0.75(2)=5.5 hours.

A business school professor is interested to know if watching a video about the Central Limit Theorem helps students understand it. To assess this, the professor tests students' knowledge both immediately before they watch the video and immediately after. The professor takes a sample of students, and for each one compares their test score after the video to their score before the video. Using the data below, calculate the p-value for the following hypothesis test: H0:μafter≤μbefore Ha:μafter>μbefore

The p-value of the one-sided hypothesis test is T.TEST(array1, array2, tails, type)=T.TEST(B2:B31,C2:C31,1,1), which is approximately 0.0128. You must designate this test as a one-sided test (that is, assign the value 1 to the tails argument) and as a type 1 (a paired test) because you are testing the same students on the same knowledge at two points in time. You must link directly to values in order to obtain the correct answer.

The census data below shows the number of building permits (in thousands) awarded monthly in each of the four major regions of the United States. The dataset includes a "Region" variable indicating the name of the region. Create a dummy variable called Midwest. Assign a 1 to observations from the Midwest and a 0 to all other observations.

There should be a 1 or 0 in cells D2:D89. In cell D2, enter the function =IF(B2="Midwest", 1,0), then copy and paste this function in cells D3:D89.

The data below show the number of hours 60 fifth-grade students reported reading last week and each student's gender. Use the AVERAGEIF function to calculate the average number of hours spent reading last week for boys, and the average number of hours spent reading last week for girls.

This is a conditional mean, so you can either use AVERAGEIF(B2:B61,"Boys",A2:A61)=4.48 and AVERAGEIF(B2:B61,"Girls",A2:A61)=5.55 or AVERAGEIF(B2:B61,D2,A2:A61)=4.48 and AVERAGEIF(B2:B61,D3,A2:A61)=5.55. You could also just sort the data by gender and compute the averages of each gender, but we want you to learn how to do conditional averages in Excel. As always, it is important that you link to the cells with the data.

For the following scenario, determine whether it would be better to analyze cross-sectional or time series data. We want to see if the Red Sox performance changes over the course of the baseball season.

Time Series Since we are interested in comparing the Red Sox performance at different points in time during the baseball season, we should analyze time series data.

If the mean of a normally distributed population is -10 with a standard deviation of 2, what is the likelihood of obtaining a value less than or equal to -7?

To calculate the likelihood of obtaining a value less than or equal to -7, P(x≤-7), use the Excel function NORM.DIST(x, mean, standard_dev, TRUE). Here, NORM.DIST(-7,B1,B2,TRUE)=NORM.DIST(-7,-10,2,TRUE)=0.93, or 93%. Approximately 93% of the population falls in the area under the curve less than or equal to -7

The owner of a local health food store recently started a new ad campaign to attract more business and wants to test whether average daily sales have increased. Historically average daily sales were approximately $2,700. After the ad campaign, the owner took another random sample of forty-five days and found that average daily sales were $2,984 with a standard deviation of approximately $585. Calculate the upper bound of the 95% range of likely sample means for this one-sided hypothesis test using the CONFIDENCE.NORM function.

To construct a 95% range of likely sample means, calculate the margin of error using the function CONFIDENCE.NORM(alpha, standard_dev, size). However, CONFIDENCE.NORM finds the margin of error for a two-sided hypothesis test and this question asks for the upper bound of a one-sided test. To find the upper bound for the one-sided test you must first determine what two-sided test would have a 5% rejection region on the right side. Since the distribution of sample means is symmetric, a two-sided test with a 10% significance level would have a 5% rejection region on the left side of the normal distribution and a 5% rejection region on the right side. Thus, the upper bound for a two-sided test with alpha=0.1 will be the same as the upper bound on a one-sided test with alpha=0.05. The margin of error is CONFIDENCE.NORM(alpha, standard_dev, size)= CONFIDENCE.NORM(0.1,C3,C4)=CONFIDENCE.NORM(0.1,585,45)=$143.44. The upper bound of the 95% range of likely sample means for this one-sided hypothesis test is the population mean plus the margin of error, which is approximately $2,700+$143.44=$2,843.44.

The manager of a direct mailing marketing firm wants to analyze whether, and if so how, the number of pieces of mail sent to households about a credit card promotion affects the number of credit card applications the firm receives. The manager collects data on the number of pieces of mail the firm has sent and number of applications received. Based on industry knowledge, the manager knows that customers don't necessarily apply for the credit card in the week they receive the mailing, but believes that some may apply in the week following the initial mailing. The manager wishes to perform a regression analysis using two independent variables: the number of pieces of mail sent during a specified week, and the number sent the week preceding the specified week. Create a new lagged variable in the highlighted column: "Pieces of Mail Sent the Preceding Week" to allow for the appropriate regression analysis.

To create the lagged variable, copy the data from C2:C32 and paste it into D3:D33. For example, 93,460 should be in cell D3; 85,220 should be in cell D4, and so on. Cell D2 should be blank and there is a new row of data that has an entry only in cell D33.

IQ scores are known to be normally distributed. The mean IQ score is 100 and the standard deviation is 15. What percent of the population has an IQ over 115?

To find P(x>115), the percent of the population has an IQ over 115, first compute the cumulative probability, P(x≤115), using the Excel function NORM.DIST(x, mean, standard_dev, TRUE). Here NORM.DIST(115,B1,B2,TRUE)=NORM.DIST(115,100,15,TRUE)=0.84, or 84%. Thus, P(x>115)=1-P(x≤115)=1-0.84=0.16, or 16%.

A streaming music site changed its format to focus on previously unreleased music from rising artists. The site manager now wants to determine whether the number of unique listeners per day has changed. Before the change in format, the site averaged 131,520 unique listeners per day. Now, beginning three months after the format change, the site manager takes a random sample of 30 days and finds that the site now has an average of 124,247 unique listeners per day. Using the data provided below, calculate the p-value for the following hypothesis test: $!\text{H}_{0}: \mu=131,520$! $!\text{H}_{a}: \mu\ne131,520$!

To use Excel's T.TEST function for a hypothesis test with one sample, you must create a second column of data that will act as a second sample. So, first enter the historical average unique listeners into each cell in column B associated with a day in the sample; that is, enter 131,520 into cells B2 to B31. Then, the p-value of the two-sided hypothesis test is T.TEST(array1, array2, tails, type)=T.TEST(A2:A31,B2:B31,2,3), which is approximately 0.0743. You must link directly to values in order to obtain the correct answer.

IQ scores are known to be normally distributed. The mean IQ score is 100 and the standard deviation is 15. The top 25% of the population (ranked by IQ score) have IQ's above what value?

Use the properties of the normal distribution to solve this problem. Since you are only interested in the top 25%, calculate the IQ at which 75% of people are below. The Excel function NORM.INV(probability, mean, standard_dev) returns the inverse of a normal cumulative distribution function. Here, NORM.INV(0.75,B1,B2)=NORM.INV(0.75,100,15)=110 indicates that 75% of people have IQ's lower than 110. Hence, 25% of people have IQ's greater than 110.

A p-value to test the significance of a linear relationship between two variables was calculated to be 0.0210. What can we conclude? Select all that apply.

We can be 90% confident that there is a significant linear relationship between the two variables. - Since the p-value, 0.0210, is less than 1-0.90=0.10, we can be 90% confident that there is a significant linear relationship between the two variables. Note another option is also correct. We can be 95% confident that there is a significant linear relationship between the two variables. - Since the p-value, 0.0210, is less than 1-0.95=0.05, we can be 95% confident that there is a significant linear relationship between the two variables. Note another option is also correct.

Is the relationship between Red Sox away games and average daily revenues significant at the 95% confidence level? Choose the correct answer with the corresponding correct reasoning.

Yes, because the p-value of the independent variable is less than 0.05 - Since the p-value, 0.0005, is less than 0.05, we can be confident that the relationship is significant at the 5% significance level and, equivalently, at the 95% confidence level.

Recall that the z-value associated with a value measures the number of standard deviations the value is from the mean. If a particular standardized test has an average score of 500 and a standard deviation of 100, what z-value corresponds to a score of 350?

z=(x-µ)/σ. Here x= 350, µ=500, the population mean, and σ=100, the population standard deviation. Thus z = (350-500)/100 = (-150)/100 = -1.5

A manager of a factory wants to know if a new quality check protocol has decreased the number of units a worker produces in a day. Before the new protocol, a worker could produce 27 units per day. What alternative hypothesis should the manager use to test this claim?

µ < 27 units The manager wants to know if the new quality check protocol has decreased the average number of units a worker can produce per day. For a one-sided test, the manager should use the alternative hypothesis Ha: μ<27 units. This is the claim the manger wishes to substantiate.

A manager of a factory wants to know if a new quality check protocol has decreased the number of units a worker produces in a day. Before the new protocol, a worker could produce 27 units per day. What null hypothesis should the manager use to test this claim?

µ ≥ 27 units This is the null hypothesis. Remember that the null and alternative hypotheses are opposites.

A manager of a factory wants to know if the average number of workplace accidents is different for workers who attended an equipment safety training compared to those who did not attend. What alternative hypothesis should the manager use to test this claim?

µattended ≠ µdid not attend The manager has reason to believe that the training has changed the average number of workplace accidents between the two groups of workers. For a two-sided test, the manager should use the alternative hypothesis Ha: µattended ≠ µdid not attend. This is the claim the manger wishes to substantiate.


संबंधित स्टडी सेट्स

MKTG 4562: Marketing Strategy Mid-term (Ch. 1-7)

View Set

Chapter 29: Introduction to the Autonomic Nervous System

View Set

Essentials of Networking Modules 7, 8, 9

View Set

Ch 01 QUIZ - WHO ARE AMERICANS? An Increasingly Diverse Nation

View Set

Unit 2 Homework (Chapters 6, 5, and 7)

View Set

Prospective and Retrospective Studies

View Set

Financial Management Ch. 13 Leverage and Capital Structure

View Set

Evolve HESI Leadership/Management

View Set