Quantitative Methods: CH6 & CH7 Quizzes
The _____ is a measure of the error that results from using the estimated regression equation to predict the values of the dependent variable in the sample. a. sum of squares due to error (SSE) b. error term c. residual d. sum of squares due to regression (SSR)
a. sum of squares due to error (SSE) The value of sum of squares due to error is a measure of the error in using the estimated regression equation to predict the values of the dependent variable in the sample. The SSR measures how much the predicted y values on the estimated regression line deviate from the mean y value.
A pizza shop advertises that they deliver in 30 minutes or less or it is free. People who live in homes that are located on the opposite side of town believe it will take the pizza shop longer than 30 minutes to make and deliver the pizza. A random sample of 50 deliveries to homes across town was taken and the mean time was computed to be 32 minutes. What is the appropriate symbol to represent the value, 32? a. x̄ = 32 b. μ = 32 c. n = 32 d. p̂ = 32
a. x̄ = 32 The value 32 is the sample mean. The appropriate symbol is = 32.
The least squares regression line minimizes the sum of the _____. a. absolute deviations between actual and predicted x values b. squared differences between actual and predicted y values c. differences between actual and predicted y values d. absolute deviations between actual and predicted y values
b. squared differences between actual and predicted y values The least squares regression line minimizes the sum of the squared differences between actual and predicted y values.
The difference between the observed value of the dependent variable and the value predicted using the estimated regression equation is known as the _____. a. error term b. constant term c. residual d. model parameter
c. residual The difference between the observed value of the dependent variable and the value predicted using the estimated regression equation is known as the residual.
A regression analysis involving one independent variable and one dependent variable is referred to as a _____. a. data mining b. factor analysis c. simple linear regression d. time series analysis
c. simple linear regression A regression analysis involving one independent variable and one dependent variable is referred to as a simple linear regression.
In a simple linear regression model, y = ß0 + ß1x + ε the parameter ß1 represents the _____. a. mean value of x b. intercept c. slope of the true regression line d. error term
c. slope of the true regression line β0, read "beta zero," is the intercept parameter; and β1, read "beta one" is the parameter that represents the slope of the true regression line.
A procedure for using sample data to find the estimated regression equation is _____. a. point estimation b. interval estimation c. the least squares method d. extrapolation
c. the least squares method The least squares method is a procedure for using sample data to find the estimated regression equation.
If the expected value of the sample statistic is equal to the population parameter being estimated, the sample statistic is said to _____. a. have low variability b. have high precision c. be a random estimator of the population parameter d. be an unbiased estimator of the population parameter
d. be an unbiased estimator of the population parameter If the expected value of the sample statistic is equal to the population parameter being estimated, the sample statistic is said to be an unbiased estimator of the population parameter.
The purpose of statistical inference is to make estimates or draw conclusions about a _____. a. mean of the sample based upon the mean of the population b. sample based upon information obtained from the population c. statistic based upon information obtained from the population d. population based upon information obtained from the sample
d. population based upon information obtained from the sample When we make estimates of or draw conclusions about one or more characteristics of a population based upon the sample, we are using the process of statistical inference.
The value of the _____ is used to estimate the value of the population parameter. a. sample parameter b. population estimate c. population statistic d. sample statistic
d. sample statistic To estimate the value of a population parameter, we compute a corresponding characteristic of the sample referred to as a sample statistic.
The graph of the simple linear regression equation is a(n) _____. a. hyperbola b. parabola c. ellipse d. straight line
d. straight line The graph of the simple linear regression equation is a straight line.
What would be the coefficient of determination if the total sum of squares (SST) is 23.29 and the sum of squares due to regression (SSR) is 10.03? a. 0.43 b. 2.32 c. 0.19 d. 0.89
a. 0.43 The coefficient of determination r^2 = SSR/SST. Substituting the given values, we get r^2 = 0.43.
A statistics teacher started class one day by drawing the names of 10 students out of a hat and asked them to do as many pushups as they could. The 10 randomly selected students averaged 15 pushups per person with a standard deviation of 9 pushups. Suppose the distribution of the population of number of pushups that can be done is approximately normal. If we would like to capture the population mean with 95% confidence the margin of error would be _____. a. 2.262(9/√10) b. 1.960(9/√10) c. 2.125(15/√15) d. 3.250(15/√15)
a. 2.262(9/√10) The margin of error would be 2.262(9/√10). The critical value for 95% confidence with 9 degrees of freedom is 2.262 and the sample standard deviation is 9.
A statistics teacher started class one day by drawing the names of 10 students out of a hat and asked them to do as many pushups as they could. The 10 randomly selected students averaged 15 pushups per person with a standard deviation of 9 pushups. Suppose the distribution of the population of number of pushups that can be done is approximately normal. The 95% confidence interval for the true mean number of pushups that can be done is _____. a. 8.56 to 21.40 **it's 21.44 not 21.40 b. 11.31 to 18.55 c. 13.02 to 16.98 d. 5.75 to 24.25
a. 8.56 to 21.40 ** 21.40 should be 21.44 The 95% confidence interval for the true mean number of pushups that can be done is 15 ± 2.262 ( 9/√10 ) = 8.56 to 21.40
_____ is used to test the hypothesis that the values of the regression parameters ß1, ß2, ... ßq are all zero. a. An F test b. A t test c. The least squares method d. Extrapolation
a. An F test An F test is used to test the hypothesis that the values of the regression parameters ß1, ß2. . . , ßq are all zero.
A Type I error is committed when _____. a. a true null hypothesis is rejected b. the validity of a claim was rejected c. a true alternative hypothesis is not accepted d. the critical value is greater than the value of the test statistic
a. a true null hypothesis is rejected If we reject H0 when H0 is true, we have made a Type I error. Said another way, a Type I error is committed when a true null hypothesis is rejected.
As the number of degrees of freedom for a t distribution increases, the difference between the t distribution and the standard normal distribution _____. a. becomes smaller b. fluctuates c. becomes larger d. stays the same
a. becomes smaller As the number of degrees of freedom for a t distribution increases, the difference between the t distribution and the standard normal distribution becomes smaller.
The _____ is a measure of the goodness of fit of the estimated regression equation. It can be interpreted as the proportion of the variability in the dependent variable y that is explained by the estimated regression equation. a. coefficient of determination b. residual c. dummy variable d. interaction variable
a. coefficient of determination Coefficient of determination is a measure of the goodness of fit of the estimated regression equation. It can be interpreted as the proportion of the variability in the dependent variable y that is explained by the estimated regression equation.
A one-tailed test is a hypothesis test in which the rejection region is _____. a. in one tail of the sampling distribution b. only in the lower tail of the sampling distribution c. in both tails of the sampling distribution d. only in the upper tail of the sampling distribution
a. in one tail of the sampling distribution A one-tailed test is a hypothesis test in which rejection region is in one tail of the sampling distribution. If a hypothesis test is a one-tailed test, the p value method of drawing a conclusion may be used, or the rejection region may be used. If a rejection region is used, the rejection region will only fall in one tail of the curve for a one-tailed test.
A null and alternative hypothesis for a one proportion z test are given as H0: p ≥ 0.8, Ha: p < 0.8. This hypothesis test is _____ a. lower-tailed b. upper-tailed c. two-tailed d. incorrectly stated.
a. lower-tailed If the alternative hypothesis contains the not-equal-to inequality symbol, then it is a two-tailed test. If the alternative hypothesis contains the less-than inequality symbol, then it is a left-tailed test. If the alternative hypothesis contains the greater-than inequality symbol, then it is a right-tailed test.
A simple random sample of 31 observations was taken from a large population. The sample mean equals 5. Five is a _____. a. point estimate b. population mean c. standard error d. population parameter
a. point estimate A point estimate is the value of a point estimator used in a particular instance, such as an estimate of a population parameter.
In a simple linear regression analysis the quantity that gives the amount by which the dependent variable changes for a unit change in the independent variable is called the _____. a. slope of the regression line b. correlation coefficient c. standard error d. coefficient of determination
a. slope of the regression line In a simple linear regression analysis, the slope of the regression line gives the amount by which the dependent variable changes for a unit change in the independent variable.
In the graph of the simple linear regression equation, the parameter ß0 represents the _____ of the true regression line. a. y-intercept b. x-intercept c. end-point d. slope
a. y-intercept In the graph of the simple linear regression equation, the parameter ß0 represents the y-intercept of the true regression line.
A parameter is a numerical measure from a population, such as _____. a. μ b. x̄ c. p̄ d. s
a. μ An example of a parameter from a population is a population mean, μ.
The population parameters that describe the y-intercept and slope of the line relating y and x, respectively, are _____. a. a and b b. B0 and B1 c. y and x d. a and B
b. B0 and B1 The population parameters that describe the y-intercept and slope of the line relating y and x, respectively, are B0 and B1.
The average number of hours for a random sample of mail order pharmacists from company A was 50.1 hours last year. It is believed that changes to medical insurance have led to a reduction in the average work week. To test the validity of this belief, the hypotheses are _____. a. H0: u > 50.1, Ha: u < 50.1 b. H0: u ≥ 50.1, Ha: u < 50.1 c. H0: u ≤ 50.1, Ha: u > 50.1 d. H0: u = 50.1, Ha: u = 50.1
b. H0: u ≥ 50.1, Ha: u < 50.1 The assumption to be challenged is last year's average. The alternative hypothesis comes from the fact that they believe there has been a reduction in the average work week. So the alternative hypothesis is Ha: u < 50.1. The null hypothesis is the complement of the alternate hypothesis: H0: u > 50.1.
_____ refers to the scenario in which the relationship between the dependent variable and one independent variable is different at different values of a second independent variable. a. Autocorrelation b. Interaction c. Multicollinearity d. Covariance
b. Interaction Interaction refers to the scenario in which the relationship between the dependent variable and one independent variable is different at different values of a second independent variable.
What are the two decisions that you can make from performing a hypothesis test? a. Accept the null hypothesis; Accept the alternative hypothesis b. Reject the null hypothesis; Fail to reject the null hypothesis c. Reject the alternative hypothesis; Accept the null hypothesis d. Make a Type I error; Make a Type II error
b. Reject the null hypothesis; Fail to reject the null hypothesis When we draw a conclusion, we either have enough evidence to reject the null hypothesis or we do not have enough evidence to reject the null hypothesis.
A sample of 37 AA batteries had a mean lifetime of 584 hours. A 95% confidence interval for the population mean was 579.2 < μ < 588.8. Which statement is the correct interpretation of the results? a. The probability that the population mean is between 579.2 hours and 588.8 hours is 0.95. b. We are 95% confident that the mean lifetime of all the bulbs in the population is between 579.2 hours and 588.8 hours. c. 95% of the light bulbs in the sample had lifetimes between 579.2 hours and 588.8 hours. d. None of these statements correctly interpret the results.
b. We are 95% confident that the mean lifetime of all the bulbs in the population is between 579.2 hours and 588.8 hours. We are 95% confident that the mean lifetime of all the bulbs in the population is between 579.2 hours and 588.8 hours.
A variable used to model the effect of categorical independent variables in a regression model which generally takes only the value zero or one is called _____. a. interaction b. a dummy variable c. a residual d. the coefficient of determination
b. a dummy variable A dummy variable is a variable used to model the effect of categorical independent variables in a regression model.
In a linear regression model, the variable (or variables) used for predicting or explaining values of the response variable are known as the _____. It(they) is(are) denoted by x. a. dependent variable b. independent variable c. regression variable d. residual variable
b. independent variable The independent variable(s) is the variable (or variables) used for predicting or explaining values of the response variable. It(they) is(are) denoted by x.
Regression analysis involving one dependent variable and more than one independent variable is known as _____. a. simple regression b. multiple regression c. linear regression d. None of these choices are correct.
b. multiple regression Multiple regression is regression analysis involving one dependent variable and more than one independent variable.
A _____ is used to visualize sample data graphically and to draw preliminary conclusions about the possible relationship between the variables. a. Gantt chart b. scatter chart c. contingency table d. pie chart
b. scatter chart A scatter chart is used to visualize sample data graphically and to draw preliminary conclusions about the possible relationship between the variables.
In the graph of the simple linear regression equation, the parameter ß1 is the _____ of the true regression line. a. x-intercept b. slope c. end-point d. y-intercept
b. slope In the graph of the simple linear regression equation, the parameter ß1 is the slope of the true regression line.
In a random sample of 400 registered voters, 120 indicated they plan to vote for Trump for President. Determine a 95% confidence interval for the proportion of all the registered voters who will vote for Trump. a. (0.27, 0.32) b. (0.29, 0.30) c. (0.25, 0.34) d. Cannot be determined from the information given.
c. (0.25, 0.34) Use the formula 0.3 ± 1.96 * √ ( p̂(1 - p̂) / n ). Filling in the given values yields 0.3 ± 1.96 * √ ( 0.3(1 - 0.3) / 400 ) so the 95% confidence interval ranges from 0.25 to 0.34.
What would be the value of the sum of squares due to regression (SSR) if the total sum of squares (SST) is 25.32 and the sum of squares due to error (SSE) is 6.89? a. 15.32 b. 19.32 c. 18.43 d. 31.89
c. 18.43 The three quantities are related as SST = SSR + SSE. Substituting the values, we get SSR=18.43.
A statistics teacher started class one day by drawing the names of 10 students out of a hat and asked them to do as many pushups as they could. The 10 randomly selected students averaged 15 pushups per person with a standard deviation of 9 pushups. Suppose the distribution of the population of number of pushups that can be done is approximately normal. What is the standard error of the mean? a. 3.061 b. 4.743 c. 2.876 **Should be 2.846 d. 0.900
c. 2.876 **Should be 2.846 The standard error of the mean for a t distribution is s/√n For this problem, the standard error of the mean = 9/√10 = 2.846
The owners of a fast food restaurant have automatic drink dispensers to help fill orders more quickly. When the 12 ounce button is pressed, they would like for exactly 12 ounces of beverage to be dispensed. There is, however, some variation in this amount. The company does not want the machine to systematically over fill or under fill the cups. Which of the following gives the correct set of hypotheses? a. H0: u ≤ 12, Ha: u > 12 b. H0: u ≥ 12, Ha: u < 12 c. H0: u = 12, Ha: u ≠ 12 d. H0: u > 12, Ha: u < 12
c. H0: u = 12, Ha: u ≠ 12 The correct set of hypotheses is H0: u = 12, Ha: u ≠ 12. They do not want the cups to be over or under filled, so they should use a two-sided alternative hypothesis.
The scatter chart below displays the residuals versus the dependent variable, x. Which of the following conclusions can be drawn based upon this scatter chart? (No pic, but the points create a V shape with a horizontal dashed line through the points at 0) a. The model overpredicts the value of the dependent variable for small values and large values of the independent variable. b. The residuals have a constant variance. c. The model fails to capture the relationship between the variables accurately. d. The residuals are normally distributed.
c. The model fails to capture the relationship between the variables accurately. The residuals are positive for small and large values of the independent variable x but are negative for the remaining values of the independent variable. This pattern suggests that the linear relationships in the regression model underpredicts the value of dependent variable for small and large values of the independent variable and overpredicts the value of the dependent variable for intermediate values of the independent variable. In this case, the regression model does not adequately capture the relationship between the independent variable x and the dependent variable y.
In a linear regression model, the variable that is being predicted or explained is known as _____. It is denoted by y and is often referred to as the response variable. a. regression variable b. residual variable c. dependent variable d. independent variable
c. dependent variable Dependent variable is the variable that is being predicted or explained. It is denoted by y and is often referred to as the response variable.
In the simple linear regression model, the _____ accounts for the variability in the dependent variable that cannot be explained by the linear relationship between the variables. a. constant term b. residual c. error term d. model parameter
c. error term In the simple linear regression model, the error term accounts for the variability in the dependent variable that cannot be explained by the linear relationship between the variables.
The coefficient of determination _____. a. is equal to negative one for the poorest fit b. is equal to zero for a perfect fit c. is used to evaluate the goodness of fit d. takes values between -1 to +1
c. is used to evaluate the goodness of fit The coefficient of determination is used to evaluate the goodness of fit for the estimated regression equation.
The degree of correlation among independent variables in a regression model is called _____. a. the coefficient of determination b. the sum of squared errors (SSE) c. multicollinearity d. interaction
c. multicollinearity Multicollinearity is the degree of correlation among independent variables in a regression model.
Two approaches to drawing a conclusion in a hypothesis test are _____. a. one-tailed and two-tailed b. Type I and Type II c. p-value and critical value d. null and alternative
c. p-value and critical value There are two methods that can be used to draw a conclusion while carrying out a hypothesis test. The value of the test statistic can be assessed using a p-value method or the test statistic can be compared to a critical value in order to test for significance.
The CEO of a company wants to estimate the percent of employees that use company computers to go on Facebook during work hours with 95% confidence. He selects a random sample of 150 of the employees and finds that 53 of them logged onto Facebook that day. What is the point estimate of the proportion of the population that logged onto Facebook that day? a. 0.65 b. 0.25 c. 0.53 d. 0.35
d. 0.35 The point estimate of the proportion of the population that logged onto Facebook that day is 53/150 = 0.35.
The CEO of a company wants to estimate the percent of employees that use company computers to go on Facebook during work hours with 95% confidence. He selects a random sample of 150 of the employees and finds that 53 of them logged onto Facebook that day. Compute the 95% confidence interval for the population proportion. a. 0.35 ± 1.645 * √ ( 0.35(1 - 0.35) / 150 ) b. 0.53 ± 1.645 * √ ( 0.53(1 - 0.53) / 150 ) c. 0.53 ± 1.96 * √ ( 0.53(1 - 0.53) / 150 ) d. 0.35 ± 1.96 * √ ( 0.35(1 - 0.35) / 150 )
d. 0.35 ± 1.96 * √ ( 0.35(1 - 0.35) / 150 ) The general formula for an interval estimate of a population proportion is p̂ ± zα/2 * √ ( p̂(1 - p̂) / n ) In this case, p̂ = 53/150, n = 150, and the critical value is 1.96. The resulting 95% confidence interval for the population proportion is 0.35 ± 1.96 * √ ( 0.35(1 - 0.35) / 150 )
In order to determine an interval for the mean of a population with unknown standard deviation, a sample of 24 items is selected. The mean of the sample is determined to be 23. The number of degrees of freedom for reading the t value is _____. a. 21 b. 22 c. 24 d. 23
d. 23 The degrees of freedom for this t distribution is n - 1 = 23.
A pizza shop advertises that they deliver in 30 minutes or less or it is free. People who live in homes that are located on the opposite side of town believe it will take the pizza shop longer than 30 minutes to make and deliver the pizza. Write the null and alternative hypotheses that can be used to conduct a significance test. a. H0: u ≥ 30, Ha: u < 30 b. H0: u < 30, Ha: u > 30 c. H0: u > 30, Ha: u < 30 d. H0: u ≤ 30, Ha: u > 30
d. H0: u ≤ 30, Ha: u > 30 The null hypothesis, is a statement that contains a statement of equality. The null hypothesis is that the mean delivery time is no more than 30 minutes. The alternative hypothesis is the complement of the null hypothesis.
_____ refers to the degree of correlation among independent variables in a regression model. a. Confidence level b. Tolerance c. Rank d. Multicollinearity
d. Multicollinearity Multicollinearity is the degree of correlation among independent variables in a regression model.
_____ is a statistical procedure used to develop an equation showing how two variables are related. a. Data mining b. Factor analysis c. Time series analysis d. Regression analysis
d. Regression analysis Regression analysis is a statistical procedure used to develop an equation showing how variables are related.
The scatter chart below displays the residuals versus the dependent variable, x. Which of the following conclusions can be drawn based upon this scatter chart? (There is a dashed horizontal line at 0, the points under it are close to the line and the points above are much more spread out away from the line) a. The model captures the relationship between the variables accurately. b. The model underpredicts the value of the dependent variable for intermediate values of the independent variable. c. The residuals have a constant variance. d. The residual distribution is not normally distributed.
d. The residual distribution is not normally distributed. The residuals in the given figure are not symmetrically distributed around zero. Many of the negative residuals are relatively close to zero, while the relatively few positive residuals tend to be far from zero. This skewness suggests that the residuals are not normally distributed.