ECON 5 - Linear Regression

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

Abby Kratz, a market specialist at the market research firm of Saez, Sikes, and Spitz, is analyzing household budget data collected by her firm. Abby's dependent variable is monthly household expenditures on groceries (in $'s), and her independent variable is annual household income (in $1,000's). Regression analysis of the data yielded the following tables. Coefficients S.E. t Statistic p-value Intercept 39.14942 22.30182 1.755436 0.109712 x 1.792312 0.407507 4.398234 0.001339 Source df SS MS F Regression 1 16850.99 16850.99 19.34446 Se=29.51443 Residual 9 7839.915 871.1017 r^2 = 0.682478 Total 10 24690.91 For a household with $50,000 annual income, Abby's model predicts monthly grocery expenditures of ________________.

$128.65

Annie Mikhail, market analyst for a national company specializing in historic city tours, is analyzing the relationship between the sales revenue from historic city tours and the size of the city. She gathers data from six cities in which the tours are offered. Annie's dependent variable is annual sales revenues and her independent variable is the city population. Regression analysis of the data yielded the following tables. Coefficients S.E. t Statistic p-value Intercept -.14156 0.292143 -.48455 0.653331 x 0.105195 0.013231 7.950352 0.001356 Source df SS MS F Regression 1 3.550325 3.550325 63.20809 Se = 0.237 Residual 4 0.224675 0.056169 r^2 = 0.940483 Total 5 3.775 For a city with a population of 500,000, Annie's model predicts annual sales of ________________.

$52,597

In the regression equation, y = 2.164 + 1.3657x, n = 6, the mean of x is 8.667, SSxx = 89.333 and Se = 3.44. A 95% confidence interval for the average of y when x = 8 is _________.

(9.13, 17.05)

A researcher has developed the regression equation y = 2.164 + 1.3657x, where n = 6, the mean of x is 8.667, Sxx = 89.333, and Se = 3.44. The 90% confidence prediction interval for y when x = 1 is ______.

(−3.14, 10.2)

In the regression equation, y = 54.78 + 1.45x, the x-intercept is _______.

-37.8

If you were to develop a regression line to predict y by x, what value would the coefficient of determination have? x 213 196 184 202 221 247 y 76 65 62 68 71 75 (Do not round the intermediate values. Round your answer to 3 decimal places.) Coefficient of determination =

0.695

Louis Katz, a cost accountant at Papalote Plastics, Inc. (PPI), is analyzing the manufacturing costs of a molded plastic telephone handset produced by PPI. Louis's independent variable is production lot size (in 1,000's of units), and his dependent variable is the total cost of the lot (in $100's). Regression analysis of the data yielded the following tables. Coefficients S.E. t Statistic p-value Intercept 3.996 1.161268 3.441065 0.004885 x 0.358 0.102397 3.496205 0.004413 Source df SS MS F Regression 1 9.858769 9.858769 12.22345 Se = 0.898 Residual 11 8.872 0.806545 r^2 = 0.526341 Total 12 18.73077 The correlation coefficient between Louis's variables is ________________.

0.73

In fitting a least square line to n = 10 data points, the following quantities were computed: SSxx = 32, 𝑥⎯⎯x¯= 3, SSyy = 26, 𝑦⎯⎯y¯= 4, SSxy = 28, then the coefficient of determination is __________.

0.9423

The following regression model was fitted to sample data with 12 observations: 𝑦̂ŷ= 30 + 4.50x. What is the residual for an observation (x = 2, y = 40)?

1

A simple regression model for 10 pair of data resulted in a standard error of 3.95 (i.e., Se = 3.95), and the. The sum of squares of error (SSE) is ______.

124.82

A researcher has developed the regression equation y = 2.164 + 1.3657x, where n = 6, the mean of x is 8.667, Sxx = 89.333, and Se = 3.44. The researcher wants to test if the slope is significantly positive, and he chooses a significance level of 0.05. The critical t value is ______.

2.132

The Decision Dilemma presented data on the prices of a Big Mac and the net hourly wage figures for ____ countries.

27

A simple regression model developed for ten pairs of data resulted in a sum of squares of error, SSE = 125. The standard error of the estimate is _______.

3.95

Some entries have been deleted from the following analysis of variance table that is part of the computer output from a simple regression analysis of a data set containing 12 matched-pair observations. -------------------------------------------------------------------------------------- Source df SS MS F p-Value -------------------------------------------------------------------------------------- Regression 1 9315.23 9315.23 0.01734 Residual Total 11 119222.72 -------------------------------------------------------------------------------------- Based on this output, what is the value of the standard error of estimate?

31.48

In fitting a least square line to n = 10 data points, the following quantities were computed: SSxx = 32, 𝑥⎯⎯x¯ = 3, SSyy = 26, 𝑦⎯⎯y¯ = 4, SSxy = 28, then a 95% prediction interval for y when xp = 4 is __________.

4.875 ± 1.062

If the sum of squares of error for a simple regression model fitted for twelve pairs of observations is 250, what is the standard error of the estimate for the model?

5.00

Some entries have been deleted from the following analysis of variance table that is part of the computer output from a simple regression analysis of a data set containing 11 matched-pair observations. -------------------------------------------------------------------------------------- Source df SS MS F p-Value -------------------------------------------------------------------------------------- Regression 1 9315.23 0.017340862 Residual Total 11 19222.72 -------------------------------------------------------------------------------------- Based on this output, what is the value of the F statistic?

9.4022

The equation of the trend line for the data based on sales (in $1000) of a local restaurant over the years 2005-2010 is Sales = -265575 + 132.571 year. The equation of the trend line when using 1 to 6 for 2005-2010 is ________.

97.284 + 132.571x

Some entries have been deleted from the following analysis of variance table that is part of the computer output from a simple regression analysis of a data set containing 11 matched-pair observations. -------------------------------------------------------------------------------------- Source df SS MS F p-Value -------------------------------------------------------------------------------------- Regression 1 9315.23 0.017340862 Residual Total 11 19222.72 -------------------------------------------------------------------------------------- Based on this output, what is the value of the mean square error?

990.749

A value of -1 for the coefficient of correlation between two variables means that the two variables are ________________. a. perfectly related b. very weakly related c. not related at all d. weakly related e. somewhat strongly related

A value of -1 for the coefficient of correlation between two variables means that the two variables are perfectly related because the range of correlation is from -1 to 1 -1 indicates strictly linearly related in the negative direction ans: perfectly related

In the regression equation, y = 2.164 + 1.3657x, n = 6, the mean of x is 8.667, SSxx = 89.333 and Se = 3.44. A 95% prediction interval for y when x = 8 is _________.

Answer is either (2.75, 23.43) i think it's this one or (3.56, 22.62)

Which of the following assertions is true about the regression line? a. The regression line is also called the least cubes line and is found minimizing the sum of the cubes of the residuals. b. It is found minimizing the sum of the residuals squared, but?even though it would be unnecessarily complicated?it could also be found minimizing the sum of the residuals cubed. c. Depending on the data, some regression lines could have only positive residuals. d. Depending on the data, some regression lines could have all residuals equal to zero. e. Depending on the data, some regression lines could have only negative residuals.

Depending on the data, some regression lines could have all residuals equal to zero.

To test the overall significance of the regression model (i.e., to determine whether at least one of the regression coefficients other than 𝛽β0 is significantly different from zero), it is appropriate to use a __________.

F-test

In the simple regression model, y = 21 − 5x, if the coefficient of determination is 0.81, we can say that the coefficient of correlation between y and x is 0.90. (T/F)

False

Prediction intervals get narrower as we extrapolate outside the range of the data. (T/F)

False

The strength of a linear relationship in simple linear regression change if the units of the data are converted, say from feet to inches. (T/F)

False

The variability in the estimated slope is smaller when the x-values are more spread out. (T/F)

False

Test the slope of the regression line for the following data. Use 𝛼=.01. x 142 122 100 93 63 29 23 y 26 30 44 69 90 114 129 (Do not round the intermediate values. Round your answer to 2 decimal places.) Observed t = The decision is to

Observed t = 146.07 is wrong The decision is to REJECT THE NULL HYPOTHESIS

In the Decision Dilemma Solved, data is presented from a regression analysis performed in Excel. In the regression statistics, the value for the R square is 0.514. What does this value tell the researcher about these data? a. The percent of variation in hourly wages that is explained by the price of a Big Mac is 71.7%. b. The null hypothesis should be rejected. c. The correlation coefficient is 0.717. d. There is no predictability for this model.

The correlation coefficient is 0.717.

The Decision Dilemma presented data on the prices of a Big Mac and the net hourly wage figures by country. Which of the following statements most likely reflects the research agenda? a. The net hourly wages will be used to predict the price of a Big Mac. b. The price of Big Mac's will be used to determine the country. c. The price of Big Mac's will be used to predict net hourly wages. d. The country will be used to predict net hourly wages.

The net hourly wages will be used to predict the price of a Big Mac.

Suppose we are making predictions of the dependent variable y for specific values of the independent variable x using a simple linear regression model holding the confidence level constant. Let Width (C.I) = the width of the confidence interval for the average value y for a given value of x, and Width (P.I) = the width of the prediction interval for a single value y for a given value of x. Which of the following statements is true? a. Width (C.I) = 0.5 Width (P.I) b. Width (C.I) > Width (P.I) c. Width (C.I) = Width (P.I) d. Width (C.I) < Width (P.I)

Width (C.I) < Width (P.I)

If the correlation coefficient between variables X and Y is roughly zero, then ______. a. Y is independent of X b. Y is dependent on X c. there is a linear correlation between Y and X d. Y is not necessarily independent of X e. Y is caused by X.

Y is not necessarily independent of X

Which of the following equations represents a linear relationship between y and x? (Note: m and b are constants.) a. y = mx + b b. y = m/x +b c. y = mx^2 d. y = mb

a. y = mx + b

The following data are the claims (in $ millions) for BlueCross BlueShield benefits for nine states, along with the surplus (in $ millions) that the company had in assets in those states. State Claim Surplus -------------------------------------------------------------------------- Alabama $1,425 $277 Colorado 273 100 Florida 915 120 Illinois 1,687 259 Maine 234 40 Montana 142 25 North Dakota 259 57 Oklahoma 258 31 Texas 894 141 -------------------------------------------------------------------------- Use the data to compute a correlation coefficient, r, to determine the correlation between claims and surplus. (Round the intermediate values to 3 decimal places. Round your answer to 3 decimal places.) r = ?

ans: 0.957 claims=X , surplus=Y X Values ? = 6087 Mean = 676.333 ?(X - Mx)2 = SSx = 2679308 Y Values ? = 1050 Mean = 116.667 ?(Y - My)2 = SSy = 72026 X and Y Combined N = 9 ?(X - Mx)(Y - My) = 420333 R Calculation r = ?((X - My)(Y - Mx)) / ?((SSx)(SSy)) r = 420333 / ?((2679308)(72026)) = 0.9568

A cost accountant is developing a regression model to predict the total cost of producing a batch of printed circuit boards as a linear function of batch size (the number of boards produced in one lot or batch). The intercept of this model is the ______. a. batch size b. total variable cost c. total cost d. unit variable cost e. fixed cost

fixed cost

If the relationship between y and x is given by the equation y = 1.57 + 0.0407 x, the slope coefficient 0.0407 means this: __________. a. for every unit increase in x, y decreases by 0.0407 b. for every unit increase in x, y increases by 0.0407 c. for every unit increase in x, y decreases by 1.6107 d. for every unit increase in x, y increases by 1.6107

for every unit increase in x, y increases by 0.0407

For a certain data set the regression equation is y = 37 + 13x. The correlation coefficient between y and x in this data set _______. a. is positive b. must be 0 c. is negative d. must be 3 e. must be 1

is positive

Annie Mikhail, market analyst for a national company specializing in historic city tours, is analyzing the relationship between the sales revenue from historic city tours and the size of the city. She gathers data from six cities in which the tours are offered. Annie's dependent variable is annual sales revenues and her independent variable is the city population. Regression analysis of the data yielded the following tables. Coefficients S.E. t Statistic p-value Intercept -.14156 0.292143 -.48455 0.653331 x 0.105195 0.013231 7.950352 0.001356 Source df SS MS F Regression 1 3.550325 3.550325 63.20809 Se = 0.237 Residual 4 0.224675 0.056169 r^2 = 0.940483 Total 5 3.775 Using 𝛼 = 0.05, Annie should ________________.

reject H0: 𝛽1 = 0

The American Research Group, Inc. conducted a telephone survey of a random sample of 1,100 U.S. adults in a recent year and determined that the average amount of planned spending on gifts for the holiday season was $854 and that 40% of the purchases would be made from catalogs. Shown below are the average amounts of planned spending on gifts for the holiday season for 11 years along with the associated percentages to be made from catalogs. Year Average Spending ($) % Purchases to be Made from Catalogs ---------------------------------------------------------------------------------- 1 1,037 44 2 976 42 3 1,004 47 4 942 47 5 907 50 6 859 51 7 431 43 8 417 36 9 658 26 10 646 42 11 854 40 Develop a regression model to predict the number of rentals per day by the average family income. Comment on the output. For 𝛼=.01 the value of the test statistic is 𝑡 = ? so the decision is ? the null hypothesis. (Do not round the intermediate values. Round the answer to 2 decimal places.)

t = 1.71 to fail to reject

The standard error estimate is computed as the square root of the mean squared error and it is a standard deviation of the errors. It is therefore useful for to making a judgment about the fit of regression model in conjunction with the assumption that __________.

the error terms are normally distributed

A regression line minimizes the sum of the squared error values. This means that the regression line minimizes the sum of ______ from each point in the scatter point to the regression line. a. the squares of the distances b. the squares of the horizontal distances (differences in the x-coordinates) c. the squares of the vertical distances (differences in the x-coordinates) d. the squares of the horizontal distances (differences in the y-coordinates) e. the squares of the vertical distances (differences in the y-coordinates)

the squares of the vertical distances (differences in the y-coordinates)

A cost accountant is developing a regression model to predict the total cost of producing a batch of printed circuit boards as a linear function of batch size (the number of boards produced in one lot or batch). The slope of the accountant's model is ______. a. unit variable cost b. total variable cost c. fixed cost d. batch size e. total cost

unit variable cost

The Conference Board produces a Consumer Confidence Index (CCI) that reflects people's feelings about general business conditions, employment opportunities, and their own income prospects. Some researchers may feel that consumer confidence is a function of the median household income. Shown here are the CCIs for 9 years and the median household incomes for the same 9 years published by the U.S. Census Bureau. Determine the equation of the regression line to predict the CCI from the median household income. Compute the standard error of the estimate, 𝑠𝑒 for this model. Compute the value of 𝑟2. Does median household income appear to be a good predictor of the CCI? Why or why not? CCI Median Household Income ($1,000) -------------------------------------------------------------------------------------- 116.8 37.415 91.5 36.770 68.5 35.501 61.6 35.047 65.9 34.700 90.6 34.942 100.0 35.887 104.6 36.306 125.4 37.005 -------------------------------------------------------------------------------------- *(Do not round the intermediate values. Round your answers to 4 decimal places.) **(Round the intermediate values to 4 decimal places. Round your answers to 3 decimal places.) y-hat = ? + ( ? )x se = ? r^2 = ? Medium Household Income (appears/does not appear) to be a good predictor of the CCI?

y = -599.367 + (19.220)x se = 13.539 r^2 = 0.688 Median household income APPEARS to be a good predictor of CCI. Solution saved as q25 in downloads/

The following data is to be used to construct a regression model: X 3 5 7 4 8 10 9 y 5 4 5 4 7 10 8 The regression equation is _______________. a. y = 16.49 + 1.43x b. y = 0.91 + 4.06x c. y = 1.19 + 0.75x d. y = 1.19 + 0.91x e. y = 0.75 + 0.18x

y = 1.19 + 0.75x

It appears that over the past 50 years, the number of farms in the United States declined while the average size of farms increased. The following data provided by the U.S. Department of Agriculture show five-year interval data for U.S. farms. Use these data to develop the equation of a regression line to predict the average size of a farm (𝑦)(y) by the number of farms (𝑥).(x). Discuss the slope and y-intercept of the model. Year Number of Farms (Millions) Average Size (acres) 1960 5.67 214 1965 4.66 261 1970 3.92 297 1975 3.31 343 1980 2.90 376 1985 2.49 419 1990 2.47 427 1995 2.27 443 2000 2.14 458 2005 2.07 471 2010 2.17 437 2015 2.09 446 (Do not round the intermediate values. Round your answers to 2 decimal places.) y = ? + (?)x

y = 598.89 + (-71.75)x

y(2016Shown here are the labor force figures (in millions) published by the World Bank for the country of Bangladesh over a 10-year period. Year Labor Force (million) ---------------------------------------------- 2005 66.49 2006 67.83 2007 69.11 2008 70.37 2009 71.66 2010 73.01 2011 74.55 2012 76.04 2013 77.61 2014 78.62 --------------------------------------------- Develop the equation of a trend line through these data and use the equation to predict the labor force of Bangladesh for the year 2017. (Do not round the intermediate values. Round your answers to 4 decimal places.) y = ? + (?)x (Round your answer to 2 decimal places.) y(2017) = ?

y = a + bx a = -2681.947 or 2681.947 b = 1.3707 y(2017) = 82.81

Suppose that in a regression analysis, SST = 140, SSR = 35, and SSxy = −23.32. Then the corresponding coefficient of correlation r = ______.

−0.50

For a regression model y = 30 − 2x, the coefficient of determination was determined as equal to 0.81. The coefficient of correlation is __________.

−0.9

Suppose for a given data set the regression equation is: y = 54.78 + 1.45x, and the point (0.00, 24.78) is in the data set. The residual for this point is _______.

−30.00


संबंधित स्टडी सेट्स

Practice Test K Linear Measurement

View Set

Chapter 2: Fundamentals of Process Control

View Set

Chapter 8 The international monetary system and financial forces

View Set

Fundamentals of the Databricks Lakehouse Platform Accreditation - v2

View Set