BNAL Exam 2 13-14
TABLE 14-3 An economist is interested to see how consumption for an economy (in $ billions) is influenced by gross domestic product ($ billions) and aggregate price (consumer price index). The Microsoft Excel output of this regression is partially reproduced below. SUMMARY OUTPUT Referring to Table 14-3, to test for the significance of the coefficient on gross domestic product, the p-value is 0.0001. 0.8330. 0.8837. 0.9999.
0.0001.
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected. ATTENDANCE = Paid attendance for the game TEMP = High temperature for the day WIN% = Team's winning percentage at the time of the game OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND − 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION − 1 = if a promotion was held; 0 = if no promotion was held The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below. Regression Statistics Multiple R 0.5487 R Square 0.3011 Adjusted R Square 0.2538 Standard Error 6442.4456 Observations 80 ANOVA df SS MS F Significance F Regression 5 1322911703.0671 264582340.6134 6.3747 0.0001 Residual 74 3071377751.1204 41505104.7449 Total 79 4394289454.1875 Coefficients Standard Error t Stat P-value Intercept -3862.4808 6180.9452 -0.6249 0.5340 Temp 51.7031 62.9439 0.8214 0.4140 Win% 21.1085 16.2338 1.3003 0.1975 OpWin% 11.3453 6.4617 1.7558 0.0833 Weekend 367.5377 2786.2639 0.1319 0.8954 Promotion 6927.8820 2784.3442 2.4882 0.0151 The coefficient of multiple determination of each of the 5 predictors with all the other remaining predictors are, respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308. What is the correct interpretation for the estimated coefficient for TEMP? Answers: a. As the high temperature increases by one degree, the estimated mean paid attendance will increase by 51.70 taking into consideration all the other independent variables included in the model. b. As the high temperature increases by one degree, the estimated mean paid attendance will increase by 51.70. c. As the high temperature increases by one degree, the paid attendance will increase by 51.70. d. As the high temperature increases by one degree, the paid attendance will increase by 51.70 taking into consideration all the other independent variables included in the model.
As the high temperature increases by one degree, the estimated mean paid attendance will increase by 51.70 taking into consideration all the other independent variables included in the model.
True or False: So that we can fit curves as well as lines by regression, we often use mathematical manipulations for converting one variable into a different form. These manipulations are called dummy variables. Answers: True False
False
True or False: The stepwise regression approach takes into consideration all possible models. Answers: True False
False
A computer software developer would like to use the number of downloads (in thousands) for the trial version of his new shareware to predict the amount of revenue (in thousands of dollars) he can make on the full version of the new shareware. Following is the output from a simple linear regression along with the residual plot and normal probability plot obtained from a data set of 30 different sharewares that he has developed: Referring to table 13-11, which of the following is the correct interpretation for the slope coefficient? For each decrease of 1 thousand downloads, the expected revenue is estimated to increase by $ 3.7297 thousands. For each increase of 1 thousand downloads, the expected revenue is estimated to increase by $ 3.7297 thousands. For each decrease of 1 thousand dollars in expected revenue, the expected number of downloads is estimated to increase by 3.7297 thousands. For each increase of 1 thousand dollars in expected revenue, the expected number of downloads is estimated to increase by 3.7297 thousands.
For each increase of 1 thousand downloads, the expected revenue is estimated to increase by $ 3.7297 thousands.
Which of the following statements about moving averages is not true? Answers: a. It can be used to smooth a series b. It gives equal weight to all values in the computation c. It gives greater weight to more recent data d. It is simpler than the method of exponential smoothing
It gives greater weight to more recent data
The Y-intercept ( b 0) represents the predicted value of Y when X = 0. change in estimated Y per unit change in X. predicted value of Y. variation around the sample regression line.
predicted value of Y when X = 0
In a multiple regression problem involving two independent variables, if b 1 is computed to be +2.0, it means that the relationship between X1 and Y is significant. the estimated mean of Y increases by 2 units for each increase of 1 unit of X1, holding X2 constant. the estimated mean of Y increases by 2 units for each increase of 1 unit of X1, without regard to X2. the estimated mean of Y is 2 when X1 equals zero.
the estimated mean of Y increases by 2 units for each increase of 1 unit of X1, holding X2 constant.
In selecting a forecasting model, we should perform a residual analysis. Answers: True False
True
One of the most common questions of prospective house buyers pertains to the cost of heating in dollars ( Y). To provide its customers with information on that matter, a large real estate firm used the following 4 variables to predict heating costs: the daily minimum outside temperature in degrees of Fahrenheit ( X 1) the amount of insulation in inches ( X 2), the number of windows in the house ( X 3), and the age of the furnace in years ( X 4). Given below are the Excel outputs of two regression models. Model 1 Model 2 Referring to Table 14-6, the estimated value of the partial regression parameter 1 in Model 1 means that Answers: holding the effect of the other independent variables constant, an estimated expected $1 increase in heating costs is associated with a decrease in the daily minimum outside temperature by 4.51 degrees. holding the effect of the other independent variables constant, a 1 degree increase in the daily minimum outside temperature results in a decrease in heating costs by $4.51. holding the effect of the other independent variables constant, a 1 degree increase in the daily minimum outside temperature results in an estimated decrease in mean heating costs by $4.51. holding the effect of the other independent variables constant, a 1% increase in the daily minimum outside temperature results in an estimated decrease in mean heating costs by 4.51%.
holding the effect of the other independent variables constant, a 1 degree increase in the daily minimum outside temperature results in an estimated decrease in mean heating costs by $4.51.
A dummy variable is used as an independent variable in a regression model when Answers: the variable involved is numerical. the variable involved is categorical. a curvilinear relationship is suspected. when 2 independent variables interact.
the variable involved is categorical.
The method of moving averages is used Answers: a. to exponentiate a series b. in regression analysis c. to plot a series d. to smooth a series
to smooth a series
The overall upward or downward pattern of the data in an annual time series will be contained in the ____________ component. Answers: a. trend b. cyclical c. seasonal d. irregular
trend
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model: where Y = demand (in thousands) and X = retail price per carat. This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis is shown below: SUMMARY OUTPUT Regression Statistics Multiple R 0.994 R Square 0.988 Standard Error 12.42 Observations 12 ANOVA df SS MS F p-value Regression 2 115145 57573 373 0.0001 Residual 9 1388 154 Total 11 116533 Coeff StdError t Stat P-value Intercept 286.42 9.66 29.64 0.0001 Price - 0.31 0.06 - 5.14 0.0006 Price Sq 0.000067 0.00007 0.95 0.3647 what is the p-value associated with the test statistic for testing whether there is an upward curvature in the response curve relating the demand (Y) and the price (X)? Answers: a. 0.0001 b. 0.3647 c. None of the above d. 0.0006
0.3647
An economist is interested to see how consumption for an economy (in $ billions) is influenced by gross domestic product ($ billions) and aggregate price (consumer price index). The Microsoft Excel output of this regression is partially reproduced below. SUMMARY OUTPUT Referring to Table 14-3, to test for the significance of the coefficient on aggregate price index, the p-value is 0.0001. 0.8330. 0.8837. 0.9999.
0.8330.
TABLE 13-6 The following Excel tables are obtained when "Score received on an exam (measured in percentage points)" ( Y) is regressed on "percentage attendance" ( X) for 22 students in a Statistics for Business and Economics course. Referring to Table 13-6, which of the following statements is true? 14.26% of the total variability in score received can be explained by percentage attendance. 14.2% of the total variability in percentage attendance can be explained by score received. 2% of the total variability in score received can be explained by percentage attendance. 2% of the total variability in percentage attendance can be explained by score received.
2% of the total variability in score received can be explained by percentage attendance.
TABLE 14-4 A real estate builder wishes to determine how house size (House) is influenced by family income (Income), family size (Size), and education of the head of household (School). House size is measured in hundreds of square feet, income is measured in thousands of dollars, and education is in years. The builder randomly selected 50 families and ran the multiple regression. Microsoft Excel output is provided below: Referring to Table 14-4, what fraction of the variability in house size is explained by income, size of family, and education? 27.0% 33.4% 74.8% 86.5%
74.8%
TABLE 13-11 A computer software developer would like to use the number of downloads (in thousands) for the trial version of his new shareware to predict the amount of revenue (in thousands of dollars) he can make on the full version of the new shareware. Following is the output from a simple linear regression along with the residual plot and normal probability plot obtained from a data set of 30 different sharewares that he has developed: Referring to Table 13-11, which of the following is the correct interpretation for the coefficient of determination? Answers: 74.67% of the variation in revenue can be explained by the variation in the number of downloads. 75.54% of the variation in revenue can be explained by the variation in the number of downloads. 74.67% of the variation in the number of downloads can be explained by the variation in revenue. 75.54% of the variation in the number of downloads can be explained by the variation in revenue.
75.54% of the variation in revenue can be explained by the variation in the number of downloads.
The strength of the linear relationship between two numerical variables may be measured by the scatter plot. coefficient of correlation. slope. Y-intercept.
coefficient of correlation.
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model: where Y = demand (in thousands) and X = retail price per carat. This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis is shown below: SUMMARY OUTPUT Regression Statistics Multiple R 0.994 R Square 0.988 Standard Error 12.42 Observations 12 ANOVA df SS MS F p-value Regression 2 115145 57573 373 0.0001 Residual 9 1388 154 Total 11 116533 Coeff StdError t Stat P-value Intercept 286.42 9.66 29.64 0.0001 Price - 0.31 0.06 - 5.14 0.0006 Price Sq 0.000067 0.00007 0.95 0.3647 what is the correct interpretation of the coefficient of multiple determination? Answers: a. 98.8% of the total variation in demand can be explained by the linear relationship between demand and price. b. 98.8% of the total variation in demand can be explained by the quadratic relationship between demand and price. c. 98.8% of the total variation in demand can be explained by the addition of the square term in price. d. 98.8% of the total variation in demand can be explained by just the square term in price.
98.8% of the total variation in demand can be explained by the quadratic relationship between demand and price
In selecting an appropriate forecasting model, the following approaches are suggested: a. All of the above b. Use the principle of parsimony c. Perform a residual analysis d. Measure the size of the forecasting error
All of the above
A microeconomist wants to determine how corporate sales are influenced by capital and wage spending by companies. She proceeds to randomly select 26 large corporations and record information in millions of dollars. A statistical analyst discovers that capital spending by corporations has a significant inverse relationship with wage spending. What should the microeconomist who developed this multiple regression model be particularly concerned with? Answers: a. Normality of residuals b. Randomness of error term c. Collinearity d. Missing observations
Collinearity
Each forecast using the method of weighted average smoothing depends on all the previous observations in the time series Answers: True False
False
In stepwise regression, an independent variable is not allowed to be removed from the model once it has entered into the model. Answers: True False
False
True or False: Collinearity is present when there is a high degree of correlation between the dependent variable and any of the independent variables. Answers: True False
False
TABLE 13-11 A computer software developer would like to use the number of downloads (in thousands) for the trial version of his new shareware to predict the amount of revenue (in thousands of dollars) he can make on the full version of the new shareware. Following is the output from a simple linear regression along with the residual plot and normal probability plot obtained from a data set of 30 different sharewares that he has developed: Referring to Table 13-11, which of the following is the correct null hypothesis for testing whether there is a linear relationship between revenue and the number of downloads? H0: b1 = 0 H0: b1 0 H0: 1 = 0 H0: 1 0
H0: 1 = 0
To explain personal consumption (CONS) measured in dollars, data is collected for > A regression analysis was performed with CONS as the dependent variable and ln(CRDTLIM), ln(APR), ln(ADVT), and GENDER as the independent variables. The estimated model was Y = 2.28 - 0.29 1n(CRDTLIM) + 5.77 1n(APR) + 2.35 In(ADVT) + 0.39 GENDER, with 0 being used as an index for males and 1 as an index for females. What is the correct interpretation for the estimated coefficient for GENDER? Answers: Holding the effect of the other independent variables constant, mean personal consumption for females is estimated to be $0.39 higher than males. Holding the effect of the other independent variables constant, mean personal consumption for males is estimated to be $0.39 higher than females. Holding the effect of the other independent variables constant, mean personal consumption for females is estimated to be 0.39% higher than males. Holding the effect of the other independent variables constant, mean personal consumption for males is estimated to be 0.39% higher than females.
Holding the effect of the other independent variables constant, mean personal consumption for females is estimated to be $0.39 higher than males.
TABLE 13-6 The following Excel tables are obtained when "Score received on an exam (measured in percentage points)" ( Y) is regressed on "percentage attendance" ( X) for 22 students in a Statistics for Business and Economics course. Referring to Table 13-6, which of the following statements is true? If attendance increases by 0.341%, the estimated mean score received will increase by 1 percentage point. If attendance increases by 1%, the estimated mean score received will increase by 39.39 percentage points. If attendance increases by 1%, the estimated mean score received will increase by 0.341 percentage points. If the score received increases by 39.39%, the estimated mean attendance will go up by 1%.
If attendance increases by 1%, the estimated mean score received will increase by 0.341 percentage points.
TABLE 14-4 A real estate builder wishes to determine how house size (House) is influenced by family income (Income), family size (Size), and education of the head of household (School). House size is measured in hundreds of square feet, income is measured in thousands of dollars, and education is in years. The builder randomly selected 50 families and ran the multiple regression. Microsoft Excel output is provided below: Referring to Table 14-4, at the 0.01 level of significance, what conclusion should the builder reach regarding the inclusion of Income in the regression model? Selected Answer: Answers: Income is significant in explaining house size and should be included in the model because its p-value is less than 0.01. Income is significant in explaining house size and should be included in the model because its p-value is more than 0.01. Income is not significant in explaining house size and should not be included in the model because its p-value is less than 0.01. Income is not significant in explaining house size and should not be included in the model because its p-value is more than 0.01.
Income is significant in explaining house size and should be included in the model because its p-value is less than 0.01.
real estate builder wishes to determine how house size (House) is influenced by family income (Income), family size (Size), and education of the head of household (School). House size is measured in hundreds of square feet, income is measured in thousands of dollars, and education is in years. The builder randomly selected 50 families and ran the multiple regression. Microsoft Excel output is provided below: Referring to Table 14-4, which of the independent variables in the model are significant at the 5% level? Income, Size, School Income, Size Size, School Income, School
Income, Size
Which of the following is not an advantage of exponential smoothing? Answers: a. It enables us to smooth out cyclical components b. It enables us to perform more than one-period ahead forecasting c. It enables us to perform one-period ahead forecasting d. It enables us to smooth out seasonal components
It enables us to perform more than one-period ahead forecasting
TABLE 14-6 One of the most common questions of prospective house buyers pertains to the cost of heating in dollars ( Y). To provide its customers with information on that matter, a large real estate firm used the following 4 variables to predict heating costs: the daily minimum outside temperature in degrees of Fahrenheit ( X 1) the amount of insulation in inches ( X 2), the number of windows in the house ( X 3), and the age of the furnace in years ( X 4). Given below are the Excel outputs of two regression models. Model 1 Model 2 Referring to Table 14-6, what is your decision and conclusion for the test H 0: 2 = 0 vs H 1: 2 < 0 at the = 0.01 level of significance using Model 1? Do not reject H0 and conclude that the amount of insulation has a linear effect on heating cots. Reject H0 and conclude that the amount of insulation does not have a linear effect on heating costs. Reject H0 and conclude that the amount of insulation has a negative linear effect on heating costs. Do not reject H0 and conclude that the amount of insulation has a negative linear effect on heating costs.
Reject H0 and conclude that the amount of insulation has a negative linear effect on heating costs.
The Chancellor of a university has commissioned a team to collect data on students' GPAs and the amount of time they spend bar hopping every week (measured in minutes). He wants to know if imposing much tougher regulations on all campus bars to make it more difficult for students to spend time in any campus bar will have a significant impact on general students' GPAs. His team should use a t test on the slope of the population regression. (T/F)
True
True or False: The goals of model building are to find a good model with the fewest independent variables that is easier to interpret and has lower probability of collinearity. Answers: True False
True
True or False: The principle of parsimony indicates that the simplest model that gets the job done adequately should be used Answers: True False
True
The effect of an unpredictable, rare event will be contained in the ___________ component. Answers: a. cyclical b. irregular c. seasonal d. trend
irregular
In a simple linear regression problem, r and b 1 may have opposite signs. must have the same sign. must have opposite signs. are equal.
must have the same sign.
In multiple regression, the __________ procedure permits variables to enter and leave the model at different stages of its development. Answers: a. best subsets b. forward selection c. backward elimination d. stepwise regression
stepwise regression
A regression diagnostic tool used to study the possible effects of collinearity is Answers: a. the slope b. the Y-intercept c. the VIF d. the standard error of the estimate
the VIF
When using the exponentially weighted moving average for purposes of forecasting rather than smoothing a. the current smoothed value becomes the forecast b. None of the above c. the previous smoothed value becomes the forecast d. the next smoothed value becomes the forecast
the current smoothed value becomes the forecast
The residuals represent the difference between the actual Y values and the mean of Y. the difference between the actual Y values and the predicted Y values. the square root of the slope. the predicted value of Y for the average X value.
the difference between the actual Y values and the predicted Y values
The slope ( b 1) represents predicted value of Y when X = 0. the estimated average change in Y per unit change in X. the predicted value of Y. variation around the line of regression.
the estimated average change in Y per unit change in X.