bnal test 2 quiz
The following Excel tables are obtained when "Score received on an exam (measured in percentage points)" (Y) is regressed on "percentage attendance" (X) for 22 students in a Statistics for Business and Economics course. Referring to Table 13-6, which of the following statements is true? a. 2% of the total variability in score received can be explained by percentage attendance. b. 14.26% of the total variability in percentage attendance can be explained by score received. c. 14.26% of the total variability in score received can be explained by percentage attendance. d. 2% of the total variability in percentage attendance can be explained by score received.
2% of the total variability in score received can be explained by percentage attendance.
A real estate builder wishes to determine how house size (House) is influenced by family income (Income), family size (Size), and education of the head of household (School). House size is measured in hundreds of square feet, income is measured in thousands of dollars, and education is in years. The builder randomly selected 50 families and ran the multiple regression. Microsoft Excel output is provided below: Referring to Table 14-4, what fraction of the variability in house size is explained by income, size of family, and education? a. 74.8% b. 27.0% c. 33.4% d. 86.5%
74.8%
A candy bar manufacturer is interested in trying to estimate how sales are influenced by the price of their product. To do this, the company randomly chooses 6 small cities and offers the candy bar at different prices. Using candy bar sales as the dependent variable, the company will conduct a simple linear regression on the data below: Referring to Table 13-2, what is the percentage of the total variation in candy bar sales explained by the regression model? a. 100% b. 88.54% c. 78.39% d. 48.19%
78.39%
The y-intercept (B0) represents the a. predicted value of Y when X=0 b. change in estimated Y per unit change in X c. predicted value of Y d. variation around the sample regression line
a. predicted value of Y when X=0
A microeconomist wants to determine how corporate sales are influenced by capital and wage spending by companies. She proceeds to randomly select 26 large corporations and record information in millions of dollars. A statistical analyst discovers that capital spending by corporations has a significant inverse relationship with wage spending. What should the microeconomist who developed this multiple regression model be particularly concerned with? a. Missing observations b. Normality of residuals c. Collinearity d. Randomness of error term
c. Collinearity
The following table contains the number of complaints received in a department store for the first 6 months of last year. Month Complaints January 36 February 45 March 81 April 90 May 108 June 144 If a three-term moving average is used to smooth this series, what would be the second calculated term? a. 36 b. 40.5 c. 54 d. 72
72
A real estate builder wishes to determine how house size (House) is influenced by family income (Income), family size (Size), and education of the head of household (School). House size is measured in hundreds of square feet, income is measured in thousands of dollars, and education is in years. The builder randomly selected 50 families and ran the multiple regression. Microsoft Excel output is provided below: Referring to Table 14-4, at the 0.01 level of significance, what conclusion should the builder reach regarding the inclusion of Income in the regression model? a. Income is significant in explaining house size and should be included in the model because its p-value is less than 0.01. b. Income is significant in explaining house size and should be included in the model because its p-value is more than 0.01. c. Income is not significant in explaining house size and should be included in the model because its p-value is less than 0.01. d. Income is not significant in explaining house size and should be included in the model because its p-value is less than 0.01.
Income is significant in explaining house size and should be included in the model because its p-value is less than 0.01.
One of the most common questions of prospective house buyers pertains to the cost of heating in dollars (Y). To provide its customers with information on that matter, a large real estate firm used the following 4 variables to predict heating costs: the daily minimum outside temperature in degrees of Fahrenheit (X1) the amount of insulation in inches (X2), the number of windows in the house (X3), and the age of the furnace in years (X4). Given below are the Excel outputs of two regression models. Model 1 Model 2 Referring to Table 14-6, what is your decision and conclusion for the test H0: 2 = 0 vs H1: 2 < 0 at the = 0.01 level of significance using Model 1? a. Do not reject H0 and conclude that the amount of insulation has a negative linear effect on heating costs. b. Reject H0 and conclude that the amount of insulation has a negative linear effect on heating costs. c. Reject H0 and conclude that the amount of insulation does not have a linear effect on heating costs. d. Do not reject H0 and conclude that the amount of insulation has a linear effect on heating costs.
Reject H0 and conclude that the amount of insulation has a negative linear effect on heating costs.
In a multiple regression problem involving two independent variables, if b1 is computed to be +2.0, it means that a. the relationship between X1 and Y is significant b. the estimated mean of Y increases by 2 units for each increase of 1 unit of X1, without holding X2 constant c. the estimated mean of Y increases by 2 units for each increase of 1 unit of X1, without regard to X2 d. the estimated mean of Y is 2 when X1 equals zero
b. the estimated mean of Y increases by 2 units for each increase of 1 unit of X1, without holding X2 constant
One of the most common questions of prospective house buyers pertains to the cost of heating in dollars (Y). To provide its customers with information on that matter, a large real estate firm used the following 4 variables to predict heating costs: the daily minimum outside temperature in degrees of Fahrenheit (X1) the amount of insulation in inches (X2), the number of windows in the house (X3), and the age of the furnace in years (X4). Given below are the Excel outputs of two regression models. Model 1 Model 2 Referring to Table 14-6, the estimated value of the partial regression parameter 1 in Model 1 means that a. holding the effect of the other independent variables constant, a 1 degree increase in the daily minimum outside temperature results in an estimated decrease in mean heating costs by $4.51. b. holding the effect of the other independent variables constant, a 1 degree increase in the daily minimum outside temperature results in a decrease in heating costs by $4.51. c. holding the effect of the other independent variables constant, a 1% increase in the daily minimum outside temperature results in an estimated decrease in mean heating costs by 4.51%. d. holding the effect of the other independent variables constant, an estimated expected $1 increase in heating costs is associated with a decrease in the daily minimum outside temperature by 4.51 degrees
holding the effect of the other independent variables constant, a 1 degree increase in the daily minimum outside temperature results in an estimated decrease in mean heating costs by $4.51.
The following table contains the number of complaints received in a department store for the first 6 months of last year. Month Complaints January 36 February 45 March 81 April 90 May 108 June 144 Suppose the last two smoothed values are 81 and 96 (Note: they are not). What would you forecast as the value of the time series for July? a. 81 b. 86 c. 91 d. 96
96
A candy bar manufacturer is interested in trying to estimate how sales are influenced by the price of their product. To do this, the company randomly chooses 6 small cities and offers the candy bar at different prices. Using candy bar sales as the dependent variable, the company will conduct a simple linear regression on the data below: Referring to Table 13-2, what is the coefficient of correlation for these data? a. -0.8854 b. -0.7839 c. 0.7839 d. 0.8854
-0.8854
A candy bar manufacturer is interested in trying to estimate how sales are influenced by the price of their product. To do this, the company randomly chooses 6 small cities and offers the candy bar at different prices. Using candy bar sales as the dependent variable, the company will conduct a simple linear regression on the data below: Referring to Table 13-2, what is the estimated mean change in the sales of the candy bar if price goes up by $1.00? a. 161.386 b. 0.784 c. -3.810 d. -48.193
-48.193
An economist is interested to see how consumption for an economy (in $ billions) is influenced by gross domestic product ($ billions) and aggregate price (consumer price index). The Microsoft Excel output of this regression is partially reproduced below. SUMMARY OUTPUT Referring to Table 14-3, to test for the significance of the coefficient on gross domestic product, the p-value is a. 0.9999 b. 0.8837 c. 0.8330 d. 0.0001
0.0001
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model: where Y = demand (in thousands) and X = retail price per carat. This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis is shown below: SUMMARY OUTPUT Regression Statistics Multiple R 0.994 R Square 0.988 Standard Error 12.42 Observations 12 ANOVA df SS MS F p-value Regression 2 115145 57573 373 0.0001 Residual 9 1388 154 Total 11 116533 Coeff StdError t Stat P-value Intercept 286.42 9.66 29.64 0.0001 Price - 0.31 0.06 - 5.14 0.0006 Price Sq 0.000067 0.00007 0.95 0.3647 what is the p-value associated with the test statistic for testing whether there is an upward curvature in the response curve relating the demand (Y) and the price (X)? a. 0.3647 b. 0.0006 c. 0.0001 d. None of the above
0.3647
An economist is interested to see how consumption for an economy (in $ billions) is influenced by gross domestic product ($ billions) and aggregate price (consumer price index). The Microsoft Excel output of this regression is partially reproduced below. SUMMARY OUTPUT Referring to Table 14-3, to test for the significance of the coefficient on aggregate price index, the p-value is a. 0.0001 b. 0.8330 c. 0.8837 d. 0.9999
0.8330
A professor of industrial relations believes that an individual's wage rate at a factory (Y) depends on his performance rating (X1) and the number of economics courses the employee successfully completed in college (X2). The professor randomly selects 6 workers and collects the following information: Referring to Table 14-2, suppose an employee had never taken an economics course and managed to score a 5 on his performance rating. What is his estimated expected wage rate? a. 10.90 b. 12.20 c. 24.87 d. 25.70
12.20
The following table contains the number of complaints received in a department store for the first 6 months of last year. Month Complaints January 36 February 45 March 81 April 90 May 108 June 144 If this series is smoothed using exponential smoothing with a smoothing constant of 1/3, what would be the first term? a. 36 b. 39 c. 42 d. 45
36
The following table contains the number of complaints received in a department store for the first 6 months of last year. Month Complaints January 36 February 45 March 81 April 90 May 108 June 144 If this series is smoothed using exponential smoothing with a smoothing constant of 1/3, what would be the second term? a. 53 b. 45 c. 42 d. 39
39
A candy bar manufacturer is interested in trying to estimate how sales are influenced by the price of their product. To do this, the company randomly chooses 6 small cities and offers the candy bar at different prices. Using candy bar sales as the dependent variable, the company will conduct a simple linear regression on the data below: Referring to Table 13-2, if the price of the candy bar is set at $2, the estimated mean sales will be a. 30 b. 65 c. 90 d. 100
65
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model: where Y = demand (in thousands) and X = retail price per carat. This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis is shown below: SUMMARY OUTPUT Regression Statistics Multiple R 0.994 R Square 0.988 Standard Error 12.42 Observations 12 ANOVA df SS MS F p-value Regression 2 115145 57573 373 0.0001 Residual 9 1388 154 Total 11 116533 Coeff StdError t Stat P-value Intercept 286.42 9.66 29.64 0.0001 Price - 0.31 0.06 - 5.14 0.0006 Price Sq 0.000067 0.00007 0.95 0.3647 what is the correct interpretation of the coefficient of multiple determination? a. 98.8% of the total variation in demand can be explained by the linear relationship between demand and price b. 98.8% of the total variation in demand can be explained by the quadratic relationship between demand and price c. 98.8% of the total variation in demand can be explained by the addition of the square term in price d. 98.8% of the total variation in demand can be explained by just the square term in price
98.8% of the total variation in demand can be explained by the quadratic relationship between demand and price
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected. ATTENDANCE = Paid attendance for the game TEMP = High temperature for the day WIN% = Team's winning percentage at the time of the game OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND − 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION − 1 = if a promotion was held; 0 = if no promotion was held The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below. The coefficient of multiple determination of each of the 5 predictors with all the other remaining predictors are, respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308. What is the correct interpretation for the estimated coefficient for TEMP? a. As the high temperature increases by one degree, the estimated mean paid attendance will increase by 51.70 b. As the high temperature increases by one degree, the paid attendance will increase by 51.70 c. As the temperature increases by one degree, the paid attendance will increase by 51.70 taking into consideration all the other independent variables included in the model d. As the high temperature increases by one degree, the estimated mean paid attendance will increase by 51.70 taking into consideration all the other independent variables included in the model.
As the high temperature increases by one degree, the estimated mean paid attendance will increase by 51.70 taking into consideration all the other independent variables included in the model.
True or False: Collinearity is present when there is a high degree of correlation between the dependent variable and any of the independent variables.
False
True or False: Each forecast using the method of weighted average smoothing depends on all the previous observations in the time series
False
True or False: In stepwise regression, an independent variable is not allowed to be removed from the model once it has entered into the model
False
True or False: So that we can fit curves as well as lines by regression, we often use mathematical manipulations for converting one variable into a different form. These manipulations are called dummy variables
False
True or False: The stepwise regression approach takes into consideration all possible models
False
A computer software developer would like to use the number of downloads (in thousands) for the trial version of his new shareware to predict the amount of revenue (in thousands of dollars) he can make on the full version of the new shareware. Following is the output from a simple linear regression along with the residual plot and normal probability plot obtained from a data set of 30 different sharewares that he has developed: Referring to table 13-11, which of the following is the correct interpretation for the slope coefficient? a. For each increase of 1 thousand downloads, the expected revenue is estimated to increase by $ 3.7297 thousands. b. For each decrease of 1 thousand downloads, the expected revenue is estimated to increase by $ 3.7297 thousands. c. For each decrease of 1 thousand dollars in expected revenue, the expected number of downloads is estimated to increase by 3.7297 thousands d. For each increase of 1 thousand dollars in expected revenue, the expected number of downloads is estimated to increase by 3.7297 thousands
For each increase of 1 thousand downloads, the expected revenue is estimated to increase by $ 3.7297 thousands.
The following Excel tables are obtained when "Score received on an exam (measured in percentage points)" (Y) is regressed on "percentage attendance" (X) for 22 students in a Statistics for Business and Economics course. Referring to Table 13-6, which of the following statements is true? a. If attendance increases by 0.341% , the estimated mean score received will increase by 1 percentage point. b. If attendance increases by 1%, the estimated mean score received will increase by 39.39 percentage points. c. If attendance increases by 1%, the estimated mean score received will increase by 0.341 percentage points. d. If the score received increases by 39.39%, the estimated mean attendance will go up by 1%.
If attendance increases by 1%, the estimated mean score received will increase by 0.341 percentage points.
A real estate builder wishes to determine how house size (House) is influenced by family income (Income), family size (Size), and education of the head of household (School). House size is measured in hundreds of square feet, income is measured in thousands of dollars, and education is in years. The builder randomly selected 50 families and ran the multiple regression. Microsoft Excel output is provided below: Referring to Table 14-4, which of the independent variables in the model are significant at the 5% level? a. Income, Size b. Income, Size, School c. Size, School d. Income, School
Income, Size
True of False: The Chancellor of a university has commissioned a team to collect data on students' GPAs and the amount of time they spend bar hopping every week (measured in minutes). He wants to know if imposing much tougher regulations on all campus bars to make it more difficult for students to spend time in any campus bar will have a significant impact on general students' GPAs. His team should use a t test on the slope of the population regression.
True
True or False: In selecting a forecasting model, we should perform a residual analysis.
True
True or False: The goals of model building are to find a good model with the fewest independent variables that is easier to interpret and has lower probability of collinearity.
True
True or False: The principle of parsimony indicates that the simplest model that gets the job done adequately should be used
True
The slope (b1) represents a. the estimated average change in Y per unit change in X b. predicted value of Y when X=0 c. the predicted value of Y d. Variation around the line of regression
a. the estimated average change in Y per unit change in X
A computer software developer would like to use the number of downloads (in thousands) for the trial version of his new shareware to predict the amount of revenue (in thousands of dollars) he can make on the full version of the new shareware. Following is the output from a simple linear regression along with the residual plot and normal probability plot obtained from a data set of 30 different sharewares that he has developed: Referring to Table 13-11, which of the following is the correct interpretation for the coefficient of determination? a. 74.67% of the variation in revenue can be explained by the variation in the number of downloads. b. 75.54% of the variation in revenue can be explained by the variation in the number of downloads. c. 74.67% of the variation in the number of downloads can be explained by the variation in revenue d. 75.54% of the variation in the number of downloads can be explained by the variation in revenue
b. 75.54% of the variation in revenue can be explained by the variation in the number of downloads.
The residual represents the discrepancy between the observed dependent variable and its _______ value. a. the difference between the actual Y values and the mean of Y b. the difference between the actual Y values and the predicted Y values c. the square of the slope d. the predicted value of Y for the average X value
b. the difference between the actual Y values and the predicted Y values
A dummy variable is used as an independent variable in a regression model when a. the variable involved is numerical b. the variable involved is categorical c. a curvilinear relationship is suspected d. when 2 independent variables interact
b. the variable involved is categorical
A professor of industrial relations believes that an individual's wage rate at a factory (Y) depends on his performance rating (X1) and the number of economics courses the employee successfully completed in college (X2). The professor randomly selects 6 workers and collects the following information: Referring to Table 14-2, for these data, what is the estimated coefficient for performance rating, b1? a. 9.103 b. 6.932 c. 1.054 d. 0.616
c. 1.054
A professor of industrial relations believes that an individual's wage rate at a factory (Y) depends on his performance rating (X1) and the number of economics courses the employee successfully completed in college (X2). The professor randomly selects 6 workers and collects the following information: Referring to Table 14-2, an employee who took 12 economics courses scores 10 on the performance rating. What is her estimated expected wage rate? a. 10.90 b. 12.20 c. 24.87 d. 25.70
c. 24.87
A professor of industrial relations believes that an individual's wage rate at a factory (Y) depends on his performance rating (X1) and the number of economics courses the employee successfully completed in college (X2). The professor randomly selects 6 workers and collects the following information: Referring to Table 14-2, for these data, what is the value for the regression constant, b0? a. 0.616 b. 1.054 c. 6.932 d. 9.103
c. 6.932
A computer software developer would like to use the number of downloads (in thousands) for the trial version of his new shareware to predict the amount of revenue (in thousands of dollars) he can make on the full version of the new shareware. Following is the output from a simple linear regression along with the residual plot and normal probability plot obtained from a data set of 30 different sharewares that he has developed: Referring to Table 13-11, which of the following is the correct null hypothesis for testing whether there is a linear relationship between revenue and the number of downloads? a. H0: b1=0 b. H0: b1≠0 c. H0: B1=0 d. H0: B1≠0
c. H0: B1=0
Which of the following is not an advantage of exponential smoothing? a. It enables us to smooth out cyclical components b. It enables us to perform one-period ahead forecasting c. It enables us to perform more than one-period ahead forecasting d. It enables us to smooth out seasonal components
c. It enables us to perform more than one-period ahead forecasting
The strength of the linear relationship between two numerical variables may be measured by the a. slope b. scatter plot c. coefficient of correlation d. Y-intercept
c. coefficient of correlation
A regression diagnostic tool used to study the possible effects of collinearity is a. the standard error of the estimate b. the slope c. the VIF d. the Y-intercept
c. the VIF
When using the exponentially weighted moving average for purposes of forecasting rather than smoothing a. the previous smoothed value become the forecast b. the next smoothed value becomes the forecast c. the current smoothed value becomes the forecast d. none of the above
c. the current smoothed value becomes the forecast
The method of moving averages is used a. to exponentiate a series b. to plot a series c. to smooth a series d. in regression analysis
c. to smooth a series
A professor of industrial relations believes that an individual's wage rate at a factory (Y) depends on his performance rating (X1) and the number of economics courses the employee successfully completed in college (X2). The professor randomly selects 6 workers and collects the following information: Referring to Table 14-2, for these data, what is the estimated coefficient for the number of economics courses taken, b2? a. 9.103 b. 6.932 c. 1.054 d. 0.616
d. 0.616
In selecting an appropriate forecasting model, the following approaches are suggested: a. Measure the size of the forecasting error b. Perform a residual analysis c. Use the principle of parsimony d. All of the above
d. All of the above
To explain personal consumption (CONS) measured in dollars, data is collected for > A regression analysis was performed with CONS as the dependent variable and ln(CRDTLIM), ln(APR), ln(ADVT), and GENDER as the independent variables. The estimated model was Y = 2.28 - 0.29 1n(CRDTLIM) + 5.77 1n(APR) + 2.35 In(ADVT) + 0.39 GENDER, with 0 being used as an index for males and 1 as an index for females. What is the correct interpretation for the estimated coefficient for GENDER? a. Holding the effect of the other independent variables constant, mean personal consumption for females is estimated to be 0.39% higher than males. b. Holding the effect of the other independent variables constant, mean personal consumption for males is estimated to be $0.39 higher than females. c. Holding the effect of the other independent variables constant, mean personal consumption for males is estimated to be 0.39% higher than females. d. Holding the effect of the other independent variables constant, mean personal consumption for females is estimated to be $0.39 higher than males.
d. Holding the effect of the other independent variables constant, mean personal consumption for females is estimated to be $0.39 higher than males.
The effect of an unpredictable, rare event will be contained in the ___________ component. a. cyclical b. trend c. seasonal d. irregular
d. irregular
Which of the following statements about moving averages is not true? a. it gives equal weight to all values in the computation b. it is simpler than the method of exponential smoothing c. it can be used to smooth a series d. it gives greater weight to more recent data
d. it gives greater weight to more recent data
In similar linear regression problem, r and b1 a. are equal b. may have opposite signs c. must have opposite signs d. must have the same sign
d. must have the same sign
In multiple regression, the __________ procedure permits variables to enter and leave the model at different stages of its development. a. backward regression b. best subsets c. forward selection d. stepwise regression
d. stepwise regression
The overall upward or downward pattern of the data in an annual time series will be contained in the ___ component a. irregular b. seasonal c. cyclical d. trend
d. trend