Exam 2 - Data Analysis Quizzes
Serial correlation in the residuals of a time series regression can occur if you fail to include a relevant lag of the dependent variable as an explanatory variable in the regression. T/F?
True
The current value of the dependent variable cannot also be an explanatory variable in a regression. T/F?
True
Omission of a relevant (significant) explanatory variable can cause serial correlation in time series regression residuals. T/F?
True.
The dependent variable in a regression is number of minutes each patient at a medical practice spends in the waiting room before their appointment. There are several potential explanatory variables and a constant in the regression. The practice would like to include a set of dummy variables to examine whether waiting time varies depending on the day of the week. The practice is open only on the five days Monday through Friday. To capture the effects of day-of-the-week, the regression should include:
A set of four separate dummy variables with D1 coded as 1 for Tuesday and 0 for other days, D2 coded as 1 for Wednesday and 0 for other days, D3 coded as 1 for Thursday and 0 for other days, D4 coded as 1 for Friday and 0 for other days.
You cannot use both the current value of an explanatory variable and lags of that same explanatory variable in a regression. T/F?
False
If the plot of a time series displays no consistent upward or downward movement, then an ADF unit root test equation should include:
a constant but no trend
A population regression equation is: Y = β0 + β1Log(X1) + β2X2 + β3X3 Suppose that β3 is equal to 4.2. This coefficient estimate shows that:
after accounting for the influences of X1 and X2, a one point increase in X3 on average results in a 4.2 point increase in Y
A population regression equation is: log(Y) = β0 + β1Log(X1) + β2Log(X2) + β3X3 Suppose that β3 is equal to 6.8. This coefficient estimate shows that:
after accounting for the influences of X1 and X2, a one point increase in X3 on average results in a 6.8 percent increase in Y
A population regression equation is: Y = β0 + β1Log(X1) + β2X2 + β3X3 Suppose that β1 is equal to 1.2. This coefficient estimate shows that:
after accounting for the influences of X2 and X3, a 1% increase in X1 on average results in a 1.2 point increase in Y
A population regression equation is: log(Y) = β0 + β1Log(X1) + β2Log(X2) + β3X3 Suppose that β1 is equal to 2.5. This coefficient estimate shows that:
after accounting for the influences of X2 and X3, a one percent increase in X1 on average results in a 2.5 percent increase in Y
Suppose that you are running the statistical test of normality. The p-value of the test is 0.35 (35%). You are using a test size of 0.05 (5%). You should conclude that the stochastic errors:
are normally distributed.
The null hypothesis of the LM test for serial correlation is that the regression residuals:
are not serially correlated
Suppose you find that your regression estimates suffer from heteroscedasticity. As a result, the t-statistics of the beta coefficients and the associated p-values:
are unreliable and possibly may result in incorrect decisions to reject or fail to reject the null hypothesis.
A population regression equation is: y = β0 + β1x + β2x^2 + ε Suppose that β1 is equal to 1.2 and β2 is equal to -0.086. As X increases Y will:
at first increase but eventually begin to decrease
If serial correlation is due to misspecification of the regression equation, then the estimated beta coefficients are:
biased
If the plot of a time series displays consistent upward movement, then an ADF unit root test equation should include:
both a constant and a trend
The p-value of an ADF unit root test is equal to 0.015. This time series:
can be used as either the dependent variable or as the explanatory variable in a regression without differencing it.
The p-value of an ADF unit root test is equal to 0.75. You should:
conclude that the series is nonstationary and difference the series before using it in a regression.
Heteroscedasticity tends to occur in regression models using:
cross-sectional data
The dependent variable in a regression is number of minutes each patient at the ER of a medical institution waits before being taken for medical evaluation and care. There are several potential explanatory variables and a constant in the regression. The set of explanatory variables includes "Bristol" which is a dummy variable equal to 1 for patients visiting the Bristol hospital and a value of 0 otherwise. The set of explanatory variables also includes "Kingsport" which also is a dummy variable equal to 1 for patients visiting the Kingsport hospital and a value of 0 otherwise. The control for the "Bristol" and "Kingsport" dummies is "Johnson City". Suppose that the coefficient multiplying "Kingsport" is 1.1 with a p-value of 0.36. Test size is 5%. This result implies that on average after controlling for the effects of the other explanatory variables patients visiting the Kingsport ER
do not wait significantly longer than patients visiting the Johnson City ER.
Suppose that a regression equation is correctly specified; however, there is serial correlation in the residuals. In this case:
estimates of the beta coefficients are unbiased; however, the t-statistics and p-values of the coefficients may be incorrect
You should difference a time series if an ADF unit root test:
fails to reject Ho
Suppose that you are running the statistical test of normality. The null hypothesis of the test is that the stochasitc errors:
follow a normal distribution.
Suppose that you are examining sales revenue over time. You decide to log the series and then difference it. The differenced log series is approximately the:
growth rate of sales revenue
Suppose that you run the statistical test for heteroscedasticity. The null hypothesis of the test is:
homoscedasticity
Suppose that the plot of a time series shows that it is growing over time at a faster rate. In other words the graph is curved upward and becoming steeper over timer. Before running a unit root test you should:
log the series
The dependent variable in a regression is the dollar salary of attorneys in a large metropolitan area. There are several explanatory variables. One explanatory variable is Gender. Gender is a dummy variable with a value of 1 for males and 0 for females. Suppose that the estimated regression coefficient multiplying Gender is equal to 2493.32 with a p-value of 0.0079. Test size is 5%. This result implies that on average after controlling for the effects of the other explanatory variables:
male attorneys earn $2493.32 more than female attorneys
Suppose that after running the statistical test you find that the stochastic errors are not normally distributed. In this case the p-values calculated for t-statistics, the F-statistic, and Wald tests:
may be incorrect and cannot be used reliably to reject or fail to reject the null hypothesis.
The null hypothesis of an ADF unit root test is that the time series is:
nonstationary
The p-value of an ADF unit root test is equal to 0.015. You should:
reject Ho and conclude that the series is stationary.
Suppose that you are running an LM test for serial correlation. The p-value of the test statistic is equal to 0.0042. In this case you should:
reject the null hypothesis and conclude that there is serial correlation.
A marketing director would like to estimate the effect of advertising expenditures (AE) on the quantity of his product sold (Q) controlling for the effects of price (P) and average household income (I) in the area. The director also would like to control for whether the store was in the Northeast region, Southeast region; Midwest region, or Western region. He should include:
separate (0,1) dummy variables for three of the regions, keeping one region out of the regression as a control.
Suppose that you run the statistical test for heteroscedasticity. The p-value of the test is 0.02 (2%). You are using a test size of 0.05 (5%). You should conclude that your regression estimates:
suffer from heteroscedasticity.
One statistical test for heteroscedasticity is:
the Breusch-Pagan-Godfrey (BPG) test.
One statistical test for non-normality is:
the Jarque-Bera test.
The dependent variable in a regression is the dollar salary of attorneys in a large metropolitan area. There are several explanatory variables. One explanatory variable is Gender. Gender is a dummy variable with a value of 1 for males and 0 for females. Suppose that the estimated regression coefficient multiplying Gender is equal to 2493.32 with a p-value of 0.23. Test size is 5%. This result implies that on average after controlling for the effects of the other explanatory variables:
the salary of male attorneys does not significantly differ from female attorneys
Heteroscedasticity occurs when:
the variance of the stochastic disturbances is not constant.
Suppose you are running a simple regression with sales revenue (Revenue) as the dependent variable and advertising expenditures (Expend) as the explanatory variable. A scatterplot shows that Revenue is increasing faster than Expend. You should:
use 100*log(Revenue) as your dependent variable.
The dependent variable in a regression is number of minutes each patient at the ER of a medical institution waits before being taken for medical evaluation and care. There are several potential explanatory variables and a constant in the regression. The set of explanatory variables includes "Bristol" which is a dummy variable equal to 1 for patients visiting the Bristol hospital and a value of 0 otherwise. The set of explanatory variables also includes "Kingsport" which also is a dummy variable equal to 1 for patients visiting the Kingsport hospital and a value of 0 otherwise. The control for the "Bristol" and "Kingsport" dummies is "Johnson City". Suppose that the coefficient multiplying "Bristol" is 12.3 with a p-value of 0.015. Test size is 5%. This result implies that on average after controlling for the effects of the other explanatory variables patients visiting the Bristol ER
wait 12.3 minutes longer than patients visiting the Johnson City ER.