Chapter 9
A researcher estimates the effect on crime rates of spending on police by using city-level data. Which of the following represents simultaneous causality? A. Cities with high crime rates may need a larger police force, and thus more spending. More police spending, in turn, reduces crime. B. More spending on police causes the crime rate to decrease. However, there is large measurement error in spending reports, which makes it difficult to identify this causal relationship. C. There is unobserved officer quality, which is correlated with more spending and lower crime rates. D. The researcher only has data on cities with both high spending on police and low crime rate.
A
An ordinary least squares regression of Y onto X will be internally inconsistent if X is correlated with the error term. A. True. B. False.
A
Each of the five primary threats to internal validity implies that X is correlated with the error term. A. True. B. False.
A
Sample selection bias A. occurs when a selection process influences the availability of data and that process is related to the dependent variable. B. results in the OLS estimator being biased, although it is still consistent. C. is only important for finite sample results. D. is more important for nonlinear least squares estimation than for OLS.
A
A statistical analysis is internally valid if A. all t−statistics are greater than |1.96| B. the statistical inferences about causal effects are valid for the population studied C. the regression R2 > 0.05 D. the population is small, say less than 2,000, and can be observed
B
A study based on OLS regressions is internally valid if A. weighted least squares produces similar results, and the t−statistic is normally distributed in large samples. B. the OLS estimator is unbiased and consistent, and the standard errors are computed in a way that makes confidence intervals have the desired confidence level. C. the errors are homoskedastic, and there are no more than two binary variables present among the regressors. D. you use a two−sided alternative hypothesis, and standard errors are calculated using the heteroskedasticity−robust formula.
B
Simultaneous causality bias A. happens in complicated systems of equations called block recursive systems. B. arises in a regression of Y on X when, in addition to the causal link of interest from X to Y, there is a causal link from Y to X. C. results in biased estimators if there is heteroskedasticity in the error term. D. is also called sample selection bias.
B
The components of internal validity are A. nonstochastic explanatory variables, and prediction intervals close to the sample mean. B. unbiasedness and consistency of the estimator, and desired significance level of hypothesis testing. C. a regression R2 above 0.75 and serially uncorrelated errors. D. a large sample, and BLUE property of the estimator.
B
The reliability of a study using multiple regression analysis depends on all of the following with the exception of A. errors−in−variables. B. presence of homoskedasticity in the error term. C. omitted variable bias. D. external validity.
B
What is the difference between internal validity and external validity? A. A statistical analysis is said to have external validity if the statistical inferences about causal effects are valid for the population being studied. The analysis is said to have internal validity if conclusions can be generalized to other populations and settings. B. A statistical analysis is said to have internal validity if the statistical inferences about causal effects are valid for the population being studied. The analysis is said to have external validity if conclusions can be generalized to other populations and settings. C. A statistical analysis is said to have internal validity if the statistical inferences about causal effects can only be verified by a few researchers. The analysis is said to have external validity if conclusions can be verified by many researchers. D. Internal validity and external validity are equivalent.
B
What is the difference between the population studied and the population of interest? A. The population studied is mandated by the government, while the population of interest is chosen freely. B. The population studied is the population from which the sample was drawn, while the population of interest is the population to which causal inferences from this study are to be applied. C. The population of interest is the population from which the sample was drawn, while the population studied is the population to which causal inferences from this study are to be applied. D. The population studied and the population of interest are always equivalent.
B
What is the trade-off when including an extra variable in a regression? A. Including an extra variable could make estimated coefficients more significant, but it always decreases the regression R2. B. An extra variable could control for omitted variable bias, but it also increases the variance of other estimated coefficients. C. Including an extra variable always makes estimated coefficients more significant, but it makes the results much harder to interpret. D. Including an extra variable always makes estimated coefficients more significant, but it also introduces multicollinearity.
B
When regression models are used for predictions, concerns about: A. unbiased estimation are more important than concerns about external validity. B. external validity are important. C. unbiased estimation and external validity are both equally important. D. unbiased estimation are very important.
B
A researcher estimates a regression using two different software packages. The first uses the homoskedasticity-only formula for standard errors. The second uses theheteroskedasticity-robust formula. The standard errors are very different. Which should the researcher use? A. The homoskedasticity-only only standard errors should be used. B. In this case, both the homoskedasticity-only and the heteroskedasticity-robust are equivalent. C. The heteroskedasticity-robust standard errors should be used. D. The researcher would need more information to answer this question.
C
A statistical analysis is internally valid if A. the hypothesized parameter value is inside the confidence interval. B. statistical inference is conducted inside the sample period. C. the statistical inferences about causal effects are valid for the population being studied. D. its inferences and conclusions can be generalized from the population and setting studied to other populations and settings.
C
Applying the analysis from the California test scores to another U.S. state is an example of looking for A. internal validity. B. simultaneous causality bias. C. external validity. D. sample selection bias.
C
Errors−in−variables bias A. is particularly severe when the source is an error in the measurement of the dependent variable. B. becomes larger as the variance in the explanatory variable increases relative to the error variance. C. arises from error in the measurement of the independent variable. D. is only a problem in small samples.
C
The analysis is externally valid if A.the study has passed a double blind refereeing process for a journal. B.the statistical inferences about causal effects are valid for the population being studied. C.its inferences and conclusions can be generalized from the population and setting studied to other populations and settings. D.some committee outside the author's department has validated the findings.
C
You try to explain the number of IBM shares traded in the stock market per day in 2005. As an independent variable you choose the closing price of the share. This is an example of A. sample selection bias since you should analyze more than one stock. B. invalid inference due to a small sample size. C. simultaneous causality. D. a situation where homoskedasticity−only standard errors should be used since you only analyze one company.
C
Comparing the California test scores to test scores in Massachusetts is appropriate for external validity if A. the student−to−teacher ratio did not differ by more than five on average. B. the two income distributions were very similar. C. Massachusetts also allowed beach walking to be an appropriate P.E. activity. D. the institutional settings in California and Massachusetts, such as organization in classroom instruction and curriculum, were similar in the two states.
D
Correlation of the regression error across observations A. makes the OLS estimator inconsistent, but not unbiased. B. is not a problem in cross−sections since the data can always be "reshuffled." C. results in correct OLS standard errors if heteroskedasticity−robust standard errors are used. D. results in incorrect OLS standard errors.
D
Reliable prediction using multiple regression has all of the following requirements, with the exception of: A. when the aim is to estimate a causal effect, it is important to choose control variables to reduce the threat of omitted variable bias. B. data used to estimate the prediction model and the observations for which the prediction is to be made are drawn from the same distribution. C. if there are many predictors, then there are some estimators that can provide more accurate out-of-sample predictions than OLS. D. you should use homoskedasticity-only standard errors since these are often smaller than heteroskedasticity-robust standard errors.
D
The true causal effect might not be the same in the population studied and the population of interest because A. of differences in characteristics of the population B. of geographical differences C. the study is out of date D. all of the above
D