Econometrics

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

The adjusted R^2, or R^-2, is given by:

(1)-((n-1)/(n-k-1))((ssr)/(tss))

To measure the fit of the probit​ model, you​ should:

use the​ "fraction correctly​ predicted" or the​ "pseudo R2​."

In the case of a simple​ regression, where the independent variable is measured with i.i.d.​ error

βp1→σ2Xσ2X+σ2wβ1.

The TSLS estimator​ is:

consistent and has a normal distribution in large samples.

Consider the regression model Wage=β0+β1Female+u Where Female ​(=1 if​ female) is an indicator variable and u the error term. Identify the dependent and independent variables in the regression model above. Wage is the __________​variable, while Female is the__________ variable.

(1) dependent (2) independent

Workers in the South earn $_____ more per hour than workers in the west, on average, controlling for other variables in the regression.

0.28 (Column 3 South)

Consider the following regression output where the dependent variable is test scores and the two explanatory variables are the student-teacher ratio and the percent of English learners: Test Score (hat)= 698.9-1.10 *STR -0.650 * PctEL

0.43 (1.10/2.56)

The critical value of F4, infinity at the 5% significance level is:

2.37

Based on the multiple regression results in part (2), we could say that

27.9% of the variation in school is explained by the variables

Consider the estimated equation from your​ textbook: Test Score=698.9−2.28×STR, R2=0.051, SER=18.6 ​(10.4) (0.52) The t​-statistic for the slope is​ approximately:

4.38

In the model 1nYi=β0+β1Xi+μi​, the elasticity of E(Y X) with respect to X ​is: A. cannot be calculated because the function is nonlinear. B. β1X. C. β1Xβ0+β1X. D. β1.

B. β1X.

Let W be the included exogenous variables in a regression function that also has endogenous regressors ​(X​). The W variables​ can: A. make an instrument uncorrelated with μ. B. have the property EμiWi=0. C. be control variables. D. all of the above.

D) All of the above

In an instrumental variable regression model with one​ regressor, Xi​, and two​ instruments, Z1i and Z2i​, the value of the J​-statistic is J​ = 18.5. Which of the following statements is​ correct?

D. Eui∣Z1i, Z2i≠​0, but there is insufficient information to infer if Eui∣Z1i≠0.

Suppose that a state offered voluntary standardized tests to all its third graders and that these data were used in a study of class size on student performance. Which of the following would generate selection​ bias?

Schools with​ higher-achieving students could be more likely to volunteer to take the test.

Consider a panel data set and the following regression model. Yit=β0+β1Xit+uit What does subscript i refer​ to?

Subscript i identifies the entity

Consider a panel data set and the following regression model. Yit=β0+β1Xit+uit What do subscripts i and t refer​ to?

Subscripts i and t identify the entity and time period respectively.

Consider a panel data set and the following regression model. Yit=β0+β1Xit+uit What does subscript t refer​ to?

Subscripts t identifies the time period.

The Boston HMDA data set was collected by researchers at the Federal Reserve Bank of Boston. The data set combines information from mortgage applications and a​ follow-up-survey of the banks and other lending institutions that received these mortgage applications. The data pertain to mortgage applications made in 1990 in the greater Boston metropolitan area. The full data set has 2925​ observations, consisting of all mortgage applications by blacks and Hispanics plus a random sample of mortgage applications by whites. GRAPH

The marginal effect in column​ (1) is the estimated​ coefficient, whereas the marginal effects in columns​ (2) and​ (3) are not the estimated coefficients directly.

Why is the regressor West omitted from the regression? What would happen if it was included>

The regressor West is omitted to avoid perfect multicollinearity. If West is included, then the OLS estimator cannot be computed in the situation.

HAC standard errors and clustered standard errors are related as​ follows:

clustered standard errors are one type of HAC standard error.

A researcher plans to study the casual effect of police crime using data from a random sample of U.S. counties. He plans to regress the county's crime rate on the (per capita) size of the country's police force. Why is this regression likely to suffer from omitted variable bias?

There are other important determinants of a country's crime rate, including demographic characteristics of the population, that if left out of the regression would bias the estimated partial effect of the (per capita) size of the country's police force

The coefficient on DadColl from the regression in part (2) indicates that

This person would be expected to have 0.696 more years of schooling than the same person whose father did not have a college degree

Suppose you are interested in investigating the wage gender gap using data on earnings of men and women. Which of the following models best serves this​ purpose?

Wage=β0+β1Female+u​, where Female ​(=1 if​ female) is an indicator variable and u the error term.

The difference between an unbalanced and a balanced panel is that

an unbalanced panel contains missing observations for at least one time period or one entity.

Sally is a 26-year-old female college graduate. Betsy is a 42-year-old female college graduate. Constuct a confidence interval of 95% for the expected difference between their earnings. The 95% confidence interval for the expected difference between their earnings. The 95% confidence interval for the expected difference between their earnings is (_______,________)

dif age= 42-26= 16 years 95% confidence= dif age * age column 4 +- 1.96 * Age column 4 p2 = 16* (0.28+- 1.96 *0.04 = 3.23,5.73

The homoskedasticity-only F-statistic and the heteroskedasticity-robust F-statistic typically are:

different

When testing joint​ hypotheses, you can​ use:

either the F​-statistic or the​ chi-squared statistic.

Threats to internal validity lead​ to

failures of one or more of the least squares assumptions.

Panel data is also called

longitudinal data.

The logic of control variables in IV​ regression:

parallels the logic of control variables in OLS.

Nonlinear least​ squares:

solves the minimization of the sum of squared predictive mistakes through sophisticated mathematical​ routines, essentially by​ trial-and-error methods.

Probit coefficients are typically estimated​ using:

the method of maximum likelihood.

A statistical analysis is internally valid​ if:

the statistical inferences about causal effects are valid for the population studied.

If the estimates of the coefficients of interest change substantially across specifications

then this often provides evidence that the original specification had omitted variable bias

The notation for panel data is ​(Xit​, Yit​), i​ = 1, ​..., n and t​ = 1, ​..., T because

there are n entities and T time periods.

In the multiple regression model, the SER is given by

((1)/(n-k-1)) sum u(hat) ^2

What is the difference in the expected hourly earnings of a 25-year old male with a college degree as compared to a 30-year old female with a college degree?

0.74

Open the Excel data​ set, CPS08​, described in Empirical Exercise 4.1. The variables are described in the Word​ file, CPS_Description.docx. Regress average hourly earns ​(AHE​) on age​, female and bachelor. 1) If age increase from 35 to​ 36, by how much is AHE expected to​ change? A. $0.740 B. ​$0.370 C. $0.585 D. $0.058 ​ 2) Re-run the regression in part​ (1) but use the natural logarithm of AHE as the dependent variable. The effect on an increase in age from 35 to 36​ is? A. 0.027 dollars B. 2.73 percent C. 0.027 years D. 0.273 percent ​ 3) Re-run the regression in part​ (2) but use the natural logarithm of age ​(ln age​) instead of age. Remember the dependent variable is the Ln AHE. The expected change in AHE given an increase in age from 35 to 36 is A. 2.26 percent B. 0.026 percent C. 2.62 dollars D. 0.262 percent ​ 4) Re-run the regression in part​ (1) but add the square of age to the model​ (include both age and age2​). What is the expected AHE of a​ 40-year old​ woman, with a college​ degree? A. $26.23 B. $29.24 C. $25.51 D. ​$22.70 5) Based on your results in part​ (4) you would conclude that the quadratic in age is A. an unnecessary addition because the natural logarithm of age used in part​ (2) is superior B. an unnecessary addition because the coefficient on age2 is not statistically significant at the​ 5% level C. an appropriate functional form because the sum of the coefficients age2 and age2 is larger than the coefficient on agewhen entered alone as in part​ (1) D. an appropriate functional form because the coefficient on age2 is statistically significant at the​ 5% level 6) Age is a rough proxy for experience. It has been proposed that women of the same age and maybe the same experience as men earn less every year they age. Regress Ln AHE on bachelor​, female​, age and the interaction of age and female. Based on this regression you conclude that A. Women's wages increase​ 1.68% less than​ men's wages for every increment in age B. Women's wages increase​ 30.98% less than​ men's wages for every increment in age C. Women's wages increase​ 3.46% less than​ men's wages for every increment in age D. Women's wages increase​ 3.46% less than​ men's wages for every increment in age (7) Based on the results in part​ (7), there is no statistical support for the proposition that​ women's wage increase less than​ men's wages for each year that they age. A. False, the coefficient on age is positive and statistically significant B. True, the coefficient on the interaction term is statistically insignificant C. True, the coefficient on the interaction term is irrelevant to the question D. False, the coefficient on the interaction term is negative and statistically significant 8) The effect of age on Ln AHE is different for high school graduates than for college​ graduates? A. ​True, the​ t-statistic on the interaction term is greater than 1.96 B. All the above C. True, the​ prob-value on the interaction term is less than 0.05 so​ it's statistically significant D. True, college graduates earn​ 1.7% more per year of age than do high school graduates 9) Based on the results in part​ (9), the expected wage​ (in logs) of a​ 40-year old woman with a​ bachelor's degree is A. 4.40 B. 3.30 C. 2.63 D. 3.02

1) 0.585 2) 2.73 percent 3) 2.26 percent 4) 26.23 5) an unnecessary addition because the coefficient on age2 is not statistically significant at the​ 5% level 6) Women's wages increase​ 1.68% less than​ men's wages for every increment in age 7) False, the coefficient on the interaction term is negative and statistically significant 8) All the above 9) 3.30

How many years of schooling would a black female be expected to have if she has a base year test score of 50; a father that went to college; is from a family with income greater than $25,000 that owns its home; she is from a country where the unemployment rate is 6.0, and that state hourly wage is manufacturing is $8.00 and she lives 100 miles from the nearest 4-year college?

14.68

Imagine you regressed earnings of individuals on a constant, a binary variable ("Male") which takes on the value 1 for males and is 0 otherwise, and another binary variable ("Female") which takes on the value 1 for females and is 0 otherwise. Because females typically earn less than males, you would expect A) the coefficient for Male to have a positive sign, and for Female a negative sign. B) both coefficients to be the same distance from the constant, one above and the other below. C) none of the OLS estimators exist because there is perfect multicollinearity. D) this to yield a difference in means statistic

C) none of the OLS estimators exist because there is perfect multicollinearity.

The t-statistic for the male-female earnings difference estimated from this regression is _______ Is the male-female earnings difference estimated from this regression statistically significant at the 5% level? Since the t-statistic is ________ than the critical value for 95% confidence, the male-female earnings difference estimated from this regression _____ statistically significant at the 5% level.

Female column 1: eg. -2.72/0.21= -12.95 greater is

The interpretation of the slope coefficient in the model 1nYi=β0+β11nXi+μi is as​ follows:

a​ 1% change in X is associated with a β1​% change in Y.

In the multiple regression model, the t-statistic for testing that the slope is significantly different from zero is calculated

by dividing the estimate by its standard error

Imagine that you were told that the t−statistic for the slope coefficient of the regression line TestScore= 698.9 − 2.28 × STR was 4.38. What are the units of measurement for the t−​statistic?

standard deviations

Instrumental variables regression uses instruments​ to:

isolate the movements in X that are uncorrelated with μ.

Rerun the regression in part (2), but drop the variable bachalor. What happens to the coefficient on female?

it falls by $1.13 in the absolute value which suggests that there is an omitted variable bias when bachelor is excluded.

The probit​ model: A. is the same as the logit model. B. always gives the same fit for the predicted values as the linear probability model for values between 0.1 and 0.9. C. forces the predicted values to lie between 0 and 1. D. should not be used since it is too complicated.

C) forces the predicted values to lie between 0 and 1.

For W to be an effective control variable in IV​ estimation, the following condition must​ hold:

E(μi|Zi, Wi)=E(μi|Wi).

A researcher is using a panel data set on n​ = 1000 workers over T​ = 10 years​ (from 2001 through​ 2010) that contains the​ workers' earnings,​ gender, education, and age. The researcher is interested in the effect of education on earnings. Determine whether each of the following is an example of unobserved​ person-specific or​ time-specific variables that are correlated with both education and earnings. . Unobserved ability _______(a)_____ 2. Unemployment level ______(b)_______ 3. Unobserved motivation ______(c)_______ 4. Unobserved household environment ________(d)________ 5. GDP growth ________(c)_________ (f) How would you control for these​ person-specific and​ time-specific effects in a panel data​ regression?

a) Person-specific b) Time-specific c) Person-specific d) Person-specific e) Time-specific f) Include​ period-specific and​ time-specific variables in the regression.

In their study of the effectiveness of cardiac​ catheterization, McClellan,​ McNeil, and Newhouse​ (1994) used as an instrument the difference in distance to cardiac catheterization and regular hospitals. a) How could you determine whether this instrument is​ relevant? b) How could you determine whether this instrument—the difference in distance to cardiac catheterization and regular hospitals—is ​exogenous?

a) Use both​ (a) and​ (c). b) Since there is one endogenous regressor and one​ instrument, the ​J-test cannot be used to test the exogeneity of the instruments. Expert judgment is required to assess the exogeneity.

Consider the following regression function to answer the questions below. (GRAPH WITH CURVE PICTURE) (a) Which of the following specifies a nonlinear regression that model this​ shape? (b) Which of the following economic relationships may exhibit a shape like​ this? ​(Check all that apply​)

a) Yi=β0+β1Xi+β2X2i+ui. b) -The relationship between wage earnings and years of experience. -The relationship between income and fertility. This is the correct answer. -The relationship between time spent studying for an exam and grade for such exam.

Construct a confidence interval of 95% for the male-female earnings difference. The 95% confidence interval for the male-female earnings difference (_______,______)

female column 1 eg. -2.72+- 1.96 * 0.21 =-3.13,-2.31

In the binary dependent variable​ model, a predicted value of 0.6 means that

given the values for the explanatory​ variables, there is a 60 percent probability that the dependent variable will equal one.

The fixed effects regression​ model:

has n different intercepts.

The question of​ reliability/unreliability of a multiple regression depends​ on:

internal and external validity.

The binary dependent variable model is an example of a

limited dependent variable model.

A​ "Cobb-Douglas" production function relates production ​(Q​) to factors of​ production, capital ​(K​), labor ​(L​), raw materials ​(M​), and an error term u using the equation Q=λKβ1Lβ2Mβ3eu​, where λ​, β1​, β2​, and β3 are production parameters. Suppose that you have data on production and the factors of production from a random sample of firms with the same​ Cobb-Douglas production function. Which of the following regression functions provides the most useful transformation to estimate the​ model?

logarithmic regression function.

The OLS estimator is derived by

minimizing the sum of squared residuals.

Suppose you are interested in studying the relationship between education and wage. More​ specifically, suppose that you believe the relationship to be captured by the following linear regression​ model, Wage=β0+β1Education+u Suppose further that the only unobservable that can possibly affect both wage and education is intelligence of the individual. OLS assumption​ (1): The conditional distribution of ui given Xi has a mean of zero.​ Mathematically, Eui∣Xi=0. (a) Which of the following provides evidence in favor of OLS assumption​ #1? ​(Check all that apply​) (b) Which of the following provides evidence against of OLS assumption​ #1? ​(Check all that apply​) (c) OLS assumption​ (2): Xi, Yi, i=1,..., n are independently and identically distributed. Suppose you would like to draw a sample to study the effect of education on wage. Which of the following provides evidence in favor of OLS assumption​ #2? ​(Check all that apply​) (d)Suppose you would like to draw a sample to study the effect of education on wage. Which of the following provides evidence against OLS assumption​ #2? ​(Check all that apply​) (e) OLS assumption​ (3): Large outliers are unlikely.​ Mathematically, X and Y have nonzero finite fourth​ moments: 0<EX4i<∞ and 0<EY4i<∞ . Suppose you would like to draw a sample to study the effect of education on wage. Which of the following provides evidence in favor OLS assumption​ #3? ​(Check all that apply​) (f)Suppose you would like to draw a sample to study the effect of education on wage. Which of the following provides evidence against OLS assumption​ #3? ​(Check all that apply​)

(a) 1 answer E(Intelligence∣Education=x)=E(Intelligence∣Education=y) for all x≠y. (b) 2 answers - corr(Intelligence, Education)≠0. - covariance(Intelligence, Education)≠0. (c) 1 answer A random sample is drawn from a population of college graduates. (d) 2 answers - A sample consisting of all honor students is drawn from a population of college graduates. Your answer is correct. - Observations consisting of the same group of college students are drawn repeatedly each year over the course of their college careers (e) 2 answers - The maximum wage an individual can get is a finite number. Your answer is correct. - The years of education an individual can get is bounded above. (f) 2 answers - Half of the wages in the sample were incorrectly multiplied by 1 million when recorded. Your answer is correct. - For some individuals in the​ sample, years of education were recorded in days rather than years.

(a) Consider a man with 17 years of education and 5 years of experience who is from a western state. Use the results from column​ (4) of the table and the method in Key Concept 8.1LOADING... to estimate the expected change in the logarithm of average hourly earnings ​(AHE​) associated with an additional year of experience. The expected change in the logarithm of average hourly earnings ​(AHE​) associated with an additional year of experience is ______(a)_____%. ​(Round your response to two decimal places.​) (b) Consider a man with 17 years of education and 11 years of experience who is from a western state. Use the results from column​ (4) of the table and the method in Key Concept 8.1LOADING... to estimate the expected change in the logarithm of average hourly earnings ​(AHE​) associated with an additional year of experience. The expected change in the logarithm of average hourly earnings ​(AHE​) associated with an additional year of experience is _____(b)________ ​(Round your response to two decimal places.​) (c)Why are the answers to Scenario A and Scenario B ​different? (d) The t​-statistic for the difference between the effects in Scenario A and Scenario B is _______(d)______. (Round your response to two decimal places.​) (e) Is the difference between the effects in Scenario A and Scenario B statistically significant at the​ 5% level? (Y/N) (f)How would you change the regression if you suspected that the effect of experience on earnings was different for men than for​ women?

(a) 1.22% (b) 0.98% (c)The regression is nonlinear in experience. (d) 7.80 (potential experience ^2 row 4 top devided by bottom (no negatives) (e) Yes (f) Include interaction terms Female×Potential experience and Female×​(Potential experience​)2.

In this​ exercise, you will use these data to investigate the relationship between the number of completed years of education for young adults and the distance from each​ student's high school to the nearest​ four-year college.​ (Proximity lowers the cost of​ education, so that students who live closer to a​ four-year college​ should, on​ average, complete more years of higher​ education.) The following table contains data from a random sample of high school seniors interviewed in 1980 and​ re-interviewed in 1986. Download the data from the table by clicking the download table icon . A detailed description of the variables used in the dataset. Use a statistical package of your choice to answer the following questions. Suppose you are interested in estimating the following model ED​ = β0​+β1Dist+u Run a regression of years of completed education ​(ED​) on distance to the nearest college ​(Dist​), where Dist is measured in tens of miles.​ (For example, Dist​ = 2 means that the distance is 20​ miles). (a) What is the estimated intercept β0​? (b)What is the estimated slope β1​? (c) Is the estimated intercept β0 meaningful in this​ case? Y/N (d)How does the average value of years of completed schooling change when colleges are built close to where students go to high​ school (e) ​Bob's high school was 39 miles from the nearest college. Predict​ Bob's years of completed education using the estimated regression. (f)John's high school was 15 miles from the nearest college. Predict​ John's years of completed education using the estimated regression (g) Compute the R2 for the regression above (h)Does distance to college explain a large fraction of the variance in educational attaintment across​ individuals Y/N (i) Compute the value of the standard error of the regression and specify its units: The standard error of the regression (SER​) is _________ __________.

(a) 13.824 (b) 0.012 (c) Yes (d) The regression predicts that if colleges are built 10 miles closer to where students go to high​ school, average years of college will decrease by 0.012 years. (e) 13.87 (f) 13.84 (g) 0.0002 (h) No (i) 3.7825, years

Suppose that a​ researcher, using data on class size ​(CS​) and average test scores from 94 ​third-grade classes, estimates the OLS regression TestScore=567.236+(−6.3438)×CS, R2=0.09, SER=12.5. (a) A classroom has 20 students. The​ regression's prediction for that​ classroom's average test score is ___________ ​(Round your response to two decimal places.​) (b)Last year a classroom had 17 ​students, and this year it has 21 students. The​ regression's prediction for the change in the classroom average test score is ______________ ​(Round your response to two decimal places.​) (c) The sample average class size across the 94 classrooms is 23.33. The sample average of the test scores across the 94 classrooms is ______________​(​Hint: Review the formulas for the OLS​ estimators.) ​(Round your response to two decimal places.​) (d) The sample standard deviation of test scores across the 94 classrooms is ____________

(a) 440.36 (b) -25.38 (c) 419.24 (d) 13.1

The estimated regression is Yi=41+0.56Xi Compute the estimated​ regression's prediction for the average score of students given 95​, 127​, or 152 minutes to complete the exam. (a) Given 95 ​minutes, the estimated​ regression's prediction for the average score of students is ______________ (b) Given 127 ​minutes, the estimated​ regression's prediction for the average score of students is _________________ (c) Given 152 ​minutes, the estimated​ regression's prediction for the average score of students is _______________ (d) Compute the estimated gain in score for a student who is given an additional 46 minutes on the exam. The estimated gain in score for a student who is given an additional 46 minutes on the exam is ___________

(a) 94.2 (b) 112.12 (c) 126.12 (d) 25.75

In this​ exercise, you will investigate the relationship between earnings and height. These data are taken from the US National Health Interview Survey for 1994. Download the data from the table by clicking the download table icon . A detailed description of the variables used in the dataset is available here LOADING.... Use a statistical package of your choice to answer the following questions. Run a regression of Earnings on Height. (a) Is the estimated slope statistically​ significant? (b) Construct a​ 95% confidence interval for the slope coefficient using heteroskedasticity-robust standard errors LOADING.... The​ 95% confidence interval for the slope coefficient is ​[_____________,__________________) ​(Round your responses to three decimal places​) (c) Run a regression of Earnings on Height using data for female workers only. Is the estimated slope statistically​ significant? (d) Construct a​ 95% confidence interval for the slope coefficient using​ heteroskedasticity-robust standard errors LOADING.... The​ 95% confidence interval for the slope coefficient is ​[___________,_______________​] ​(Round your responses to three decimal places​) (e) Run a regression of Earnings on Height using data for male workers only. Is the estimated slope statistically​ significant? (f) Construct a​ 95% confidence interval for the slope coefficient using​ heteroskedasticity-robust standard errors LOADING.... The​ 95% confidence interval for the slope coefficient is ​[_______________,_____________​] ​(Round your responses to three decimal places​) (g) Can you reject the null hypothesis that the effect of height on earnings is the same for men and​ women?

(a) No (b) -926.980, 2150.005 (c) No (d) -10923.158, 5058.995 (e) No (f) -8550.075, 22665.640 (g) No

A professor decides to run an experiment to measure the effect of time pressure on final exam scores. He gives each of the 400 students in his course the same final​ exam, but some students have 90 minutes to complete the exam while others have 120 minutes. Each student is randomly assigned one of the examination times based on the flip of a coin. Let Yi denote the number of points scored on the exam by the ith student​ (0 ≤ Yi ≤​100), let Xi denote the amount of time that the student has to complete the exam ​(Xi ​= 90 or​ 120), and consider the regression model Yi=β0+β1Xi+ui​, Eui=0 (a) Which of the following are true about the unobservable ui​?(Check all that apply)

(a) two correct: - ui represents factors other than time that influence the​ student's performance on the exam. Your answer is correct. - Different students will have different values of ui because they have unobserved individual specific traits that affect exam performance

Suppose you are interested in studying the relationship between education and wage. More​ specifically, suppose that you believe the relationship to be captured by the following linear regression​ model, Wage=β0+β1Education+u Suppose further that you estimate the unknown population linear regression model by OLS. (a) What is the difference between β1 and β1​? (b) What is the difference between u and u​? (c) What is the difference between the OLS predicted value Wage and E(Wage∣Education)​?

(a) β1 is a true population​parameter, the slope of the population regression​line, while β1 is the OLS estimator of β1. (b) u represents the deviation of observations from the population regression​ line, while u is the difference between Wage and its predicted value Wage. (c) E(Wage∣Education) is the expected value of Wage for given values of Education​, while Wage is the OLS predicted value of Wage for given values of Education.

Which of the following variables are likely useful to add to the regression to control for important omitted variables?

- the fraction of young males in the county population - the average level of education in the county - the average income per capita of the county

Workers in the Northeast earn $_____ more per hour than workers in the west, on average, controlling for other variables in the regression.

0.72 (column 3 Northeast)

The t-statistic for the college-high school earnings difference estimated from this regression is _______. Is the college-high school earnings difference estimated fro this regression statistically significant at 5% level? Since the absolute value of the t-statistic is _________ than the critical value for 95% confidence, the college-high school earnings difference estimated from this regression _______ statistically significant at the 5% level.

=25.55 (column 1 college x/x eg. 5.62/0.22) Greater Is

What is the difference between internal validity and external validity​?

A statistical analysis is said to have internal validity if the statistical inferences about causal effects are valid for the population being studied. The analysis is said to have external validity if conclusions can be generalized to other populations and settings.

Using the t-statistic for the coefficient on Age is _____ The p-value for the preceding t-statistic is _______ Does this imply that age is an important determinant of earnings?

A) 4.67 Age column 2 x/x eg. .28/.06 B) 0.0000 C) Yes, age is an important determinant of earnings because the low p-vale implies that the coefficient on age is statistically significant at the 1% level.

Consider the population regression of log earnings ​[Yi​, where Yi ​= ​ln(Earningsi​)] against two binary​ variables: whether a worker is married ​(D1i​, where D1i ​= 1 if the ith person is​ married) and the​ worker's gender ​(D2i​, where D2i ​= 1 if the ith person is​ female), and the product of the two binary variables Yi=β0+β1D1i+β2D2i+β3D1i×D2i+μi. The interaction​ term: A. does not make sense since it could be zero for married males. B. allows the population effect on log earnings of being married to depend on gender. C. indicates the effect of being married on log earnings. D. cannot be estimated without the presence of a continuous variable.

B. allows the population effect on log earnings of being married to depend on gender.

The Least Squares Assumptions Yi=β0+β1Xi+ui​, i=1,..., n where 1. The error term ui has conditional mean zero given Xi​: Eui∣Xi=0​; 2. Xi,Yi, i=1,..., n​, are independent and identically distributed​ (i.i.d.) draws from their joint​ distribution; and 3. Large outliers are​ unlikely: Xi and Yi have nonzero finite fourth moments. Assuming this​ year's class is a typical representation of the same class in other​ years, are OLS assumption​ (2) and​ (3) satisfied?

Both OLS assumption​ #2 and OLS assumption​ #3 are satisfied.

Consider the polynomial regression model of degree r​, Yi=β0+β1Xi+β2X2i+•••+βrXri+μi. According to the null hypothesis that the regression is linear and the alternative that is a polynomial of degree r corresponds​ to: A. H0 : β2=0, β3=0,...,βr=0 vs. H1 : all βj≠0, j=2,...,r. B. H0 : βr=0 vs. H1 : βr≠0. C. H0 : β2=0, β3=0,...,βr=0 vs. H1 : at least one βj≠0, j=2,...,r. D. H0 : β1=0 vs. H1 : β1≠0.

C. H0 : β2=0, β3=0,...,βr=0 vs. H1 : at least one βj≠0, j=2,...,r.

The OLS residuals, ui (hat), are sample counterparts of the population

Errors.

What is the difference between the population studied and the population of interest​?

The population studied is the population from which the sample was​ drawn, while the population of interest is the population to which causal inferences from this study are to be applied.

Suppose that you have just read a careful statistical study of the effect of advertising on the demand for cigarettes. Using data from New York during the​ 1970s, the study concluded that advertising on buses and subways was more effective than print advertising. Use the concept of external validity to determine if these results are likely to apply to Boston in the​ 1970s; Los Angeles in the​ 1970s; New York in 2010.

The results are likely to apply to Boston in the​ 1970s, but not to Los Angeles in the 1970s or New York in 2010.

Now run a regression on average hourly earning (AHE) on bachelor, female and age. Comparing the coefficient on age in part (1) with coefficient on age when bachalor and female are included, you could conclude that

There is no evidence of omitted variable bias in the simple regression of AHE on age

Consider the regression model Wage= B0 + B1Female + u Where Female(=1 if female) is an indicator variable and u the error term. Identify the dependent and independent variables in the regression model above. Wage is the _____________ variable, while Female is the ______________ variable.

Wage is the dependent variable, while Female is the independent variable.

Do there appear to be important regional differences?

Yes, because wages are not consistent across the region.

An example of a quadratic regression model is

Yi ​= β0 ​+ β1X ​+ β2X2 ​+ ui.

A person with a college degree earns on average $8.03 more than someone with less than a college degree holding age and gender constant. If the t-statistic and standard error were remove from the output could you still test the following null: H0:Hbachalor=4.0

You could use the 95% confidence interval from which you would reject the null.

Sales in a company are ​$191 million in 2009 and increase ​$200 million in 2010. Compute the percentage increase in sales using the usual formula 100×(Sales(2010)−Sales(2009))/(Sales(2009)) Compare this value to the approximation 100×ln (Sales(2010))−ln (Sales(2009)) 100×((Sales(2010)−Sales(2009))/(Sales(2009)) ​= ____(a)_____% 100×ln (Sales(2010))−ln (Sales(2009)) ​= ____(b)_____% ​Now, assume that sales in a company are ​$191 million in 2009 and increase ​$263 million in 2010. 100×((Sales(2010)−Sales(2009))/(Sales(2009)) ​= ____(c)_____% 100×ln (Sales(2010))−ln (Sales(2009)) ​= ____(d)_____% The approximation performs _______(e)________ when the change is small. The quality of the approximation ______(f)_____as the percentage change increases.

a) 4.712 b) 4.604 c) 37.696 d) 31.989 e) better f) deteriorates

This problem is inspired by a study of the​ "gender gap" in earnings in top corporate jobs​ [Bertrand and Hallock​ (2001)]. The study compares total compensation among top executives in a large set of U.S. public corporations in the 1990s.​ (Each year these publicly traded corporations must report total compensation levels for their top five​ executives.) Let Female be an indicator variable that is equal to 1 for females and 0 for males. A regression of the logarithm of earnings onto Female yields ln(Earnings)=6.44−0.43Female, SER=2.48. ​(0.01​) (0.05​) Calculate the average hourly earnings for top male and female executives. The hourly earnings for top male executives is ​$____(a)____ per hour. ​(Round your response to two decimal places.​) The hourly earnings for top female executives is ​$______(b)_____ per hour. ​(Round your response to two decimal places.​) What is the estimated average difference between earnings of top male executives and top female​ executives? The estimated average difference between earnings of top male executives and top female executives is ​$_____(c)_____ per hour. ​(Round your response to two decimal places.​) What is the estimator of the standard deviation of the regression​ error? The estimator of the standard deviation of the regression error is ____(d)_____ (Round your response to two decimal places.​) Calculate the t​-statistic for Female. The t​-statistic for Female is ______(e)_______. ​(Round your response to two decimal places.​) f) Looking at the t​-statistic, does this regression suggest that female top executives earn less than top male​ executives? (y/n) g) Does this imply that there is gender​ discrimination? (y/n) Two new​ variables, the market value of the firm​ (a measure of firm​ size, in millions of​ dollars) and stock return​ (a measure of firm​ performance, in percentage​ points), are added to the​ regression: ln(Earnings)=3.86−0.28Female+0.37ln(MarketValue)+0.004Return​, ​ (0.03​) (0.04​) (0.004​) (0.003​) n​ = 46,670, R^-2 ​= 0.345. If MarketValue increases by 1.77​%, what is the increase in​ earnings? If MarketValue increases by 1.77​%, earnings increase by _____(h)_____ _______(i)_______. ​(Round your response to two decimal places.​) (j) The coefficient on Female is now−0.28. Why has it changed from the first​ regression? A. Female is correlated with the two new included variables. B. MarketValue is important for explaining ​ln(Earnings​). C. The first regression suffered from omitted variable bias. D. All of the above. (k) Are large firms more likely to have female top executives than small​ firms? A. Yes. B. There is no relationship between the genders. C. No.

a) 626.41 b) 407.48 c) 218.93 d) 2.48 e) -8.60 f) Yes g) No h) 0.65 i) % j) all of the above k) no

The true causal effect might not be the same in the population studied and the population of interest​ because

all of the above.

The interpretation of the slope coefficient in the model ​ln(Yi​) ​= β0 ​ + β1 ​ ln(Xi​)+ ui is as​ follows:

a​ 1% change in X is associated with a β1 ​ % change in Y.

The adjusted and unadjusted R^2 from the regression in part (2) are very similar because

because (n-1)/(n-k-1) is close to 1.0

The OLS residuals

can be calculated by subtracting the fitted values from the actual values

The OLS estimators of the coefficients in multiple regression will have omitted variable bias

if an omitted determinant of Yi is correlated with at least one of the regressors.

One of the least square assumptions in the multiple regression model is that you have random variables which are "i.i.d." this stands for

independently and identically distributed

In the simple linear regression model, the regression slope:

indicates by how many units Y increases, given a one-unit increase in X.

Changing the units of measurement, e.g. measuring testscores in 100s, will do all of the following EXCEPT for changing the

interpretation of the effect that a change in X has on the change in Y

A nonlinear function

is a function with a slope that is not constant.

The dummy variable trap is an example of

perfect multicollinearity

The best way to interpret polynomial regressions is​ to:

plot the estimated regression function and to calculate the estimated effect on Y associated with a change in X for one or more values of X.

To obtain the slope estimator using the least squares principle, you divide the

sample covariance of X and Y by the sample variance of X

If you wanted to test, using a 6% significance level, whether or not a specific slope coefficient is equal to one, then you should:

subtract 1 from the estimated coefficient, divide the difference by the standard error, and the check if the resulting ratio is larger than 1.96

Based on the regression in part (8) (AHE on age and female), test the following H0:Bage=Bfemale=0

the F^act statistic is 178.4 so you would reject the null at the 1% level

Internal validity is​ that

the estimator of the causal effect should be unbiased and consistent

The regression R2 is a measure​ of:

the goodness of fit of your regression line.

Comparing the California test scores to test scores in Massachusetts is appropriate for external validity​ if

the institutional settings in California and​ Massachusetts, such as organization in classroom instruction and​ curriculum, were similar in the two states.

Suppose that crime rate is positively affected by the fraction of young males in the population, and that countries with high crime rates tend to hire more police. Use the following expression for omitted variable bias to determine whether the regression will likely over- or underestimated the effect of police on the crime rate. B1 (hat) -> p B1 + pxu (ou/ox)

the regression will likely overestimate B1. That is B1 (hat) is likely to be larger than B1.

Based on a comparison of the coefficents on dist in the regression in part (1) and part (2) we would conclude that

there is an omitted variable problem because the coefficient on distance was reduced by 57% suggesting that other factors correlated with distance but also correlated with completed schooling were not included in the simple regression.

The error term is homoskedastic if

​var(ui Xi = x) is constant for i​= 1,?, n.

A researcher is using a panel data set on n​ = 1000 workers over T​ = 10 years​ (from 2001 through​ 2010) that contains the​ workers' earnings,​ gender, education, and age. The researcher is interested in the effect of education on earnings. Suppose you run a regression of earnings on​ person-specific and​ time-specific control variables. Why might the regression error for a given individual be serially​ correlated?

(2 answers) -An unexpected earnings increase that is persistent through some part of the sample period. Your answer is correct. -An unexpected natural disaster occurs in a particular​ individual's city.

Suppose that n​ = 331 i.i.d. observations for Yi, Xi yield the following regression​ results: Y=32.18+69.03X, SER=15.67, R2=0.81 ​(16.4​) ​(13.4​) Another researcher is interested in the same​ regression, but he makes an error when he enters the data into the​ regression: He enters each observation​ twice, so he has 662 observations​ (with observation 1 entered​ twice, observation 2 entered​ twice, and so​ forth). (a) Which of the following estimated parameters change as​ result? ​(Check all that apply​) (b) Using the 662 ​observations, what results will be produced by his regression​ program? Y = 32.18 ​+ 69.03X​, SER​ = ___(Bi)____​, R2 ​= 0.81 ​(_____(Bii)______) (_____(Biii)______​) (c) Which​ (if any) of the internal validity conditions are​ violated?

(a) - The standard error of the regression (SER) - The standard errors of the estimated coefficients (b) Bi) 15.65 (use ser from 1st thing and subtract 0.02 bc data size has changed. may 0.02 casue it is 2 times the size.) Bii) 11.58 =(sqroot((331-2)/(662-2))*16.4 Biii) 9.46 =(sqroot((331-2)/(662-2))*13.4 (more data means more stable estimator) (c) Measurement error.

Four hundred​ driver's license applicants were randomly selected and asked whether they passed their driving test ​(Passi=1​) or failed their test ​(Passi=0​); data were also located on their gender ​(Malei=1 if male and​ = 0 if​ female) and their years of driving experience ​(Experiencei​, in​ years). The following table summarizes the results from several probit​, logit and linear probability models. GRAPH Use the results in column​ (2) to answer the following questions. a) Is the coefficient on Experience significant at any reasonable​ level? B) John has 15 years of driving experience. What is the predicted probability that he will pass the​ test? The predicted probability that John will pass the test is ______b)________ ​(Round your response to three decimal places​) C) Katherine is a new driver​ (zero years of​ experience). What is the predicted probability that she will pass the​ test? The predicted probability that Katherine will pass the test is _______c________ ​(Round your response to three decimal places​) d) Which of the figures below is more likely to show predicted probabilities from the logit​ model? figue a= curved figure b= straight diagonal line

(a) The coefficient on Experience is significant at the​ 1% significance level. b) 0.844 ((1)/((1+e^(1.058+0.042*15))) c)0.742 ((1)/(1+e^(1.058))) d) Figure (a) (which is curved)

Four hundred​ driver's license applicants were randomly selected and asked whether they passed their driving test ​(Passi=1​) or failed their test ​(Passi=0​); data were also located on their gender ​(Malei=1 if male and​ = 0 if​ female) and their years of driving experience ​(Experiencei​, in​ years). The following table summarizes the results from several probit, logit, and linear probability models. GRAPH (a) Is the coefficient on Experience significant at any reasonable​ level? b) Matthew has 11 years of driving experience. What is the predicted probability that he will pass the​ test? The predicted probability that Matthew will pass the test is ______b)_________ (c) Christopher is a new driver​ (zero years of​ experience). What is the predicted probability that he will pass the​ test? The predicted probability that Christopher will pass the test is _______(c)_________ (d) The sample included values of Experience between 0 and 40​ years, and only four people in the sample had more than 30 years of driving experience. Jed is 95 years old and has been driving since he was 17. What is the​ model's prediction for the probability that Jed will pass the​ test? The predicted probability that Jed will pass the test is _______d)_________ (e) Do you think the previous prediction is​ reliable?

(a) The coefficient on Experience is significant at the​ 1% significance level. b) 0.864 (=Use Z table. First caluc 0.712 (contasnt column 1)+(.036 (experience column 1) *20 (# of years)= 1.432 Then go to z table and search under x 1.4 then go over to 0.03= 0.924.) c) 0.762 d) 1.000 e) No

Suppose that the linear probability model yields a predicted value of Y that is equal to 1.3. Explain why this is nonsensical.

The predicted value of Y must be between 0 and 1

The following questions refer to the panel data regressions summarized in Table 12.1. (GRAPH) Suppose that the federal government is considering a new tax on cigarettes that is estimated to increase the retail price by ​$0.60 per pack. If the current price per pack is​ $7.50, use the regression in column​ (1) to predict the change in demand. The expected percentage change in cigarette demand is _______a)________​%. ​(Round your response to two decimal places​.) Construct a​ 95% confidence interval for the change in demand. The confidence interval is ​(_________b)______%, _________c)_________%). ​(Round your responses to two decimal places​.) d) Suppose that the United States enters a​ recession, and income falls by 3​%. Use the regression in column​ (1) to predict the change in demand. The expected percentage change in demand is ______d_______ ​(Round your response to two decimal places​.) e) Suppose that the recession lasts less than 1 year. Do you think that the regression in column​ (1) will be able to reliably predict the effect of income change on cigarette​ demand? Why or why​ not? f)Suppose that the ​F-statistic in column​ (1) were 3.6. Would the regression provide a reliable measure of the effect of a price change on cigarette​ demand? Why or why​ not?

a) -7.24% (idk how to get that) b) -10.41% (-.94-1.96*.21)*0.0770 *100% c)-4.07% (-.94+1.96*.21)*0.0770 *100% the -0.94 comes from graph ln(p cig 1995)-ln(p cig 1985) column 1 and the -0.21 is below that. d)-1.59% (0.53* (-.03) * 100% .53 from ln(Inc1995) column 1. and -.03 from overidentifing restictions of j-test and p value column 3 e) Both​ (a) and​ (c) are correct. f) No, the instrumental variable would be too weak​ (irrelevant) if the F​-statistic in column​ (1) were less than 10.

New Jersey has a population of 6.5 million people. Suppose that New Jersey increased the tax on a case of beer by​ $1 (in 1988​ dollars). Use the results in column​ (4) to predict the number of lives that would be saved over the next year. The predicted number of lives that would be saved over the next year is _____(a)_____ Construct a​ 95% confidence interval for your answer. The​ 95% confidence interval for the number of lives that would be saved over the next year is ​(______(b)_____, _____(c)____) The drinking age in New Jersey is 21. Suppose that New Jersey lowered its drinking age to 18. Use the results in column​ (4) to predict the change in the number of traffic fatalities in the next year. The predicted ______(d)_____in the number of traffic fatalities in the next year is _______(e)_______ Construct a​ 90% confidence interval for your answer. The​ 90% confidence interval for the predicted increase in the number of traffic fatalities in the next year is (________(f)______, _______(g)______) Suppose that real income per capita in New Jersey increases by​ 1% in the next year. Use the results in column​ (4) to predict the change in the number of traffic fatalities in the next year. The predicted ______(h)_____ in the number of traffic fatalities in the next year is ________(i)_______ Construct a​ 90% confidence interval for your answer. The​ 90% confidence interval for the predicted increase in the number of traffic fatalities in the next year is ​[_____(j)______, ______(k)_____) l) Refer to the reported F​-Statistics and p​-values associated with testing for exclusion of group of variables. Should time effects be included in the​ regression? m)A researcher conjectures that the unemployment rate has a different effect on traffic fatalities in the western states than in the other states. How would you test this​ hypothesis?

a) 279.50 (beer tax column 4= 0.43. 6.5mil= 650. 650*0.43= 279.5) b) -140.92 =0.33 (column 4 part 2 beer tax) is SE of (b1 hat) -.43 + (1.96*.33)= 0.2168 then multiply by -650= -140.92 c) 699.92 (=-.43- (1.96*.33)= -1.0768 then multiply by -650= 699.92) d) increase e) 20.15 (=0.031 (column 4 drinkin age 18) * 650 (population)= 20.15) f) -61.11 (=0.031 (column 4 drinkin age 18)+ (1.645* 0.076 from column 4 drinkin age 18)= 0.09402. then multiply by -650= -61.11) g) 101.41 (=0.031 (column 4 drinkin age 18)- (1.645* 0.076 from column 4 drinkin age 18)= -.15602 *-650= 101.413) h) increase i) 11.31 (=1.74% (column 4 for real income) *650)) j) 4.04 k) 18.58 l) yes m) I would include a binary variable west ​(=1 if the state is in the west and 0​ otherwise), and an interaction term west​*Unemployment rate .​ Then, I would test if the estimated coefficient for the interaction term is significant at a reasonable level.

Consider the following binary variable version of the fixed effects model. Each regressor Dj is a binary variable that equals 1 when i ​= j and 0 otherwise. Note that the binary variable D1i for the first group is arbitrarily omitted. Yit=β0+β1Xit+γ2D2i+γ3D3i+...+γnDni+uit Use the regression in the equation above and the tool palette to the right to answer the following questions. What is the slope and intercept for entity 1 in time period 1​? The slope of entity 1 in time period 1 is ______a)_____. The intercept of entity 1 in time period 1 is _____(b)_____. ​(Properly format your expressions using the tools in the palette. Hover over tools to see keyboard shortcuts.​ E.g., a subscript can be created with the​ _ character.) What is the slope and intercept for entity 3 in time period 3​? The slope of entity 3 in time period 3 is ______(c)_____. The intercept of entity 3 in time period 3 is _____(d)______. ​(Properly format your expressions using the tools in the​ palette.) What is the slope and intercept for entity 2 in time period 1​? The slope of entity 2 in time period 1 is _____(e)______. The intercept of entity 2 in time period 1 is _____(f)______. ​(Properly format your expressions using the tools in the​ palette.)

a) B1 (substript) b) B0 (subscript) c) B1 d) B0 + Y3 (0 and 3 are subscripts) e) B1 f) B0 + Y2

A set of instruments must satisfy the following two conditions to be​ valid: (i) Instrument Relevance and​ (ii) Instrument Exogeneity. Consider the instrumental variable regression model Yi=β0+β1Xi+β2Wi+ui​, where Xi is correlated with ui​, Wi ​(the exogenous​ regressor) is uncorrelated with ui​, and Zi is an instrument. Suppose that the following three assumptions are satisfied. 1. EuiW1i,...,Wri=0​; 2. X1i,...,Xki, W1i,...,Wri, Z1i,...,Zmi, Yi are i.i.d. draws from their joint​ distribution; 3. Large outliers are​ unlikely: The ​X'​s, ​W'​s, ​Z'​s, and Y have nonzero finite fourth moments. a) Which of the two​ conditions, (i) and​ (ii), for a valid instrument is not satisfied when Zi is independent of ​(Yi​, Xi​, Wi​)? b)Which of the two​ conditions, (i) and​ (ii), for a valid instrument is not satisfied when Zi ​= Wi​? c) Which of the two​ conditions, (i) and​ (ii), for a valid instrument is not satisfied when Wi ​= 1 for all ​i? d)Which of the two​ conditions, (i) and​ (ii), for a valid instrument is not satisfied when Zi ​= Xi​?

a) Only​ (i) is not satisfied b) Only​ (i) is not satisfied. c) Only​ (i) is not satisfied. d) Only​ (ii) is not satisfied.

Consider the problem of estimating the elasticity of demand for butter. The demand equation is given by lnQbutteri=β0+β1lnPbutteri+ui​, where Qbutteri is the ith observation on the quantity of butter​ consumed, Pbutteri is its​ price, and ui represents other factors that affect​ demand, such as income and consumer tastes. a) In the above demand curve regression​ model, is lnPbutteri positively or negatively correlated with the​ error, ui​? b) If β1 is estimated by​ OLS, would you expect the estimated value to be larger or smaller than the true value of β1​?

a) ln left parenthesis Upper P Subscript i Superscript butter right parenthesislnPbutteri is positively correlated with the regression​ error, u Subscript iui. b) The OLS estimator of beta 1β1 is likely to be larger than the true value of beta 1β1​, because ln left parenthesis Upper P Superscript butter Baseline right parenthesislnPbutter is positively correlated with the regression​ error, u Subscript iui.

The rule of thumb for checking for weak instruments is as​ follows: for the case of a single endogenous​ regressor:

a​ first-stage F​-statistic ​< 10 indicates that the instruments are weak.

F​-statistics computed using maximum likelihood​estimators:

can be used to test joint hypotheses.

Consider a model with one endogenous regressor and two instruments. Then the J​-statistic will be​ large:

if the coefficients are very different when estimating the coefficients using one instrument at a time.

In the expression Pr(deny=1P/I ratio, black)=Φ(−2.26+2.74 P/I ratio+0.71 black)​, the effect of increasing the P​/I ratio from 0.3 to 0.4 for a black person ​(Assume a probit ​model)​:

is 9.4 percentage points.

In panel​ data, the regression​ error:

is likely to be correlated over time within an entity.

In panel​ data, the standard errors are clustered because the regression​ error:

is likely to be correlated over time within an entity.

In the case of​ errors-in-variables bias:

the OLS estimator is consistent if the variance in the unobservable variable is relatively large compared to the variance in the measurement error.

Weak instruments are a problem​ because:

the TSLS estimator may not be normally​ distributed, even in large samples.

The linear probability model​ is:

the application of the linear multiple regression model to a binary dependent variable.

In the probit​ regression, the coefficient β1 ​indicates:

the change in the z​-value associated with a unit change in X.

In the time fixed effects regression​ model, you should exclude one of the binary variables for the time periods when an intercept is present in the​ equation:

to avoid perfect multicollinearity.

The distinction between endogenous and exogenous variables​ is:

whether or not the variables are correlated with the error term.

In the multiple regression model, the adjusted R^2, R^-2

will never be greater than the regression R^2

A survey of earnings contains an unusually high fraction of individuals who state their weekly earnings in​ 100s, such as​ 300, 400,​ 500, etc. This is an example​ of:

​errors-in-variables bias.

Consider the following regression model Yi=β0+β1Xi+ui (a) Suppose that Y is measured with random error. Does this mean that regression analysis is​ unreliable? (b) ​Now, suppose that X is measured with random error. Does this mean that regression analysis is​ unreliable?

(a) No (b) Yes

All of the following are true

- A high R^2 or adjusted R^2 does not mean that the regressors are true cause the dependent variable - A high R^2 or adjusted R^2 does not mean that there is no omitted variable bias - A high R^2 or adjusted R^2 does not necessarily mean that you have the most appropriate set of regressors - A high R^2 or adjusted R^2 does not always mean that an added variable is statistically significant

How many years of schooling would a person be expected to have if all you knew was that they lived 100 miles from the nearest 4-year college

13.22

How many years of schooling would a black female be expected to have if she had the same characteristics as in part (7) but her family had less than $25,000 in income and they did not own their own family home?

14.14

Assume that you had estimated the following quadratic regression​ model: Test Score=607.3+3.85Income−0.0423Income2 If income increased from 10 to 11​ ($10,000 to​ $11,000), then the predicted effect on test scores would​ be:

2.96

Suppose that a researcher, using data on the class size (CS) and average test scores from 103 third-grade classes, estimates the OLS regression Test score (hat)= 515.196 + (-5.7618) * CS, R^2=0.06, SER=11.4. A classroom has 21 students. The regression's prediction for that classroom's average test score is ________

515.196 + (-5.7618) * 21 = 394.20

The multiple regression includes two regressors: Yi= B0+ B1X1i + B2X2i +ui Use the tool palette to the right to answer the following questions. What is the expected change in Y if X1 increases by 4 units and X2 is unchanged? - The expected change in Y if X1 increases by 4 units and X2 is unchanged is ________. What is the expected change in Y if X2 decreases by 7 units and X1 is unchanged? - The expected change in Y if X2 decreases by 7 units and X1 is unchanged is ______. What is the expected change in Y if X1 increases by 2 units and X2 decreases by 5 units? -The expected change in Y if X1 increases by 2 units and X2 decreases by 5 units is__________.

A) 4B1 B) -7B2 C) 2B1-5B2

Changing the units of measurement—that ​is, measuring test scores in​ 100s, will do all of the following except for changing​ the: A. interpretation of the effect that a change in X has on the change in Y. B. numerical value of the intercept. C. residuals. D. numerical value of the slope estimate.

A. interpretation of the effect that a change in X has on the change in Y.

A nonlinear​ function: A. is a function with a slope that is not constant. B. can be adequately described by a straight line between the dependent variable and one of the explanatory variables. C. makes little​ sense, because variables in the real world are related linearly. D. is a concept that only applies to the case of a single or two explanatory variables since you cannot draw a line in four dimensions.

A. is a function with a slope that is not constant.

The coefficient on age shows that

AHE increase by 0.605 for every one-year increase in age

A researcher estimates the effect on crime rates of spending on police by using​ city-level data. Which of the following represents simultaneous​ causality?

Cities with high crime rates may need a larger police​ force, and thus more spending. More police​ spending, in​ turn, reduces crime.

Construct a confidence interval of 95% for the college-high school earnings difference. The 95% confidence interval for the college-high school earnings difference is (______,______)

Column 1 college. 5.62+- 1.96 * 0.22 = 5.19, 6.05

Consider a regression with two variables, in which X1i, is the variable of the interest and X2i is the control variable. Conditional mean independence requires:

E(uiIX1i,X21)= E(uiIX1i)

The coefficient on females in part (2) indicates that

Females obtain 0.145 more years of schooling than do males adjusted for other factors

Consider the regression in part (2). Based on a joint test of the hypothesis H0:Bfemale= Bbachalor= 0 we would

Reject H0: because the F^act is 822 which is much larger than the critical 2,8 of 3.0

Given the following hypothesis: H0:B(females)=0.0 adjusted for age and education we would

Reject H0 because the 95% confidence interval does not include zero

Data were collected from a random sample of 340 home sales from a community in 2003. Let Price denote the selling price (in $1,000), BDR denote the number of bedrooms, Bath denote the number of bathrooms, Hsize denote the size of the house (in square feet), Lsize denote the lot size (in square feet), Age denote the age of the house (in years), and Poor denote a binary variable that is equal to 1 if the condition of the house is reported as "poor". An estimated regression yields Price (hat)= 122.8 + 0.500BDR + 24.1Bath +0.161Hsize + 0.004Lsize + 0.093Age - 50.3Poor, R^-2=0.74, SER=42.7 Suppose that a homeowner converts part of an existing family room in her house into a new bathroom. What is the expected increase in value of the house? The expected increase in value of the house is $_______.

The expected increase in value of the house is $24100 24.1*1000= 24100

Using the Excel data set, run a regression of years of completed schooling (ed) on distance (in 10s of miles) from a 4-year college (dist)

Years of completed schooling decreased by 0.073 years for every 10-mile increase in distance from the nearest 4-year college

Regress completed schooling (ed) on the variables dist. female, black, hispanic, byset, dadcoll incomehi, ownhome, cue80, and stwfg80. The coefficient on distance (dist) now indicates that adjusted for other factors

Years of completed schooling increase by 0.032 years for every 10-miles closer one lives from the nearest 4-year college.

Consider the multiple regression model with two regressors X1 and X2, where both variables are determinants of the dependant variable. You first regress Y on X1 only and find no relationship. However, when regressing Y on X1 and X2, the slope coefficient B1 (hat) changes by a large amount. This suggests that your first regression suffers from:

omitted variable bias.


संबंधित स्टडी सेट्स

Chapter 5: Elasticity and its Applications

View Set

Cognitive Dissonance & Attitude Change

View Set

Chapter 3: Investment Vehicles : Investment Companies (Pooled Investments)

View Set

What's your name, How are you, Where are you from, What's the weather like

View Set