Econometrics final
Fixed effects models can solve the omitted variables problem as long as the omitted variable is time-invariant (i.e., is assumed "fixed" over the sample time period).
true
If Yt and Xt are both I(1)in the regression Yt =β0 +β1Xt +ut and the error term, ût = Ŷi -β0 - β1Xt, is I(0), then Yt and Xt are cointegrated.
true
The Durbin-Watson test can be used to test for first-order and second-order serial correlation.
false DW only for first
1. Studentized Residuals examine outliers by considering a. The model's errors, ûi b. The model's fitted values, Ŷi c. The model's estimated coefficients, i
A
1. Which of the following is true of Regression Specification Error Test (RESET)? a. It tests if the functional form of a regression model is misspecified.
A
1. Which of the following models is a Koyck model? a. Yt = d0 + d1Xt + d2Yt-1+ ut
A
5. The model Ŷt = b1Yt-1 + ut where b < 1 is an A) AR(1) process
A
1. A study on "Reducing Cheating in College" indicated that a student with even odds of cheating experienced a 25 percentage point reduction in the likelihood of cheating after completing a week-long course on Ethics but only a 10 percentage point decrease after being suspended from campus for one week. a. (2) According to the report, the coefficient for completing Ethics, call it bEthics, is [1/-1/-0.5/-0.25]. b) Given the coefficient value above, the likelihood for a student with a 25 percent chance of cheating prior to completing Ethics would fall from 25 percent to [26.5/6.25/18.75/12.5] percent c) According to the report, the coefficient for a suspension's impact, call it bSuspension is [-0.3/0.30/0-.50/-0.40]
A) -1. Detailed solution: ME = .5*(1-.5)*bethics = -0.25 implies bethics = -1 B) 6.25. Detailed solution: ME = .25*(1-.25)*bethics= .25*(1-.25)*1 = -0.1875 or -18.75% and 25% - 18.75% = 6.25% c) -0.40. Detailed solution: ME = -.10 = .5*(1-.5)*bSuspension= .25* bSuspension = -0.10 or 0.10/0.25 = 2/5 = -0.40
1. The same newspaper reported on a follow-up study as to the one above stating: "Data on 100 firms in the year 2015 revealed that CEO yearly salary rose by six percent for each percentage point increase in the firm's return on equity. If return on equity were insignificant, the predicted CEO salary would be $1,500,000. It should be noted four of the highest revenue-generating firms in the sample, had CEO salaries of only $1 due to their compensation coming entirely from stock options". a. (4) Using Salary for CEO salary, ROE for return on equity in percentage points, the econometric model being used is i. lnSalary = b0 + b1ROEi ii. lnSalary = b0 + b1lnROEi iii. Salary = b0 + b1lnROEi iv. Salary = b0 + b1ROEi b) The numeric values for the coefficients cited in the article are b0 = [xx.xxx] and b1 = [x.xx]. c) the last sentence, in terms of the CLRM assumptions, indicates d) Eliminating the four CEO salaries of $1, would cause b0 to [increase/decrease] in value and cause b1 to [increase/decrease] in value.
A) answer i., is correct lnSalary = β0 + β1lnROE b) b0 = [14.221] and b1 = [0.06] lnSalary = 14.221 + 0.06ROEi c) The answer is ii., outliers. d) The answer is decrease and increase.
5. Consider the following AC and PAC for monthly data on variable Yt. The data generating process for Yt appears as a AR(1) process Unit root process (i.e., r = 1) White Noise process
B
1. Which two of the following conditions must a stationary variable uphold? a. zero multicollinearity b. a constant mean, E(Yt) = Ῡ c. serially uncorrelated errors d. constant variance, E(Yt - Ῡ)2 = s2
B and D
1. T/F An Andrews test of a logit model gave an c2 of 40 with 100 observations. Given the critical Chi-squared value at the 5% significance level is 18.3, this test provides evidence that the model fits the data well.
False, the c2stat = 40 > c2crit at 5%.
Stationarity is a common cause of the spuriousness among times series variables.
False. Nonstationarity is the common cause
Let community loan approval rates be determined by the following equation Apprate = β0 + β1percmin + β2avginc + β3avgwlth + β4avgdebt + u where percmin is the percentage minority in the community, avginc is average income, avgwlth is average wealth, and avgdebt is some measure of average debt obligations. The null hypothesis that there is no difference in loan rates across neighborhoods due to racial and ethnic composition, when average income, average wealth, and average debt have been controlled for is given by
H0: β1 = 0 if you aren't sure whether minorities are favored or discriminated against; H0: β1 > 0 if you think they are discriminated against. HA: β1 ≠ 0 if you don't know if discrimination is favoritism or harmful; HA: β1 < 0 if you think it is harmful.
1. T/F/ The Koyck model is an autoregressive model.
True
5. T/F. In the Dickey-Fuller test, if r = 1 that implies d = 0 and that a unit root exists.
True
5. T/F. The null hypothesis for a Dickey-Fuller test is that a unit root exists indicating the data series is nonstationary. True.
True
lnImportst = β0 + β1lnRGDPt + ut where Imports is in $US millions, GDP is in $US billions and CPI is an index valued at 100. For this sample, β0 is [a] (x.xx) and β1 is [b] (y.yy) b. The approximate ρ from the original model is [c] (x.xxx) and is calculated by using the [d] (Durbin-Watson statistic/R-squared/F-statistic.) c. Run a Durbin-Watson test for the model to determine whether there is first-order serial correlation at the 5% level. The null hypothesis is that [e](serial/no serial) correlation exists and the alternative hypothesis is that [f] (serial/no serial) correlation exists. The lower bound for the critical value is [g] (x.xx) while the upper bound is [h] (x.xxx). We conclude that there [i] (is/is no) evidence for first-order serial correlation because [j] (pick a,b,c, or d) a. DW < dL b. DW > dL c. DW < du d. dL < DW < du
a) 1.44 b) 1.81 c) .647 d) DW stat e) no serial f) serial g) 1.3 h) 1.4 i) is j) a
Researchers have found that tax compliance has a nonlinear relationship with income. Consider the following illustrative regression equation UnderreportedIncomei = 43 - 0.5GovInvesti - 1Incomei + 0.01Incomei2 - 1.2AuditFundsi n = 30 where UnderreportedIncome is given as a percentage (meaning 12% is given as 12), Income = yearly income in $1,000s, GovInvest = Government Investment expenditures in $10,000s, and AuditFunds = funds in $10,000s devoted to auditing tax returns. Coefficient standard errors are in parentheses. a. A $20,000 increase in funding for auditing will cause a: b. b. What income level has the highest reporting of income? c. Assuming government investment and auditing funding are each $100,000, the model predicts the percentage of income underreported for people earning $10,000 to be [aa] (xx) percent. d. Assuming government investment and auditing funding are each $100,000, the model predicts the percentage of income underreported for people earning $100,000 to be [zz] (xx) percent. e. Previous tax studies revealed that the income level which generates the lowest underreporting is $40,000. Which regression is an accurate revision of the regression equation below to test the restriction that 40,000 is the income level that minimizes underreporting? ( f. The Wald test to determine whether the estimated income level that minimizes underreporting is statistically equivalent to $40,000, as shown in previous problem. The null hypothesis is that the estimated income level and $40,000 were equivalent and the test yielded an F statistic of 4.32 and associated p-value of 0.043. The results indicate the estimated value [a] (is/is not) statistically equivalent to $40,000 at the 5% significance level. g) type of variable is income 500k h) A $10,000 increase in audit funding for those earning less than $500,000 causes ^^ above $500,000 causes
a) 2.4 The answer is c because the -1.2 coefficient for AuditFunds indicates that every additional $10,000 devoted to auditing tax returns lowers underreported income by 1.2 percentage points. Thus, $20,000 raises reported income by 2.4 percentage points. b) 50,000 c) 17% Under =43 - 0.5*10 - 1*10 + 0.01*102 - 1.2*10 = 43 - 5 - 10 + 1 - 12 = 17 percent d) 26 Under =43 - 0.5*10 - 1*100 + 0.01*1002 - 1.2*10 = 43 - 5 - 100 + 100 - 12 = 26 percent e) UnderreportedIncomei = β0 + β1GovInvesti + β3(Incomei2 − 80Incomei) + β4AuditFundsi f) is not g) dummy variable h) their reported income to rise by 2 percentage points their reported income to increase by 1.5 percentage point
1. Whether a person becomes an entrepreneur is modeled as Pi = b0 + b1FAMINCi + b2FEMALEi + b3IMMIGRi + b4AGEi + b5AGEi2 + ui where Pi = 1 if the ith person is an entrepreneur (given by the variable ENTREP); 0 if the ith person is not an entrepreneur, FAMINCi = family income for the ith person measured $10,000 increments, FEMALEi = 1 if the ith person is female and 0 otherwise, IMMIGRi = 1 if the ith person immigrated to the U.S. and 0 otherwise, and AGEi = age of ith person. Use the logit output below to answer the following questions for a person on the cusp of becoming an entrepreneur (i.e., the 50-50 likelihood person). a) a. How does being a non-immigrant affect the likelihood of a person becoming an entrepreneur? b) Are wealthy people more likely to become entrepreneurs?
a) According to the Logit model, being an immigrant raises the likelihood of the cusp person becoming an entrepreneur by 0.25*0.3255 = 0.081 or 8.1 percentage points. b) No, the coefficient for wealth is negative. Using the better logit model, we know the marginal effect is 0.25*(-0.03) = -0.0075 implying every $10,000 increase in wealth lowers the likelihood by - 0.75 percentage points of the cusp person becoming an entrepreneur.
1. Consider the airline ticket demand curve given below with standard errors given in parentheses. Q = 16,000 - 2.5P + 3PS + 0.25YD s.e. (40,000) (1.25) (2) (0.1) n = 100 and R2 = 0.56 where Q = quantity of airline tickets sold, P = price of airline tickets in $US, PS = price of train fare in $US, and YD = disposable income in $US. a. Interpret the four regression coefficients. b) Predict airline ticket sales if P = $300, PS = $100, and YD = $50,000. c) How will a $1,000 increase in disposable income affect airline ticket sales? How will a $10 increase in train fare affect airline ticket sales? How will a $50 increase in airline ticket prices affect airline ticket sales? d) Calculate the t-stats for each of the four coefficients. e) Is 2 (the coefficient for PS) statistically significant at the 5% significance level? (Be sure to show the details of your hypothesis test.) f) Other studies show 3 (the coefficient for YD) to be 0.35. Is this regression's 3 value statistically different from 0.35 at the 10% significance level? (Be sure to show the details of your hypothesis test.) g) Interpret the R2 value. Briefly explain why Ṝ2 is lower than the R2 value. h) Is train travel a substitute to airline travel according to these results? If so is it a statistically significant substitute at the 5% significance level? (Be sure to show the details of your hypothesis test.) i) Now consider the addition of TV and radio advertising, given by TV and Radio, respectively, to yield Q = 15,000 - 2P + 3PS + 0.2YD + 2TV + 3Radio s.e. (30,000) (1) (1.5) (0.1) (2) (2) n = 96 and R2 = 0.59 Should the advertising variables be added to the model? Answer by running appropriate test.
a) Every $1 increase in airline ticket price causes quantity demanded to decrease by 2.5. If P = PS = YD = 0, 16,000 airline tickets would be sold. A $1 increase in disposable income causes a 0.25 increase in airline ticket sales. b) Q = 16,000 - 2.5P + 3PS + 0.25YD = 28,050 c) A $1,000 increase in disposable income increases airline ticket sales by 250. A $10 increase in train fare will raise airline ticket sales by 30. A $50 increase in airline ticket prices will lower airline ticket sales by 125. d) t-statb0 = 16,000/40,000 = 0.4; t-statb1 = - 2.5/1.25 =-2 ; t-statb2 = 3/2 = 1.5; t-statb3 = 0.25/.1 = 2.5. e) Ho: b2 = 0 HA: b2 ≠ 0 t-statb2 = 3/2 = 1.5 df = 100 - 4 = 96 tCrit0.025 = 1.984 Fail to reject Ho because t-stat < t-crit. f) Ho: b3 = 0.35 HA: b3≠ 0.35 t-statb2 = (0.25-0.35)/0.1 = -1 df = 100 - 4 = 96 tCrit0.05 = 1.66 reject Ho because t-stat < t-crit. No, they are not statistically different. g) 56% of the variation in ticket sales is explained by our model. Adjusted R2 adjusts for the degrees of freedom (which decrease for every additional regressor). h) The positive coefficient in front of train ticket price indicates it is a substitute for plane travel. Whether it is a statistically significant Ho: b2 = 0 HA: b2 ≠ 0 t-statb2 = 3/2 = 1.5 df = 100 - 4 = 96 tCrit0.025 = 1.984 Fail to reject Ho because t-stat < t-crit. i) F-stat = (.59-.56)/2/(1-0.59)/90 = 3.293 F-crit = 3.0977 Ho: b4 =b5 = 0 HA: b4 and/orb5 ≠ 0 F-stat > F-crit, reject H0. Conclude that advertising has impact. df = 96 - 6 = 90
A recent study of the socio-economic factors contributing to religious affiliation across U.S. cities produced the following results Noni = 15 + 1.5Wi + 0.1UNi - 0.2MeanAgei + 1Edi (s.e.) (15) (0.5) (0.5) (0.1) (0.2) n = 105 R2 = 0.56 where Non = percentage of non-religious in the city (where a value of 20 represents 20% of population being non-religious), W = yearly welfare payments per capita in $1000s (e.g., W = 2 represents $2000 of welfare payments per person), UN = unemployment rate as a percentage (where a value of 4.5 represents 4.5%), MeanAge = average age of population, Ed = average years of education in population. A) T/F. The constant term in the regression is statistically significant. b) T/F. The variable Edi in the regression is statistically significant c) A city receiving $1000 in W, with an unemployment rate of 5%, an average age of 50, and an average education of 12 years is predicted to have a percentage of non-religious equal to [xx] ? d) The coefficient for UNi implies a. a one percentage point increase in unemployment causes a 0.1 percentage point increase in non-religious population e) i. The coefficient for MeanAgei implies a. a one year increase in average age of population causes a 0.2 percentage point decrease in non-religious population f) Welfare payments must [increase or decrease] by [?] to raise non-religion by 6 percentage points.
a) False. Its t-stat = 1 b) True. Its t-stat = 5 c) 19 d) A e) A f) Increase is correct and 4000 is the value
1. The Institute of Applied Econometrics' report Education's Impact on Hourly Wages states the following: 1000 randomly selected U.S. citizens were polled regarding their education level and hourly wage. The average citizen has 12 years of education and an hourly wage of $12.36. A bivariate regression model of the data reveals every additional year of education raises hourly wages by $1.28 on average with a standard error of $0.07. Results indicate that 21 percent of the variation in wages are explained by this model. One important caveat is that the variance of model errors increases for higher levels of education. a) According to the report, an additional year of college increase salaries by [1.28] for the average citizen. b) How precise is the above measure of education's impact on the average citizen's hourly wage? To answer, the relevant range wherein we can expect our $1.28 measure to lie for 95 out of 100 similar samples of data has a lower bound of $[1.14] and an upper bond of $[1.42]. (Put differently, you are providing the values for the 95% confidence interval.) c) Using Wage (in $ per hour) as the dependent variable and Education (number of years in school) as the independent variable, provide the appropriate numerical values in the spaces below as specified in the article. d) The R2 of the report's regression model is [x.21]. e) The last sentence from the report indicates the model may violate the regression model assumption of no (heteroscedasticity) f) When the CLRM assumptions are maintained, the resulting regression coefficients will have the following two properties of (unbiasedness)
a) Salary will rise by $1.28 for each additional year implying another year would raise salaries from $12.36 to $13.64. b) Salary will rise by $1.28 + 2*0.07 or $1.14 to $1.42 for each additional year implying another year would raise salaries to $13.41 to $13.69. c) Wage = - [3] +[1.28]Education Detailed Solution: 0 = Ῡ - 1X̄ = -3 = 12.36 - 1.28*12 = -3 Wage = - 3 +1.28Education d) R2 = 0.21 e) heteroscedasticity f) unbiasedness
A newspaper article reports on a study as follows. "Data on 100 firms in the year 2015 revealed that CEO yearly salary rose by $20,000 for each additional $1 million in the firm's annual revenue. The study seems suspicious to this reporter, however, because the study apparently shows that CEO salary would be $500,000 even if annual revenue were zero." a. Using Salary (in $1000s) as the dependent variable and Revenue (for annual revenue in millions of $U.S.) as the independent variable, provide the appropriate numerical values in the spaces below as specified in the article. b. The predicted CEO salary for a company earning $10 million in annual revenue is $[700] thousand. c. T/F. The reporter's suspicions may be unwarranted because the intercept is often inconsequential to the analysis and there may be few or no observations near zero d) The $20,000 impact of revenues on CEO salary is statistically significant. e) 68% lie between f) 95% lie between
a) Salaryi = [500] + [20]Revenuei b) 500+20*10 = $700 c) true d) True. The two-tailed t-stat for the slope is 20,000/1,000 = 20 which greatly exceeds the critical value of tcrit = + 1.98 for n = 100 and k = 2 e) 19 and 21,000 f) 18 and 22000
1. A newspaper article reports on a study as follows. "Data on 100 randomly chosen firms in the year 2015 revealed that CEO yearly salary rose by $20,000 (within a margin of error of $2,000) for each additional $1 million in the firm's annual revenue. The study seems suspicious to this reporter, however, because it shows that CEO salary would be $500,000 even if annual revenue were zero. Moreover, though the researchers show the model they use is valid as its errors 'follow a normal distribution', a prior study found an average impact of $21,500 for each additional $1 million in the firm's annual revenue. Clearly a difference of $1,500 is significant and calls the researchers' results into question." a. T/F. The $20,000 impact of revenues on CEO salary is statistically significant. b. a. Were 100 more random samples (of 100 firms each) collected, we would expect 68% of the estimates of the impact of annual revenues on CEO salaries to lie between $[21,000] and -$[19,000], using the same regression techniques? c. Were 100 more random samples (of 100 firms each) collected, we would expect 95% of the estimates of the impact of annual revenues on CEO salaries to lie between $[22,000] and $[18,000], using the same regression techniques? d. Regarding the reporter's claim that "Clearly a difference of $1,500 is significant and calls the researchers' results into question" e. The Jarque-Bera (JB) probability for the researchers' model must be
a) True. The two-tailed t-stat for the slope is 20,000/2,000 = 10 which greatly exceeds the critical value of tcrit = + 1.98 for n = 100 and k = 2 b) 21,000 and -19,000 c) 22,000 and 18,000 d) iii., is correct because there is no statistical difference between $20,000 and $21,500. e) i., Our rule of thumb is that JB prob > 0.20 for a normal distribution.
output table with 5, 2.5, ? The t-stat for β0 is [a] (x.x) and it [b] (is/is not) statistically significant at the 5% significance level. The model explains [c] (xx) percent of the variation in quantity. The degrees of freedom for a t-test are [d] (xx). Assuming a critical t value of 2.0, the slope [e] (is/is not) statistically different from the value 1.0. In this regression, "Quantity" is the [f] (dependent/independent) variable. "Price" is assumed to be a [g] (random/non-random/dependent) variable. If the model has a Jarque-Bera probability (JB prob) of 0.15, this implies the errors [h] (are/are not) normally distributed. Non-normally distributed errors imply that our model's [i] (t-stats/R-square/betas) may not be valid. To sell 80 units, price must be [j] (xx) At a price of $10, quantity will equal [k] (no hint). A correct interpretation of the slope is [l] (choose i, ii, iii, or iv)
a. 2.0 b. is not c. 62 d. 28 e. is not f. dependent (it is the Y variable) g. non-random (recall X var is fixed in repeated samples) h. are not (because JB prob < 0.20) i. t-stats (because statistical tests are predicated on betas being normally distributed which may not be if errors are not) j. $50 (using equation 80 = 5 + 1.5*P giving a P = 50). k. 20 (using equation Q = 5 + 1.5*10). l. iii. (note that the slope is 1.5 implyng a $1 increase in price causes quantity to rise by 1.5, implying that a $10 increase would cause quantity to rise by 15).
1. A newspaper article reports on a study as follows. "Data on 100 firms in the year 2015 revealed that CEO yearly salary rose by $10,000 for each percentage point increase in the firm's return on equity and rose by $20,000 for each additional $1 million dollars in the firm's annual revenue. The average CEO salary was $2,600,000 and the average ROE was 10% while the average annual revenue was $20,000,000. These variables together explained 68 percent of the changes in CEO salary." The newspaper reporter went on to state "Though the report seemed quite thorough, one glaring error was found by this reporter. The authors claimed that 'Measures of CEO emotional intelligence and openness to worker ideas - which social psychologists claim are crucial to employees' performance - played an important role, as well. The addition of these two variables raised the model's explanatory power to 70 percent.' Upon close scrutiny, the researchers seemed to have errored as each of these variables were statistically insignificant. Whatever explanatory power they might have had on their own, they provided no real explanatory power when controlling for the firm' return on equity and its annual revenue." b) Substitute relevant numbers for the coefficients given the analysis cited in the article. (Hint: to calculate b0 recall that the regression surface passes through Ῡ, 1 and 2.) c) Using the researchers' model, predict the CEO salary for a firm with a 15% ROE and $30,000,000 annual revenue. d) Should the researchers have excluded emotional intelligence and openness to worker ideas from their model, as the reporter suggested? Prove and explain your answer with the appropriate hypothesis test.
b) B0 = Ῡ - 1X̄1 - 1X̄2 = 2,600,000 - 10,000*10 - 20,000*50 = 2,600,000 - $1,100,000 = $1,500,000. Ŷi = 2,600,000 + 10,000*ROEi + 20,000*Revi c) Ŷi = 2,600,000 + 10,000*15 + 20,000*30 = 2.6 + .750 = 3,350,000 d) Fstat = (0.70 - 0.68)/2/(1-.70)/(100-5) = 3.17 Fcrit = 2.36 @ 10% significance band 3.09 @ 5% significance
Even if your model variables are cointegrated, the model cannot be estimated in its original units if those model variables are nonstationary.
false
Imagine we have twenty years of cost data for each of ten firms. We have all the relevant variables related to costs, except for a variable representing management expertise. If each firm had the same management team over the twenty years, we should choose a random effects model over a fixed effects model because management styles between firms are expected to vary in a random manner.
false
The distributed lag model is more likely to exhibit serial correlation than the autoregressive model.
false
The least squares dummy variable estimators often yield different regression results (e.g., coeffcients, standard errors, R2) than the fixed effects within group estimator
false
We should reject the null hypothesis of the following White test results. Heteroskedasticity Test: White Obs*R-squared 9.0 Prob. Chi-Square(6) 0.11
false
Dummy variables such as ethnicity cannot be estimated in a [a] (fixed effects/random effects) model but can be estimated in a [b] (fixed effects/random effects) model.
fixed, random
Restaurant Chain Sales. Consider the following regression output from a sales function with local income, Income, as a regressor and sales, Sales as the regressand. We can see that there [a] (is/is not) evidence of serial correlation from the [b] (R-squared/t-statistics/F-statistic/Durbin-Watson stat). Assume we also run a Breusch-Godfrey test with two lags. Assume the R2 value for the test is 0.50. The estimated χ2 statistic must be [c] (xx) and the χ2crit at the 5% significance value is [d] (x.xx) which suggests we [e] (have/do not have) evidence of second-order serial correlation
is, DW stat, 23, 5.99, have We can see that there is evidence of serial correlation from the Durbin-Watson stat of 0.31 because it is much lower than any dL critical value when k'=1 and df=40 (the lower one found on the online table I viewed was value of 1.2 for df=40) Assume we run a Breusch-Godfrey test with two lags. Assume the R2 value for the test is 0.50. The estimated χ2 statistic must be 23 (n*R2 =46*0.50) and the χ2crit at the 5% significance value is 5.99 which suggests we have evidence of second-order serial correlation.
The Institute of Econometric Excellence's report "Adult Males Living with Parents" states the following: The examination of the percentage of males aged 25-34 living with their parents across 1000 US cities reveals that three economic and three sociological variables were deemed determinative. In a solely economic regression model, called Model E, the three economic variables alone explained 45% of the changes in cohabitation. In a solely sociological regression model, called Model S, the three sociology variables alone explained 47% of the changes in cohabitation. To determine whether the economic variables added any predictive power to Model S, and to avoid variables confounding one another, the researchers added the predicted percentages of cohabitation from Model E to Model S. The coefficient for predicted cohabitation was 2.5 with a standard error of 1.0. The authors then added the predicted percentages of cohabitation from Model S to Model E and found that the coefficient for predicted cohabitation was -3.0 with a standard error of 1.0. The econometric term for the problem of variables "confounding" one another is [a] (serial correlation/multicollinearity/heteroscedasticity) and it would appear in regression results in in terms of [b] (VIFs exceeding 5/Durbin-Watson statistic near 2/failing to reject the H0 of a White test). The test procedure the researchers are running by adding the predicted cohabitation variables to different regression models is the [c] (Ramsey RESET /Davidson-MacKinnon J/Quandt-Andrews Breakpoint test). According to these results, if our sole desire is to accurately predict the percentage of cohabitation in a city, the variables that should be included in a regression model are the [d] (sociology only/economic only/sociology and economic) because the test reveals the [e] (sociology only/economic only/sociology and economic) are statistically significant.
multi, VIF, Davidson, both, both
A Hausman test whose chi-squared statistic has a p-value of 0.001 indicates we would [a] (reject/fail to reject) the null hypothesis that [b] (fixed effects/random effects) are appropriate.
reject, random effects
notes
t stat = coeff/ s.e prob > .05, fail to reject prob < is not stat signif, .05, reject null t stat = (xxcoeff - given t crit)/ given error , if close -- is stat sig if 1+ is not? f test for joint signif
A Wald test was run to determine whether the estimated optimal time for studying was (statistically) different from 3 hours. The F statistic for the test was 4.67 and its associated p-value was 0.5. The estimated optimal time (statistically) equivalent to 3 hours.
true
Cointegration occurs when the stochastic trend in two or more non-stationary series cancel out in a regression and reveal a nonspurious relationship where the regression errors are stationary.
true
If your model variables are nonstationary and not cointegrated, the model should be estimated using first differences (ΔYt and ΔXt).
true
Impure heteroscedasticity can occur when the regression model is incorrectly specified.
true
Long runs of positive (or negative) errors followed by a long run of negative (or positive) errors is one sign of serial correlation
true
The null hypothesis for a Dickey-Fuller test is that a unit root exists indicating the data series is nonstationary
true
The validity of the random effects estimator relies on the random effects to be uncorrelated with the regressors.
true
True/False. If R2 = 0.4 and the explained sum of squares = 12, the total sum of squares must be 30.
true. If ESS = 12 and R2 is 0.4 that means TSS must be 30. As R2=ESS/TSS or 1 - RSS/TSS. Note that TSS = ESS + RSS where RSS = residual sum of squared errors (which must be 18 given TSS and ESS).
Wagei = 3.0 + 1.50*Ediwhere Wage = hourly wage in U.S. dollars and Ed = years of education. The predicted hourly wage for someone with 5 years of education is $[xx.xx](xx.xx), while the predicted hourly wage for someone with 10 years of education is $[yy.yy](xx.xx). We would predict the wage to be $21 for someone with [aa](xx) years of education.
10.5 , 18, 12
The Chi-squared critical value at the 5% significance level for an Andrews test with data grouped in quartiles is [9.5]
9.5
It is hypothesized that theft, or larceny, in the US is a function of unemployment. The following regression comes from data on these variables over the period 1975-2005 larcenyt = 360 + 55.0unemploymentt where larcenyt is in 1,000s of $US (e.g., a value of $200,000 of stolen property appears as 200) and unemploymentt is in percentage point terms (e.g., 4.5% unemployment appears as 4.5). A. The predicted value of larceny for a 10% unemployment rate is $[z] (360,055/360,550/415,000/910,000). B. A 2-percentage point increase in unemployment will lead to an increase in larceny of $[
910,000 and 110,000
The model Ŷt = β1Yt-1 + ut where β < 1 is an
AR(1) process
1. DFFITS examine outliers by considering a. The model's errors, ûi b. The model's Ŷi values c. The model's estimated coefficients, i.
B
1. Which of the following models is a distributed lag model? a. Yt = d0 + d1Xt + d2Yt-1+ ut b. Yt = b0 + b1Xt + b2Xt-1 + b3Xt-2 + ut
B
Consider the following test regression. ûit2 =α0 + α1X1it + α2X2it + α3X3it + α4X4it The test that is being run is a [a] (BG/BPG/Durbin-Watson) for a [b] (cross-sectional/time series/fixed effects panel/pooled) data set. The underlying regression model given the following test equation is [c] (a/b/c/d) a. Yi = β0 + β1X1i + β2X2i + β3X3i + β4X4i + ui b. Yit = β0 + β1X1it + β2X2it + β3X3it + β4X4it + uit c. Yt = β0 + β1X1t + β2X2t + β3X3t + β4X4t + ut d. Not enough information If the test regression results yield an R2 of 0.07 with n = 200, we should [d] (reject/fail to reject) Ho at the 5% significance level.
BPG pooled B reject
1. DFBETAS examine outliers by considering a. The model's errors, ûi b. The model's fitted values, Ŷi c. The model's estimated coefficients.
C
Stating that the likelihood of getting a particular estimator value (say a slope value, b1, of -2.68) by chance is 3%, implies a. The |tstat| < |tcrit| for the particular estimator. b. The p-value for the estimator is approximately 0.3. c. The p-value for the estimator is approximately 0.03.
C
Impure heteroscedasticity occurs when the nature of the data causes the model error variance to change systematically.
False, change it to pure
1. T/F. Imagine we have twenty years of cost data for each of ten firms. We have all the relevant variables related to costs, except for a variable representing management expertise. If each firm had the same management team over the twenty years, we should choose a random effects model over a fixed effects model, because management styles between firms are assumed to vary in a random manner.
False, the "effect" from the same team over the same years of data is best represented as "fixed".
5. T/F. The alternative hypothesis for a Dickey-Fuller test is given as HA: d ≠ 0 (or r ≠ 1).
False. It is only one-sided.
Consider the following regression equation. Wagei = 3.0 + 1.50*Edi where Wage = hourly wage in U.S. dollars and Ed = years of education. The correct interpretation of the two regression coefficients is
If education were 0 the hourly wage would be $3. Every additional year of education raises hourly wages by $1.50
Two benefits from using panel data include
Less multicollinearity. Solving omitted variables problems
Jaque bera
Our rule of thumb is that JB prob > 0.20 for a normal distribution.
1. A Hausman test whose chi-squared statistic has a p-value of 0.001 indicates we would [reject/fail to reject] the null hypothesis that [fixed effects are appropriate/random effects are appropriate].
Reject, Random Effects are appropriate,
The Institute of Econometric Excellence's report "Adult Males Living with Parents" states the following: The examination of the percentage of males aged 25-34 living with their parents across 1000 US cities reveals that together three economic and three social variables explain 85% of the changes in "cohabitation." To test the adequacy of the model, researchers added the squared and cubed values for the predicted percentages of cohabitation which raised the explanatory power by only one percentage point. The name of the test the authors were running to test the model adequacy is the [a] (Ramsey RESET /Davidson-MacKinnon J/Quandt-Andrews Breakpoint test). Results from the cohabitation model show [b] (no signs/signs) of misspecification. This is proven by using [c] (an F-statistic /t-statistic/J-statistic) whose value is [d] (xx.xx) which exceeds its critical value of [e] (x.xx) at the 5% significance level. (Email your calculations to verify results or receive partial credit.)
Ramsey, signs, f stat, 35, 3 Misspecification is evident because Fstat of 35.43 > Fcrit 3.01. Note that the number of restrictions is the 2 additional variables (squared and cubed fitted values) and n = 1000 and k in unrestricted is 8 (3 soc. vars + 3 econ. vars + 2 fitted vars in test) Fstat = ((0.86-0.85)/2)/((1-0.86)/(1000-8)) = 35.43
1. Consider the following regression equation with standard errors in parentheses. CMt = 180 - 0.30Incomet - 0.05Mt (s.e.) (20) (0.1) (0.01) where CMt = quarterly Egyptian child mortality rate, Incomet = quarterly aggregate income of American farmers, and Mt = quarterly Honduran money supply. R2 = 0.92, DW = 0.4752, F-stat (prob)= 0.04 and n = 100 Which of the following is not evidence of this being a spurious regression relationship? a. R2 > DW b. The variables' relationship is not economically significant or meaningful c. The variables' relationship is not statistically significant d. The overall regression appears statistically significant but the individual variables are not.
The answer is c. and d. because the t-stats> 2 which implies statistical significance and both the regression and variables are statistically significant.
Assume that the correlation between height and weight for adult males in Tampa is 0.50. What would happen to the correlation if we included Tampa women and children in our sample, knowing that individuals' gender and age influence both their height and their weight?
The correlation would rise above 0.5 because the relative weight of the shared factors would increase with their inclusion
A recent study of monthly larceny crimes in the U.S. produced the following results. larcenyt = 300 + 50unemploymentt - 5t n = 103 R2 = 0.20 where larcenyt is in 1,000s of $US (e.g., a value of $200,000 of stolen property appears as 200), unemploymentt is in percentage point terms (e.g., 4.5% unemployment appears as 4.5), and t represents the month (where first month has a value of "1" and second month as "2"). If the model regressors are statistically insignificant individually and the Global F-statistic has a p-value of 0.04, this implies: The Global F-statistic measures the statistical significance of the model's R2. Calculating this F-statistic by hand yields a value of (use the format x.xx):
The overall model is statistically significant and thus is better for predicting larceny than using the mean level of monthly larceny. 12.5 Every additional month leads to a $5,000 decrease in larceny.
Consider the following regression equation with standard errors in parentheses. CMt = 180 - 0.30Incomet - 0.05Mt (s.e.) (20) (0.1) (0.01) where CMt = quarterly Egyptian child mortality rate, Incomet = quarterly aggregate income of American farmers, and Mt = quarterly Honduran money supply. R2 = 0.92, DW = 0.4752, F-stat (prob)= 0.04 and n = 100 Which of the following is not evidence of this being a spurious regression relationship?
The variables' relationship is not statistically significant c. The overall regression appears statistically significant but the individual variables are not.
1. T/F. The validity of the random effects estimator relies on the random effects to be uncorrelated with the regressors.
True
Both the error term, u, and regressors, x, are random variables.
True
Good in theory is good in practice
True
1. T/F. If the marginal impact is a 10 percentage point increase of Y for an additional X when facing "two-to-one odds", the coefficient for X must be 0.45.
True. 2/3*1/3*b = 0.10 => 2/9*b = 0.10 =>b = 0.45
Solve for the natural rate of unemployment for the following two Phillips curve models. ∆πt = 2.961687 - 0.483849ut and ∆πt = -2.601144 + 14.92249
Un = 6.121097698 and Un = 5.736894997
correlation table Is there evidence of problematic multicollinearity from the Correlation and VIF tables?
Yes, VIFs > 5 and correlation > 0.8 for study and study2
Which of the following models is a distributed lag model?
Yt = β0 + β1Xt + β2Xt-1 + β3Xt-2 + ut
Which of the following models is a Koyck model?
Yt = δ0 + δ1Xt + δ2Yt-1+ ut
1. Consider the following output table. C= 5.0, 2.5, ?, 0.06 Price = 1.5, .5, ?, .006 a)A The t-stat for β0 is [a](x.x) and it [b] (is/is not) statistically significant at the 5% significance level. 2.0, is not b) The model explains [c] (xx) percent of the variation in quantity. 62 c) A. The degrees of freedom for a t-test are [d] (xx). 28 d) Assuming a critical t value of 2.0, the slope [e] (is/is not) statistically different from the value 1.0. e) In this regression, "Quantity" is the [f] (dependent/independent) variable. "Price" is assumed to be a [g] (random/non-random/dependent) variable. f) If the model has a Jarque-Bera probability (JB prob) of 0.15, this implies the errors [are not] (are/are not) normally distributed. Non-normally distributed errors imply that our model's [t-stats] (t-stats/R-square/betas) may not be valid. g) To sell 80 units, price must be [j] (x.x) h) At a price of $10, quantity will equal [k] (no hint). i) A. A correct interpretation of the slope is [l] (choose i, ii, iii, or iv) iii. every $10 increase in price causes a 15 increase in quantity
a) 2.0, is not b) 62 c) 28 d) Is not, t = (1.5-1)/0.5 = 1 which is < tcrit = 2.0 e) dependent, nonrandom f) are not, t-stats g) $50 (80 = 5 + 1.5*P => 75/1.5 = P) h) 20 units will be sold i) iii
Consider the following regression equation: Hourly wage = -0.01445 + 0.724097*Years of Education a) Interpret the regression coefficients (slope). Every additional year of education raises hourly wage by 72 cents. b) Predict the hourly wage for someone with 5 years of education. $3.60 c) Predict the hourly wage for someone with 10 years of education. $7.23
a) Every additional year of education raises hourly wage by 72 cents. b) $3.60 c) $7.23
Consider the following regression equation below with standard errors given in parentheses. colGPAi = 1.2 + 0.3hsGPAi + 0.01ACTi + 0.5Studyi - 0.1Studyi2 + 0.1rmhsGPAi s.e. (0.5) (0.1) (0.01) (0.4) (0.1) (0.025) n = 100, R2 = 0.45, F-statistic = 15.38, and F-stat (prob) = 0.002 where colGPAi = college grade point average, hsGPAi = high school GPA, ACTi = achievement test score, Study = hours studying per day, rmhsGPA = roommate's high school GPA, According to the results, every 1 point increase in a student's high school GPA results in a [b] (rise/fall) of college GPA of [a.a] (x.x) points. This result is [d] (statistically significant/insignificant. b) The model predicts a college GPA of [x.xx] (x.xx) for someone with a high school GPA of 3.0, a 25 ACT score, who studies for 4 hours a day, and whose roommate had a 3.0 hs GPA.
a) rise, .3, is stat signif b)3.05
1. The Institute of Econometric Excellence's report "Adult Males Living with Parents" states the following: The examination of the percentage of males aged 25-34 living with their parents across 1000 US cities reveals that three economic and three sociological variables were deemed determinative. In a solely economic regression model, called Model E, the three economic variables alone explained 45% of the changes in cohabitation. In a solely sociological regression model, called Model S, the three sociology variables alone explained 47% of the changes in cohabitation. To determine whether the economic variables added any predictive power to Model S, and to avoid variables confounding one another, the researchers added the predicted percentages of cohabitation from Model E to Model S. The coefficient for predicted cohabitation was 2.5 with a standard error of 1.0. The authors then added the predicted percentages of cohabitation from Model S to Model E and found that the coefficient for predicted cohabitation was -3.0 with a standard error of 1.0. a. The econometric term for the problem of variables "confounding" one another is [multicollinearity] and it would appear in regression results in terms of [VIFs exceeding 5) b) a. The test procedure the researchers are running by adding the predicted cohabitation variables to different regression models is the [Ramsey RESET /Davidson-MacKinnon J/Quandt-Andrews Breakpoint] test. c) According to these results, if our sole desire is to accurately predict the percentage of cohabitation in a city, the variables that should be included in a regression model are the [sociology only/economic only/both the sociology and economic] because the test reveals [sociology only/economic only/both the sociology and economic] are statistically significant.
b) Davidson-MacKinnon J c) both the sociology and economic] both the sociology and economic] Both sociology and economic should be included because the test above indicated both models add value to the other.
Which two of the following conditions must a stationary variable uphold?
constant mean and constant variance and time-indep covar
The University of Tampa provost asks you to find the relationship between weekly hours spent working (work) and weekly hours spent studying (study). He knows that students choose both of these activities. Given this information, the relationship is best characterized as a [y] (correlation/regression) because work is a/an [z] (dependent/independent) variable and study is a/an [b] (dependent/independent) variable.
correl, indep, indep
Assume the correlation between ROE and Annual Revenue, Rev, is 0.5. If ROE were added to the model in question 1 - making it a multi-variate model with ROE and Rev as regressors - we would expect to the coefficient on Rev to [a] (decrease/increase/remain the same) and be [b] (positive/negative/zero) and we would expect R2 to [c] (rise/fall/stay the same) and Adjusted R2 to [d] (rise/fall/stay the same).
decrease, positive, rise, rise
A constant error variance is one sign of heteroscedasticity.
false
Heteroscedasticity can arise under the following conditions except when:
superfluous (i.e., unnecessary) regressor is included in our model.
First-order serial correlation occurs when:
the errors are correlated with their previous period values.
Applications-Science Grants. Did student science grants increase applications to Cragmoore College? Consider the following regression results (with standard errors in the parentheses) for the years 1959 through 2016, appt = 2,000 + 0.05 grantst + 240 wart - 32 rivalt (s.e.) (1000) (0.01) (70) (4) Are the distributed lags individually statistically significant? To answer, we use [a] (an F-test/two-tailed t-tests/one-tailed t-tests/a Wald test) with degrees of freedom of [xx] (xx) . The critical value for our test statistic is [y.yy] (x.xx). From this test, we find that [z] (none/some/all) of the distributed lags are statistically significant. b. Are the distributed lags jointly statistically significant? To answer, we use [c] (F-test/two-tailed t-tests/one-tailed t-tests/a Wald test) whose critical value for our test statistic is [d] (x.xx). We compare this critical value to our test statistic of [bb] (x). Because the critical value is [g] (larger/smaller) than the test statistic, we [e] (fail to reject/reject) the null hypothesis and conclude that the distributed lags are [f] (not jointly significant/jointly significant).We might we expect the result above because of [h](our model being properly specified/multicollinearity/serial correlation/heteroscedastic model errors)
two tailed t test 50 2.01 none f test, 3.18 5 small reject jointly multico 300 1 multi serial
When the CLRM assumptions are maintained, the resulting regression coefficients will have the following two properties of
unbiasedness and efficiency
The following is a BPG/White [x] (x) test (with standard errors in parentheses) for heteroscedasticity. Given the results, we are most likely to [y] (reject/fail to reject) the null hypothesis of [z] (homoscedasticity/heteroscedasticity) because a statistically significant relationship [a] (does/does not) exist between the squared errors and some of the regressors. ûi2 = 8 + 2X1i + 1.5X2i + 4X1i2 + 3X2i2 + 1X1iX2i (s.e.) (4) (3) (2) (1) (1) (2)
white, reject, homo, does some of t stats are signif white test for impure