Econemetrics Midterm 1 Review
See image
Draw a graph with y on the vertical axis and x on the horizontal axis. On this picture: a. Draw 5 data points on the graph (they can be anywhere you like), and draw a line representingthe OLS regression line given those 5 data points (it does not need to be perfectly placed) b. For one of the 5 data points, indicate the distance representing 𝑦̂𝑖 c. For one of the 5 data points (it can be same as part c.), indicate the distance representing 𝑢̂ 𝑖
image
Draw a graph with y on the vertical axis and x on the horizontal axis. On this picture: a. Draw 5 data points on the graph (they can be anywhere you like), and draw a line representingthe OLS regression line given those 5 data points (it does not need to be perfectly placed) b. For one of the 5 data points, indicate the distance representing 𝑦̂𝑖 c. For one of the 5 data points (it can be same as part c.), indicate the distance representing 𝑢̂ 𝑖
𝑉𝑎𝑟(𝛽̂1) depends on 𝜎2, which we do not know. On the other hand 𝑉𝑎𝑟̂ (𝛽̂1) depends on 𝜎̂ 2, which wecan calculate. Hence we can calculate 𝑉𝑎𝑟̂ (𝛽̂1) but not 𝑉𝑎𝑟(𝛽̂1)
Explain carefully why we cannot calculate 𝑉𝑎𝑟(𝛽̂1), but we can calculate 𝑉𝑎𝑟̂ (𝛽̂1).
Everything is correct except for the last sentence "Thus, I can test.....". The issue with the last sentenceis that, as we discussed in class, the 𝑢̂ 𝑖 and xi are uncorrelated for ***every*** OLS regression. I.e. the𝑢̂ 𝑖 and xi are uncorrelated even if the true 𝑢𝑖 and xi are correlated (i.e. the ZCM doesn't hold). Hencefinding that 𝑢̂ 𝑖 and xi are uncorrelated (which you will always find!) provides no information on whetherthe ZCM holds
Explain what is wrong with the following "logic": "If the ZCM assumption doesn't hold in my SLR model, then 𝛽̂0 and 𝛽̂1 will generally be biased. TheZCM assumption requires that the ui are uncorrelated with the xi. I don't observe the ui, so I can'tdirectly check whether the ui are uncorrelated with the xi. However, after OLS estimation, I can calculate𝑢̂ 𝑖, and these 𝑢̂ 𝑖 are estimates of the ui. Thus, I can test whether the ZCM assumption is correct bychecking whether the 𝑢̂ 𝑖 are uncorrelated with the xi, i.e. if I find that 𝑢̂ 𝑖 and xi are uncorrelated, this isevidence that the ZCM assumption is correct"
Everything is correct except for the last sentence "Thus, I can test.....". The issue with the last sentenceis that, as we discussed in class, the 𝑢̂ 𝑖 and xi are uncorrelated for ***every*** OLS regression. I.e. the𝑢̂𝑖 and xi are uncorrelated even if the true 𝑢𝑖 and xi are correlated (i.e. the ZCM doesn't hold). Hencefinding that 𝑢̂𝑖 and xi are uncorrelated (which you will always find!) provides no information on whetherthe ZCM holds
Explain what is wrong with the following "logic": "If the ZCM assumption doesn't hold in my SLR model, then 𝛽̂0 and 𝛽̂1 will generally be biased. TheZCM assumption requires that the ui are uncorrelated with the xi. I don't observe the ui, so I can'tdirectly check whether the ui are uncorrelated with the xi. However, after OLS estimation, I can calculate𝑢̂𝑖, and these 𝑢̂ 𝑖 are estimates of the ui. Thus, I can test whether the ZCM assumption is correct bychecking whether the 𝑢̂ 𝑖 are uncorrelated with the xi, i.e. if I find that 𝑢̂𝑖 and xi are uncorrelated, this isevidence that the ZCM assumption is correct"
c) The variance of 𝑢 depends on the value of the 𝑥 variables
Heteroskedasticity is when a) The expected value of 𝑢 depends on the value of the 𝑥 variables b) The expected value of 𝑢 does not depend on the value of the 𝑥 variables c) The variance of 𝑢 depends on the value of the 𝑥 variables d) The variance of 𝑢 does not depend on the value of the 𝑥 variables
d) The variance of 𝑢 depends on the value of the 𝑥 variables
Heteroskedasticity is when a) The expected value of 𝑢 does not depend on the value of the 𝑥 variables b) The expected value of 𝑢 depends on the value of the 𝑥 variables c) The variance of 𝑢 does not depend on the value of the 𝑥 variables d) The variance of 𝑢 depends on the value of the 𝑥 variables
B
In SLR, which of the following changes (all else equal) would tend to increase 𝑉𝑎𝑟(𝛽̂1)? a) an increase in the sample variance of xi b) an increase in the number of observations in the sample c) an increase in the variance of ui d) a and b e) a and c f) b and c
c) an increase in the number of observations in the sample
In SLR, which of the following changes (all else equal) would tend to increase 𝑉𝑎𝑟(𝛽̂1)? a) an increase in the sample variance of xi b) an increase in the variance of ui c) an increase in the number of observations in the sample d) a and be) a and c f) b and c
f) a and c
In the Multiple Linear Regression model 𝑦 = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 + 𝑢, the ZCM assumption requires: a) x1 to be uncorrelated with u b) x1 to be uncorrelated with x2 c) x2 to be uncorrelated with u d) a and b e) b and c f) a and c g) a, b, and c h) none of the above
e) a and c
In the Multiple Linear Regression model 𝑦 = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 + 𝑢, the ZCM assumptionrequires: a) x1 to be uncorrelated with u b) x1 to be uncorrelated with x2 c) x2 to be uncorrelated with u d) a and b e) a and c f) b and c g) a, b, and c h) none of the above
Suppose we are trying to measure the impact of hours spent in a job training program on wages, i.e. 𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1𝑗𝑜𝑏𝑡𝑟𝑎𝑖𝑛𝑖𝑛𝑔 + 𝑢 If individuals choose their amount of job training on their own, jobtraining could easily be correlated withomitted variables in u (e.g. education, work ability, etc.). But if we can instead randomly assignindividuals to receive different levels of job training (e.g we randomly select a third of the individuals toget 10 hours of training (jobtraining = 10), a third of the individauls to get 5 hours training (jobtraining =5), and a third of the individuals to get 0 hours training (jobtraining = 10). Then, because the value ofjobtraining was determined by a roll of a die (or computer random number generator), it will essentiallybe uncorrelated with any omitted variables in u.
In the context of a specific SLR model example of your choice (i.e. tell me what your yvariable is (e.g. educ, sales, donations, etc. or whatever you want), and what your x variable is (e.g.another variable of your choice that you think might affect your y)) explain how you could userandomization to avoid omitted variable bias (OVB). Be sure to explain carefully "why" randomizationensures no OVB.
Suppose we are trying to measure the impact of hours spent in a job training program on wages, i.e.𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1𝑗𝑜𝑏𝑡𝑟𝑎𝑖𝑛𝑖𝑛𝑔 + 𝑢If individuals choose their amount of job training on their own, jobtraining could easily be correlated withomitted variables in u (e.g. education, work ability, etc.). But if we can instead randomly assignindividuals to receive different levels of job training (e.g we randomly select a third of the individuals toget 10 hours of training (jobtraining = 10), a third of the individauls to get 5 hours training (jobtraining =5), and a third of the individuals to get 0 hours training (jobtraining = 10). Then, because the value ofjobtraining was determined by a roll of a die (or computer random number generator), it will essentiallybe uncorrelated with any omitted variables in u.
In the context of a specific SLR model example of your choice (i.e. tell me what your yvariable is (e.g. educ, sales, donations, etc. or whatever you want), and what your x variable is (e.g.another variable of your choice that you think might affect your y)) explain how you could userandomization to avoid omitted variable bias (OVB). Be sure to explain carefully "why" randomizationensures no OVB.
Family income and cigarettes smoked explain 2.98% of the variation in birthweight.
Interpret the R-squared value of 0.0298. (One sentence.)
A $1000 increase in family income is associated with an expected increase in birthweightof 0.0927 ounces, holding number of cigarettes smoked fixed
Interpret the estimate of the slope parameter on the faminc variable. (One sentence.)
2213.6
Now suppose that instead of measuring addoll in units of hundreds of dollars, you measure itin units of thousands of dollars, i.e. you define the new variable addoll_1000 = addoll/10 and run a new regression of salesperweek (not salesperweek_100!) on logprice and addoll_1000. a) What would the estimated slope coefficient on addoll_1000 in the new regression be?
8516
Now suppose that instead of measuring addoll in units of hundreds of dollars, you measure itin units of thousands of dollars, i.e. you define the new variable addoll_1000 = addoll/10 and run a new regression of salesperweek (not salesperweek_100!) on logprice and addoll_1000. b) What would the estimated constant term in the new regression be?
Yes
Now suppose that instead of regressing salesperweek on the two variables logprice andaddollars, you instead run a new regression where you regress salesperweek on three different variables- sitehits (visits to your website in that week), coupons (the number of coupons you gave to consumers inthat week, and discounts (the amount of discounts you gave that week). Could the R2 in the new regression be greater than 0.4738?
Yes
Now suppose that instead of regressing salesperweek on the two variables logprice andaddollars, you instead run a new regression where you regress salesperweek on three different variables- sitehits (visits to your website in that week), coupons (the number of coupons you gave to consumers inthat week, and discounts (the amount of discounts you gave that week). Could the R2 in the new regression be lower than 0.4738?
b) The total sample variation in 𝑦𝑖
SST measures a) The total sample variation in 𝑦̂𝑖 b) The total sample variation in 𝑦𝑖 c) The total sample variation in 𝑦̂𝑖 minus the total sample variation in 𝑦̅ d) The total sample variation in 𝑦𝑖 minus the total sample variation in 𝑦̂
a) The total sample variation in 𝑦𝑖
SST measures a) The total sample variation in 𝑦𝑖 b) The total sample variation in 𝑦̂𝑖 c) The total sample variation in 𝑦𝑖 minus the total sample variation in 𝑦̂𝑖 d) The total sample variation in 𝑦̂𝑖 minus the total sample variation in 𝑦̅
190_or $190,000 (or 0.5*140+100+20)
Suppose a study of house prices estimates the following OLS regression line 𝑝𝑟𝑖𝑐𝑒 = 200 + 0.5𝑠𝑞𝑓𝑡 + 100𝑏𝑒𝑑𝑟𝑜𝑜𝑚𝑠 + 20𝑏𝑎𝑡ℎ𝑟𝑜𝑜𝑚𝑠 Where price is the selling price of the house (in thousands of dollars), sqft is the size of the house (insquare feet), bedrooms is the number of bedrooms the house has, and bathrooms is the number ofbathrooms the house has. Suppose you are considering building a new addition on the back of your house that will contain a 100square foot bedroom and a 40 square foot bathroom. According to the regression results, by how muchshould you expect the new addition to increase the value (i.e. price) of your house?
220_or $220,000 (or 0.5*200+100+20)
Suppose a study of house prices estimates the following OLS regression line 𝑝𝑟𝑖𝑐𝑒 = 200 + 0.5𝑠𝑞𝑓𝑡 + 100𝑏𝑒𝑑𝑟𝑜𝑜𝑚𝑠 + 20𝑏𝑎𝑡ℎ𝑟𝑜𝑜𝑚𝑠 Where price is the selling price of the house (in thousands of dollars), sqft is the size of the house (insquare feet), bedrooms is the number of bedrooms the house has, and bathrooms is the number ofbathrooms the house has. Suppose you are considering building a new addition on the back of your house that will contain a 150square foot bedroom and a 50 square foot bathroom. According to the regression results, by how muchshould you expect the new addition to increase the value (i.e. price) of your house?
2.213
Suppose that instead of measuring salesperweek in number of units, you measure it inhundreds of units, i.e. you define the new variable salesperweek_100 = salesperweek/100 and run a new regression of salesperweek_100 on logprice and addoll. a) What would the estimated slope coefficient on addoll in the new regression be?
85.16
Suppose that instead of measuring salesperweek in number of units, you measure it inhundreds of units, i.e. you define the new variable salesperweek_100 = salesperweek/100 and run a new regression of salesperweek_100 on logprice and addoll. b) What would the estimated constant term in the new regression be?
Omitted variable bias depends on two things, the correlation between the omitted variable and theincluded variable (in this case corr(educ, teendruguse) - or 𝛿 in notes ), and the effect of the omittedvariable on the y variable (in this case, the effect of educ on crimerate - or 𝛽2 in notes). Because I thinkeducation is generally a good thing, I would expect educ to be negatively correlated with teendruguse, i.e.that cities with higher education levels would tend to have lower teen drug use; and I would expect educto have a negative effect on crimerate, i.e. that increased education levels would tend to cause lowercrime rates. Because both these objects (𝛽2 and 𝛿) are negative, our OVB formula implies that 𝛽̂1 will bepositively biased. Thus, since 1.534 has positive bias, I would expect the true effect of increasingteendruguse by 1 on crimerate to be something less than 1.534
Suppose you are using data from a sample of cities to study the effect of illegal drug use oncrime. You consider the following regression model: 𝑐𝑟𝑖𝑚𝑒𝑟𝑎𝑡𝑒 = 𝛽0 + 𝛽1𝑡𝑒𝑒𝑛𝑑𝑟𝑢𝑔𝑢𝑠𝑒 + 𝑢 where crimerate is the measured crime rate in each city, and teendruguse is a variable measuring theproportion of young adults in the city who have used illegal drugs in the past year. Suppose that the OLSestimate of the slope parameter is 𝛽̂1 = 1.534. Now consider another variable called educ, which is omitted from the above regression and measures theaverage education level in each city. Use economic reasoning to argue that the omission of educ from theabove regression model would bias 𝛽̂1 . Be sure to explain, based on your economic reasoning, whetheryou expect the bias to be positive or negative. Given this reasoning, do you think the effect of increasingteendruguse by 1 on crimerate is more likely to be greater than 1.534 or less than 1.534?
Omitted variable bias depends on two things, the correlation between the omitted variable and theincluded variable (in this case corr(educ, teendruguse) - or 𝛿 in notes ), and the effect of the omittedvariable on the y variable (in this case, the effect of educ on crimerate - or 𝛽2 in notes). Because I thinkeducation is generally a good thing, I would expect educ to be negatively correlated with teendruguse, i.e.that cities with higher education levels would tend to have lower teen drug use; and I would expect educto have a negative effect on crimerate, i.e. that increased education levels would tend to cause lowercrime rates. Because both these objects (𝛽2 and 𝛿) are negative, our OVB formula implies that 𝛽̂1 will bepositively biased. Thus, since 2.134 has positive bias, I would expect the true effect of increasingteendruguse by 1 on crimerate to be something less than 2.134
Suppose you are using data from a sample of cities to study the effect of illegal drug use oncrime. You consider the following regression model: 𝑐𝑟𝑖𝑚𝑒𝑟𝑎𝑡𝑒 = 𝛽0 + 𝛽1𝑡𝑒𝑒𝑛𝑑𝑟𝑢𝑔𝑢𝑠𝑒 + 𝑢 where crimerate is the measured crime rate in each city, and teendruguse is a variable measuring theproportion of young adults in the city who have used illegal drugs in the past year. Suppose that the OLSestimate of the slope parameter is 𝛽̂1 = 2.134. Now consider another variable called educ, which is omitted from the above regression and measures theaverage education level in each city. Use economic reasoning to argue that the omission of educ from theabove regression model would bias 𝛽̂1 . Be sure to explain, based on your economic reasoning, whetheryou expect the bias to be positive or negative. Given this reasoning, do you think the effect of increasingteendruguse by 1 on crimerate is more likely to be greater than 2.134 or less than 2.134?
It tells us that increasing price by 1% leads to 3.7541 less unit sales per week.
1. a) What does the estimated coefficient on logprice tell us about the extent to which increasing price willaffect sales?
It tells us that increasing advertising by $10 per week leads to 2.214 more unit sales per week
1. b) What does the estimated coefficient on addoll imply about the effect of increasing your advertisingspending by $10 per week (remember that addoll is measured in units of hundreds of dollars)
0.22136
2. (2 points) Suppose that instead of measuring salesperweek in number of units, you measure it inhundreds of units, i.e. you define the new variable salesperweek_100 = salesperweek/100 and run a new regression of salesperweek_100 on logprice and addoll. a) What would the estimated slope coefficient on addoll in the new regression be?
8.516
2. (2 points) Suppose that instead of measuring salesperweek in number of units, you measure it inhundreds of units, i.e. you define the new variable salesperweek_100 = salesperweek/100 and run a new regression of salesperweek_100 on logprice and addoll. b) What would the estimated constant term in the new regression be?
221.366
3. (2 points) Now suppose that instead of measuring addoll in units of hundreds of dollars, you measure itin units of thousands of dollars, i.e. you define the new variable addoll_1000 = addoll/10 and run a new regression of salesperweek (not salesperweek_100!) on logprice and addoll_1000. a) What would the estimated slope coefficient on addoll_1000 in the new regression be?
851.6248
3. (2 points) Now suppose that instead of measuring addoll in units of hundreds of dollars, you measure itin units of thousands of dollars, i.e. you define the new variable addoll_1000 = addoll/10 and run a new regression of salesperweek (not salesperweek_100!) on logprice and addoll_1000. b) What would the estimated constant term in the new regression be?
yes
4. Now suppose that instead of regressing salesperweek on the two variables logprice andaddollars, you instead run a new regression where you regress salesperweek on three different variables- sitehits (visits to your website in that week), coupons (the number of coupons you gave to consumers inthat week, and discounts (the amount of discounts you gave that week) Could the R2 in the new regression be greater than 0.4738?
yes
4. Now suppose that instead of regressing salesperweek on the two variables logprice andaddollars, you instead run a new regression where you regress salesperweek on three different variables- sitehits (visits to your website in that week), coupons (the number of coupons you gave to consumers inthat week, and discounts (the amount of discounts you gave that week) Could the R2 in the new regression be lower than 0.4738?
𝛽̂1 would not change, i.e. 𝛽̂1 = 4. Remember that because of the log, 𝛽̂1 measures the effect of apercentage change in the x variable. So, for example, the slope in the first regression measures the effectof a 100% increase in price measured in dollars, and the slope in the second regression measures theeffect of a 100% increase in price measured in cents. But a 100% increase in price measured in dollars*equivalent* to a 100% increase in price measured in cents (in contrast, increasing price measured indollars by 1 is *not equivalent* to increasing price measured in cents by 1). Thus the slopes in the tworegressions measure the same thing and thus will be the same.
Consider the following regression model 𝑠𝑎𝑙𝑒𝑠 = 𝛽0 + 𝛽1log (𝑝𝑟𝑖𝑐𝑒) + 𝑢 where price is the average weekly price (measured in dollars), log(price) is the natural log of price, and sales is your weekly sales. Suppose the OLS slope parameter estimate is 𝛽̂1 = 4. Now suppose that instead of measuring price in dollars, you measure it in cents. Call this new variableprice_incents, take the natural log of it, and consider the new regression model 𝑠𝑎𝑙𝑒𝑠 = 𝛽0 + 𝛽1log (𝑝𝑟𝑖𝑐𝑒_𝑖𝑛𝑐𝑒𝑛𝑡𝑠) + 𝑢 What do you expect 𝛽̂1 to be in this new regression model? Explain your reasoning.
𝛽̂1 would not change, i.e. 𝛽̂1 = 2. Remember that because of the log, 𝛽̂1 measures the effect of apercentage change in the x variable. So, for example, the slope in the first regression measures the effectof a 100% increase in price measured in dollars, and the slope in the second regression measures theeffect of a 100% increase in price measured in cents. But a 100% increase in price measured in dollars*equivalent* to a 100% increase in price measured in cents (in contrast, increasing price measured indollars by 1 is *not equivalent* to increasing price measured in cents by 1). Thus the slopes in the tworegressions measure the same thing and thus will be the same.
Consider the following regression model 𝑠𝑎𝑙𝑒𝑠 = 𝛽0 + 𝛽1log (𝑝𝑟𝑖𝑐𝑒) + 𝑢 where price is the average weekly price (measured in dollars), log(price) is the natural log of price, andsales is your weekly sales. Suppose the OLS slope parameter estimate is 𝛽̂1 = 2. Now suppose that instead of measuring price in dollars, you measure it in cents. Call this new variableprice_incents, take the natural log of it, and consider the new regression model 𝑠𝑎𝑙𝑒𝑠 = 𝛽0 + 𝛽1log (𝑝𝑟𝑖𝑐𝑒_𝑖𝑛𝑐𝑒𝑛𝑡𝑠) + 𝑢 What do you expect 𝛽̂1 to be in this new regression model? Explain your reasoning.
As we discussed in class, including an irrelevant variable does not bias the other coefficients, i.e. 𝐸[𝛽̂1] =𝛽1. However, it will tend increase 𝑉𝑎𝑟[𝛽̂1].
Consider the model 𝑦 = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 + 𝑢 Suppose that x2 is an irrelevant variable, but you include it in your OLS regression. Explain in words howincluding x2 in your regression (as compared to not including it) is likely to affect 𝐸[𝛽̂1] and 𝑉𝑎𝑟[𝛽̂1].
As we discussed in class, including an irrelevant variable does not bias the other coefficients, i.e. 𝐸[𝛽̂1] =𝛽1. However, it will tend increase 𝑉𝑎𝑟[𝛽̂1].
Consider the model: 𝑦 = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 + 𝑢 Suppose that x2 is an irrelevant variable, but you include it in your OLS regression. Explain in words howincluding x2 in your regression (as compared to not including it) is likely to affect 𝐸[𝛽̂1] and 𝑉𝑎𝑟[𝛽̂1].
a) The effect on y of a one unit increase in x depends on the initial value of x
Consider the multiple linear regression model 𝑦 = 𝛽0 + 𝛽1𝑥 + 𝛽2𝑥2 + 𝑢 where the second explanatory variable is the square of the first explanatory variable. Which of thefollowing statements are true? a) The effect on y of a one unit increase in x depends on the initial value of x b) The effect on y of a one unit increase in x is measured by 𝛽̂1 c) The effect on y of a one unit increase in x is measured by 𝛽1 d) None of the above
c) The effect on y of a one unit increase in x depends on the initial value of x
Consider the multiple linear regression model 𝑦 = 𝛽0 + 𝛽1𝑥 + 𝛽2𝑥2 + 𝑢 where the second explanatory variable is the square of the first explanatory variable. Which of thefollowing statements are true? a) The effect on y of a one unit increase in x is measured by 𝛽̂1 b) The effect on y of a one unit increase in x is measured by 𝛽1 c) The effect on y of a one unit increase in x depends on the initial value of x d) None of the above
It tells us that increasing advertising by $10 per week leads to 22.14 more unit sales per week
What does the estimated coefficient on addoll imply about the effect of increasing your advertisingspending by $10 per week (remember that addoll is measured in units of hundreds of dollars)
t tells us that increasing price by 1% leads to 37.54 less unit sales per week
What does the estimated coefficient on logprice tell us about the extent to which increasing price willaffect sales?
The predicted difference is (40-50)bfaminc + (10-0)bcigs = -0.93-4.63 = -5.56 ounces.
What is the predicted difference in birthweight (bwght) between babies born to Mother A and Mother B, where Mother A has family income of $40,000 and smokes 10 cigarettes per day and Mother B has family income of $50,000 and does not smoke? Show your work.
he sample variance of bwght is SST/(n-1) = 574611.72/1387 = 414.28.
What is the variance of bwght? (You can leave your answer as a fraction.) Note that SST and number of observations are shown in the regression output above - SST is 574611.72 and the number of observations is 1388
e) Assumption SLR.5
Which of the SLR assumptions do we not require for the OLS estimators to be unbiased a) Assumption SLR.1 b) Assumption SLR.2 c) Assumption SLR.3 d) Assumption SLR.4 e) Assumption SLR.5
a) Assumption SLR.5
Which of the SLR assumptions do we not require for the OLS estimators to be unbiased a) Assumption SLR.5 b) Assumption SLR.4 c) Assumption SLR.3 d) Assumption SLR.2 e) Assumption SLR.1
d) 𝑢̂𝑖 = 𝑦𝑖 − 𝛽̂0 − 𝛽̂1𝑥𝑖
Which of the following is true about the OLS residuals 𝑢̂ 𝑖? a) 𝑢̂ 𝑖 = 𝛽0 + 𝛽1𝑥𝑖 b) 𝑢̂𝑖 = 𝛽̂0 + 𝛽̂1𝑥𝑖 c) 𝑢̂ 𝑖 = 𝑦𝑖 − 𝛽0 − 𝛽1𝑥𝑖 d) 𝑢̂𝑖 = 𝑦𝑖 − 𝛽̂0 − 𝛽̂1𝑥𝑖
c) 𝑢̂ 𝑖 = 𝑦𝑖 − 𝛽̂0 − 𝛽̂1𝑥𝑖
Which of the following is true about the OLS residuals 𝑢̂ 𝑖? a) 𝑢̂ 𝑖 = 𝛽̂0 + 𝛽̂1𝑥𝑖 b) 𝑢̂ 𝑖 = 𝛽0 + 𝛽1𝑥𝑖 c) 𝑢̂ 𝑖 = 𝑦𝑖 − 𝛽̂0 − 𝛽̂1𝑥𝑖 d) 𝑢̂ 𝑖 = 𝑦𝑖 − 𝛽0 − 𝛽1𝑥𝑖
e) a and c
Which of the following is/are true about the OLS residuals 𝑢̂ 𝑖? a) They have zero correlation with all the independent variables 𝑥𝑖1, ... , 𝑥𝑖𝑘 b) They have zero correlation with the 𝑦𝑖 c) They have zero correlation with the fitted/predicted values 𝑦̂𝑖 d) a and b e) a and c f) b and c g) all of the above
a. R-squared is not changed (0.0298) by a change in units of measurement. b. bpacks = 20bcigs = -9.27
You decide to replace cigs (number of cigarettes smoked) by packs (number of packs smoked), where 20 cigs packs = . You run the regression of bwght on faminc and packs. a. (2 points) What is the R-squared value from this new regression? b. (3 points) What is the estimated slope parameter on packs from this new regression?
A one-percent increase in family income is associated with an expected increase inbirthweight of 0.0185 ounces, holding number of cigarettes smoked fixed.
You decide to replace faminc by log(faminc) (natural logarithm) in the regression. The resulting slope estimate on the log(faminc) variable is 1.85. Interpret this estimate. (One sentence.)
f) a and c
hich of the following is/are true about the OLS residuals 𝑢̂ 𝑖? a) They have zero correlation with the fitted/predicted values 𝑦̂𝑖 b) They have zero correlation with the 𝑦𝑖 c) They have zero correlation with all the independent variables 𝑥𝑖1, ... , 𝑥𝑖𝑘 d) a and b e) b and c f) a and c g) all of the above