econometrics!
heterosked dummy dependent variable
Dummy dependent variable always suffers from heteroskedasticity
What is the interpretation of a slope in regression? o Bivariate o Multivariate o When one or both variables are in logs? o When the dependent variable is a dummy variable?
Percentage points on the probability of y being in the "1" category
o How to fix standard errors- PD Which cross section unit should we use?
(", cluster(cross‐section unit)")
PD - How to fix efficiency Why do we almost never estimate with random effects
("random effects")
heterosked- definition and consequences when does it occur?
heterosked- variance of errors is different for different values of X's violated MLR.5- var [u|x]= σ^2 consequences: unbiased and consistent- but inefficient (not BLUE), and SEs are biased--> can't use T, F, or LM tests occurs if: y data are means (averages) if y is a dummy dependent variable
missing data viol (MLR>2
if data are missing for any variable, that observation can't be used if data are missing at random--> no bias data missing systematically violates MLR.4--> biased estimates
ME- how does ME in dependent var affect slopes and standard errors: How does the answer differ when the dependent variable is continuous versus when it is a dummy variable (or, more generally, bounded)?
if y is a dummy variable: cov (y*, e) DNE O--> me neg correlation- nonclassical measurement error other nonclassical measurement errors: years of ed, over/underreporting if E[e] DNE 0, may bias intercept estimate if cov (y*, e) DNE 0, may cause bias in intercept AND slopes, (OVB)
Measurement error ME In dependent variable: how does it affect slopes? Standard errors?
in y--> okay in x--> bad me in y - (we assume that E[e]=0 (e=errors) and cov (y*,e)=0) - unbiased coefficients - higher MSE--> more noise--> larger SE's - affects F and E- tests • So while the estimates a and b for α and β will remain unbiased with OLS, the estimate will become less precise - their standard errors will increase because the standard error of the regression (the r.m.s.e.) will increase due to the "noise" in Y . Clear?
interactions of nondummy variables- specification test
interaction interpretation generalizes to non-dummies as well! ex. lnwage = B0+B1 yrsed+B2potexp+B3potexp*yrsed+er B1- returns to eat potexp=0, slope B3= change in returns to ed for each additional year of experience report nonlinear effects with slope at mean- plug in potexp mean and coefficients and solve for returns
specification issues: interpreting dummy variables interpreting time period dummy multiple categories!
interpret dummy coffee by examining expected value b'n o and 1 - intercept shift E[ y|x; d=0]- E[y|x; d=1] time period dummy - coffee on dummy is the difference (change) in the dependent variable between that period and the excluded period multiple categories- coeff on dummies represent difference in average y- b'n the included and excluded groups the B0 estimate (intercept coef) is the mean y for the excluded group c ** but with controls- B0- is the intercept of excluded group and coeff on dummies represent slope difference coeff on dummies are the diff in Y b/n included and excluded groups holding xi constant--> intercept shifts!
dummy dependent variables
interpretation of Bj: change in the probability of being I in y as x's change * units= percentage points we have a linear probability model w/ dummy dependent variable: problems violates assm of homoskedasticity- so SE's and inference are wrong predicted values can lie outside [0,1]
random effects
RE does not deal w/ bias, just more efficient than OLS (smaller SEs) - assuming ai is uncorrelated with all x's, cov(ai, xi)=0 equations:
Conditions for valid IV estimation:
(1) strong first stage [F‐stat on INSTRUMENTs alone>10 -->t‐stat>3 for a single instrument] on instrument (2) instrument independent of model error IV must have 2 properties: first stage: cov (x,z) DNE 0- (instrument relevance)--> always check second assumption: cov (u,z)=0 (instrument exogeneity)--> untestable IV uses variation in x generated by "randomly assigned" z to indirectly get at the effect of x on y * gives you effect on Y per unit change in X how to check first stage: testing cov(x,z) DNE=0 y= B0+B1X1+B2X2+u X1= π0+ π2Z+ π2X2+ e we must have π1 DNE= 0 so H0= π1=0 Ha: π1 DNE 0 reg x on z's, predict Reasons why a strong first stage is important o There is NO WAY to test the second condition when you are just identified (#instruments=#endogenous variables). Resist all temptation to think that there is. There is only evidence which "makes a case" for it, such as Z be uncorrelated with controls, and sometimes "placebo regressions" (examining outcomes you know should be unaffected by the treatment). To test an instrument's validity, you would need some other valid way to identify the relationship between Y and X, such as another instrument. if R^2=0, there is no first stage- treatment and control took same amount of drugs, - can't learn anything if no first stage- R^2 in denominator small and std error/confidence intervals blow up
What are the consequences of non‐normal errors? Under what conditions can they be addressed
(large sample --> central limit theorem).
Fixed effects: when is equivalent to first differences When T>2, what is it equivalent to? o What do they control for? What drops out of the regression when you control for them?
(when T=2 time periods)
Difference- in differences
* interaction of time dummy var w/ another dummy
specification issues: interpreting non-linearities
* non linear effects reported using the slope at mean
Why is a strong first stage important: (a) for bias reasons (b) for precision reasons
- effect of instrument on (x)- need strong first stage because weak first stage magnifies bias in IV and will lead to large SEs in our IV estimates IV sacrifices precision to reduce Bias- LArger SEs than OLS--> we want to use OLS, but can only if cov (x,u)=0 holds since R^2<1, IV standard errors are always larger than OLS, however IV is consistent while OLS is inconsistent when cov (x,u) DNE - and cov(z,u)= 0 the more instruments you are using the larger the first F-stat on instruments should be- need higher standard of evidence
panel data in stata
- first ell stat that it's a panel: xtset xsecvar timer for random: xtreg y x1 x2.. xk, re for fixed xtreg y x1 x2.. xk, fe fe assumes cov(u,x's)=0 re assumes cov (ai,x's)=0 the ai assumption is testable, if it holds FE=RE estimator
Asymptotics - What are the consequences of non‐normal errors? Under what conditions can they be addressed
.
testing for endogeneity
1) estimate reduced form: x1 on all IVs and controls, get residuals 2. reg y on x1 and controls and residuals 3. if coeff on residuals is stat significantly different from zero, then x1 is endogenous
why would coeff get larger and more sig w/ fe can taking differences make OVB worse in some cases?
1) maybe cross section regression had OVB 2. taking differences can make ova worse in some cases- attenuation bias is usually worse in iddferentation adding controls makes me attenuation factor smaller differing makes it larger (removes signal only!)
what is R^2
1- SSR/SST SST- total variance in Y SSE- var in y explained by X SSR- var in residuals SST= SSE+SSR R
controlling for fixed effect
1. 2. estimate ai by including a dummy for each individual here, 46 obs in 2 time periods so 45 dummies wed'er more interested in steh other coefficients so let's difference out the dummies period 2- period 1= diffference by assumption- no correlation b/n x's and error--> no bias/incons ex. stata - create dummies for each indie and include in reg tab city, gen(city dummy( or areg command areg y x yrdummies, absorb(city)
What are the different kinds of data? 5
1. cross sectional data- collected by sampling a population at a given point in time, - ignore any minor timing differences in collecting data, ordering of data does not matter! - assume random sampling, 2. time series data- data collected over time on one or more variables- ex. stock prices, homocide rates, - timing is important, chronological ordering of observations, more difficult to analyze become econ obs rarely independent across time - frequency at which data is collected important! 3. pooled cross sections- independent cross section usually collected at different points in time, combined to produce single data set- mix of cross sectional and time series - - good for looking at policy changes - keeping track of year of each jobs very important - analyzed much like standard cross sectional 4. panel (or longitudinal) data- data set constructed from repeated cross sections over time, w/ a balanced panel- the same units appear each time, with an unbalanced panel- some units do not appear in each time period- often due to attrition - ordering in cross section does not matter allows us to study important lags in behavior and policy, more difficult to obtain, but allows us to control for certain unobserved characteristics 5- experimental
MLR1-5
1. population relationship is linear in parameters 2. random sample 3. no x's are constant and no perfect linear relationship--> no multicollinearity 4. independence assumption- expected value of error does not depend on anything that affects outcome - unbiased proxy for counterfactual 5. homosked- - variance o the errors is unchanged for any value of x's 6. normal disrib of estimators 1-5 need for Blue
Hypothesis testing: o How do you do a t‐test? What does it test? o How do you calculate a confidence interval? o How do you do an F‐test? What does it test?
1. write down the null, Ho: B0=0 2. write down the alternative, Ha:B0 DNE 0 3. pick alpha, get tc (critical value) - t table with small n - n table with large n alpha: pr(reject null/null is true) 4. form the test statistics t= B1-B0/ SE(B!) 5 decision rule compare alpha to t if t>alpha reject Ho if alpha>t do not reject no strong evidence confidence interval- estimate +(-) Tc*se(estimate) tc- critical value =1.965 for 95% two tailed- more conservative, harder to reject the null, need stronger evidence than in one tailed!
Different forms of IV o IV o Indirect least squares (when can you do it). Be ready to compute IV estimates this way o Two‐stage least squares
3 equivalent ways: 4 including control function- all give exactly same slope estimates as IV IV: cov(x,y)/var(x,z) ILS - ratio of two OLS slope coefficients, indirect least squares (ratio of difference also known as wald estimator) cov(y,z)/ cov(x,z) IV= ILS when only one instrument IV= reduced form/ first stage regression= OLS reg of y on z / OLS reg of x on Z (and other controls) when instrument z is continuous (not dummy for treatment= can't use ratio of difference form)... 2SLS- two stage least squares - more efficient than IV but increase in bias - basically a more general way of thinking about IV - using variance in x induced by z to determine impact on y equation- cov (y,x)/var(xhat)--> xhat predicted from first stage with z 1) reg x on z, predict xhat --> firststage regression - this rescales z to units x 2) reg y on predicted xhat-- reduced form reg- use predicted xhat for significance gives wrong standard error
o When are IV estimates = the effect of the treatment on the treated
??
IV v.s OLs slope estimates
Biv= cov(y,z)/ cov(x,z) Bols= Cov (y,x)/ cov(x,x)
zero conditional mean
E[u|x]=0 1. E[u|x] is constant ( no relationship between u and x) 2. E[u]=0 --> implies cov(u,x) =0 --> no linear relationship
What problems does IV deal with? How do IV standard errors compare to OLS?
Iv methods deal with - OVB, ME in X and simultaneity IV standard errors larger than OLS and Iv sacrifices precision to reduce bias - good for OVB- example- people who take drugs might be different in ways that affect outcome- contamination of RCT- use instrumental variables ideally z is randomly assigned! IV used in situations where you have a natural experiment on Z which also affects x and you are really interested in the effect of x on y- IV estimate gives you effect on y per unit change in x Iv estimation can also be used to eliminate attenuation due to measurement error- second report of x used as instrument for first report of x- if two reports are independent (uncorrelated) then second report is by defining uncorrelated with source of attenuation bias ME attenuates the first stage and reduced form by the same factor lambda, so IV divides it out! only gets ride of attenuation bias if Instrument not correlated with measurement error classical ME when IV estimates larger in magnitude than OLS one reason can be because IV is getting ride of attenuation bias
IV and heterogeneous treatment effects: LATE interpretation of IV o What is the monotonicity assumption that is required for this?
Late- under monotonicity assumption and heterogenous treatment effects we say IV is LATE LATE- local average treatment effects intuition- IV is only estimated off of those whose x value is moved by the instrument- people on the margin (absent heterogenous treat effects LATE= ATE=constant) in many policy context this is just fine- these are the people we are interested in monotonicity assumption- need on top of validity to be meaningful late- increases in z more everyone's x (weakly) in the same direction - untestable- requires knowledge of the counterfactual (x at different values of z) same as saying no defiers for example! - some people may not be affected at all and that's okay! but have compliers, never takers, and always takers
summary of panel data info
OLS SEs's -- IN HTSC use robust to get consistent SE's when autocorrelation cases- cluster to get consistent SEs OLS inefficient- could get smaller SEs from same data when HTSC-- WLS( but need to know form of HTSC autocorrelations- REs but requires implausible assumption implausible assumption: there is no ova from fixed unobserved
SLR1-4 mean, adding SLR.5
OLS estimator is unbiased and consistent!! with .5- OLS estimator is BLUE, homoskedasticity (efficient)
SLRM 1- 5
SLR.1 - population relationship is linear in parameters 2. random sample of x and y from population 3. there is variation in x--> no perfect collinearity 4 E[u|x] =0- zero conditional mean--> means slr.5 var[u|x]= 0 (homosked)
regression discontinuity: what assumptions are necessary
What assumptions are necessary? o All variables except treatment status are smooth in the running variable in a neighborhood of the threshold. o No sorting at the threshold: agents cannot fine tune the level of their running variable in a neighborhood of the threshold
what is a fuzzy RD?
When being to the right of the threshold does not perfectly predict the treatment status; instead, it is a "first stage" in an IV regression.
for what purpose if any, are OLS residuals a good proxy for the errors?
a. Estimating the variance of the errors c. Estimating standard errors A and c. The residuals cannot be used to evaluate the correlation with the right-hand side variable because it is assumed (and in fact, imposed) that that correlation is zero in order to fit the line. The variance of the residuals, also known as the mean squared errors, are used to estimate the variance of the errors in the standard error formula.
adding more x's - three effects
always increases R^2 may reduce OVB does not tell us anything about causality!
o Who are "compliers"? Who are "always‐takers" and "never‐takers"? Who are "defiers"
always takers- people who ??
What does "consistent" mean?
asymptotic ask how these changes as sample size increases ( bias, efficiency and sampling distribution) - large sample properties of estimators and test stats consistent if as sample size n-->inf, se -->0 unbiased if E[u|x]=0 consistent if E[u]=0 and cov(x,u)= 0 - inconsistency does not go away with a larger sample bias may go away with a large sample if the estimator is consistent efficiency- OLs asymptotically efficient under MLR1-5 Inference- central limit theorem- says that OLS estimators are asymptotically normally distributed- Use LM stat! with rare exception, unbiased estimators are also consistent - some estimators are biased but consistent opposite of consistent estimator is one with asymptotic bias- bias that remains as ample gets arbitrarily large- inconsistentcy does not go away as add data- in contexts where OLS estimator is biased it is also consistent large sample central limit theorem)
if you have multiple instruments you can test whether iv suffers from omitted variables bias under the assumption that
at least one instrument is valid
interpretations of IV> OLS
attenuation bias caused by ME making OLS smaller than truth, first stage is weak reduced form- partly driven by third factors besides instrument( validity condition fails) - IV estimates a local average treatment effect LATE the effect for those whose education decision influenced by instrument- only those at margin - Late is untestable assumption- no requires monotonicity that nobody is pushed in wrong direction- causal effect only for subset of people who can be induced to change behavior because of instrument
OLS
avg residual is zero, std error==> sample std. dev of error line of best fit MSE = std. error^2
ME n independent variables: how does it affect slopes? Standard errors?
bad! estimators are biased! - resulting attenuation bias! (biases LS estimators towards zero) - violates MLR.4 as the (noise/signal +noise) ratio gets larger, slope estimate goes to zero- - this ratio is always less than 1 - leads to estimates that are always too small in magnitude larger the attenuation factor--> the smaller the downward bias - OLS is biased and inconsistent - error gets even worse with multivariate- biased std errors? Fix: divide by attenuation factor if we have it OR take slope from a reg of first report on second report to estimate attenuation factor
What is the fundamental problem of causal inference?
can't observe the counterfactual
what is the OLS approach
choose estimates of intercept and slope so that the actual value of y is on average as close as possible to est'd line - error for population unobservable residual (for estimated equation)
Hausman test
compares RE and FE 1. test store name 2. hausman consist_fe effic_re 3. if reject--> use FE also to test IV first stage 1. reg x on z's and controls, get residuals reg y on x and controls and residuals 3. if error stat is sig, x1 is correlated w/ u-->OLs is biased us IV, if IV's not biased
Advance IV: Control function (what is it? What do the coefficients represent?)
control function is approach of testing IV against OLS (most flexible method) - controlling for bad/endogenous part of variation in x - if this residual has a significant relationship w/y--> OLS is biased, * reg x, on x's and z's; get resid; reg y on x, x's and resid--> if resid stat significant, x is correlated with u! IV can test whether OLS suffers from OVB, but only under the assumption that IV does not! control function: (1)estimates OLS regression of x on z and controls and get residuals (first stage) (2) reg y on x, the same controls, and the first stage residuals - controlling for the bad or endogenous part of variation in x which is correlated with error - if this residual has a sig relationship with Y, suggests OLS is biased z and controls called exogenous vars/ x and y are endogenous results! estimates will be same as IV and estimate of B2 can be used to test dif between OLS and IV B2= Bols- Biv * ex how to do! OLS controlling for fist stage residual predict eta_hat, resid eta_hat coeff = difference between OLS and IV educ coffee same- but Ses are wrong!
ME: When other controls are in the regression (attenuation bias gets worse) o When fixed effects are added o When estimating in differences (handout!)
controls make measurement error attenuation worse- the estimates of all coefficients will be biased(with multiple regression, matters are more complicated, as measurement error in one variable can bias the estimates of the effects of other independent variables in unpredictable directions.) - when FEs are added- when estimating in differences-
Panel data- what is it
cross sectional and time series components can be used to address some kinds ofOVB if omitted variable is fixed over time, a fixed effect approach removes bias FE's are controlling for a host of dummy variables or differencing out mean levels of variables
alt of white test
detect nonlinear forms of HTSC estimtae model, get residuals 2. reg squared residuals on y and y^2, use R^2 for F or LM
o Interpreting all of the coefficients in a "difference‐in‐differences" regression.
difference in differences- interaction of time dummy var/ w/ another dummy first define 2 dummy vars coefficient on the uninterrupted var (main effect( is the slope for the excluded group , ex chain from 79-81 in other coefficient on the interacted var represents difference in slopes b/n included and excluded, ex. how much larger was change in employment in miami than in other --> diff in diff ex. miami81- miami 79= change in miami other 81-other 79= change in other diff in diff- change in miami- change in other
differencing data and running OLS reg is the same as...
differencing data and running old reg is same as estimating a fixed effects version- controlling for ai with dummies
the standard errors on an IV estimate
e- IV standard errors are always larger than OLS since R^2<1, std error in IV case differs from OLs only in the R^2 (first regression of x on z( in the denominator - however OLS is inconsistent and IV is consistent, IV sacrifices precision to reduce bias motive for control contain or hausman test if OLS is not different than IV use OLS
endogenous/exogenous variables
endogenous variable: x, correlated with u; coffee suffers from OVB or ME exogenous variable: Z, not correlated with u; instrument and controls ** need as many instruments as endogenous variables
FE vs.FD
equivalent when there are only 2 periods, with more than 2, FE is demeaning data Fe--> variation w/in individuals over time FD--> variation between estimator, uses mean equations! FE more common- easier to do, easy for unbalanced panel FE more efficient, than differencing, no autocorrelation, FD removes more individual var over thim than FE
unobserved fixed effect-
error= ai + e eit= idiosyncratic error ai= time constant component of the composite error, fixed over time, no t if ai is correlated with any x, old will be biased and inconsistent, violates MLR.4
contaminated trial example
estimate B1: (y treatment- y control)/ (xtreat- xcontrol) this is the instrumental variables estimate of the effect of x on y- instrument z is variable correlated with x, but uncorrected with u, instrument dummy for being in treatment group
testing for HTSC in IVs (adjustment to BP
estimate IV model, get residuals reg residuals squared on all exogenous variables (instruments and controls, and not the x's you are instrumenting for!) test for joint significance IV estimation can be used to eliminate attenuation bias- a second report of x can be sued as an instrument for the first report of x
hausman test and FE/RE
fail to reject: use RE rejectL use FE Ho: Cov(xit, ai)=0==> RE (est is efficient and consistent) ha: cov (xit, ai) DNE 0--> FE est is consistent estimate with RE then with FE if coef estimates are sig different--> reject null failure to reject null--> no bias in random effects must assume no OVB in Ha(FE) estimator
fixes of autocorrelation
fix 1: "cluster provides consistent SEs under arbitrary form of error correlation over time - larger SE's - less info--> less precise, need large n! - coefs the same fix 2: efficiency correlation= random effects - assumes ai is uncorrelated with x's does not remove OVB, just more efficient than OLS reduces SE's
heterosked- How to fix standard errors How to fix efficiency and when not to
fix: (", robust") robust standard errors - makes it consistent but not efficient - biased in small samples but consistent, less biased as n--> inf - still estimating by OLS just fixing SE's *almost always good in decent size sample what is robust: estimate the whole variance expression- but substitute the squared residuals for the real variance(std. dev^2), in this case averaged over a large number of observations these are equal- law of large numbers efficiency fix: WLS--> makes it efficient ("WLS"), what is WLS= weighted least squares- will downright observations that tend to be noisier (bigger error variance) and upweight less noisy observations to produce a better estimate of line than OLS general: divide whole equation by sqrt(n) for homoskedastic error - just use for efficiency (to reduce SE's) -OLS still unbiased and consistent if estimates are very different w/ WLS, prob other assm violated problematic: can't manipulate it to reflect what you want!
The Frisch‐Waugh theorem: How do you do residual regression to get the same slope as a multivariate regression?
following 2 steps delivers same slope as the multi-reg: 1) regression x1 on all x's, get residuals 2) regress y on residuals!
partialling out-
following two steps deliver same slope as multiple regression 1) regress x1 on x2.. xk to get residuals ri! 2) regres y on residuals (bivariate) - if there were a perfect linear relationship between x and the other x's, then there would be no variation left to related to y .. var(ri)=0
advanced IV: Hausman test: what does it test? How do you compute it by hand?
it is a test for endogeny! hausman tests compare estimates from a potentially biased estimator that is efficient under the null hypothesis (OLS) against a less efficient estimator that we assume produces unbiased estimate (IV( idea: we'd like to use OLS if we can- but only valid if cov(x,u)=0 which is our null test compares OLS and IV estimates, if they are statistically different suggest OLS was biased and cov(x,u) DNE - If OLS estimates are bigger than IV estimate--> OVB in OLS( prefer IV if corr(z,u)/corr(z,x)< corr (x,u)) we want to use OLS but can only if cov(x,u)=0 holds hausman test: if OLS and IV estimates are significantly different, then OLS is biased and cov(x,u) DNE 0--> use IV Ho: cov(x,u)=0 --> use OLS Ha: Cov(x,u) DNE 0 --> use IV by hand?!!!!: if variance of the difference (int this case) is the difference in the variances se (Biv- Bols)= sqrt(se Biv)- se (Bols)2 biv-bols tstat on difference reg lwage contorls estimates store olse_est ivreg lwage (educ= nearc4) controls estimates store iv_est stat: Hausman ivest olsest ivest- are estimates aha are consistent by assumption old est- estimates that are efficient under null
IV estimation with multiple instruments
more than one instrument for x is also okay, just add them to first stage- 2sls still applies IV estimation w/ multiple endogenous variables- more than one endogenous variable also okay- we need at least one instrument for each endogenous variable! - we have more than one first stage and each endogenous x should have significant relationship with the instruments collectively conditional on the controls F-test on just the instruments (not the controls)
sample selection: Which is worse: selection on Y, or selection on X?
non random samples- version of missing data-- on x- fi constant treatment affect--> no bias on x--> if heterogenous treatment effects--> local ATE rather than overall ATE does not usually lead to bias in X very very bad on Y! --> always leads to bias on errors--> also bad!
SLR.6
normality, parameters are jointly normally distributed
Determining the sign or mangitude of omitted variables bias
omitted varaible bias- omitting the variable only leads to OVB if it is correlated with the included x AND with y bias i zero if B2=0 or x1, and x2 are uncorrelated look at equation on sheet 3!
what is zero conditional mean- 3 ways to write
other factors which affect Y are on average unrelated to x- untestable assumption E[u|x]=0 E[u]=0 cov (u,x)= o
p-value
probability we would observe a t-stat as large as we got, if the null were true small--> null is highly unlikely to be true large--> under null being true, would be easy to find t-stat such as ours
What is the "reduced form" (also known in some contexts as the "intention to treat" estimates)? o What can it be used to test?
reduced form is cov(u, OLS reg of y on z called intention to treat tests if there is an effect tells us whether IV estimate is significantly different from zero if instrument is valid- reduced form only supposed to be driven by instruments relationship to treatment- does not always hold there can only be a nonzero IV estimate if there is a nonzero reduced form! reduced form is a sufficient stat for whether there is any non-zero iv estimate having a first stage is critical for valid IV estimation * new if you have a valid instrumental variable, testing the reduced form relationship with the instrument is sufficient to answer - is there a causal effect of x on y even when first stage is weak bu not zero
"How to test for heteroskedasticity": just the idea -
regress the squared residuals on "stuff." No details of specific tests are important. 1) estimate model, get residuals 2) regress squared residuals on all x's, see if they're stat significant--> reject null--> we have HTSC we need to correct Breusch Pagan BP test- detect linear forms of HTSC - use auxiliary regression of squared residuals on all x's - use R^2 for F or LM test White TEst--> allows for nonlinearities - same as BP except with squares and interactions - alt: reg residuals swaured on estimates of y and (estimates of y^2) and use R^2 for F'LM
LM stat
used for joint hypothesis testing w/ large sample--> don't need normality df= q use when MLR 1-5 hold but not MLr.6 (normality), as long as we have a large sample 1) estime restricted model, get residuals 2. regress residuals on all x's, get R^2 3. LM= nR^2, use X^2 distrib for critical value
what in the end is the value of the control function?
see and test difference between OLS and IV, also there're useful generalizations of the control function approach- interacting residuals with other controls- allows you to control more flexibly for bad endogenous variation useful if outcome is specified with a nonlinear function
heterosked- Consequences for slopes Consequences for standard errors Consequences for efficiency
slopes: OLS estimates remain unbiased - We only need Assumptions 1-3 to show that the OLS estimator is unbiased, hence a violation of Assumption 5 has no effect on this property Efficiency- inefficient Variance of the OLS estimator is inflated 3. Standard formulae for standard errors of OLS estimates are wrong. - They don't take account of the extra sampling variation
The forces which affect the size of standard errors on regression slopes
standard error- is the estimate of the std dev of the estimate s/sqrt(n) larger the variance of the estimator (larger sqrt(MSE) --> more noise--> larger variance larger sample size--> smaller SEs more variation in x's--> smaller Ses (more efficient) but unnecessary x's raises R^2 and Ses--> bias, efficiency trade-off ( because adding x's always reduceds bias)
IV with multiple regression
stata: IV+controls:ivreg y (x=z) controls multiple IVs: ivreg y (x=z1, z2) cotnrols always use ivreg(not broken up into 2sls to avoid wrong SEs adding controls straightforward- like frisch waugh, all same formulas apply to residuals of and z after regressing on the other controls, 1st stage and reduced form include controls firs stage- regress x on z and controls, need experiment to induce variation in x conditional on the controls in 2sls- include controls in predicted values!
data problems: variables that are unobservable (violates MLR.4)
substitute observable proxies (ie IQ scores for ability( if used as.. ... treatment variable: noisy reports, consequences like ME .. controls: not fully capturing desired variation in variable we would like to measure
testing exclusion restrictions how to- F test
testing exclusion restrictions - compare R^2's 1) estimate restricted model w/out get Rr^2 2. estimate unrestricted model w/all x's to get Rur^2 F stat q= number of x's you're testing- number of restrictions k- total # of x's in unrestricted reg rect H0 at signify level if F> C n= number of jobs, k - total # of x's, q= number of restricted x's if null is Ho: B0=B1=0 then (R^2/K)/ ((1-R^2)/ (n-k-1))= F
advanced Iv What is an "over‐id" test? When can you do it? What is the "Q" (degrees of freedom) in an overid test?
testing overidentifying restrictions- having an instrument allows us to test the OLS assumption cov(x,u)=0,how can we test the IV assumption that cov(z1,u)=0 - if there is just one instrument we cannot! like with OLs if we estimate by IV with z1, z1 and u will be uncorrelated by construction if we had a second instrument z2 we could estimate IV using z2, then see if u from these IV esteems is correlated with z1 - like houseman test id done under the assumption that at least one instrument is valid the override test-Tests the IV assumption that cov (z,u)= 0 1 estimate the model using IV using all of the instruments and obtain the residuals ivreg y (x1=z1 z2) controls predict uhat_iv, resid 2) regress residuals on all exogenous variables (IVs and controls) reg uhat_iv z1 z2 controls 3. construct LM stat: nR^2 ex. if 1 instrument added compares to x^2(1) disturb Ho: all instruments are uncorrelated with the error: cov (z1, u)= cov (z2, u)= cov (z3, u)= 0 * Q is extra # of instruments under the null that all of the instruments are uncorrelated with the error, LM= Xq^2 where q is the number of extra instruments if we reject--> same IVs are not exogenous (they are correlated with u), one of the two instruments is invlaid, under the assumptonio that at least one of the instruments is valid and so produces consistent answer - if we fail to reject does not say our instruments are valid! if coeff is larger with instrument added this is not a good sing
specification issues: interactions with dummies
the coef on the uninteracted variable (the main effect) represents the slope for the excluded group, c, slope for d=0 - the coef on the interacted variables represent the differences in slopes b/n the included and excluded groups, difference in slope b/n d=1, and d=0 ie. y rises __ more per year in A than in D, where y rises ___ per year of xi
what is the fundamental problem of causal inference:
the counterfactual outcome is unobservable for every individual
chow test
to determine if separate models needed for dif groups? ex. Ho: B1=B3=0 Ha: B1=B3 DNE 0 1) estimate fully interacted model, get Rur^2 2. estimate pooled model, get Rr^2 3. compute F stat, if reject null, model should be split
the hausman test failed to reject the null. this means that
under the assumption that IV is consistent OLS estimates are statistically indistinguishable form IV estimates does not mean that OLS is consistent
Hetrogeneous treatment effects Idea of "average" treatment effect. When does OLS estimate it? And selection on X: "local" average treatment effect
violates MLR.1-- fix: respecify your model
autocorrelation
w/ autocorrelation OLS is biased and inconsistent cluster--> consistency RE--> efficiency and consistency *either way still biased autocorrealtion- errors correlated across periods (continue to assume no error core between diff individuals) consequences: Los calculates Biased SE's, can't do inference OLS is inefficient (not blue) violates MLR.5
variance of OLS
we don't know population variance b/c we don't observe errors, residuals we do observe the residuals so we can estimate the error variance MSE, S^2 = 1/(n-k-1)*(resids^2)= SSR/ n-k-1
Median/quantile regression: o When do we use it? How should we interpret the slope?
we use it when we have outliers (threatens MLR.3) 2 fixes for outliers (lead to bias): 1) trim data 2) fix: Least absolute deviations (LAD) regression - like more of a median while OLS is a mean in stata use qreg--> residuals are median zero - biased but consistent in small samples and estimates a different parameter than old!
interpreting coefficients
y, x--> change in y= B1 change in x y, log (x)--> change in y= (B1/100)% change in x log y, x --> % change in Y= (B1*100) change in x log y, log x --> %change in Y= B1% change in x
why you need to control for year effects in panel regressions!
year dummies capture the influence of aggregate time series trends - time series regressions or panel regs which fail to control for year effects pick up influence of aggregate trend which have nothing to do with causal relationships have OVB if fail to control for year effects!