Endogeneity
What are the 4 reasons endogeneity may arise
1. Omitted variable bias: the variations in x and in y may be due to another variable which is unobserved 2. Measurement error: the data may only provide an inaccurate measure of x 3. Simultaneity bias: variations in x may be themselves triggered by variations in y so causality runs both ways 4. Selection bias: y or x may be observed only for a selected sample of individuals, depending on unobserved characteristics
Correlation
A measure of the relationship between two variables
In order for βˆ 1 to be a consistent estimator for β1 when there is a measurement error in the dependent variable, what must be true
E(xi(ui + ei)) = E(xiui) + E(xiei) = 0
endogenous
If an explanatory variable is correlated with the error term (i.e., the exogeneity assumption does not hold) ⇒ OLS estimator is: 1) biased: E(βˆ1) not equal to β1 2) inconsistent: the bias persists in large samples
exogenous
If an explanatory variable is not correlated with the error term
What does it mean if we have a measurement error
If one or more relevant regressors is measured with some error, then the OLS estimator is biased in finite samples The bias persists in large samples, i.e., OLS estimator is inconsistent.
Is mismeasuring the dependent variable problamatic
In general, mismeasuring the dependent variable is unproblematic, as long as this measurement error is uncorrelated with the regressors
How can we prove inconsistency in the classical errors in variables
The magnitude of this bias varies according to the magnitude of σ 2 e , i.e., the dispersion of measurement error
What is a casual relationship
The simple OLS estimator will be E(ui | xi) = 0 as long as the zero conditional mean assumption is present. In the absence of this assumption, two very important properties, unbiasedness and consistency, will no longer hold
What is omitted variable bias?
When the regressor is correlated with a variable that has been omitted from the analysis and that determines in part, the dependent variable. Therefore inconsistent
What is a long regression
When there is another explanatory variable that can be added into the regression but is correlated to another explanatory variable
What is a short regression
When there isnt a correlated explanatory variable included in the regression
Classical-Errors-in-Variables assumption
is that the measurement error is uncorrelated with the unobserved explanatory variable (i.e., E(x∗i ei) = 0 or Cov(x∗ i , ei) = 0)
Causality
the relationship between cause and effect
Internal validity
the statistical inferences about causal effects are valid for the population being studied