Econometrics Chapters 3&4

Ace your homework & exams now with Quizwiz!

what happens if the mean doesn't equal zero in the sample?

for a small sample, it is not likely that the mean is exactly zero, but as the size of the sample approaches infinity, the mean of the sample approaches zero. *so, as long as you have a constant term in the equation, the estimate of Beta0 will absorb the non-zero mean. *in essence, the constant term equals the fixed portion of Y that cannot be explained by the independent variables and the error term equals the stochastic portion of the unexplained value of Y

specification error

(the specification) of a model involves: (independent variables, functional form of the variables, the properties of the stochastic error term. A mistake in any of the three elements results in a specification error. - is usually the most disastrous to the validity of the estimated equation.

six steps in applied regression analysis

1. review the literature and develop a theoretical model 2. specify the model: select the independent variables and the functional form (independent variables and how they should be measured, the functional (mathematical) form of the variables, and the properties of the stochastic error term) 3. hypothesize the expected signs of the coefficients 4. collect the data. inspect and clean the data. 5. estimate and evaluate the equation. 6. document results

classical assumptions: (the five ones in class)

1. true model = [regression model is linear, correctly specificed, and has an additive error term (I.)] 2. x's not correlated with error and not co-varianced [ all explanatory variables are uncorrelated with the error term (III) ] & [no explanatory variable is a perfect linear function of any other explanatory variable(s) (no perfect multicollinearity) (VI)] 3. expected value of error term = 0 [error term has a zero population mean (II) ] 4. serial independence of error terms and homoskedasticity [observations of the error term are uncorrelated with each other (no serial correlation) (IV)] and [error term has a constant variance (no heteroskedasticity) (V) ] 5. normally distributed error term [the error term is normally distributed (VII)] ***not sure where this one fits in:

how to remember the gauss-markov theorem?

BLUE acronym: Best Linear Unbiased Estimator *where Best means (minimum variance), so could be MvLUE *if an equations' coefficient estimation is unbiased, then E(beta-hatk) = Betak (k = 0,1,2,...,K) Best means that beta-hat k has the smallest variance possible (out of all the linear unbiased estimators of Betak) - an unbiased estimator with the smallest variance is called efficient, and that estimator is said to have the property of efficiency

how is the equation estimated?

Eviews can estimate the equation in less than a second! typically, estimation is done using OLS

standard error of beta-hat

SE(beta-hat) is the square root of the estimated variances of the beta-hats, and is also similarly affected by the size of the sample - an increase in sample size will cause SE(beta-hat) to fall; larger the sample, the more precise the coefficient estimates will be.

Gauss-Markov Theorem:

Under class assumptions (I-IV), the ordinary Least Squares estimator of Betak is the minimum variance estimator from among the set of all linear unbiased estimators of Betak, for k = 0,1,2,...,K.

How does the constant term absorb the non-zero mean of the error term in a sample?

Y = beta0 + beta1Xi + erri suppose mean of erri = 3 instead of zero, then E(erri - 3) = 0. If we add 3 tot he constant term and subtract it from the error term, we obtain: Yi = (Beta0 + 3) + Beta1Xi + (err-3) Then we can rewrite, so : Yi = Beta0* + Beta1Xi + erri* where Beta0* = Beta0 + 3 and erri* = erri - 3

Class Assumption V: the error term is normally distributed

although we have already assumed the observations of the error term are drawn independnetly from a distribution that has a zero mean and has a constant variance, we have said little about the shape of the distribution - (which is normal)

unbiased estimator

an estimator Beta-hat is an unbiased estimator if its sampling distribution has as its expected value the true value of beta. E(beta-hat) = beta

efficiency:

an unbiased estimator with the smallest variance has the property of efficiency

the measure of the central tendency of the sampling distribution of beta-hat

can be thought of as the mean of the beta-hats, is donated by E(beta-hat), read as the expected value of beta-hat.

biased estimator

if an estimator produces beta-hats that are not centered around the true beta, the estimator is referred to as a biased estimator

Class Assumption 2 PART 1: [all explanatory variables are uncorrelated with the error term]

if an explanatory variable and the error term were instead correlated with each other, the OLS estimates would be likely to attribute to the X some of the variation in Y that actually came from the error term. *if the error term and X were positively correlated, the estimated coefficient would probably be higher than it would otherwise have been, because the OLS program would mistakenly attribute the variation in Y caused by err to X instead.

how to decrease variance?

increasing the size of the sample, which increases the degrees of freedom, since the number of degrees of freedom equals the sample size minus the number of coefficients or parameters estimated.

Even though just need assumptions (I- IV), what happens when you add that last assumption (V)?

it is strengthened, because the OLS estimator can be shown to be the best (minimum variance) unbiased estimator after all the possible estimators, not just out of the linear estimators - so when all five assumptions are met, OLS is BUE With all five assumptions, the OLS coefficient estimators have the properties: 1. unbiased 2. minimum variance 3. consistent 4. normally distributed

CLASS assumption II part 2??? *maybe, might need to figure out where this fits in on tuesday* no explanatory variable is perfect linear function of any other explanatory variables (no multicollinearity)

perfect collinearity between two independent variables implies they are really the same variable, or that one is amultiple of the other, and/or that a constant has been added to one of the variables. That is, the relative movements of one explanatory variable will be matched exactly by the relative movements of the other. Then, the OLS estimation procedure will be incapable of distinguishing one variable from the other -example: sales tax is a perfect linear function of sales, and you'll have multicollinearity

dummy variables

takes on the value of one or zero (and only those values) depending on whether a specified condition is met.

True model: ASSUMPTION 1: regression model is linear, correctly specified, and has an additive error term:

the assumption does not require the underlying theory to be linear: - for example, an exponential function: Yi = e^Beta*X1^beta1*e^erri where is the base of the natural log, can be transformed by taking the natural log of both sides of the equation: ln(Yi) = beta0 + Beta1ln(Xi) + erri if the variables are relabeled: Yi* = ln(Yi) and Xi* = ln(Xi), then the form of the equation becomes linear

what does the number of degrees of freedom signify?

the more degrees of freedom there are, the better - because when the number of degrees of freedom is large, every positive error is likely to be balanced by a negative error. When the degrees of freedom are low, the random element is likely to fail to provide such offsetting observations. For example, the more a coin is flipped, the more likely it is that the observed proportion of heads will reflect the true probability of 0.5

Class Assumption IV part 1: observations of the error term are uncorrelated with each other:

the observations of the error term are drawn indpenednetly from each other - if a systematic correlation exists between on observation of the error term and another, then OLS estimates will be less precise than estimates that account for the correlation. *if, over all the observations of the sample, errt+1 is correlated with errt, then the error is said to be serially correlated (and the assumption iv is VIOLATED

ASSUMPTION 3: error term has a zero population mean

the specific value of the error term for each observation is determined purely by chance - think of each observation of the error term as being drawn from a random variable distribution, where the mean is zero. :That is, when the entire population of possible values for the stochastic error term is considered, the average value of that population is zero. For sample, it is not likely that hte mean is exactly zero.

standard error of beta-hat

the square root of the estimated variance of the coefficient estimate is the standard error of Beta-hat, SE(beta-hat k)

CLASS ASSUMPTION IV part 2: error term has a constant variance

the variance of the distribution from which the observations of the error term are drawn is constant - that is, the observations of the error term are assumed to be drawn continually from identical distributions. for example, supposed you're studying the amount of money that the 50 states spends on education - new york and california are more heavily populated than New Hampshire and Nevada, so it's probable that the variance of the error term for big states is larger than it is for small states - the amount of unexplained variation in educational expenditures seems likely to be larger in big states like New york than in small states like New Hampshire: The violation of Assumption V is referred to as heteroskedasticity

omitted condition

there are two conditions, but one fewer dummy variable is constructed than conditions. The event not explicitly represented by a dummy variable, the OMITTED CONDITION, forms the basis against which the included conditions are compared - be careful never to use two dummy variables to describe two conditions. = the dummy variable trap:

notation of variance:

variance could be: VAR (beta-hat) or sigma^2 (beta-hat) *remember standard deviation os square root of variance the variance of the estimates is a population parameter never actually observed in practice; instead, it is estimated with sigma-hat^2(beta-hatk), also written as s^2(beta-hatk)


Related study sets

Chapter 52, Concepts of Care for Patients With Inflammatory Intestinal Disorders

View Set

other 50/100 history final questions (R)

View Set

CRJ 203: Exam 2 Study Guide OFFICIAL

View Set

Into Business Chapter 1-16 Review

View Set

Chapter 3: Human Resource Management

View Set

Chapter 25: Drug Therapy for Seizures

View Set