Studenmund

Ace your homework & exams now with Quizwiz!

The Gauss-Markov Theorem is perhaps most easily remembered by stating that...

"OLS is BLUE" where BLUE stands for "Best (meaning minimum variance) Linear Unbiased Estimator." Students who might forget that "best" stands for minimum variance might be better served by remembering "OLS is MvLUE," but such a phrase is hardly catchy or easy to remember.

Irrelevant Variables

A variable in an equation that does not belong

Two-sided Test

Another approach is to use a two-sided test (or a two-tailed test) in which the alternative hypothesis has values on both sides of the null hypothesis. For a two-sided test around zero, an example null and alternative hypotheses is pictured.

Chapter 4 - The Classical Model

Chapter 4 - The Classical Model

Chapter 8 - Multicollinearity

Chapter 8 - Multicollinearity

T-test vs F-test

Econometricians generally use the t-test to test hypotheses about individual regression slope coefficients. Tests of more than one coefficient at a time (joint hypotheses) are typically done with the F-test

5. The error term has a constant variance

For example, suppose that you're studying the amount of money that the 50 states spend on education. New York and California are more heavily populated than New Hampshire and Nevada, so it's probable that the variance of the error term for big states is larger than it is for small states. The amount of unexplained variation in educational expenditures seems likely to be larger in big states like New York than in small states like New Hampshire. The violation of Assumption V is referred to as heteroskedasticity.

Omitted Variable Bias

The bias caused by leaving a variable out of an equation.

Expected Bias Equation

β2 = expected sign of omitted variable α-hat1 = correlation between omitted variable and the other one.

Limitations of the t-Test: 3. The t-test is not intended for tests of the entire population

- If a coefficient is calculated from the entire population, then an unbiased estimate already measures the population value and a significant t-test adds nothing to this knowledge. One might forget this property and attach too much importance to t-scores that have been obtained from samples that approximate the population in size. This point can perhaps best be seen by remembering that the t-score is the estimated regression coefficient divided by the standard error of the estimated regression coefficient. If the sample size is large enough to approach the population, then the standard error will approach zero and the t-score will eventually become: (Pictured) Thus, the mere existence of a large t-score for a huge sample has no real substantive significance.

Limitations of the t-Test: 1. The t-test does not test theoretical validity

- Some researchers conclude that any statistically significant result is also a theoretically correct one. -example of the amount of rainfall in the UK is significantly significant to the CPI; it does not mean there is theoretical validity to that

Limitations of the t-Test: 2. The t-test does not test "Importance"

- Some researchers draw the conclusion that the most statistically significant variable in their estimated regression is also the most important in terms of explaining the largest portion of the movement of the dependent variable. Statistical Significance says little if anything about which variables determine the major portion of the variation in the dependent variable. In the example X2 is 20-times more important to Y in comparison to X1, but X1 has the higher t-score

Consequences of Multicollinearity

1. Estimates will remain unbiased. 2. The variances and standards errors of the estimates will increase. 3. The computed t-scores will fall 4. Estimates will become very sensitive to changes in specification 5. The overall fit of the equation and the estimation of the coefficients of nonmulticollinear variables will be largely unaffected.

The Consequences of Serial Correlation

1. Pure serial correlation does not cause bias in the coefficient estimates. 2. Serial correlation causes OLS to no longer be the minimum variance estimator (of all the linear unbiased estimators). 3. Serial correlation causes the OLS estimates of the SE(β-hat)s to be biased, leading to unreliable hypothesis testing.

The four steps to use when working with the t-test are:

1. Set up the null and alternative hypotheses. 2. Choose a level of significance and therefore a critical t-value. 3. Run the regression and obtain an estimated t-value (or t-score). 4. Apply the decision rule by comparing the calculated t-value with the critical t-value in order to reject or not reject the null hypothesis.

Characteristics of time-series

1. The order of observations in a time series is fixed. 2. Time series samples tend to be much smaller than cross-sectional ones. 3. The theory underlying time-series analysis can be quite complex. 4. The stochastic error term in a time-series equation is often affected by events that tool place in a previous time period.

The 7 Classical Assumptions

1. The regression model is linear, is correctly specified, and has an additive error term. 2. The error term has a zero population mean. 3. All explanatory variables are uncorrelated with the error term. 4. Observations of the error term are uncorrelated with each other (no serial correlation) 5. The error term has a constant variance (no heteroskedasticity. 6 No explanatory variable is a perfect linear function of any other explanatory variable(s) (no perfect multicollinearity) 7. The error term is normally distributed (this assumption is optional but usually is invoked).

Given all seven classical assumptions, the OLS coefficient estimators can be shown to have the following properties:

1. They are unbiased. That is, E(β-hat) is β. This means that the OLS estimates of the coefficients are centered around the true population values of the parameters being estimated. 2. They are minimum variance. The distribution of the coefficient estimates around the true parameter values is as tightly or narrowly distributed as is possible for an unbiased distribution. No other unbiased estimator has a lower variance for each estimated coefficient than OLS. 3. They are consistent. As the sample size approaches infinity, the estimates converge to the true population parameters. Put differently, as the sample size gets larger, the variance gets smaller, and each estimate approaches the true value of the coefficient being estimated. 4. They are normally distributed. The β-hat's are N(β, VAR[β-hat]). Thus various statistical tests based on the normal distribution may indeed be applied to these estimates, as will be done in Chapter 5.

Confidence Interval

A confidence interval is a range of values that will contain the true value of a certain percentage of the time, say 90 or 95 percent. The formula for a confidence interval is pictured where Two-sided

Decision Rules of hypothesis Testing: Critical Value

A critical value is a value that divides the "acceptance" region from the rejection region when testing a null hypothesis 1.8 is the critical value in the picture.

Decision Rules of hypothesis Testing: A Decision Rule is...

A decision rule is a method of deciding whether to reject a null hypothesis. Typically, a decision rule involves comparing a sample statistic with a preselected critical value found in tables such as those in the end of this text. A decision rule should be formulated before regression estimates are obtained. The range of possible values of β-hat is divided into two regions, an "acceptance" region and a rejection region, where the terms are expressed relative to the null hypothesis.

Dominant Variable

A special case related to perfect multicollinearity occurs when a variable that is definitionally related to the dependent variable is included as an independent variable in a regression equation. Such a dominant variable is by definition so highly correlated with the dependent variable that it completely masks the effects of all other independent variables in the equation. In a sense, this is a case of perfect collinearity between the dependent variable and an independent variable. The relationship is definitional, and the dominant variable should be dropped from the equation to get reasonable estimates of the coefficients of the other variables. Be careful, though! Dominant variables shouldn't be confused with highly significant or important explanatory variables. Instead, they should be recognized as being virtually identical to the dependent variable. While the fit between the two is superb, knowledge of that fit could have been obtained form the definitions of the variables without any econometric estimation.

An equation that is "linear in the coefficients"

An equation is linear in the coefficients only if the coefficients (the βs) appear in their simplest form they are not raised to any powers (other than one), are not multiplied or divided by other coefficients, and do not themselves include some sort of function (like logs or exponents). For example, Equation 7.1 is linear in the coefficients, but Equation 7.3 is not • The use of OLS requires that the equation be linear in the coefficients, but there is a wide variety of functional forms that are linear in the coefficients while being nonlinear in the variables.

An equation that is "linear in variables"

An equation is linear in the variables if plotting the function in terms of X and Y generates a straight line. For example, Equation 7.1:

Classical Assumption 2: What happens if the mean doesn't equal zero in a sample?

As long as you have a constant term in the equation, the estimate of beta-0 will absorb the nonzero mean. In essence, the constant term equals the fixed portion of Y that cannot be explained by the independent variables, and the error term equals the stochastic portion of the unexplained value of Y.

Specification Error

Before any equation can be estimated, it must be specified. Specifying an econometric equation consists of three parts: choosing the correct independent variables, the correct functional form, and the correct form of the stochastic error term. A specification error results when any one of these choices is made incorrectly.

Impure Serial Correlation

By impure serial correlation we mean serial correlation that is caused by a specification error such as an omitted variable or an incorrect functional form. While pure serial correlation is caused by the underlying distribution of the error term of the true specification of an equation (which cannot be changed by the researcher), impure serial correlation is caused by a specification error that often can be corrected. • best remedy for this is to attempt to find the omitted variable or the correct functional form for the equation.

Chapter 5 - Hypothesis Testing and Statistical Inference

Chapter 5 - Hypothesis Testing and Statistical Inference

Chapter 6 - Specification: Choosing the Independent Variables

Chapter 6 - Specification: Choosing the Independent Variables

Chapter 7 - Specification: Choosing a Functional Form

Chapter 7 - Specification: Choosing a Functional Form

Chapter 9 - Serial Correlation

Chapter 9 - Serial Correlation

Remedies to Multicollinearity: Do nothing

Every remedy has a drawback, so sometimes doing nothing is the best thing. • One reason for doing nothing is that multicollinearity in an equation will not always reduce the t-scores enough to make them insignificant or change β-hats enough to make them differ from expectations. • A second reason for doing nothing is that the deletion of a multicollinear variable that belongs in an equation will cause specification bias. • The final reason for considering doing nothing to offset multicollinearity is that every time a regression is rerun, we risk encountering a specification that fits because it accidentally works for the particular data set involved, not because it is the truth.

The Consequences of an Omitted Variable

If we leave an important variable out of an equation, we violate Classical Assumption 3 (that the explanatory variables are independent of the error term), unless the omitted variable is uncorrelated with all the included variables (which is extremely unlikely). To explain why this is so, we start by saying most pairs of variables are correlated to some degree. When an important variable is omitted, the impact of that variable goes into the error term. Thus, the error term is no longer independent of the explanatory variable(s). Other explanatory variables will also try to compensate for the impact of the omitted variable. When there is a violation of one of the Classical Assumptions, the Gauss-Markov Theorem does not hold, and the OLS estimates are not BLUE.

Imperfect Multicollinearity

Imperfect multicollinearity can be defined as a linear functional relationship between two or more independent variables that is so strong that it can significantly affect the estimation of the coefficients of the variables. • The stochastic error term, u, implies that although the relationship between X1 and X2 might be fairly strong, it is not strong enough to allow X1 to be completely explained by X2; some unexplained variation still remains.

Degrees of Freedom

In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary. = N - K - 1

t-Statistic

In statistics, the t-statistic is the ratio of the departure of the estimated value of a parameter from its hypothesized value to its standard error. In the example, the t-stat for Ni is -4.42. We came to that conclusion by taking the estimated regression coefficient (-9075) and dividing the SE (2053) from it.

Correcting for an Omitted Variable

In theory, the solution to a problem of omitted variable bias seems easy: Add the omitted variable to the equation! Unfortunately, that's easier said than done, for a couple of reasons: 1. The omitted variable is hard to detect - the amount of bias introduced can be small and not immediately detectable. 2. Problem of choosing which variable to add to an equation once you decide that it is suffering from omitted variable bias. - Some researchers will add all possible relevant variables at once. This leads to less precise estimates. - Others will test a number of different variables and keep the one in the equation that does the best statistical job of appearing to reduce bias. This is invalid because the variable that best corrects a case of specification bias might do so only by chance rather than by being the true solution to the problem.

Distribution of the β-hat's

Just as we would like the distribution of the β-hat's to be centered around the true population β, we would also like the distribution to be as narrow as possible.

You have an equation of weight on height. Suppose you take a sample of six student and apply the OSL to get a estimate of Beta-1 on height. What will happen if you select a second sample of six students and do the same thing? Will they have the same Beta-1 hat?

No they will not. While a single sample provides a single estimate of beta-1, that estimate comes from a sampling distribution with a mean and a variance. Other estimates from that sampling distribution will most likely be different.

Level of Confidence

Now and then researchers will use the phrase "degree of confidence" or "level of confidence" when they test hypotheses. What do they mean? The level of confidence is nothing more than 100 percent minus the level of significance. Thus a t-test for which we use a 5-percent level of significance can also be said to have a 95-percent level of confidence.

Remedies to Multicollinearity: Drop a Redundant Variable

On occasion, the simple solution of dropping one of the multicollinear variables is a good one.

6. No explanatory variable is a perfect linear function of any other explanatory variable(s).

Perfect Collinearity - implies that two independent variables are really the same variable • Suppose that you decide to build a model of the profits of tire stores in your city and you include annual sales of tires (in dollars) at each store and the annual sales tax paid by each store as independent variables. Since the tire stores are all in the same city, they all pay the same percentage sales tax, so the sales tax paid will be a constant percentage of their total sales (in dollars). If the sales tax rate is 7%, then the total taxes paid will be 7% of sales for each and every tire store. Thus sales tax will be a perfect linear function of sales, and you'll have perfect multicollinearity! • Perfect multicollinearity also can occur when two independent variables always sum to a third or when one of the explanatory variables doesn't change within the sample. With perfect multicollinearity, OLS (or any other estimation technique) will be unable to estimate the coefficients of the collinear variables (unless there is a rounding error).

When to reject Ho with critical t-values

Reject the Null Hypothesis if the sample t-value is more than the critical t-value and also has the sign implied by H1. Do not reject H0 otherwise if you do not have both.

When to reject Ho with p-values

Reject the null hypothesis if the sample p-value is less than the selected level of significance and if β-hatk has the sign implied by H1. Do not reject Ho

Omitted Variable

Suppose that you forget to include one of the relevant independent variables when you first specify an equation (after all, no one's perfect!). Or suppose that you can't get data for one of the variables that you do think of. The result in both these situations is an omitted variable, defined as an important explanatory variable that has been left out of a regression equation.

The F-Test

The F-test is a formal hypothesis test that is designed to deal with a null hypothesis that contains multiple hypotheses or a single hypothesis about a group of coefficients.11 Such "joint" or "compound" null hypotheses are appropriate whenever the underlying economic theory specifies values for multiple coefficients simultaneously.

Alternative hypothesis

The alternative hypothesis typically is a statement of the values that the researcher expects. The notation used to specify the alternative hypothesis is: "H1:" followed by a statement range of values you expect. - the book uses Ha (Example pictured)

1. The regression model is linear, is correctly specified, and has an additive error term

The assumption that the regression model is linear does not require the underlining theory to be linear. Two additional properties also must hold: • First, we assume that the equation is correctly specified. If an equation has an omitted variable or an incorrect functional form, the odds are against that equation working well. • Second, we assume that a stochastic error term has been added to the equation. This error term must be an additive one and cannot be multiplied by or divided into any of the variables in the equation.

Decision Rule to use in the F-test

The decision rule to use in the F-test is to reject the null hypothesis if the calculated F-value (F) from Equation 5.10 is greater than the appropriate critical F-value (Fc): pictured

Double-Log Form

The double-log form is the most common functional form that is nonlinear in the variables while still being linear in the coefficients. Indeed, the double-log form is so popular that some researchers use it as their default functional form instead of the linear form. In a double-log functional form, the natural log of Y is the dependent variable and the natural log of X is the independent variable

One-sided Test

The hypothesis pictured is an example of a one-side test because the alternative hypotheses have value on only one side of the null hypothesis.

Remedies to Multicollinearity: Increase the Size of the Sample

The idea behind increasing the size of the sample is that a larger data set (often requiring new data collection) will allow more accurate estimates than a small one, since the larger sample normally will reduce the variance of the estimated coefficients, diminishing the impact of the multicollinearity. For most time series data sets, however, this solution isn't feasible.

The level of Type I Error in a hypothesis test is also called...

The level of significance

Level of Significance

The level of significance indicates the probability of observing an estimated t-value greater than the critical t-value if the null hypothesis were correct Example: 1%, 5%, 10% Some researchers produce tables of regression results, typically without hypothesized signs for their coefficients, and then mark "significant" coefficients with asterisks. • The asterisks indicate when the t-score is larger in absolute value than the two-sided 10-percent critical value (which merits one asterisk), • the two-sided 5-percent critical value (**), • or the two-sided 1-percent critical value (***). Such a use of the t-value should be regarded as a description.

Linear Form

The linear regression model, used almost exclusively in this text thus far, is based on the assumption that the slope of the relationship between the independent variable and the dependent variable is constant

First-Order Serial Correlation

The most commonly assumed kind of serial correlation is first-order serial correlation, in which the current value of the error term is a function of the previous value of the error term:

Null hypothesis

The null hypothesis typically is a statement of the values that the researcher does not expect. The notation used to specify the null hypothesis is: "Ho:" followed by the statement fo the range of values you do not expect. • We never say we accept the null hypothesis; we must always say that we cannot reject the null hypothesis. (Example pictured)

Decision Rules of hypothesis Testing: What does the rejection region measure?

The rejection region measures the probability of a Type I error is the null hypothesis is true. Again, decreasing the chance of a Type I Error means increasing the chance of a Type II Error (not rejecting a false null hypothesis). If you make the rejection region so small that you almost never reject a true null hypothesis, then you're going to be unable to reject almost every null hypothesis, whether they're true or not! As a result, the probability of a Type II Error will rise.

What does the term Classical mean when referring to the classical model of econometrics?

The term classical refers to a set of fairly basic assumptions required to hold in order for OLS to be considered the "best" estimator available for regression models. When one or more of these assumptions do not hold, other estimation techniques such as Generalized Least Squares may be better than OLS.

Variance inflation factor (VIF)

The variance inflation factor (VIF) is a method of detecting the severity of multicollinearity by looking at the extent to which a given explanatory variable can be explained by all the other explanatory variables in the equation. How high is high? An R2i of 1, indicating perfect multicollinearity, produces a VIF of infinity, whereas an R2i of 0, indicating no multicollinearity at all, produces a VIF of 1. Common rule of thumb is that if VIF if more than 5, the multicollinearity is severe.

p-value

There's an alternative to the t-test based on a measure called the p-value, or marginal significance level. A p-value for a t-score is the probability of observing a t-score that size or larger (in absolute value) if the null hypothesis were true. Graphically, it's two times the area under the curve of the t-distribution between the absolute value of the actual t-score and infinity. A p-value is a probability, so it runs from 0 to 1. It tells us the lowest level of significance at which we could reject the null hypothesis (assuming that the estimate is in the expected direction). A small p-value casts doubt on the null hypothesis, so to reject a null hypothesis, we need a low p-value.

Perfect Multicollinearity examples

Think about the distance between two cities measured in miles with X1 and in kilometers with X2. The data for the variables look quite different, but they're perfectly correlated! A more subtle example is when the two variables always add up to the same amount, for instance P1, the percent of voters who voted in favor of a proposition, and P2, the percent who voted against it (assuming no abstentions), would always add up to 100% and therefore would be perfectly (negatively) correlated. • if this is the case, one variable should be dropped • violates assumption 6

Pure Serial Correlation

This occurs when assumption 4, which assumes uncorrelated observations of the error term, is violated in a correctly specified equation.

Critical t-value

To decide whether to reject or not to reject a null hypothesis based on a calculated t-value, we use a critical t-value. A critical t-value is the value that distinguishes the "acceptance" region from the rejection region. The critical t-value is selected from a t-table. A critical t-value is a function of the probability of Type I Error that the researcher wants to specify.

Example of type I and II errors: Ho: The defendant is innocent H1: The defendant is guilty What would be a type I error and what would be a type II error?

Type I Error = Sending an innocent defendant to jail. Type II Error = Freeing a guilty defendant. Decreasing the level of type I error means increasing the level of type II error.

Type I Error

Type I: We reject a true null hypothesis

Type II Error

Type II: We do not reject a false null hypothesis

2. The error term has a zero population mean

• As was pointed out in Section 1.2, econometricians add a stochastic (random) error term to regression equations to account for variation in the dependent variable that is not explained by the model. The specific value of the error term for each observation is determined purely by chance. • For a small sample, it is not likely that the mean is exactly zero. But as the size of the sample approaches infinity, the mean of the sample approaches zero.

7. The error term is normally distributed

• Assumption VII states that the observations of the error term are drawn from a distribution that is normal (that is, bell-shaped, and generally following the symmetrical pattern portrayed in Figure 4.3). Even though Assumption VII is optional, it's usually advisable to add the assumption of normality to the other six assumptions for two reasons: 1. The error term ei can be thought of as the sum of a number of minor influences or errors. As the number of these minor influences gets larger, the distribution of the error term tends to approach the normal distribution. 2. The t-statistic and the F-statistic, which will be developed in Chapter 5, are not truly applicable unless the error term is normally distributed.

Unbiasedness

• For an estimation technique to be "good," the mean of the sampling distribution of the Beta-hats it produces should equal the true population . This property has a special name in econometrics: unbiasedness. • An estimator beta-hat is an unbiased estimator if its sampling distribution has as its expected value the true value of. • No systemic distortion; the model should hold and β-hat, aka the beta of the sampling distribution, should be equal to the true β value • If an estimator produces β-hat's that are not centered around the true , the estimator is referred to as a biased estimator.

4. Observations of the error term are uncorrelated with each other

• If a systematic correlation exists between one observation of the error term and another, then OLS estimates will be less precise than estimates that account for the correlation. • This assumption is most important in time-series models. - Assumption IV says that an increase in the error term in one time period (a random shock, for example) does not show up in or affect in any way the error term in another time period. In some cases, though, this assumption is unrealistic, since the effects of a random shock sometimes last for a number of time periods.

What happens if you suppress the constant term?

• In most cases, suppressing the constant term leads to a violation of the Classical Assumptions, because it's very rare that economic theory implies that the true intercept, βo, equals zero • If you omit the constant term, then the impact of the constant is forced into the estimates of the other coefficients, causing potential bias. • forcing the regression through the origin makes the slope appear to be significantly positive.

Impact of Irrelevant Variables

• Increases the variance of the estimated coefficients while at the same time decreasing the absolute magnitude of their t-scores • It will decrease the R^2 average but not the R^2

The Standard Error of β-hat

• The standard error of the estimated coefficient, SE(β-hat), is the square root of the estimated variance of the β-hat's, it is similarily affected by the size of the sample and the other factors we have mentioned. • An increases in sample size will call the SE to fall; the larger the sample, the more precise or coefficient estimates will be.

"Do no rely on estimates of the constant term"

• the intercept should not be relied on for purposes of analysis or inference • the constant term is the value of the dependent variable when all the independent variables and the error term are zero, but the variables used for economic analysis are usually positive. Thus, the origin often lies outside the range of sample observations

Distribution of the β-hat's: Sample size

•The variance of the distribution of the β-hat's can be decreased by increasing the size of the sample. • This also increases the degrees of freedom, since the number of degrees of freedom equals the sample size minus the number of coefficients or parameters estimated. • As the number of observations increases, other things held constant, the variance of the sampling distribution tends to decrease. • Although it is not true that a sample of 60 will always produce estimates closer to the true β than a sample of 6, it is quite likely to do so; such larger samples should be sought


Related study sets

Azure Infrastructure and Networking

View Set

Microeconomics -- Chapter 8 (Part 1): Short-Run Costs and Output Decisions

View Set

Entrepreneurial Small Business 5th Edition; Chapter 7

View Set

Addition Set 5: Plus 9 Combinations

View Set