Hagrannsóknir

¡Supera tus tareas y exámenes ahora con Quizwiz!

a binary variable is called a

dummy variable

True

Adjusted R Squared can be negative

b1 meaning

how much of the dependent variable varies?

upward bias.

if E(B ̃1) . B1, then we say that B ̃1 has an upward bias

statistically significant

if a regression coefficient is different from zero in a two-sided test, the corresponding variable is said to be...

reject null in F-test

if unconstrained equation fit is better than constrained

In the simple linear regression model y = β0+β1x+u E (u|x) = 0 the regression slope

indicates by how many units the conditional mean of y increases, given a one unit increase in x.

a binary/dummy variable

is a variable that only ever takes the values of 0 or 1

linear probability model

is heteroskedastic by definition

The population parameter in the null hypothesis

is not always equal to zero.

rejection rule

is that H0 is rejected in favor of H1 at the 5% significance level if t>c

data frequency

daily, weekly, monthly, quarterly, and annually.

sum of squared residuals (SSR)

difference between the actual data point (y) and our regression line (y hat)

normal distribution in stata

display normal(z)

Severe Imperfect Multicollinearity

linear functional relationship between two or more independent variables so strong that it can significantly affect the estimations of coefficients.

Linear Regression

means a regression that is linear in the parameters. It may or may not be linear in the variables.

R (bar) ^2 (adjusted R^2)

measures the percentage of the variation of Y around its mean that is explained by the regression equation, adjusted for degrees of freedom.

Skewness

measures the symmetry (Logs can get rid of skewness)

The natural log of X

monotonically increasing transformation of X

a general way to run a higher-order terms specification test; includes squares and possibly higher order fitted values in the regression; failure to reject implies that some combination of higher order terms and interactions of your X variables would produce a better model

regression specification error test (RESET)

RSS

regression sum of squares how much variation does your model explain?

standard error of B

se(B)=σ^/[SST(1-R^2)]^(1/2)

partial effect of a change in X on Y

the coefficient in a regression represents the...

Probit coefficients are typically estimated​ using:

the method of maximum likelihood

unbiasedness of the error variance

theorem 2.3

unbiasedness of OLS

theorem 3.1

sampling variances of OLS slope estimators

theorem 3.2

unbiased estimator of the error variance

theorem 3.3

omitted variable bias

when the omitted variable is correlated with the included explanatory variables

Consider a regression model of a variable on itself: y = β0 + β1y + u What are the OLS estimators of β and β , βˆ and βˆ ? 0101

βˆ = 0, βˆ = 1

Explained Sum of Squares

the total variation of the fitted Y values around their average (i.e. the variation that is explained by the regression):

Residual Sum of Squares (RSS)

the unexplained variation of the Y values around the regression line: • RSS = Σ(i)(Yi-^Yi)² = Σ^u²

RESET test

regression specification error test tests the null hypothesis that your model is correct high test score, model probably not correct

standard error of ^B1

σ/ √SSTx

variance inflation factor (VIF)

σ^2 Var(B)=-------*(VIF) SST

standard error of the regression (SER)

σ^=sqrtσ^2

Var(~Y)

σ²Y/n

SE(~Y)

^σ~Y=s/√n •standard error or ~Y

regression R^2=

ess/tss

The probit​ model

forces the predicted values to lie between 0 and 1.

β₁

notation for the population slope

alternative hypothesis

Ho: B/=/0

How to find the OLS estimators ^β₀ and ^β₁

• Solve: ^β₁= sXY/s²X = (Σ(i=1,n)(Xi-~X)(Yi - ~Y))/(Σ(i=1,n)(Xi - ~X)² ^β₀= ~Y - ^β₁~X

Steps in regression analysis step 4

4) Collect the data. Inspect and clean the data

semi-elasticity

=100*b1

This is not actually correct

AIC is defined in terms of the log-likelihood. The likelihood is loosely interpreted as the probability of occurrence of an event

in the model ln(y)=B0+B1X1 +u the elasticity of E(Y|X) with respect to X is

B1X

econometric model.

Crime=B0+B1edu+B2exper+B3training+u

conditional mean of Y

E (Y|X) = β₀ + β₁X

errors have zero expected value (not systematically biasing your estimate by a non-zero amount)

Gauss-Markov assumption 4

null hypothesis

Ho: B=0

Infinity

If p= n-1

Multiple Restrictions

More than one restriction of the parameters in an econometric model.

What does B1 represent

The Slope

What does B0 represent

The intercept

t Statistic

The statistic used to test a single hypothesis about the parameters in an econometric model.

Impact of variance

Weights equalize

regress squared residuals on all explanatory variables, their squares, and interactions; detects more general deviation from hetero than BP test

White test for heteroscedasticity

Residual

^ui=Yi-^Yi=Yi-b₀-b₁Xi

classical assumption 3

all explanatory variable are uncorrelated with the error term

OLS estimates

best fit

Y-hat =

b₀ + b₁X

binary variables

can take on only two values

f distribution in stata

does my model matter? display F(df1, df2, z) high f value means it's unlikely that your model doesn't matter (ergo...it matters)

rij are the squared residuals from a regression (of xj on all other explanatory variables)

heteroskedasticity-robust OLS standard errors

rejection rule

if p-value < alpha, we reject null

MSE

mean square error

ols estimator is derived by

minimizing the sum of squared residuals

correlation coefficient

quantifies correlation/dependence measure of the strength and direction of the linear relationship between two variables

What is the difference between u and uhat?

"u represents the deviation of observations from the population regression line, while uhat is the difference between Wage and its predicted value Wage."

If we can reject the null hypothesis that β1=0 then we say that

"β1 is statistically significant", or "β1 is positive/negative and statistically significant."

residual sum of squares=

(actual - expected)^2

first moment of standardized variable

0

second moment of standardized variable

1

identification solutions

1) generate a variable that we know can't be correlated with other factors because it is random 2) figure out a way of identifying exogenous variation (variation that is random)

Incorporating nonlinearities in simple regression

1. logarithmic transformations of dependent variable 2. logarithmic transformations of both dependent and independent variable

Steps in regression analysis step 2

2) Specify the model: select independent variables and the functional form

Steps in regression analysis step 3

3) Hypothesize the expected signs of the coefficients

Let X be a normally distributed random variable with mean 100 and standard deviation 20. Find two values, a and b, symmetric about the mean, such that the probability of the random variable being between them is 0.99.

48.5, 151.5

Steps in regression analysis step 6

6) Document Results

A multiple regression includes two regressors: Yi = B0 + B1 X1i + B2 X2i + ui What is the expected change in Y if X1 increases by 8 units and X2 decreases by 9 units?

8B1 - 9B2

one-tailed test

>2.1%

Causal Effect

A ceteris paribus change in one variable that has an effect on another variable

Cross-sectional Data Set

A data set collected by a sampling population at a given point in time

Which of the following statements is true of hypothesis testing?

A restricted model will always have fewer parameters than its unrestricted

How to pick the best model in portfolio

Adjusted R Squared Akaikie's (AIC) Schwar'z Information Criterion Hannan-Quinn's Information Criterion

less than or equal to R squared

Adjusted R squared is

Rbar^2

Adjusted R^2 for adding another independent variable because R^2 will never decrease if another independent variable is added

Of the following assumptions, which one(s) is (are) necessary to guarantee unbi- asedness of the OLS estimator in a multiple linear regression context? a) Linearity of the model in the parameters. b) Zero conditional mean of the error term. c) Absence of perfect multicollinearity. d) Homoskedasticity of the error term. e) Random sampling.

All except d

Assumption 3

All explanatory variables are uncorrelated with error term

Assumption 3

All explanatory variables are uncorrelated with the error term

Two sided test

Alternative hypothesis has values on both sides of the null hypothesis

the slope estimate

B1=Δy/Δx

simplest way to test for heteroskedasticity; regress residuals on our Xs

Breusch-Pagan test

Perfect Multicollinearity

Cannot invert (X'X) matrix

Dummy Variables

Changes due to something Seasons Strikes War/Flood/Storm

different units households, firms, industries, states, countries Difference in those characteristics across units that result in variation in the disturbance terms

Cross-sectional data is concerned with

Retrospective Data

Data collected based on past, rather than current, information.

Residual (e)

Difference between the estimated vale of the dependent variable and the actual value of the dependent variable e=Y-Y^

Type 2 error

Do not reject a false hypothesis

An unbiased estimator with the smallest variance is

EFFICIENT

Property two of variance

For any constants a and b, Var(aX + b) = a^2Var(x)

one-sided alternative hypothesis

H1: Bj>0

Stationery

If both the mean and variance are finit and constant, then the time series is said to be

Rejection Rule

In hypothesis testing,the rule that determines when the null hypothesis is rejected in favor of the alternative hypothesis.

Sum of Squared Residuals (SSR)

In simple and multiple regression analysis, the sum of the squared OLS residuals across all observations.

Goodness of fit 3

Is data set reasonably large and accurate!

Goodness of fit 1

Is equation supported by theory

T-test

Is the appropriate test to use when the stochastic error term is normally distributed and when the variance of that distribution must be estimated

Sampling Distribution

Just as the error term follows a probability distribution, so too do the estimates of β. In fact, each different sample of data typically produces a different estimate of β. The probability distribution of these βN values across different samples is called the sampling distribution of 𝛃hat N

Simple Regression

Linear regression model with one regressor

Method one: Method of Moments

MOM estimators are constructed by replacing the population moment, such as µ with its unbiased sample counterpart, such as ¯x. For the method of moments estimators, we use the sample counterparts to choose our estimates of β0 and β1.

Variance

Measures how dispersed the data is. How far away x is from the population mean

Correllation

Measures how linear the data is (how related the variables x and y are)

The detection of Multicollinearity

Multicollinearity exists in every equation. • Important question is how much exists. • The severity can change from sample to sample. • There are no generally accepted, true statistical tests for multicollinearity.

Panel/Longitudial data

Multiple individuals Multiple time periods example Vouchers for low income to go to higher income Tracked those people over time

Does ommitting a variable in our regression always cause OMMITTED VARIABLE BIAS

No

Assumption 6

No explanatory variable is a perfect linear function of any other explanatory variable (no perfect multicollinearity)

Exogenous variables

Not correlated with the error term

adding independent variables

R sqaured can be made arbitrarily large by

overall significance of the regression

R^2/k ----------------------- (1-R^2)/(n-k-1)

Assumption 1

Regression is linear, correctly specified and had error term

Type 1 error

Reject a true hypothesis

Type 1 error

Reject the null hypothesis that is actually true

Chow test

Run separate regressions, the new unrestricted SSR is given by the sum of the SSR of these two separate regressions, then just run a regression for the restricted model

Coefficient of Determiniation (R²)

R² = ESS/TSS = 1- RSS/TSS

Response variable

See dependent variable

Binary variables

Success or failure x=1 or x=0

Which of the following tests helps in the detection of heteroskedasticity?

The Breusch-Pagan test

True Model

The actual population model relating the dependent variable to the relevant independent variables, plus a disturbance, where the zero conditional mean assumption holds.

Variance inflation factor

The amount by which the variance is inflated due to multicollinearity

Consistency

The first asymptotic property of estimators concerns how far the estimator is likely to be from the parameter it is supposed to be estimating as we let the sample size increase indefinitely

Data Frequency

The frequency at which time series data are collected. Yearly, quarterly, and monthly are the most common data frequencies

OlS Intercept Estimate

The intercept in an LS estimation.

The residual

The residual for observation i is the difference between the actual yi and its fitted value ^ui = yi- ^yi -= yi - ^B0 - ^B1xi these are not the same as errors in our population regression function

Sample Variation in the explanatory variable

The sample outcomes on x are not all the same value

Dependent Variable

The variable to be explained a multiple regression model

Weighted Least Squares

This method of estimation in which the data, both the Xs and Y are weighted by a factor depending on the VGP

T/F Although the expected value of a random variable can be positive or negative its variance is always non-negative.

True, follows from the definition of the variance of a random variable.

What does the measure of goodness of fit tell us?

Useful for evaluating the quality of the regression and comparing models that have different data sets or combinations of independent variables

Method two: Least Squares Estimator

Using calculus, we set the derivative wrt the model parameters of the objective function, the sum of the squared error, equal to zero

It is binomially distributed

Which of the following assumptions about the error term is not part of the so called Classical Assumptions?

linear in the variables

X vs Y gives a straight line

population simple linear regression model

Y = β₀ + β₁X

The OLS residuals, u i, are defined as follows:

Yi - Yhat i

Confidence Interval for β₁

[^β₁+Z(α/2) x SE(^β₁), ^β₁+ Z(1-α/2) x SE(^β₁)]

An estimate is:

a nonrandom number.

AIC (akaike's information criterion)

adjusts RSS for the sample size and K; the lower it is, the better the equation; penalizes more for irrelevant variables that Ramsey does

We would like to predict sales from the amount of money insurance companies spent on advertising. Which would be the independent variable?

advertising.

As the sample size increases the variance for B1 ______

decreases

ols residuals, ui, are sample counterparts of the population

errors

the fixed effects regression model

has n different intercepts

univariate model

has u as the error term (aka the unknown that isn't accounted for in the y = b0 +b1x1 model)

Xi

independent variable, or regressor, or explanatory variable, or covariate

β₀

intercept. • It measures the average value of Y when X equals zero (this is often not economically meaningful by itself)

The question of​ reliability/unreliability of a multiple regression depends​ on:

internal and external validity.

"Changing the units of measurement that is, measuring test scores in 100s, will do all of the following except for changing the:"

interpretation of the effect that a change in X has on the change in Y

The correlation coefficient

is a measure of linear association.

The population correlation coefficient between two variables X and Y

is a measure of linear association.

The size of the test

is the probability of committing a type I error.

omitted variable bias

is the regressor (the student- teacher ratio) is correlated with a variable that has been ommitted from the analysis( the percent of english learners) and that determinines in part the dependent variable ( test scores) , then the OLS estimator will have omitted bias

one of the primary advantages of using econometrics over typical results from economic theory is

it potentially provides you with quantitative answers for a policy problem rather than simply suggesting the direction of the response

constant elasticity model.

log(yi)=b0+b1xi+ui log(c1yi)=[log(c1)+b0]+b1xi+ui .

semi-elastic

log-level 100 x beta j

elastic

log-log

The Adjusted R^2 is always ___________ than the R^2

lower

maximum likelihood estimation yields the values of the coefficients that

maximize the likelihood function

sum(ax1+B) equals

n x a x X with a bar + n x b

denominator degrees of freedom

n-k-1=df

proxy variable approach

need correlated x and uncorrelated u best proxy is a lagged version of the same thing

M (in the F-test)=

number of constraints

Experimental data

often collected in laboratory environments in the natural sciences, but they are much more difficult to obtain in the social sciences.

Sources of variation in Y other than Xs:

omitted variation, measurement errors, functional form problems, random variation/unpredictability

respondent bias

people misrepresent themselves and give answers they think you want to hear

one way of dealing with omitted variables (but imperfect); think of running them as something more like a specification test, or a test for the influence of possible omitted variable bias (worried about nonrandom sampling)

proxy variables

correlate in stata

pwcorr

increasing the sample size

reduces the variance

regress in stata

regress y x

if the absolute value of your calculated t statistic exceeds the critical value from the standard normal contribution, you can

reject the null hypothesis

p value

smallest significance at which the null would be rejected

p value

smallest significance level at which the null hypothesis would still be rejected; small value is evidence against the null hypothesis

efficient estimator

smallest variance and unbiased

cross-sectional

snapshot at a point in time

σ/ √SSTx

standard error of ^B1

central limit theorem (asymptotic law)

states that given a sufficiently large sample size from a population with a finite level of variance, the mean of all samples from the same population will be approximately equal to the mean of the population

measures how many estimated standard deviations the estimated coefficient is away from zero

t statistic

average treatment effect

tells us what on average our unit treatment effect is (the average difference between all the treated (Z=1) and control (Z=0) groups)

F-Test

test the naive model against the sophisticated model

structural estimation

testing the model and its assumptions deliberately and explicitly

regression

tests the relationship between variables

A large p value implies

that observed value Y bar is consistant with null hypothesis

first order conditions

the OLS first order conditions can be obtained by the method of moments: under assumption E(u)=0 and E(xu)=0, where j=1, 2, ..., k. The equations are the sample counterparts of these population moments, although we have omitted the division by the sample size n.

We are able to estimate the true β₀ and β₁ more accurately if we use

the OLS method rather than any other method that also gives unbiased linear estimators of the true parameters. This is because the OLS estimators are (BLUE)

In the case of​ errors-in-variables bias, the precise size and direction of the bias depend​ on

the correlation between the measured variable and the measurement error.

Heteroskedasticity is a problem with the

the error term

R^2 is a measure of

the goodness of fit of a line

Comparing the California test scores to test scores in Massachusetts is appropriate for external validity​ if

the institutional settings in California and​ Massachusetts, such as organization in classroom instruction and​ curriculum, were similar in the two states.

OLS sampling errors

the variance in our estimator lets us infer the accuracy of a given estimate

heteroskedasticity

the variance of the error term increases as X increase, violating Assumption V

1. If the dependent variable is multiplied by a constant,

then the OLS intercept and slope estimates are also multiplied by that constant

normal sampling distributions

theorem 4.1

the slope estimator B1 has a a smaller standard error, other things equal, if

there is more variation in the explanatory variable X

The OLS slope estimator, β1, has a smaller standard error, other things equal, if

there is more variation in the explanatory variable, x

measurement error inconsistency

this factor (which involves the error variance of a regression of the true value of x1 on the other explanatory variables) will always be between zero and one; implies we are consistently biased towards zero

rejecting a true hypothesis

type 2 error

serial correlation

violates Classical Assumption IV because observations of the error term ARE correlated (based on past data)

causal effect

when one variable has an effect on another variable

well identified model

when we can argue that an estimator is likely to yield an unbiased estimate of our parameter of interest (one can correctly "identify" what plays a causal role and what does not)

ceteris paribus

which means "other (relevant) factors being equal"

participation bias

whoever chooses to participate is inherently biased

Non-Linear function of the independent variables examples

• Yi = β₀+β₁X²i+ui • Yi = β₀+β₁1/Xi+ui • Yi = β₀+β₁log(Xi)+ui

Distribution of OLS estimators

• ^β₁~ N(β₁, Var(^β₁)) • ^β₀~ N(β₀, Var(^β₀))

Median

"central tendency" of data Half of the data is above and the other half of the data is below

F=

((RSSm-RSS)/M)/(RSS/(N-K-1))

For n = 25, sample mean =645, and sample standard deviation s = 55, construct a 99% confidence interval for the population mean.

(614.23, 675.77)

adjusted R^2=

(ESS/(n-k-1)/(TSS/(n-1)

t-statistic

(^β₁-β₁)/SE(^β₁)

State whether the following statements about heteroskedasticity are true or false. If false - give reasons why. (a) Heteroskedasticity occurs when the disturbance term in a regression model is correlated with one of the explanatory variables.

(a) False - Heteroskedasticity occurs when the variance of the disturbance term is not the same for all observations.

total sum of squares=

(actual - average)^2

(t/f) When R2 = 1 F=0 and when R2=0 F=infinite.

(c) False. Follows from formula see question 18 part c answer

Adjusted R²

1-((n-1)/(n-k-1))((RSS)/(TSS))

Properties of R²

1. 0 ≤ R² ≤ 1 2. For the simple regression, R² = ρ²YX

Non linear functions that model economic phenomena

1. Diminishing marginal returns 2. Estimating the price elasticity of demand

FOUR Assumptions required for unbiasedness of OLS

1. Linear in parameters 2. Random Sampling 3. Sample variation in independent variable 4. Zero Conditional mean

Randomized Controlled experiement SRTE

1. Sampling 2. Randomization, random assignment 3. treatment 4. Estimation

Steps for finding covariance

1. find x_bar and y_bar 2. find dx and dx (x-x_bar) (y-y_bar) 3. find (dx*dy) 4. Sum 5. Divide by n-1

OLS will have omitted variable bias when two conditions hold

1. when the omitted variable is correlated with the included regressor 2. when the omitted variable is a determinant of the dependent variable another way to say it 1. X is correlated with the omitted variable 2. The omitted variable is a determinant of the dependent variable Y

sample covarience

1/n-1 * sum(xi-¯x)(yi-¯y)

New assumption

P independent variables are linearly independent Can't write one independent variable as a linear combination of other p-1 variables

Heteroskedasticity

Patterened error terms

Steps for hypothesis test

State null hypothesis and alternative Calculate value of test statistic (z or t depending) Draw a picture of what Ha looks like, find p value State sentence

Null hypothesis (H0)

Statement of the values that the researcher does not expect

Alternative hypothesis (HA)

Statement of the values that the researcher expects

linear in parameters

assumption SLR.1

they are not independent and each will have a different distribution conditional on the other

if knowing something about random variable B tells you something about variable A then...

reject

if | t-statistic |>critical value, we ___________ the null hypothesis

log just the independent variable

impact of independent variable increases at a decreasing rate

how do you control for the person specific and time specific effects in panel regression data

include period specific and time specific variables in the regression

may increase sampling variance

including irrelevant variables...

irrelevant variable added

increase variance, decrease t-scores, usually decrease adjusted R^2 but not regular R^2

log just the dependent variable

increasing in independent variable causes dependent variable to increase at an increasing rate

whether two random variables are related (or not)

independence and conditionality are how we discuss...

Because of random sampling, the observations (Xi, Yi) are

independent of each other and identically distributed (they are i.i.d.)

In the simple linear regression model, the regression slope

indicates by how many units Y increases, given a one unit increase in X.

In the simple linear regression model y = β0+β1x+u E (u|x) = 0 the regression slope

indicates by how many units the conditional mean of y increases, given a one unit increase in x.

a small p value

indicates evidence against the null hypothesis

estat ovtest

is my model ommiting variables?

if the absolute value of the calulated t statistic exceeds the critical value from the standard normal distribution you can

reject the null hypothesis

To obtain the slope estimator using the least squares principle, you divide the

sample covariance of X and Y by the sample variance of X.

In the simple regression model y = β0 + β1x + u, to obtain the slope estimator using the least squares principle, you divide the

sample covariance of x and y by the sample variance of x.

standard error of Bˆ1

se(B1)=σ/sqrt(SST)=σ/[∑(x-x ̄)^2]^(1/2)

consistency of OLS (estimator is consistent for a population parameter)

theorem 5.1

asymptotic normality of OLS

theorem 5.2

The interpretation of the slope coefficient in the model Yi = β0 + β1 ln(Xi) + ui is as follows:

1% change in X is associated with a β1 % change in Y.

Stochastic error term

A term that is added to a regression equation to introduce all of the variation in Y that cannot be explained by the included Xs

F statistic

is used to test joint hypothesis about regression coeff two restrictions B1=0 b2=0 F stat : is the average of the squared t stat . if t stat are uncorrelated than F=.5(t1^2 +t2^2 )

the assumption of constant variance of the error term

is violated by the linear probability model

the r squared of the restricted model

is zero by definition

VALID INSTRUMENT : Z

istrument variable Z to isolate that part of X that is correlated with u

Fixed effect

it captures anything that is specific to an indiv but does not change over time

Testing the hypothesis that β1 = 0 is somewhat special, because

it is essentially a test of whether or not Xi has any e ect on Yi .

if people are not honest in a survey

it may lead to a non-random measurement error

regression of Y on an indicator variable for treatment X which takes on the value 1 when treatment occurred and 0 otherwise; 1 if person is a woman, 0 if person is a man

dummy variable

investigator bias

easy to influence the outcome via subjective lens

OLS is best and unbiased (under assumptions MLR.1 - MLR.5 ); want unbiased estimator with the smallest variance (doesn't have to be OLS)

efficiency of OLS

under assumptions MLR.1 - MLR.5, OLS is best and unbiased; want unbiased estimator with the smallest variance (doesn't have to be OLS)

efficiency of OLS

When testing joint​ hypotheses, you can​ use

either the F​-statistic or the​ chi-squared statistic.

Of the following assumptions, which one(s) is (are) necessary to guarantee unbi- asedness of the OLS estimator in a multiple linear regression context? a) Linearity of the model in the parameters. b) Zero conditional mean of the error term. c) Absence of perfect multicollinearity. d) Homoskedasticity of the error term. e) Random sampling.

All of the above except d).

In a random sample

All the individuals or units from the population have the same probability of being chosen

In a random sample:

All the individuals or units from the population have the same probability of being chosen.

Hannan Quinn Information Criterion

Alternative to AIC Used Infrequently Look for Model with smallest H-Q Value

Degrees of freedom

Excess of the number of observations (N) over the number of coefficients estimated (K+1)

F-statistic

F = ((RSSC-RSSU)/m)/(RSSU/(n-k-1)) ~ F(m,n-k-1) where m=# of linear restrictions and k=number of regresoors in the unconstrained model

Denote SSRr the sum of squared residuals of a restricted model and SSRur the sum of squared residuals of an unrestricted model, n the sample size and k the number of regressors of the unrestricted model. All the following are correct formulae for the F-statistic for testing q restrictions with the exception of

F = ((SSRur−SSRr)/q)/(SSRr/(n-k-1))

Confidence interval

Set of values at which we fail to reject the null

Upward Bias

The expected value of an estimator is greater than the population parameter value.

Consider a regression model where the R2 is equal to 0.2359 with n = 46 and k = 5. You are testing the overall significance of the regression using the F -Test. What is the p-value of the test?

The p-value is approximately 5% (but a little bit smaller than that).

Intercept Parameter (Constant)

The parameter in a simple and multiple linear regression model that gives the expected value of the dependent variable when all the independent variables equal zero.

The effect of a one unit increase in X on Y is measured by

The slope of a line relating Y to X. It is β₁in the equation Y=β₀+β₁X • β₁= ΔY/ΔX

p-Value

The smallest significance level at which the null hypothesis can be rejected. Equivalently, the largest significance level at which the null hypothesis cannot be rejected.

σ^2

population variance

Park test

residuals from estimated regression, form dependent variable Z from the log, run regression against log of dependent variable, if t-test is large...proof of heteroskedasticity

In the case of​ errors-in-variables bias

the OLS estimator is consistent if the variance in the unobservable variable is relatively large compared to the variance in the measurement error.

BLUE

the OLS estimator with the smallest variance in the class of linear unbiased estimators of the parameters

In a randomized controlled experiment

there is a control group and a treatment group.

beta 1 hat has a smaller standard error, ceteris paribus, if

there is more variation in the explanatory variable, x

The OLS slope estimator, β1, has a smaller standard error, other things equal, if

there is more variation in the explanatory variable, x.

if the significance level is made smaller and smaller

there will be a point where H0 cannot be rejected

"If the ommitted variable biased exists, what is the effect, if any, on our regression estimations (the numbers Stata gives us for the Betas/coefficients)"

they are biased and inconsistent

discrete variable

variable with countable outcomes within an interval i.e rolling a dice

Endogenous Variable

variables correlated with the error term

1) often mitigates influence of outlier observations 2) can help to secure normality and homoscedasticity

why take logs?

1) weakens influence of outlier observations 2) can help secure normality and homoscedasticity

why take logs?

1) one sided tests imply a certain amount of certainty in one's model 2) set the bar for lower significance

why use two sided tests?

In the multiple regression model, the adjusted R2, R2

will never be greater than the regression R2

Let y be the fitted values. The OLS residuals, u , are defined as follows:

y −y₋

Properties of the Adjusted R²

• Adjusted R² ≤ R² • Adding a regressor to the model has two opposite effects on the adjusted R²: 1. the SSR falls, which increases the adjusted R² 2. the factor (n-1)/(n-k-1) increases, which decreases the adjusted R² (Note: Whether the adjusted R² increases or decreases depends on which of the two effects is stronger) • The R² is always positive, but the adjusted R² can sometimes be negative • The adjusted R² can be used to compare two regressions with the same dependent variable but different number of regressors

The bias of ^β₁is

• Cov(Xi,ui)/Var(Xi) • does not depend on the sample size

Goodness of fit 4

Is OLS best estimator

Ordinary Least squares

Is a regression estimation technique that calculates the B^ so as to minimize the sum of the squared residuals

Slope Dummy

A quantitative variable and dummy variable interaction is typically called a

Type I Error

A rejection of the null hypothesis when it is true.

Continuous random variable

A variable that takes on any real value with zero probability. Speed of a car No way to tell what speed the car will be going at a particular moment

Dummy Variable, or binary variable,

A variable that takes on only the values 0 or 1

Which of the following is true of the standard error the OLS slope estimator, s.e. β ?

It is an estimate of the standard deviation of the OLS slope estimator.

Properties of variance

It is desirable for sampling distribution to be as narrow (as precise) as possible

Perfect Multicollinearity

It is the case where the variation in one explanatory variable can be completely explained by movements in another explanatory variable.

total sample variation in explanatory variable xj (converges to n * var(xj)); more sample variation leads to more precise estimates so just increase sample size (good variation)

SST (variance)

How to get a log out of an equation

Take the e^(number on other side) log(y)=10.6 y=e^10.6

Slope Parameter

Te coefficient on an independent variable in a simple and multiple regression model

Elasticity

The percentage change in one variable given a 1% ceteris paribus increase in another variable.

no multicollinearity (model is estimating what it should be)

assumption MLR.3

Fitted value

We can define a fitted value for y when x=xi as ^yi=^B0+^B1xi This is the value we predict for y when x=xi for the estimated intercept and slope

Consider a regression model where the R2 is equal to 0.257 with n = 33 and k = 3. You are testing the overall significance of the regression using the F-Test. Which of the statements below is correct?

We can reject the null hypothesis at the 5% level of significance but not at the 1% level of significance.

Take an observed (that is, estimated) 95% confidence interval for a parameter in a multiple linear regression model. Then:

We cannot assign a probability to the event that the true parameter value lies inside that interval.

Y =

average Y for given X + Error

constant elasticity model

both variables are in logarithmic form

Irrelevant Variable

a variable in an equation that doesn't belong there Decreases R(bar)^2

correlation between X and Y

can be calculated by dividing the covariance between X and Y by the product of the two standard deviations

internal validity of the model

concerns about identification within an econometric model

causes of heteroskedasticity

1) misspecification (omitted variables) 2) outliers 3) skewness 4)incorrect data transformation 5) incorrect functional form for the model 6) improved data collection procedures

IV procedure

1) regress X on Z 2) take predicted X values 3) regress y on predicted Xs (if Z only influences Y through X, then we have identified variation in X that is NOT jointly determined)

What is captured by the error term?

1. Omitted or left out variables 2. measurement error in data 3. different functional form than the chosen regression 4. unpredictable events or purely random variation

What conditions are necessary for an omitted variable to cause OVB?

1. The omitted variable has an effect on Y (i.e. β₂is different from O). 2. The omitted variable is correlated with one of the regressors (in our example, Ai is correlated with Xi)

Omitted Variable Bias (OVB) occurs when

Assumption 1 of OLS is violated i.e. when E (ui|Xi=xi) depends on the value of xi

False

Autocorrelation occurs primarily with cross-sectional data

Law of large numbers

Cannot survey the entire population, too expensive and will take too long The larger the sample size, the closer you will get to the actual population

Type II Error

Failing to reject the null hypothesis when it is, in fact, false

(t/f) The adjusted and unadjusted R2s are identical only when the unadjusted R2 is equal to 1.

True. see question 18 answer

zero conditional mean assumption

When we combine mean independence with assumption E(u/x)=0

Var(aX + bY ) is

a2σX2 +2abσXY +b2σY2

Nonexperimental data

accumulated through controlled experiments on individuals, firms, or segments of the economy. (called observational data, or retrospective data) emphasize the fact that the researcher is a passive collector of the data

Let σX2 be the population variance and X be the sample mean. V ar(X) is:

equal to σX2 divided by the sample size.

Let σX2 be the population variance and X be the sample mean. Var(X) is

equal to σX2 divided by the sample size.

example of time series data

studying inflation in us from 1970 to 2006

SSE

sum of squared errors how much of reality does your model not explain?

SSxy

sum of the cross-products

SSE

sum of the squared errors

significance level

the probability of rejecting H0 when it is in fact true.

causality

the relationship between cause and effect

In testing multiple exclusion restrictions in the multiple regression model under the Classical Linear Model assumptions, we are more likely to reject the null that some coefficients are zero if:

the residuals sum of squares of the restricted model is large relative to that of the unrestricted model.

ρ²YX

the squared sample correlation coefficient between Y and X

Sample mean

unbiased estimator for population mean

U(Residual error)

y - y(hat) Difference between the estimated and the population

Non-linear in the parameters

• Yi = β₀+β²₁Xi • Yi = β₀+ß₀β₁Xi

Linear function of the independent variables:

• Yi = β₀+β₁Xi+ui

what kind of p-value to reject the null?

SMALL

SST formula

SSE + SSR = SST

the explained sum of squares (SSE)

SSE=∑(Y^-Y)^2

the residual sum of squares (SSR)

SSR=∑u^2

Predictor Variable

See explanatory variable

Control Variable

See explanatory variables

Economic Significance

See practical experience

Residual Sum of Squares (SSR)

See sum of squared residuals

t Ratio

See t statistic

homoskedasticity

Variance of disturbance term is constant

Heteroskedasticity

Variance of disturbance term is not constant

3. The goodness of fit model, as measured by the R-squared

does not depend on the units of measurement of our variables

Low R2

doesnt mean OLS is useless, it is still possible that our OLS is a good estimate

if MLR3 is violated with a perfect linear function

drop one of the independent variables

A binary variable is often called a:

dummy variable

How do we interpret B2 in words Yi = B0 + B1X1i + B2X2i + ui

"A 1 unit increase in X2 is associated with a B2 increase in Y on average, holding all else constant"

"Suppose that a researcher, using wage data on 235 randomly selected male workers and 263 female workers, estimates the OLS regression Wage = 11.769 + 1.993 × Male, Rsq= 0.04, SER= 3.9, with the respective standard errors for the constant and the Male coefficient of (0.2162) and (0.3384). Wage is measured in dollars per hour and Male is a binary variable that is equal to 1 if the person is a male and 0 if the person is a female. What does an Rsq of 0.04 mean ?"

"It means using our explanatory variable, we are only able to explain 4% of the variation in the dependant variable"

SSE/SST

% explained variation (equation)

R-squared form of the F statistic.

(Rur^2-Rr^2)/q F=----------------------- (1-Rur^2)/dfur

F statistic

(SSRr-SSRur)/q F=------------------------ SSRur/(n-k-1)

698.9-2.28 x STR. you are told t statistic on the slope coefficient is 4.38. what is standard error of the slope coefficient?

.52

Consider the following regression line: testscore = 698.9 - 2.28 × STR. You are told that the t-statistic on the slope coefficient is 4.38. What is the standard error of the slope coefficient?

.52

A box has 20 screws, three of which are known to be defective. What is the probability that the first two screws taken out of the box are both defective?

0.0158

Find the probability that a standard normal random variable has a value greater than -1.56.

0.9406

Steps in regression analysis step 1

1) Review literature and develop theoretical model

partialling out

1) regress the explanatory variable X1 on all other explanatory variables 2) regress Y on X's 3) regress the residuals of Y on the residuals of X1

Steps in regression analysis step 5

5) Estimate and evaluate the equation

Confidence Interval for a Single Coeeff in Multiple Regression

95% CI (Bj - 1.96SE(Bj) , Bj+1.96SE(Bj)

-5

A fitted regression equation is given by Yb = 20 + 0.75X. What is the value of the residual at the point X=100, Y=90?

One-Tailed Test

A hypothesis test against a one-sided alternative.

Zero Conditional Mean Assumption

A key assumption used in regression analysis that states that, given any values of the explanatory variables, the expected value of the error equals zero.

Multiple Linear Regression Model

A model linear in its parameters, whee the dependent variable is a function of independent variables plus an error term.

Empirical Analysis

A study that uses data in a formal econometric analysis to test a theory, estimate a relationship, or determine the effectiveness of a policy.

Best Linear Unbiased Estimator (BLUE)

Among all linear unbiased estimators, the one with the smallest variance. OLS is BLUE, conditional on the sample values of the explanatory variables, under the Gauss-Markov assumptions.

Classical Error Term

An error term satisfying Assumptions I through V (of the classical assumptions) (Called classical normal error term if assumption VII is added)

(a) What is meant by an unbiased estimator?

An estimator is unbiased if the mean of its sampling distribution equals the true parameter. The mean of the sampling distribution is the expected value of the estimator. Thus lack of bias means that E(theta^) = (theta) , where (theta^) is the estimator of the true parameter, (theta) . This means that in repeated random sampling we get on average the correct estimate.

Experiment

Any procedure that can, in theory, be infinitely be repeated and has a well defined set of outcomes

Goodness of fit 6

Are obviously important varaibles included

micronumerosity

Arthur Goldberger defines as the "problem of small sample size."

Consider the multiple regression model with two regressors X1 and X2, where both variables are determinants of the dependent variable. You first regress Y on X1 only and find no relationship. However, when regressing Y on X1 and X2, the slope coefficient β1 changes by a large amount. This suggests that your first regression suffers from:

B. omitted variable bias.

Specific form

B0+B1R1+B2M1+E

95% confidence interval for B0 is interval

B0-1.96SE (B0), B0+1.96SE(B0)

OLS regression line/ sample regression function (SRF) y^=B0+B2x2+...+Bkxk

B0= OLS intercept estimate Bˆ1,..., bˆk= OLS slope estimates

"A professor decides to run an experiment to measure the effect of time pressure on final exam scores. He gives each of the 400 students in his course the same final exam, but some students have 90 minutes to complete the exam while others have 120 minutes. Each student is randomly assigned one of the examination times based on the flip of a coin. Let Yi denote the number of points scored on the exam by the ith student ( 0 less than or equl to Yi less than or equal to 100), let Xi denote the amount of time that the student has to complete the exam (Xi = 90 or 120), and consider the regression model Yi = Beta0 + Beta1 Xi + ui , E( ui) = 0 The Least Squares Assumptions Reminder 1. The error term ui has conditional mean zero given Xi : Yi = Beta0 + Beta1 Xi + ui , i = 1,..., n where E (u|Xi)= 0; 2. Xi ,Yi, i = 1,..., n, are independent and identically distributed (i.i.d.) draws from their joint distribution; and 3. Large outliers are unlikely: Xi and Yi have nonzero finite fourth moments. Assuming this year's class is a typical representation of the same class in other years, are OLS assumption (2) and (3) satisfied?"

Both OLS assumption #2 and OLS assumption #3 are satisfied

Formula for correlation

Cov(x,y)/Sx * Sy s=standard deviations

Consider a regression with two variables, in which X1i is the variable of interest and X2i is the control variable. Conditional mean independence requires

E(ui |X1i, X2i) = E(ui |X2i)

Zero Conditional Mean assumption

E(u|x)=0

population regression function (PRF) E(y/x)=Bo+B1x

E(y/x), is a linear function of x. The linearity means that a one-unit increase in x changes the expected value of y by the amount b1. For any given value of x, the distribution of y is centered about E(y/x), a

Goodness of fit 8

Free of exonomic problems?

sampling is random (not biasing your estimate through selection)

Gauss-Markov assumption 2

Instrument variables

IV regression uses these additional variable tools to isolate the movements of X that are uncorrelated with u , which in turn permit consistent estimation of the regression coeff

statistically insignificant

If H0 is not rejected, we say that "xj is statistically insignificant at the 5% level."

multicollinearity

High (but not perfect) correlation between two or more independent variables

Two lines of hypothesis test

Ho=null hypothesis (statement being tested) Ha=the statement we hope or suspect is true instead

Kurtosis

How "thick" are tails of the data How many observations are in the tail of the data

R^2

How much variation in y is explained by a variation in x

Quadratic Functions

How to capture diminishing returns y=b0+b1x+b2x^2 To approximately determine the marginal effect of x on y we use delta(y)=(b1+2b2x)(deltax)

Goodness of fit 5

How well do coeffcients. Correspond to expectations developed by the researcher before data were collected

Goodness of fit 2

How well does the estimate fit data

The estimated effect of gender gap is statistically significant at the: I. 5% level II. 1% level III. 0.01% level

I,II,III

Intheestimatedmodellog(q)=502.57−0.9log(p)+0.6log(ps)+0.3log(y),where p is the price and q is the demanded quantity of a certain good, ps is the price of a substitute good and y is disposable income, what is the interpretation of the coefficient on p? (Assume that the Gauss-Markov assumptions hold in the population model.)

If the price increases by 1%, the demanded quantity is estimated to be 0.9% lower, ceteris paribus

In the estimated model log(q)=502.57−0.9log(p)+0.6log(ps)+0.3log(y),where p is the price and q is the demanded quantity of a certain good, ps is the price of a substitute good and y is disposable income, what is the meaning of the coefficient on p? (Assume that the Gauss-Markov assumptions hold in the theoretical model)

If the price increases by 1%, the demanded quantity will be 0.9% lower on average, ceteris paribus.

heterosckedasticity

If the variance of the error term depends on x, the error term exhibits heteroskedasticity (non constant error value)

Standard Error of the Regression (SER)

In a simple and multiple regression analysis, the estimate of the standard deviation of the population error, obtained as the square root of the sum of squared residuals over the degrees of freedom

R-squared

In a simple or multiple regression model, the proportion of the total sample variation in the dependent variable that is explained by the independent variable.

Denominator Degrees of Freedom

In an F test, the degress of freedom in the unrestricted model.

Numerator Degrees of Freedom

In an F test,the number of restrictions being tested.

Null Hypothesis

In classical hypothesis testing, we take this hypothesis as true and require the data to provide substantial evidence against it.

are typically random

In distributed lag models, both X and Y

Degrees of Freedom (df)

In multiple regression analysis, the number of observations minus the number of estimated parameters.

Perfect Collinearity

In multiple regression, one independent variable is an exact linear function of one or more other independent variables.

Explanatory Variable (Independent Variable)

In regression analysis, a variable that is used to explain variation in the dependent variable

Intercept

In the equation of a line, the value of the y variable when the x variable is zero.

is measured at different points in time and it is the relationship between the measurments that is the issue

In time series data, the same unit

For an instrument Z to be valid, it must satisfy two conditions:

Instrument relevance: Corr (Zi ,Xi ) 6= 0 Instrument exogeneity: Corr (Zi , ui ) = 0

decision rule

Is a method of deciding whether to reject a null hypothesis

Critical values

Is a value that divides the acceptance region from the rejection region when testing a null hypothesis

Why does the dependent variable appear in logarithmic form?

It allows us to approximate a constant percentage change in the dependent variable, y, due to a change in the independent variable,x

In the estimated model log(q)=502.57−0.9log(p)+0.6log(ps)+0.3log(y),where p is the price and q is the demanded quantity of a certain good, ps is the price of a substitute good and y is disposable income, what is the meaning of the coefficient on ps?

It is the cross-price elasticity of demand in relation to the substitute good and it bears the expected sign.

Intheestimatedmodellog(q)=502.57−0.9log(p)+0.6log(ps)+0.3log(y),where p is the price and q is the demanded quantity of a certain good, ps is the price of a substitute good and y is disposable income, what is the interpretation of the coefficient on ps?

It is the estimate of the cross-price elasticity of demand in relation to the substitute good and it bears the expected sign.

VIF = 1

No correlation, no variance inflation No multicollinearity

No Multicollinearity

No linear relationship at all Completely orthogonal Not Really typical of economic data

not an issue, but it will rarely occur with economic data

No multicollinearity is

Consider the model grade = β0 + β1study + β2leisure + β3sleep + β4work + u, where each regressor is the number of hours per week a student spends in each one of the named activities. The dependent variable is the student's final grade for BEE1023 Introduction to Econometrics. What assumption is necessarily violated if the weekly endowment of time (168 hours) is entirely spent either studying, or sleeping, or working, or in leisure activities?

No perfect multicollinearity

null vs alternative hypothesis

Null Hypothesis is the statement which we are testing and the alternative hypothesis is the statement which must be true if the null is false. null is result *NOT* expected and alternative is result expected.

Expected Values of OLS estimators

Our objective is to show the OLS estimators provide unbiased estimates for true population parameters We need to show that E(^B0) = B0 E(^B1) = B1

p-value

P-value for a t-value is the probability of observing a t-value of that size or larger in absolute value if the null is true

P value meanings

P<_a reject Ho P>a fail to reject

True

Pairwise correlations will always successfully reveal whether multicollinearity is present in your estimating samples

pooled cross section

Pooling cross sections from different years is often an effective way of analyzing the efects of a new government policy. The idea is to collect data from the years before and after a key policy change

µY

Population mean of Y

Econometrics

Quantitative measurement and analysis of actual economic and business phenomena

The regression sum of squares (SSR) as a proportion of the total sum of squares (SST)

R squared measures

increase as more variables are added

R squared will only

from a regression of explanatory variable on all other independent variables (including a constant) (converges to a fixed number)

R-squared (variance)

Type 1 error

REJECT when you were supposed to accept (false positive)

Regression through the Origin

Regression analysis where the intercept is set to zero; the slopes are obtained by minimizing the sum of squared residuals, as usual (i.e. regression with no intercept)

Ordinary least squares

Regression estimate technique that calculates the beta that minimizes sum of squared residuals

Statistically Significant

Rejecting the null hypothesis that a parameter is equal to zero against the specified alternative, at the chosen significance level.

Type I Error

Rejecting the null hypothesis when it is, in fact, true

Exclusion Restrictions

Restrictions which state that certain variables are excluded from the model (or have zero population coefficients).

One of your friends is using data on individuals to study the determinants of smoking at your university. She is particularly concerned with estimating marginal effects on the probability of smoking at the extremes. She asks you whether she should use a​ probit, logit, or linear probability model. What advice do you give​ her?

She should use the logit or​ probit, but not the linear probability model.

? Unbiasedness of OLS

Show that OLS estimators are unbiased

Unbiasedness of ^B0

Show that the expected value of ^B0 is B0 by plugging in mean

true

T or F: t-statistic is a number, not a random variable

(T/F) If X and Y are independent, the conditional probability density function f(XjY )(XjY ) is equal to the marginal probability density function fX(X)

TRUE

(T/F) Consider the model, Consumption = β₀ + β₁Wage + ε. The sample regression function estimated with OLS gives you the average (or expected) value of Consumption for each value of Wage.

TRUE.

Units of Measurement

The OLS estimates change in expected ways when the units of measurement of the dependent and independent variables change

Sum of Squared residuals

The OLS estimator chooses ^B0 and ^B1 to make the SSR as small as possible

If repeated samples of the same size are taken, on average their value will be equal to B1

The OLS estimator for Beta 1 is unbiased means

Under the assumption of the Gauss-Markov Theorem, in the simple linear regres- sion model, the OLS estimator is BLUE. This means what?

The OLS estimator is the estimator that has the smallest variance in the class of linear unbiased estimators of the parameters.

False

The Variance Inflation Factor (VIF) is one (1) if there is a large degree of multicollinearity in a multiple regression model.

Which of the following Gauss-Markov assumptions is violated by the linear probability model?

The assumption of constant variance of the error term.

still hold for general case of multiple regression Linearity Unbiasedness Minimum Variance

The desirable small sample properties

Partial Effect

The effect of an explanatory variable on the dependent variable, holding other factors in the regression model fixed.

Linear in Parameters

The equation is linear in parameters B0 and B1. There are no restrictions on how y and x relate to the original explained and explanatory variables of interest

OLS Regression Line

The equation relating the predicted value of the dependent variable to the independent variables, where the parameter estimates have been obtained by OLS.

Assumption 5

The error term has a constant variance (no Heteroskedasticity)

Assumption 2

The error term has a zero population mean

Ordinary Least Squares

The goal of OLS is to closely "fit" a function with the data. It does so by minimizing the sum of squared errors from the data.

Which of the following is true of the OLS t statistics?

The heteroskedasticity-robust t statistics are justified only if the sample size is large.

Alternative Hypothesis

The hypothesis against which the null hypothesis is tested.

Which of the following is not correct in a regression model containing an interaction term between two independent variables, x1 and x2:

The interaction term coefficient is the effect of a unit increase in √x1x2

Omitted variable bias means the first least squares assumption that E(u|X)=0 is incorrect This fails the first assumption of FLS -> ols estimator is bias, the bias does not vanish in a very large sample, the ols estimator is inconsistent

The larger pXu the larger the bias pg 182

Ordinary Least Squares (OLS)

The most common way of estimating the parameters β₀ and β₁

Classical Linear Model (CLM)

The multiple linear regression model under the full set of classical linear model assumptions.

The natural logarithm

The nonlinear function that plays the most important role in econometric analysis y=log(x) The relationship between y and x displays diminishing marginal returns 100 * delta(log x) = % delta(x)

standard error of the regression (SER)

The positive square root of σ^2, denoted σˆ it is estimator of the standard deviation of the error term.

Practical Significance

The practical or economic importance of an estimate, which is measured by its sign ad magnitude, as opposed to its statistical significance.

Suppose that the linear probability model yields a predicted value of Y that is equal to 1.3. Explain why this is nonsensical.

The predicted value of Y must be between 0 and 1.

Significance Level

The probability if a Type I error in hypothesis testing.

Joint probability

The probability of two variables occurring (Probability of someone buying a ticket and being a businessman)

Misspecification Analysis

The process of determining likely biases that can arise from omitted variable, measurement error, simultaneity, and other kinds of model misspecification.

Regression Analysis

The study of the relationship between one variable (dependent variable) and one or more other variables (independent, or explanatory, variables).

Which of the following statements is true?

The upper bound of the confidence interval for a for a population parameter, say β , is given by β + critical value · standard error β .

Error Term (Disturbance)

The variable in a simple or multiple regression equation that contains unobserved factors which affect the dependent variables. The error term may also include measurement errors in the observed dependent or independent variables.

"If there is multicollinearity among independent variables, then a variable that appears significant may not indeed be so" Is this statement valid?

This statement is erroneous, just the opposite is true. Multicollinearity increases the standard errors and lowers t-statistics. A lower t-statistic is likely to make a variable insignificant rather than significant.

True

Three information criterion give basically the same answers for most problems

normality assumption

To make the sampling distributions of the Bˆj tractable, we now assume that the unobserved error is normally distributed in the population. The population error u is independent of the explanatory variables x1, x2, ..., xk and is normally distributed with zero mean and variance σ^2: u~Normal(0,σ^2).

True/False: Multiple linear regression is used to model annual income (y) using number of years of education (x1) and number of years employed in current job (x2). It is possible that the regression coefficient of x2 is positive in a simple linear regression but negative in a multiple regression.

True

(T/F) The sample average of the OLS residuals is zero.

True. (Σ(i=1,n)^ui)/n = ~Y - ~^Yi = ~Y - (1/n)(Σ(i=1,n)(~Y - ^β₁~X - ^β₁~X = 0

T/F If two variables are independent their correlation coefficient will always be zero

True. If two variables X and Y are independent E(XY)=E(X)E(Y).

T/F The coefficient of correlation will have the same sign as that of the covariance between the two variables.

True. The covariance can be positive or negative but the standard deviation always takes a positive value. Then from the formula to compute the correlation coefficient it must have the same sign as that of the covariance between two variables.

5. For the estimated regression model,Y = b + b X + b X 2 , we will find the 123 minimum value of Y over all values of X occurring at X * = -b2/2b3 if b3 turns out to be a positive number.

True. When b3 is positive the quadratic is U shaped resulting in a minimum. True. Using the formula for omitted bias,

Multicollinearity

Two or more predictor variables in the multiple regression are highly correlated. One can be linearly predicted from the other

are still BLUE and consistent The Gauss-Markov Theorem still holds

Under the expanded classical assumptions, the OLS estimators

Interpreting R^2 and adjusted R^2 WHAT THEY DO TELL YOU

What they tell you whether a regressor are good at predicting or explaing the valyes of the dependent variable in the sample of data on hand.

OLS regression line

Y= b0+B1x,

example of a quadratic regression model is

Y=B0+B1X+B2X^2+u

Theoretical regression equation

Y=B0+B1X1+E

exogenous explanatory variables

Zero Conditional Mean- The error u has an expected value of zero given any values of the independent variables. In other words, E(u|x1, x2, ..., xk)=0.

95% Confidence Interval for β₁

[^β₁- 1.96 x SE(^β₁), ^β₁+1.96 x SE(^β₁)] means we are 95% confident that this interval covers the true β₁

interaction term

a term in a regression where two independent variables are multiplied together and run as their own variable (if we believe that one variable affects our model in different ways for different values of the other variable)

zero

a variable is irrelevant to a model if its coefficient's true value is __________

Random variable

a variable that takes on numerical value and has an outcome that is determined by an experiment

confidence interval=

beta +- T*SE

Lagged Dependent Variable

better at capturing dynamics "Today Influenced by yesterday"

Omitted variable bias

bias on the coefficients; violate Assumption III because omitted variable is correlated with explanatory variables (which determine the error term)

attentuation bias

bias towards zero that results from classical measurement error; increases risk of type 2 errors (false negatives not false positives)

omitted variable

bias, violate classical assumption 3

Dummy variable

binary metric

specifying and equation

choosing the correct independent variables, choosing the correct functional form, choosing the correct form of the stochastic term

law of large numbers (asymptotic law)

as a sample size grows, its mean gets closer to the average of the whole population

Attrition bias

assignment was random but people drop out of the study . if those characteristics of those who drop out are systematically different from characteristics of the remainder of the treatment group then we have attrition bias

random sampling

assumption MLR.2

let W be the included exogenous variables in a regression function that also has endogenous regressor X the W variables can

be a control variable, make an instrument uncorrelated with u, have the property E(u|W)=0

they themselves are random variables

because estimators are the product of random variables...

the following problems could be analyzed using probit and logit estimation with the exception of whether or not

being a female has an effect on earnings

the regression residuals are our estimated errors

best fit

heteroskedasticity

constant variance of errors fix: white correction test, Breusch-Pagan test -- tests whether dependent (white test) variables have predictive power over predictive squared residuals. B-P does the same for independent variables high p value means you probably have it estat hettest robust cluster

log functions, when to use

consumer side, indifference curve, constant elasticity

2.57, 1.96, 1.64

critical values for when there is a large sample size (>120) for 1% sig level, 5% sig level, 10% sig level

The most frequently used experimental or observational data in econometrics are of the following type:

cross-sectional data

cdf

cumulative density functions integral of pdf how likely are you to be in a range? always upward-sloping

first order serial correlation

current value of the error terms is a function of the last value of the error term

SSE/SST or 1- SSR/SST

equation for R^2

sum(y hat-y bar)^2

equation for SSE

sum(yi-y hat)^2

equation for SSR, another equation is: sum(ui hat)^2

To infer the political tendencies of the students at your college/university, you sample 150 of them. Only one of the following is a simple random sample: You

have your statistical package generate 150 random numbers in the range from 1 to the total number of students in your academic institution, and then choose the corresponding names in the student telephone directory.

Level of Significance

indicates the probability of observing an estimated t-value greater than the critical t-value if the null hypothesis were correct.

OLS standard errors will be biased; will alter your standard errors in ways that will make you think you've found signals that are stronger or weaker than they actually are

heteroskedasticity implies...

standard errors will be biased; arises when one or more of our X variables is correlated with the variance of our errors (will alter your standard errors in ways that will make you think you've found signals that are stronger or weaker than they actually are)

heteroskedasticity implies...

rij are the squared residuals from a regression of xj on all other explanatory variables

heteroskedasticity-robust OLS standard errors

filing cabinet bias

hiding when you're wrong so that the anomaly study looks like the accurate one. non-results are just as important as results

multicollinearity

high correlation among some of the independent variables

the sampling variance because there is more noise (bad variation)

high error variance increases...

OVB is problematic, because

if it occurs, it means that our OLS estimator ^β₁will be biased and inconsistent. This also means that ^β₁will not correctly measure the effect of changing Xi on Yi. It will be both biased and inconsistent.

polynomial form

if the slope of relationship depends on the level of the variable

Weights

inverses of the variances

the population parameter in the null hypothesis

is not always equal to zero

A type II error

is the error you make when not rejecting the null hypothesis when it is false.

Gauss-Markov assumptions

justifies the use of the OLS method rather than using a variety of competing estimators. 1. (linear in parameters) y=B0+B1x+u 2. (random Sampling) -We have a random sample of size n, {(xi,yi): i= 1, 2, ..., n}, following the population model 3. (Sample Variation in the explanatory Variable) -The sample outcomes on x, namely, {xi, i=1, ..., n}, are not all the same value 4. (zero Conditional Mean) E(u|x)=0 5.(homoskedasticity) Var(u|x)=σ^2

fourth moment of standardized variable

kertosis

Orthogonal

lack of linear relationship between data

The OLS estimator is derived by

minimizing the sum of squared residuals.

pooled data

multiple cross-sections at different periods of time

Cross sectional data set

multiple individuals/entities at same time

The Classical Assumptions

must be met in order for OLS estimators to be the best available. I. The regression model is linear, is correctly specified, and has an additive error term. II. The error term has a zero population mean. III. All explanatory variables are uncorrelated with the error term. IV. Observations of the error term are uncorrelated with each other (no serial correlation). V. The error term has a constant variance (no heteroskedasticity). VI. No explanatory variable is a perfect linear function of any other explanatory variable(s) (no perfect multicollinearity). VII. The error term is normally distributed (this assumption is optional but usually is invoked).

time series data

one person multiple periods

pdf

probability density functions derivative of cdf function of a continuous random variable gives the probability that the value of the variable lies within an interval

discrete distribution

probability of different outcomes for a variable that can take one of a finite number of outcomes along a discrete scale

semi-log functions, when to use

producer side, diminishing returns, utility curve, constant semi-elasticity

Correlation Coefficient of Perfect Multicollinearity

r= 1.0

to decide whether the slope coefficient indicates a large effect of X on Y, you look at the

size of the slope coefficient

The cumulative probability distribution shows the probability:

that a random variable is less than or equal to a particular value.

cumulative probability distribution shows probability

that random variable is less than or equal to a particular value

A large p-value implies

that the observed value act is consistent with the null hypothesis.

a large p value implies

that the observed value y with bar act is consistent with null hypothesis

You have to worry about perfect multicollinearity in the multiple regression model because

the OLS estimator cannot be computed in this situation.

When there are omitted variables in the regression, which are determinants of the dependent variable, then

the OLS estimator is biased if the omitted variable is correlated with the included variable.

Variance

the distance X is from its expected value

Multiple Regression model

this model permits estimating the effect of Y of changing on variable X1, whole holding the other regressors ( X2,X3...) constant Test score example Isolate the effect of on test scores Y of the student-teacher ratio (X1) while holding constant the percentage of students in the district who are english learners(x2)

unemployment level

time specific

in the time fixed effects regression model you should exclude one of the binary variables for the time periods when an intercept is present in the equation

to avoid perfect multicollinearity

represents total variation in dependent variable

total sum of squares

TSS

total sum of squares how much story is there to explain?

goodness-of-fit

total sum of squares (SST) SST=∑(y-y ̄)^2 explained sum of squares (SSE) SSE=∑(y-y ̄)^2 residual sum of squares or sum of squared residuals (SSR) SSR=∑u^2 SSR/SST+SSE/SST=1. R^2=SSE/SST=1-SSR/SST (∑(y-y ̄)(y-y))^2 R^2=----------------------- (∑(y-y ̄)^2)(∑(y-y ̄)^2)

ideal randomized controlled experiments in economics are

useful because they give a definition of a causal effect

why use logarithms

useful when measuring relative or percentage change

discrete heteroskedasticity

two distributions, one with larger variance even thought both centered around 0 (heights of mice and bball players)

one is an exact linear combination of the other

two variables are multicollinear when...

error term or disturbance

u

measures how far each number is from the mean

variance

variance of the error term

variance

Mean Squared Error

variance plus bias^2; lower is better

pure serial correlation

violates Assumption IV; correctly specified equation

What do we use regression analysis for?

• To estimate the mean or average value of the dependent variable, given the values of the independent variables. • To test a hypothesis implied by economic theory • To predict, or forecast, the mean value of the dependent variable given the independent variables.

Total Sum of Squares (TSS)

• the total variation of the actual Y values around their sample average: • TSS = Σ(i)(Yi-~Y)²

A multiple regression includes two regressors: Yi = B0 + B1 X1i + B2 X2i + ui What is the expected change in Y if X2 decreases by 10 units and X1 is unchanged?

0-10B2

One sided test

Have values on only one side of the null hypothesis

bias towards zero that results from classical measurement error; increases risk of type 2 errors (false negatives not false positives)

attenuation bias

the intercept in the multiple regression model

determines the height of the regression line

degrees of freedom

df=n-(k+1) = (number of observations)- (number of estimated parameters).

When there are ∞ degrees of freedom, the t∞ distribution

equals the standard normal distribution.

When there are ∞ degrees of freedom, the t∞ distribution:

equals the standard normal distribution.

ui

error. • It captures that there are likely other variables (that we are not explicitly considering) that affect Y in addition to X.

non-normality of error term

errors aren't normally distributed fixes: change model get more data transform variables report it and walk away

time series

following a group over a period of time

log(wage)=2.48+.094log(education) Interpret

for every one more year of education wage increases by 9.4%

non-independence of chronological errors

predictable errors Durbin-Watson test: compares errors to previous errors to test for independence 2-tailed test: close to 2, likely no errors. on edges (0, 4) data likely has errors regress y x estat dwatson

in a quasi experiment

randomness is introduced by variations in individual circumstances that make it appear as if the treatment is randomly assigned

B. True/False: Multiple linear regression is used to model annual income (y) using number of years of education (x1) and number of years employed in current job (x2). It is possible that R2 is equal to 0.30 in a simple linear regression model using only x1 and equal to 0.26 in a multiple regression model using both x1 and x2.

False

Population Regression Function (PRF)

See conditional expectation

Explained Variable

See dependent variable

Predicted Variable

See dependent variable

Underspecifying the Model

See excluding a relevant variable.

F-test for overall significance=

(ESS/K)/(RSS/(N-K-1))

Consequences of Multicollinearity

1. Estimates will remain unbiased. 2. The variances and standard errors of the estimates will increase. 3. The computed t-scores will fall. 4. Estimates will become sensitive to changes in specification. 5. The overall fit of the equation and estimation of the coefficients of nonmulticollinear variables will be largely unaffected.

In a multiple linear regression where the Gauss-Markov assumptions hold, why can you interpret each coefficient as a ceteris paribus effect?

Because the Ordinary Least Squares (OLS) estimator of the coefficient on variable xj is based on the covariance between the dependent variable and the variable xj after the effects of other regressors has been removed.

Confidence interval formula

CI = X_bar +- z (st dev/sqrt(n))

T-Test

Econometricians generally use the t-test to test hypotheses about individual regression slope coefficients. .

we would calculate an estimate that is not efficient in the class of linear, unbiased estimators since BLU has a smaller variance while also being linear and unbiased

If we persist in using the conventional OLS formula and ignore heteroskedasticity,

What does it mean when we say that OLS is unbiased?

In expectation, the difference between the OLS estimator and the true pop- ulation parameter is equal to zero.

Restricted Model

In hypothesis testing, the model obtained after imposing all of the restrictions required under the null.

Unrestricted Model

In hypothesis testing, the model that as no restrictions aced on its parameters.

Critical Value

In hypothesis testing, the value against which a test statistic is compared to determine whether or not the null hypothesis is rejected.

Properties of the Mean

Mean equals the true mean of the variable being estimated, and has an unbiased estimator

Classical Assumption

Must be met in order for OLS estimators to be the best available

probability of an event A or B(Pr(A or B)) to occur equals

Pr(A) +Pr(B) if and A B are mutually exclusive

Specialized Variables used to represent

Present/Absent Yes/No Male/Female

Estimator

Rule that assigns each possible outcome of the sample a value

total sum of squares (SST)

SST=∑(Y-Y ̄)^2

Standard deviations formula within correlation

Sqrt(sum(dx^2)/n-1)

log(1+x) Interpretation

The quality of the estimation decreases as the value of x gets larger

There may be approximation errors in the calculation of the least squares estimates

The regression model includes a random disturbance term for a variety of reasons. Which of the following is NOT one of them?

Error Variance

The variance of the error term in a multiple regression model.

Heteroskedasticity

The variance of the error term, given the explanatory variables, is not constant.

When the sample size n is large, the 90% confidence interval for is

Y ± 1.64SE(Y ).

coefficient of determination

elasticity

in the log-log model= b1

Standard Deviation of Y

sY=√(Σ(i)(Yi-~Y)²/(n-1))

semi elasticity model

semi log form suggests...

β₁

slope. • It measures the change in Y resulting from a unit change in X

SSy

sum of squares Y

E(u/x)=E(u)

u is mean independent of x.

Multiple linear regression model

y=B0+B1x1+B2x2+B3x3+...+Bkxk+u -B0 is the intercept. -B1 is the parameter associated with x1. -B2 is the parameter associated with x2, and so on.

to provide quantitative answers to policy questions

you should examine empirical evidence

In the multiple regression model the standard error of regression is given by

√((1/(n-k-1)∑(n, 1=i)⁻u²))

Difference between non-linear and linear

Change in y depends on the starting point of x

Assumption 4

Observations of the error term are uncorrelated with eachother

Reduced importance

Observations with large variance

if F> Fc

then x1 and x2 are jointly significant and reject the null

unbiasedness of OLS

theorem 2.1

we would be using a biased estimator

If we used estimator of variance under homoskedasticity for heteroskedasticity

multicollinearity

Occurs when the independent variables (or error observations) are highly correlated among themselves.

Heteroskedasticity

Occurs when the variance of the residuals is not the same across all observations in the sample

Explained Sum of Squares (SSE)

The total sample variation of the fitted values in a simple or multiple regression model.

The overall regression F-statistic tests the null hypothesis that

all slope coefficients are zero

(c) How can multicollinearity be detected?

(c) The classic case of multicollinearity occurs when none of the explanatory variables is statistically significant even though R2 may be high. High simple correlation coefficients among explanatory variables are sometimes used as a measure of multicollinearity. Another indication is when estimates are very sensitive to changes in specification.

Desirable properties of ~Y

1. unbiased 2. consistent

multicollineraity

2 or more x's are highly correlated symptoms: R^2, high F, low t correl x1 x2 x3 any result > |0.3| worth looking at solutions: more data acknowledge include only one (but have a good reason

b) How is bias defined?

Bias is defined as the difference between the expected value of the estimator and the true parameter. That is, bias = E(theta^hat) - (theta)

p-value

2ф(-|t-stat|) • the probability of drawing a value of ~Y that differs from µ₀ as much as its value from the sample. The smaller the p-value is, the more statistical evidence there is against H₀

If you choose a higher level of significance, a regression coefficient is more likely to be significant.

3. True. With an increase in the level of significance it becomes more likely that the t-statistics will be significant.

Assume that you assign the following subjective probabilities for your final grade in your econometrics course (the standard GPA scale of 4 = A to 0 = F applies): Grade Probability A 0.50 B 0.50 C 0 D 0 F 0

3.5

A bag has five pearls in it, out of which one is artificial. If three pearls are taken out at random, what is the probability that the artificial pearl is one of them?

3/5

=(1/(n-1))∑(i-1,n)(Yi - ~Y)² • sample variance used to estimate the population variance. It is unbiased and consistent

Residual

Difference between actual value and estimated value

Variances of the OLS estimators

It is important to know how far away we can expect ^B1 to be away from B1 on average -This allows us to choose the best estimator -The measure of spread in the distribution of ^B1 that is easiest to work with is the variance

residuals will follow a sine-wave type pattern

Positive autocorrelation

Assumption 7

The error term is normally distributed

R^2

is the fraction of the sample variance of Y explained by the regressors. R^2= ESS/TSS = 1- SSR/TSS in multiple regression the R^2 increases whenever a regressor is added, unless the estimated coeff on the added regressor is exactly 0 an increase in R^2 doesn not mean that adding a variable actually improves the fit of the model inflated estimate

t distribution

large t value (at least ≥ 2) means that the variable likely DOESN'T NOT matter (ergo...the variable matters)

error term=

true Y - expected Y

Suppose we have the linear regression model y = β0 + β1x1 + β2x2 + u, and we would like to test the hypothesis H0 : β1 = β2. Denote β1 and β2 the OLS estimators of β1 and β2. Which of the following statistics can be used to test H0?

β⁻₁-β⁻₂/(Var(β⁻₁)- 2Cov(β⁻₁, β⁻₂)+ Var(β⁻₂))

Parameters of the regression model

β₀ and β₁

Variance of ^B1

σ^2/sum(xi-¯x)^2 invalid in presence of heteroskedasticity

OLS 3 assumptions

• E (ui | Xi) = 0. In words, the error term ui has a conditional mean of zero given Xi . • (Xi,Yi) , i = 1,2,...,n are i.i.d.. That is, we have a random sample. • E(X^4i) < ∝ and E(Y^4i) < ∝, i.e. Xi and Yi have finite fourth moments. In practice, this means that large outliers (values of Xi and Yi that are far outside the range of the data) are unlikely.

Under the OLS assumptions, we have the following results for ^β₁(same results hold for ^β₀):

• E( ^β₁) = β₁. In words, ^β₁is an unbiased estimator of β₁ • As the sample size n increases, ^β₁ gets closer and closer to β₁, i.e. ^β₁is a consistent estimator of β₁. • If n is large, the distribution of ^β₁is well approximated by a normal

Briefly explain what is meant by Multicollinearity

• Problem if some explanatory variables are highly correlated • How do we deal with multicollinearity? ways to spot it: • use economic theory to tell you what is most important to put in the regression

The 1st OLS assumption E(ui | Xi)=0 means

• That ui and Xi are unrelated and that the expected value of omitted variables is 0 for any Xi. • Also under assumption 1, the predicted value ^Yi, is an estimate of the expected value of Yi given Xi. • Is not a strong assumption at all, it is just a normalization

1- SSR/SST

% unexplained variation (equation)

Two ways in which a regression function can be linear

1. Linear in the variables 2. Linear in the parameters

multiple hypotheses test

A test of multiple restrictions

if added independent variables have little or no explanatory power

Adjusted R squared can decline

Time Series Data

Data collected over time on one or more variables.

The conditional expectation of Y given X, E[Y |X = x], is calculated as follows:

Eki=1 yi Pr(Y = yi|X = x).

D. True/False: To perform simple linear regression the explanatory variable must follow a normal distribution.

False

Does regression prove causality?

No, it only tests the strength and direction of the quantitative relationship involved

Standard deviation

Square root of the variance

General form

T=f(R,M)+E

R-Squared Form of the F Statistic

Te F statistic for testing exclusion restrictions expressed in terms of R-squareds from the restricted and unrestricted models.

Omitted Variable Bias

The bias that arises in the OLS estimators when a relevant variable is omitted from the regression.

Homoscedastic errors

The variance of each ui is constant for all i

CLT

Theorem that states the average from a random sample for any population, when standardized, has an asymptotic standard normal distribution

autocorrelation

Time series data usually exhibit

False

Under autocorrelation, the conventionally calculated regression variance estimator, s 2 , is unbiased since this has nothing to do with the disturbance term.

we believe that our identified coefficient can be generalized to outside our sample

a model is externally valid if...

an estimator is consistent for a population parameter if

consistency

t statistic

estimator-hypothesize value/ standard error of estimator

Effects of a disturbance or shock linger in time, but then die out

musical insturment

SSx

sums of squares X

The linear probability model​ is:

the application of the linear multiple regression model to a binary dependent variable.

GDP growth

time specific

"Suppose you are interested in studying the relationship between education and wage. More specifically, suppose that you believe the relationship to be captured by the following linear regression model, Wage = Beta0 + Beta1 Education + u Suppose further that you estimate the unknown population linear regression model by OLS. What is the difference between Beta1 and Beta1hat ?"

"Beta1 is a true population parameter, the slope of the population regression line, while Beta1hat is the OLS estimator of Beta1

semilog form

"increasing at a decreasing rate" form...; perfect for percentage terms (ln X...change in Y related to 1 percent increase in X...ln Y...percent change in Y related to a one-unit increase in X)

correlation coefficient

"r" measures strength and direction of linear relationship between two variables r=1 perfect positively correlated r= 0 variables uncorrelated

algebraic properties of olS Statistics

(1) The sum, and the sample average of the OLS residuals, is zero. ∑Ui=0 (2) The sample covariance between the regressors and the OLS residuals is zero. ∑xu=0 (3) The point (x ̄,y ̄) is always on the OLS regression line. yi =yˆi+uˆi.

Let R2 be the R-squared of a regression (that includes a constant), SST be the total sum of squares of the dependent variable, SSR be the residual sum of squares and 2 df be the degrees of freedom. The estimator of the error variance, σ = SSR/df, can be re-written as:

(1−R^2)SST /df

Let R2 be the R-squared of a regression (that includes a constant), SST be the total sum of squares of the dependent variable, SSR be the residual sum of squares and 2 df be the degrees of freedom. The estimator of the error variance, σ = SSR/df, can be re-written as:

(1−R²)SST /df

Var(^β₀)

(Σ(i)X²i)(^σ²(u))/nΣ(i)(Xi-~X)²

15. Consider a regression model of a variable on itself: y = β0 + β1y + u If you were to estimate this regression model by OLS, what would be the value of R-squared?

1

components of OLS variation

1) error variance (σ2); bad variation 2) total sample variation in Xs (Var X); good variation 3) linear relationships among the independent variables

Standard Normal Distribution

A normal distribution with a mean of 0 and a standard deviation of 1.

Define a time series data set...

A time series data set consists of observations on a variable or several variables over time. Examples of time series data include stock prices, money supply, consumer price index, gross domestic product, annual homicide rates, and automobile sales figures

The dummy variable trap is an example of:

A. perfect multicollinearity.

Exogenous Explanatory Variable

An explanatory variable that is uncorrelated with the error term.

best linear unbiased estimator (BLUE)

B ̃=∑wy

(T/F) The Adjusted R-squared (~R^2) is always greater than or equal to the R-squared (R^2)

FALSE. Because [(n-1)/(n-k-1)] is always greater or equal to 1, ~R^2 <= R^2

Type 2 error

Fail to reject false the null hypothesis

Dependent variable

Function of movements in a set of other variables

Gauss-Markov Theorem

Given assumptions 1-6, the OLS estimator of B is the minimum variance estimator among all linear unbiased estimated for (k=0,1,2...K)

p-value

Given the observed value of the t statistic, what is the smallest significance level at which the null hypothesis would be rejected? P(|T|>|t|)

Estimated regression coefficients

Guesses of the true regression coefficients and are obtained from data from a sample of the Ys and Xs

Level of significance

Indicates probability of observing an estimated t value greater than the critical value if null hypothesis were correct. Measured amount of type 1 error

R^2

Is the ratio of the explained sum of squares to the total sum of squares

Type 2 error

NOT REJECT when you were supposed to REJECT (false negative)

the much larger family GLS

OLS is a special case of

from a regression of explanatory variable xj on all other independent variables (including a constant); converges to a fixed number

R-squared (variance)

Goodness of fit

R-squared is the coefficient of determination. It is the ratio of the explained variation compared to the total variation, and is the fraction of the sample variation in y that is explained by x. It is equal to the square of the sample correlation coefficient between yi and ^yi R^2=SSE/SST = 1 - SSR/SST

Coefficient of Determination, R^2

R2 is the ratio of the explained sum of squares to the total sum of squares: It is used to measure the fit of the regression r^2 of 1 is a perfect fit r^2 of 0 shows a failure of the estimated regression Explained sum of squares/ Total sum of squares A major problem with R2 is that adding another independent variable to a particular equation can never decrease R2, because it cannot change TSS.

Adjusted R^2

R^2 adjusted for degrees of freedom

linear in parameters

SLR.1

Overspecifing the Model

See inclusion of an irrelevant variable.

Properties of standard error

Square root of variance

Residual Sum of Squares

The Residual Sum of Squares is the amount of variation that is left unexplained by the regression line, that is, the sum of the squared differences between the predicted and observed values.

Ordinary Least Squares

The estimates given by ^B- and ^B1 are the OLS estimates of B0 and B. Can derive unbiasedness, consistency

Gauss-Markv Assumptions

The set of assumptions under which OLS is BLUE

The error term is homoskedastic if

Var(u|x) is constant.

Estimated regression equation

Y^=1034+6.38Xi

ols residuals Ui are defined as follows

Yi- Y^i

Alternate Hypothesis

a statement of the values that the researcher expects

the model is properly specified and linear

assumption MLR.1

As variability of xi increases, the variance of ^b1 ____

decreases

Yi

dependent variable

any difference in the predicted dependent variable and the actual dependent variable is due to

factors subsumed in the model error term

exponential functions, when to use

growth

a large p value

is in favour of the null

log-log

log(y)=b0+b1log(x) for every 1% increase in B1, y inc/decr by 1%

instrument relevance

means that some of the variance in the regressor is related to variation in the instrument

coefficient of variation

measure of spread that describes the amount of variability relative to the mean unitless, so you can use it to compare the spread of different data sets instead of the standard deviation

β₀

notation for population intercept

central limit theorem

states conditions under which a variable involving the sum of Y1 variable becomes the standard normal distribution

The central limit theorem

states conditions under which a variable involving the sum of Y1,..., Yn i.i.d. variables becomes the standard normal distribution.

simple linear regression

statistical procedure used to determine a model (equation of a straight line) which "best" fits all points on the scatter plot for bivariate data

to standardize a variable

subtract its mean and divide by standard deviation

In the log-log model, the slope coefficient indicates

the elasticity of Y with respect to X.

The proof that OLS is BLUE (Gauss-Markov Theorem) requires all of the following assumptions with the exception of:

the errors are normally distributed.

The proof that OLS is BLUE (Gauss-Markov Theorem) requires all of the following assumptions with the exception of: a) the errors are homoskedastic. b) the errors are normally distributed. c) E(u|x) = 0. d) there is variation in the regressors.

the errors are normally distributed.

the t statistic is calculated by dividing

the estimator minus its hypothesized value by the standard error of the estimator

The t-statistic is calculated by dividing

the estimator minus its hypothesized value by the standard error of the estimator.

The t-statistic is calculated by dividing:

the estimator minus its hypothesized value by the standard error of the estimator.

conditional mean independence assumption

the explanatory variable must not contain information about the mean of ANY unobserved factors (talking about the actual parameter u)

regression through the origin

the line passes through the point x=0, y ̃=0. To obtain the slope estimate, we still rely on the method of ordinary least squares, which in this case minimizes the sum of squared residuals y ̃=B1x

unbiasedness

the mean of the sampling distribution is equal to the true population

if there is heteroskedasticity

the ols is not the most efficient estimator and the standard errors are not valid for inference

slope parameters

the parameters other than the intercept even though this is not always literally what they are where neither b1 nor b2 is itself a slope, but together they determine the slope of the relationship between consumption and income

the fitted value of Y is also called

the predicted value of Y-hat

the significance level of a test

the probability of rejecting the null hypothesis when it is true

consistency

the probability that the estimate is close to the true population value can be made high by increasing the sample size

probability of an outcome

the proportion of times that the outcome occurs in the long run

demand equation

the quantity demanded of each commodity depends on the price of the goods, the price of substitute and complementary goods, the consumer's income, and the individual's characteristics that affect taste.

A statistical analysis is internally valid​ if

the statistical inferences about causal effects are valid for the population studied

regression

the statistical technique of modeling the relationship between variables

2. If the independent variable is multiplied or divided by some nonzero constant,

then the OLS slope coefficient is multiplied or divided by that constant

variances of OLS estimators

theorem 2.2

states that OLS estimator is the Best Linear Unbiased Estimator (under assumptions MLR.1-MLR.5)

theorem 3.4 (Gauss Markov theorem)

t distribution for standardized estimators (allows us to form a general null hypothesis)

theorem 4.2

how to choose variables

theory, t-test, adjusted R^2, and bias

us unemployment rate % from 65-95

time series

spurious regression

time series producing high (inaccurate) values of R^2

failing to reject a false hypothesis

type 1 error

Non-Orthogonal

typical of economic data which are not collected by experiments

With a biased estimator, confidence intervals and hypothesis tests are

typically invalid

"A professor decides to run an experiment to measure the effect of time pressure on final exam scores. He gives each of the 400 students in his course the same final exam, but some students have 90 minutes to complete the exam while others have 120 minutes. Each student is randomly assigned one of the examination times based on the flip of a coin. Let Yi denote the number of points scored on the exam by the ith student ( 0 less than or equl to Yi less than or equal to 100), let Xi denote the amount of time that the student has to complete the exam (Xi = 90 or 120), and consider the regression model Yi = Beta0 + Beta1 Xi + ui , E(ui) = 0 Which of the following are true about the unobservable ui ? "

ui represents factors other than time that influence the student's performance on the exam.

"Under the least squares assumptions for the multiple regression problem (zero conditional mean for the error term, all Xi and Yi being i.i.d., all Xi and ?i having finite fourth moments, no perfect multicollinearity), the OLS estimators for the slopes and intercept:"

unbiased and inconsistent

If Cov(Xi,ui) > 0, then ^β₁is biased

upwards (positive bias)

standardized beta coefficients

use for regression when variables are in different units; one st. dev increase in X leads to an increase of beta coefficient amount in Y

When testing joint hypotheses, you should

use the F-statistics and conclude that at least one of the restrictions does not hold if the statistic exceeds the critical value.

When testing joint hypothesis, you should

use the F-statistics and reject at least one of the hypothesis if the statistic exceeds the critical value.

To measure the fit of the probit​ model, you​ should:

use the​ "fraction correctly​ predicted" or the​ "pseudo R squared​."

Expected bias analysis

use to determine whether or not there is an omitted variable: sign of correlation between omitted var and explanatory times sign of omitted variable coefficient

mean square error

used to decide if a small variance offsets bias (with a sampling distribution with narrow distribution but offset from the true value a little)

multicollinearity

violates assumption VI; estimates remain unbiased and overall fit will be generally unaffected; variances and standard errors will be large

fundamental problem of causal inference

we can never observe both potential outcomes Yi(0) and Yi(1) (counterfactuals are unobservable); we cannot know the unit effects (the individual average difference between counterfactual state)

estimating the error variance

we can use data to estimate σ^2, which allows us to estimate var(^b1)

When estimating σ²u, we divide by n - 2, the degrees of freedom, because

we lose 2 for estimating β₀ and β₁

If we want to test whether a binary variable equaling 0 or 1 generates equal results,

we should test the null hypothesis that the slope coefficient β₁=0

constrained equation in the F-test

what the null would look like if it were correct

level-level

y(hat)=B(hat) + B(hat)1x + .... B(hat)1 - for every 1 unit increase in x, y(hat) increases/decreases by B1

Model with two Independent Variables

y=B0+B1x1+B2x2+u -B0 is the intercept. -B1 measures the change in y with respect to x1, holding other factors fixed. -B2 measures the change in y with respect to x2, holding other factors fixe

simple linear regression model

y=Bo+B1x+u y- the dependent variable x-independent variable u- error term or disturbance B1= slope parameter in the relationship between y and x Bo=intercept parameter If the other factors in u are held fixed, so that the change in u is zero, Change U=0, then x has a linear effect on y: ChangeY=B1ChangeX if ChangeU=0.

Let y^i be the fitted values. The OLS residuals, u^i , are defined as follows

yi-y^i

var(^B1)

σ^2/ sum(xi- ¯x)^2

Variance of ^B0

σ^2/n times sum of x variance over sum of x variance squared = 0

Linearity in the parameters

• Yi = β₀+β₁Xi+ui • Yi = β₀+β₁X²i+ui • Yi = β₀+β₁1/Xi+ui • Yi = β₀+β₁log(Xi)+ui

~Y

• estimator of µY • often called a point estimate

t/f An efficient estimator means an estimator with minimum variance.

(c) False. To be efficient an estimator must be both unbiased and have minimum variance.

consider the following least squares specification between test scores and student teacher ratio. test score=557.8+36.42ln(income) according to this equation a 1% increase income is associated with an increase in test scores of

.36 points

Assume that Y is normally distributed N(μ, σ2). Moving from the mean (μ) 1.96 standard deviations to the left and 1.96 standard deviations to the right, then the area under the normal p.d.f. is

0.95

Threats to external validity

1.Non rep sample 2. Non rep treatment 3. General equilibruim effects and externalities 4. Treatment vs eligibility 5.Treatment vs choice

OLS Slope Estimate

A slope in an OLS regression line.

Minimum Variance Unbiased Estimators

An estimator with the smallest variance in the class of all unbiased estimators.

Unbiased Estimator

An estimator βN is an unbiased estimator if its sampling distribution has as its expected value the true value of β. E(βhat) = β

hourly data, which has a major problem

Annual data probably have less autocorrelation problem then

Formula for slope (B1)

B1 = Cov(x,y)/Var(x)

According to Guass-Markov Theorem what is OLS

BLUE = best, linear, unbiased, estimator

BLUE

Best Linear Unbiased Estimator

BLUE

Best - minimum variance Linear Unbiased Estimator

omitted variable bias.

Bias(B1)=E(B1)-B1=B2δ1

Experimental Data

Data that have been obtained by running a control group

The regression R2 is defined as follows:

ESS/TSS

(T/F) In the expression Pr(Y = 1) = Φ (β₀ + β₁X) from a probit model, β₁cannot be negative, since probabilities have to lie between 0 and 1.

FALSE. Even if β is negative Φ(β₀ + β₁X) will never be negative, as the function Φ(.) is the cumulative density function of the N(0,1) , and all cumulative density functions lie, by definition, between 0 and 1.

(T/F) When we drop a variable from a model, the total sum of squares (TSS) increases.

FALSE. The TSS is the variation of Y around its sample average ~Y . It is unrelated to the number of X's used to explain Y .

(T/F) Among all unbiased estimators that are weighted averages of Y1; :::; Yn, ^β₁ is the most unbiased estimator of β₁.

False ^β₁is unbiased so it is not more or less unbiased than any other unbiased estimator. The Gauss-Markov theorem says that it is the best (most efficient) among unbiased estimators.

False

For most economic time series, the autocorrelation coefficient is negative

the model is properly specified and linear

Gauss-Markov assumption 1

endogenous explanatory variable

If xj is correlated with u for any reason, then xj is said to be an endogenous explanatory variable

False

If you wish to use a set of dummy variables to capture six different categories of an explanatory variable, you should use six different dummy variables, each equal to one when the observation is a member of a specific category and zero otherwise.

Variance Inflation Factor (VIF)

In multiple regression analysis under the Gauss-Markov assumption, the term in the sampling variance affected by correlation among the explanatory variables.

Excluding a Relevant Variable

In multiple regression analysis, leaving out a variable that has a non-zero partial effect on the dependent variable.

Critical t-value

Is the value that distinguishes the acceptance region from the rejection region

random white noise

Jaggedness in residuals due to

Specialized Variables

Lagged Variables Time Trend Dummy Variables

What can one do about OVB other than trying to determine the direction of the bias?

Main strategies are: 1. Multiple Regression 2. Instrumental Variables 3. Randomized Experiments

random movement from one state to another

Markov Process Represents the

Z distribution

Mean of 0 and standard deviation of 1

Dummy

Measure structural change; measure impact of strike; account for seasonality; measure gender differences

Trend

Measure technological change; exogenous factors for population growth

"In a regression, if the t-statistic for a coefficient is 1.83, do you reject the null hypothesis of the coefficient being equal to zero at the 5% level?"

No

Assumption 6

No explanatory variable is perfectly linear function of any other variable

In the model Grade = β0 + β1study + β2leisure + β3sleep + β4work + u, where each regressor is the amount of time (hours), per week, a student spends in each one of the named activities and where the time allocation for each activity is explaining Grades (where Grade if the final grade of Introduction to Econo- metrics), what assumption is necessarily violated if the weekly endowment of time (168 hours) is entirely spent either studying, or sleeping, or working, or in leisure activities?

No perfect multicollinearity.

what assumption is necessarily violated if the weekly endowment of time (168 hours) is entirely spent either studying, or sleeping, or working, or in leisure activities?

No perfect multicollinearity.

Minimum Variance: Heteroskedasticity

Non-Constant Variance makes OLS estimators differ from BLU Estimators

classical linear model (CLM) assumptions

Normality Assumption_The population error u is independent of the explanatory variables x1, x2, ..., xk and is normally distributed with zero mean and variance σ^2: u~Normal(0,σ^2).

Best Linear Unbiased Estimators (BLUE)

OLS estimators have minimum variance among all unbiased estimators of the β's that are linear functions of the Y's.

gauss-markov theorem

OLS is BLUE (best, minimum variance, linear unbiased estimator); assumptions 1-6

errors in variable assumption (measurement error)

OLS is biased and inconsistent because the mismeasured variable is endogenous

The difference between the standard deviation of Y and the SER is

SER measures the deviation of the Y values around the regression line and the standard deviation of Y measures the deviation of the Y values around the sample mean

random sampling

SLR.2

sample variation in the explanatory variable

SLR.3

zero conditional mean

SLR.4

In the simple regression model an unbiased estimator for V ar(u) = σ2, the variance of the population regression errors, is:

SSR/(n−2)

Increasing return to education

Since the change in wage for an extra year of education increases as education increases, it is increasing

Difference between z and t distributions

T distributions are lower and wider, more "kurtosis" the standard deviation of the data is much larger in a t distribution (meaning data is more dispersed)

(T/F) A random variable X can only take on the following values: 0 with probability b; 10 with probability 4b; 20 with probability 4b; and 100 with probability b. Therefore b must be equal to 0.1

TRUE. Since probabilities must sum to 1, if X can only take on the specified values it must be the case that b + 4b + 4b + b = 1. Hence b = 0:1.

Residual

Te difference between the actual value of the fitted (or predicted) value; there is a residual for each observation in the sample used to obtain an OLS regression line.

Fitted Value

The estimated values of the dependent variable when the values of the independent variables for each observation are plugged into the OLS regression line

Downward Bias

The expected value of an estimator is below the population value of the parameter.

Mean Independent

The key requirement in a simple and multiple regression analysis, which says the unobserved error has a mean (E(u)) that does not change across subsets of the population defined by different values (x) of the explanatory variables.

Assumption 1

The regression model is linear, is correctly specified, and has an additive error term

Gauss-Markov Theorem

The theorem that states that, under the five Gauss-Markov assumptions (for cross-sectional or time series models), the OLS estimator is BLUE (conditional on the sample values of the explanatory variables.

J. True/False: In trying to obtain a model to estimate grades on a statistics test, a professor wanted to include, among other factors, whether the person had taken the course previously. To do this, the professor included a dummy variable in her regression that was equal to 1 if the person had previously taken the course, and 0 otherwise. The interpretation of the coefficient associated with this dummy variable would be the average amount the repeat students tended to be above or below non-repeaters, with all other factors the same.

True

Take an observed (that is, estimated) 95% confidence interval for a parameter of a multiple linear regression. Then:

We cannot assign a probability to the event that the true parameter value lies inside that interval.

homoskedasticity

assumption SLR.5

yi =

b₀ + b₁xi + ei

cross sectional data

data constructed of individual observations taken at a single point in time (individuals, households) (must be more or less independent to draw meaningful inferences)

Omitted Variable

defined as an important explanatory variable that has been left out of a regression equation. omitted variable bias.

residual sum of squares

defines sample variation in ^u sum(yi-^y)^2

Explained sum of squares

defines sample variation in ^y sum(^y-_y)^2

Total sum of squares

defines sample variation in y

Consider the regression model Wage = β0 + β1Female + u Where Female (=1 if female) is an indicator variable and u the error term. Identify the dependent and independent variables in the regression model above. Wage is the __________ variable

dependent

chi squared distribution in stata

display chi2(df, z)

type 2 error

fail to reject a false null

how you plan to identify your parameter of interest (one that will satisfy the conditional mean independence assumption)

identification strategy

Interaction Term

is an independent variable that is a multiple of two or more other independent variables.

The expected value of a discrete random variable

is computed as a weighted average of the possible outcome of that random variable, where the weights are the probabilities of that outcome.

The expected value of a discrete random variable:

is computed as a weighted average of the possible outcome of that random variable, where the weights are the probabilities of that outcome.

The larger the error variance that ___ is var(^B1)

larger

panel data

one cross-section over a period of time

The power of the test is

one minus the probability of committing a type II error

unobserved ability

person-specific

A manufacturer claims that his tires last at least 40,000 miles. A test on 25 tires reveals that the mean life of a tire is 39,750 miles, with a standard deviation of 387 miles. Compute the actual value of the t statistic.

t = −3.23.

empirical analysis

uses data to test a theory or to estimate a relationship.

Proportionate Change

x1-x0/x0 x0=beginning term x1=ending term

Standard Error of the OLS estimators:

• se(^β₀) = √Var(^β₀) • se(^β₁) = √Var(^β₁)

Assumptions for OLS

1. Parameters are linear 2. Random sampling No bias Sample estimates are unbiasedly targeted toward the pop parameters 3. Var(x) > 0 X cannot be more than one term, has to have variation and be dispersed 4. E(u) = 0 Expected value of error is 0 5. Homoskedasticity -> equality of error variances across x values

Which of the following is true of confidence intervals?

Confidence intervals are also called interval estimates.

Nonexperimental Data

Data that have not been obtained through a controlled experiment.

Assumption 2

Error term has a zero erro population mean

E. True/False: It is impossible for R*2 to decrease when you add additional explanatory variables to a linear regression model.

False

F. True/False: In a particular model, the sum of the squared residuals was 847. If the model had 5 independent variables, and the data set contained 40 observations, the value of the standard error of the estimate is 24.2.

False

(T/F) The standard error of the regression is equal to 1-R².

False. SER = (1/(n-2))Σ(i=1,n)^u²i and 1-R² = (Σ(i=1,n)^u²i)/(Σ(i=1,n)(Yi-Y)² , and the two are not equal.

Assumption 4

Observations of error term are uncorrelated with each other

Perfect Multicollinearity

Perfect linear relationship among variables One or more variables are redundant Holds for all observations Not typical of economic data Usuall introduced into a problem (Dummy Variable Trap)

lie on top of each other and are indistinguishable. Move along one by a certain amount and you move along the other by the exact same ammount

Perfect multicollinearity, the two lines

Regression analysis

Statistical technique that attempts to explain movements in one variable

true

T or F: R^2 never decreases when you add variables to the model

critical value

c, rejection of H0 will occur for 5% of all random samples when H0 is true.

unit elastic

level-log

Constant elasticity of demand function

log(y)=B0+B1log(x) B1 is the elasticity

Log-level

log(y)=B0+Bx+... for every 1 unit increase in x, y incr/decr by b1*100 -> %

Cross sectional data

one time period multiple entities

economic significance

significance of variance is determined by the size and sign of B

1) variables measured in natural units such as years 2) variables measured in percentage points 3) if variables take on 0 or negative values

when to not take logs

For n = 121, sample mean=96, and a known population standard deviation σX = 14, construct a 95% confidence interval for the population mean.

(93.51, 98.49).

to derive the least squares estimator Uy, you find the estimator m which minimizes

(Y1-m)^2

^β₁~ N(β₁, Var(^β₁)) and ^β₀~ N(β₀, Var(^β₀)) implies that

(^β₀ - ß₀)/SE(^ß₀) ~ N(0, 1) and ( ^β₁ - ß₁)/SE(^ß₁) ~ N(0, 1)

(a) OLS is an estimating procedure that minimizes the sum of errors squared, (sum) ei^2

(a) False, OLS minimizes the sum of squared residuals.

(a) State the null and alternative hypotheses in testing the overall significance of the regression. (b) How is the overall significance of the regression tested? What is its rationale?

(a) Testing the overall significance of the regression refers to testing the hypothesis that none of the independent variables helps to explain the variation of the dependent variable about its mean. Formally the null hypothesis is H0 :B(2) =B(3) ... B(K) = 0 against the alternative hypothesis H1: not all Bi's are 0 (b) The overall significance of the regression is tested by calculating the F ratio of the explained to the unexplained variance. A "high" value for the F statistic suggests a significant relationship between the dependent and independent variables leading to the rejection of the null hypothesis that the coefficients of all explanatory variables are jointly zero.

t/f An estimator of a parameter is a random variable but the parameter is non-random.

(a) True. We normally assume the parameter to be estimated is some fixed number, although unknown.

(a) What is meant by perfect multicollinearity? What is its effect?

(a) Two or more independent variables are perfectly collinear if one or more of the variables can be expressed as a linear combination of the other variable(s). For example, there is perfect multicollinearity between X1 and X2 if X1=2X2 or X1=5-X2. If two or more explanatory variables are perfectly linearly correlated it will be impossible to calculate OLS estimates of the parameters.

t/f An unbiased estimator of a parameter (theta) means that it will always be equal to (theta)

(b) False. An estimator is unbiased if on average it is equal to the true unknown parameter.

(b) For a given significance level and degrees of freedom, if the computed ItI exceeds the critical t value we should accept the null hypothesis.

(b) False. The null hypothesis should be rejected.

(b) What is meant by high but not perfect multicollinearity? What problems may result?

(b) High but not perfect multicollinearity refers to the case in which two or more independent variables in the regression model are highly correlated. This may make it difficult to isolate the effect that each of the highly collinear explanatory variables has on the dependent variable.

State whether the following statements about heteroskedasticity are true or false. If false - give reasons why. (b) In the presence of heteroskedasticity OLS is an inefficient estimation techniques and the t- and F-tests are invalid.

(b) True.

State whether the following statements about heteroskedasticity are true or false. If false - give reasons why. (c) Heteroskedasticity can be detected with a Chow test.

(c) False - The Chow test is used to test for structural change. .

(c) The coefficient of correlation r has the same sign as the estimated slope coefficient.

(c) True. The numerator of both involves the covariance between Y and X which can be positive or negative.

t/f An estimator can be BLUE only if its sampling distribution is normal.

(d) False. No probabilistic assumption is required for an estimator to be BLUE.

(d) What can be done to overcome or reduce the problems resulting from multicollinearity?

(d) Serious multicollienarity may sometimes be corrected by (1) extending the size of the sample data (2) Using some prior information about one of the estimates (3) transforming the functional form or (4) dropping one of the highly collinear variables (however this may lead to specification bias so care must be taken)

State whether the following statements about heteroskedasticity are true or false. If false - give reasons why. (d) Sometimes apparent heteroskedasticity can be caused by a mathematical misspecification of the regression model. This can happen for example, if the dependent variable ought to be logarithmic, but a linear regression is run.

(d) True

explained sum of squares=

(expected - average)^2

In one sentence discuss what is meant by the following statements: (i) When we say that a time series variable is I(0) we mean ? (ii) When we say that a time series variable is I(1) we mean ?

(i) When we say that a time series variable is I(0) we mean ?__the time series is stationary and doesn't need to be differenced. (ii) When we say that a time series variable is I(1) we mean ?_the time series is non- stationary and needs to be differenced once to make it stationary.

panel data

(or longitudinal data) set consists of a time series for each cross-sectional member in the data set.

sum of squared deviations from the mean

(sum of the x values individually squared) - (n*mean)^2

standardizing a variable

(value - mean) / std dev standardize so you can compare different units not a summary statistic gives a value for each observation so you can compare across the board makes different values relative to each other

Consider the following estimated model (by OLS), where return is the total return of holding a firm's stock during one year, dkr is the firm's debt to capital ratio, eps denotes earnings per share, netinc denotes net income and salary denotes total compensation, in millions of dollars, for the CEO (estimated standard errors of the parameters in parentheses below the estimates). The model was estimated using data on n = 142 firms. return = −12.3 + 0.32 dkr + 0.043eps − 0.005 netinc + 0.0035salary, (6.89) (0.150) (0.078) (0.0047) (0.0022) n = 142, R2 = 0.0395 Which of the following is the 99% confidence interval for the coefficient on dkr?

(−0.0664,0.7064)

A spark plug manufacturer believes that his plug lasts an average of 30,000 miles, with a standard deviation of 2,500 miles. What is the probability that a given spark plug of this type will last 37,500 miles before replacement? Assume a normal distribution.

0.0013

Consider the following estimated model (by OLS), where return is the total return of holding a firm's stock during one year, dkr is the firm's debt to capital ratio, eps denotes earnings per share, netinc denotes net income and salary denotes total compensation, in millions of dollars, for the CEO (estimated standard errors of the parameters in parentheses below the estimates). The model was estimated using data on n = 142 firms. return = −12.3 + 0.32 dkr + 0.043eps − 0.005 netinc + 0.0035salary, (6.89) (0.150) (0.078) (0.0047) (0.0022) n = 142, R2 = 0.0395 What is the correlation between the fitted values and the dependent variable?

0.1987

Let Z be a standard normal random variable. Find Pr(-0.5 < Z < 0.5).

0.3830

The probability of stock A rising is 0.3; and of stock B rising is 0.4. What is the probability that neither of the stocks rise, assuming that these two stocks are independent?

0.42

7 Assumptions for OLS to be BLUE

1) Regression model is linear, correctly specified, and has an additive term 2) Error term has a zero population mean 3) All explanatory variables are uncorrelated with error term 4) Observations of the error term are uncorrelated with one another (no serial correlation) 5) observations of the error term are uncorrelated with one another (no perfect multicollinearity) 6) Error term has constant variance ( no heteroskedasticity) 7) Error term is normally distributed

The Least Squares Assumption in Multiple Regression

1. Conditional distribution of u given X1i,X2i.. has a mean of zero 2. (X1i,X2i..,Yi)i=1..n are IID 3. Large outliers are unlikely 4. No perfect multicollinearity - if something is perfect multicollinearity then one of the regressors is a perfect linear function of the other regressors - makes it impossible to compute OLS (produces division with zeros)

OLS assumptions for multiple regression

1. E (ui | X1i = x1i ,X2i = x2i,...,Xki = xki ). In words, the expectation of ui is zero regardless of the values of the k regressors. 2. (X1i ,X2i ,...,Xki ,Yi ) are independently and identically distributed (i.i.d). This is true with random sampling. 3. (X1i ,X2i,...,Xki ,Yi ) have finite fourth moments. That is, large outliers are unlikely (this is generally true in economic data). New assumption: no perfect multicollinearity between regressors 4. The regressors (X1i ,X2i ,...,Xki ) are not perfectly multicollinear. This means that none of the regressors can be written as a perfect linear function of only the other regressors.

Suppose you believe there is heteroskedasticity proportional to the square of an explanatory variable x and so you divide all the data by x prior to applying OLS. If in fact there is no heteroskedasticity this is undesirable because it causes your OLS estimates to be biased.

1. False. If there is no heteroskedasticity and you divide all the data through by x you will introduce heteroskedasticity into the equation. In the presence of heteroskedasticy OLS estimates are unbiased.

Steps of hypothesis testing for a parameter of the linear regression model

1. Formulate the null hypothesis (e.g. H0 : βj = 0) 2. Formulate the alternative hypothesis (e.g. H1 : βj ≠ 0) 3. Specify the significance level α (e.g. α = 5%) 4. Calculate the actual value of the decision variable, called t-statistic. 5. Compute the critical values zα/2 and z1-α/=2. 6. Decide whether you can or cannot reject the null hypothesis.

Two conditions for a valid instrument

1. Instrument relevance: corr(Z,X) does not equal zero 2. Instrument exogenity: corr(Z,u)=0 thus an instrument that is relevant and exogenous can capture movement in X that are exogenous. This exogenous variation can in turn be used to estimate the population coeff B1.

unbiasedness of OLS

1. LINEAR IN PARAMETERS-In the population model, the dependent variable, y, is related to the independent variable, x, and the error (or disturbance), u, as y=B0+B1x+u where b0 and b1 are the population intercept and slope parameters, respectively. 2. random sampling- We have a random sample of size n, {(xi,yi): i= 1, 2, ..., n}, following the population model in equation 3. SAMPLE VARIATION IN THE EXPLANATORY VARIABLE- The sample outcomes on x, namely, {xi, i=1, ..., n}, are not all the same value 4.ZERO CONDITIONAL MEAN- The error u has an expected value of zero given any value of the explanatory variable. In other words, E(u|x)=0 5..5 HOMOSKEDASTICITY (or nonconstant variance).- The error u has the same variance given any value of the explanatory variable. In other words, Var(u|x)=σ^2

The Expected Value of the OLS Estimators

1. Linear in Parameters- The model in the population can be written as y=B0+B1x1+B2x2+...+Bkxk+u, where B0, B1, ..., Bk are the unknown parameters (constants) of interest and u is an unobserved random error or disturbance term 2. Random Sampling- We have a random sample of n observations, {(xi1, xi2, ..., xik, yi ): i=1, 2, ..., n} 3. No Perfect Collinearity- In the sample (and therefore in the population), none of the independent variables is constant, and there are no exact linear relationships among the independent variables. 4. Zero Conditional Mean- The error u has an expected value of zero given any values of the independent variables. In other words, E(u|x1, x2, ..., xk)=0.

Six Steps in the Applied Regression Analysis

1. Review the literature and develop the theoretical model 2. Specify the model : select the independent variables and functional form 3. Hypothesize the expected signs of the coefficients 4. Collect, inspect and clean the data 5. Estimate and evaluate the equation 6. Document the results.

There is a simple relationship between ̃b1 and ˆb1, which allows for interesting comparisons between simple and multiple regression: ̃B1=B1+B2δ1

1. The partial effect of x2 on yˆ is zero in the sample. That is, bˆ2=0 2. x1 and x2 are uncorrelated in the sample. That is, δ1=0.

What does the sign of Cov(Xi,ui) depend on?

1. The sign of Cov(Xi,Ai) i.e. whether the omitted variable Ai is positively or negatively correlated with Xi 2. The sign of β₂i.e. whether the omitted variabble Ai positively or negatively affects Yi

The four specification criteria

1. Theory: Is the variable's place in the equation unambiguous and theoretically sound? 2. t-Test: Is the variable's estimated coefficient significant in the expected direction? 3. R2: Does the overall fit of the equation (adjusted for degrees of freedom) improve when the variable is added to the equation? 4. Bias: Do other variables' coefficients change significantly when the variable is added to the equation?

Assuming that x1 and x2 are not uncorrelated, we can draw the following conclusions:

1. When b2 doesn't equal 0, B ̃ is biased, Bˆ1 is unbiased, and Var(B ̃1)<Var(bˆ1). 2. When B2=0, B ̃1 and Bˆ1 are both unbiased, and Var(B ̃) <Var(Bˆ1).

5 necessary conditions for CLM to hold

1. errors must have constant variance (CV) 2. errors must be normally distributed (N) 3. errors must be sequentially independent (IE) 4. explanatory variables must be independent of each other (IV) 5. all relevant independent variables must be counted (C) cv ie iv n c

6 main CLM (classical linear model) assumptions

1. linear in parameters y = b0 + b1x1 + b2x2...+ bkxk + u 2. random sampling of population 3. no exact linear relationship in x's 4. conditional expected value of error is 0 e(u | x) = 0 5. variance of error is constant 6. error is independent of x's and normally distributed cov(x, u) = 0 and u is normally distributed

assumptions about variance in error for ols estimator

1. not serially coralated 2. homoskadicety 3. normal distribution

Functional Form 1. Linear 2. Double Log 3. Semilog 4. polynomial

1. the slope of the relationship between the independent variable and the dependent variable is constant Slopes are constant, elasticities are not 2. the natural log of Y is the dependent variable and the natural log of X is the independent variable: lnY = β0 + β1 lnX1 + β2 lnX2 + e the elasticities of the model are constant and the slopes are not. 3. is a variant of the double-log equation in which some but not all of the variables (dependent and independent) are expressed in terms of their natural logs. Yi = β0 + β1 lnX1i + β2X2i + ei 4. express Y as a function of independent variables, some of which are raised to powers other than 1. For example, in a second-degree polynomial (also called a quadratic) equation, at least one independent variable is squared: Yi = β0 + β1X1i + β2(X1i) 2 + β3X2i + ei

Algebraic properties of OLS statistics

1. the sum of the OLS residuals is zero 2. The sample covariance between Xi and OLS residuals is zero 3. The point (¯x,¯y) is always on the regression line

Two sided t-tests

1. two sided tests of whether an estimated coefficient is significantly different from zero 2. Two-sided tests of whether an estimated coefficient is significantly different from a specific nonzero value

Given the following probability distribution: X P(X) 1 0.2 2 0.3 3 0.3 4 0.2 What is the variance of the random variable X?

1.05

Threats to internal validity

1.non random assignment 2. Failure to follow random protocol 3. Attrition bias 4. Hawthorne Effect 5. Failure to follow intended treatment protocp;

what is the estimator of σ^2

1/n sum(^u^2i) = SSR/n but it is bias because it does not account for two restrictions sum(^ui)=0 and sum(xi^ui)=0 if we know n-2 of the residuals, we can get the other two by using the restrictions in the first condition so The unbiased estimator is 1/n-2 sum ^u^2 = SSR/(n-2)

If we wish to test the null hypothesis that B4 =B5=B6 in a model with 30 observations and five explanatory variables other than the intercept term, we will need to compare the F-test statistic for this hypothesis with the critical values of an F-distribution with 2 and 24 degrees of freedom.

10. True. While three parameters are involved, there are only two restrictions. The number of restrictions provides the so-called numerator degrees of freedom; the degrees of freedom in the unrestricted model, N-K, is the denominator and will be 30- 6=24.

2. When Y is a binary variable, using the linear probability model ensures that the predicted probability that Y=1 is between 0 and 1 for all values of X.

2. False. OLS predictions will not necessarily fall between 0 and 1.

4. Multicollinearity causes the values of estimated coefficients to be insensitive to the presence or absence of other variables in the model. ˆ

4. False. Multicollinearity can make it difficult to distinguish the individual effects on the dependent variable of one regressor from that of another regressor and can cause the values of estimated coefficients to be sensitive to the presence or absence of other variables in the model.

to test for the significance of entity fixed effects you should calculate the F statistic and compare it to the critical value from your F distribution where w equals

47

sum of deviations from the mean

50% of amount of values are above/below the mean, so added together the total deviations from the mean are 0

Consider the following estimated model (standard errors in parentheses) wage = 235.3923 + 60.87774educ − 2.216635hours, (104.1423) (5.716796) (1.738286) n = 935, R2 = 0.108555, where wage is the wage in euros, educ is the level of education measured in years and hours is the average weekly hours of work. What is the F-statistic for the overall significance of the model.

56.747

7. Consider a model where the return to education depends upon the amount of work experience: log(wage) = B1educ B2exper B3(exper*educ) where educi stands for the number of years of education and experi is the number of years of work experience. The appropriate test to the null hypothesis that the return to education does not depend on the level of work experienceis H0: B1=B3

7. False. The appropriate test is H0 : B3=0.

8. Suppose a null hypothesis CAN NOT be rejected at the 13% significance level. Then, based upon this information, we can conclude that the p-value of the test may be equal to 7%.

8. False. The p value would have to be greater than 0.13.

Denote the R2 of the unrestricted model by R2 and the R2 of the restricted model by R2 . Let R2 and R2 be 0.4366 and 0.4149 respectively. The difference R UR R between the unrestricted and the restricted model is that you have imposed two restrictions. The unrestricted model has one intercept and 3 regressors. There are 420 observations. The F-statistic in this case is

8.01

9. Assuming that the number of regressors is greater than 1 then the adjusted and unadjusted R2s are identical only when the unadjusted R2 is equal to 1.

9. True.FollowsfromR2=1-N-1(1-R2) N-K

Pooled Cross Section

A data configuration where independent cross sections, usually collected at different points in time, are combined to produce a single data set.

Define cross sectional data set...

A data set collected by sampling a population at a given point in time.

Panel Data

A data set constructed frm repeated cross sections over time. With a balanced panel, the same unit appear in each time period. With an unbalanced panel, some units do not appear in each time period, often due to attrition.

Biased Toward Zero

A description of an estimator whose expectation in absolute value is less than the absolute value of the population parameter.

Ordinary Least Squares (OLS)

A method for estimating the parameters of a simple and multiple linear regression model. The OLD estimates are obtained by minimizing the sum of squared residuals (loss function).

Logarithmic transformations of the dependent variable

A model that gives approximately a constant percentage effect is log(wage) = B0 + b1edu + u the effect of education on wage then is, %delta(wage) = 100B1(delta(edu)

Simple Linear Regression Model

A model where the dependent variable is a linear function of a single independent variable, plus an error term.

Constant Elasticity model

A model where the elasticity of the dependent variable, with respect to an explanatory variable, is constant; in multiple regression, both variables appear in logarithmic form.

Population Model

A model, especially a multiple linear regression model, that describes a population.

Define panel data set...

A panel data (or longitudinal data) set consists of a time series for each cross-sectional member in the data set. As an example, suppose we have wage, education, and employment history for a set of individuals followed over a ten-year period.

Confidence Interval (CI)

A rule used to construct a random interval so that a certain percentage of all data sets, determined by the confidence level, yields an interval that contains the population value.

Random Sampling

A sampling scheme whereby each observation is drawn at random from the population. In particular, no unit is more likely to be selected than any other unit, and each draw is independent of all other draws.

F Statistic

A statistic used o test multiple hypotheses about the parameters in a multiple regression model.

Micronumerosity

A term introduced by Arthur Goldberger to describe properties of econometric estimators with small sample sizes.

Multicollinearity

A term that refers to correlation among the independent variable in a multiple regression model; it is usually invoked when some correlations are "large," but an actual magnitude is not well defined.

Two-Tailed Test

A test against a two-sided alternative.

Joint Hypotheses Test

A test involving more than one restriction on the parameters in a model.

Multiple Hypotheses Test

A test of a null hypothesis involving more than one restriction on the parameters.

Overall Significance of the Regression

A test of the joint significance of all explanatory variables appearing in a multiple regression equation.

Multiple Regression Analysis

A type of analysis that is used to describe estimation of an inference in the multiple linear regression model.

Lagged Dependent or Independent

Account for dynamics in time series; account for habits or learning

Hypothesis test

Always saying something about the population Testing if something is true about the population or not Can never prove Ho true, only that it is not true

One-Sided Alternative

An alternative hypothesis that states that the parameter is grater than (or less than) value hypothesized under the null.

Two-Sided Alternative

An alternative where the population parameter can be either less than or greater than the value stated under he null hypothesis.

Econometric Model

An equation relating the dependent variable to a set of explanatory variables and unobserved disturbances, where unknown population parameters determine the ceteris paribus effect of each explanatory variable

Endogenous Explanatory Variable

An explanatory variable in multiple regression that is correlated with the error term, either because of an omitted variable, measurement error, or simultaneity.

What is the​ trade-off when including an extra variable in a​ regression?

An extra variable could control for omitted variable​ bias, but it also increases the variance of other estimated coefficients.

Suppose you are interested in investigating the wage gender gap using data on earnings of men and women. Which of the following models best serves this purpose? A. Female = β0 + β1Wage + u where Female (=1 if female) is an indicator variable and u the error term. B. Wage = β0 + β1 Female + u where Female (=1 if female) is an indicator variable and u the error term. C. Wage = β0 + u where u is the error term. D. Male = β0 + β1Female + u where Male (=1 if male) is an indicator variable and u the error term

B

In a multiple linear regression where the Gauss-Markov assumptions MLR.1 through MLR.4 hold, why can you interpret each coefficient as a ceteris paribus effect?

Because the Ordinary Least Squares (OLS) estimator of the coefficient on variable xj is based on the covariance between the dependent variable and the variable xj after the effects of other regressors has been removed

simplest way to test for heteroskedasticity; regress residuals on our x's

Breusch-Pagan test

The 95% confidence interval for β1 is the interval: A. β1 - 1.645SE β1 , β1 + 1.645SE β1 . B. (β1 - 1.96SE(β1), β1 + 1.96SE(β1)) C. β1 - 1.96SE β1 , β1 + 1.96SE β1 D. β1 - 1.96, β1 + 1.96

C

T distribution

Centered at zero but has a large standard deviation

sample of 100 households and their consumption and income patterns using these observations, you estimate the following regression Ci= B1Yi+ui where C is consumption and Y is disposable income. The estimate of B1 will tell you

Change in consumption/ change in income

Method Three: Maximum Likelihood

Chooses model parameters to maximize the likelihood function. The value that makes the likelihood of the observed data the largest should be chosen Often it is more convenient to work with the log likelihood function. Using calculus, we find the derivative wrt the model parameters of the objective function, the log likelihood function, and set it equal to zero to find the maximum

Specification

Choosing from the following components 1) Independent variables and how they should be measured 2) the functional form of the variables 3) The properties of the stochatic error term

Run separate regressions, the new unrestricted SSR is given by the sum of the SSR of these two separate regressions, then just run a regression for the restricted model

Chow test

A researcher estimates the effect on crime rates of spending on police by using​ city-level data. Which of the following represents simultaneous​ causality?

Cities with high crime rates may need a larger police​ force, and thus more spending. More police​ spending, in​ turn, reduces crime.

Gauss-Markov and normality assumptions are collectively referred to as the...

Classical Linear Model (CLM) assumptions

Define Cointegration

Cointegration consists of matching the degree of nonstationarity of the variables in an equation in a way that makes the errors of the equation stationary. Even though individual variables might be nonstationary it's possible for linear combinations of nonstationary variables to be stationary or cointegrated.

Conditional distributions

Continuous "the probability that y=y given that x=x" Probability of the next thing happening is dependent on the first thing happening Free throw shooting (making the first what is the probability of making the second) (Not making the first what is probability of the second)

Using the textbook example of 420 California school districts and the regression of test scores on the studentteacher ratio, you find that the standard error on the slope coefficient is 0.51 when using the heteroskedasticityrobust formula, while it is 0.48 when employing the homoskedasticityonly formula. When calculating the t statistic, the recommended procedure is to: A. use the homoskedasticityonly formula because the t statistic becomes larger. B. first test for homoskedasticity of the errors and then make a decision. C. make a decision depending on how much different the estimate of the slope is under the two procedures. D. use the heteroskedasticity-robust formula.

D

Covariance

Describes the relationship between two variables and the entire population (if they move apart or move together)

Fan Shapes

Desirable pattern in constant, even spread around a line with intercept zero and slope zero

Coefficients (B)

Determine the coordinates of the straight line at any point

Drawbacks of correlation

Does not perform well with non-linear data No units Does not do well with skewness (outliers)

Fix Multicollinearity

Drop a redundant variable Increase Sample Size Tranfrom variables (use logs/change time series frequency)

X and Y are two random variables. Which of the following statements holds true regardless of whether X and Y are independently distributed?

E(Y ) = E[E(Y |X)]

with idd sampling each of the following is true except

E(Y with bar)<e(Y)

The condition for ^β₁to be an unbiased estimator of β₁is

E(^β₁)=β₁

an estimator u with a bar of the population value u without a bar is unbiased if

E(uwith bar) = U without bar

Zero conditional mean assumption

E(u|x)=E(u) E(u)=0 If u and x are uncorrelated, they are not linearly related The average value of u does not depend on the value of x

Expressing zero condition mean and homoskedacity assumption compactly:

E(y|x) = β0 + β1x Var(y|x) = σ^2

An estimator θ^ of the population value θ is unbiased if

E(θ^) = θ

An estimator θ of the population value θ is unbiased if

E(θ₋) = θ.

R^2

ESS/TSS -- coefficient of determination -> measures best fit

Variance formulas

E[(X - µ^2)] E(X^2)-µ^2

Let Y be a random variable with mean μY . Then V ar(Y ) equals:

E[(Y − μY )2]

Let Y be a random variable with mean μY . Then var(Y ) equals:

E[(Y − μY )2].

Consider the model: log(price) = β0 + β1score + β2breeder + u, where price is the price of an adult horse, score is the grade given by a jury (higher score means higher quality of the horse) and breeder is the reputation of the horse breeder. The estimated model is: log(price) = 5.84 + 0.21score + 0.13breeder What is the interpretation of the estimated coefficient on score?

Each additional grade point increases the horse's price by 21%, on average, ceteris paribus.

Parameter estimators

Each three approach generates our OLS estimators, ^B1 and ^B0

Economic vs Econometric model

Economic model consists of mathematical equations that describe relationships. Econometric models take economic models that do not take into account variables that are not directly observed and includes in the analysis. The choice of these variables is based on economic theory.

Correlation vs Causality

Economists want causality instead of correlation Shark attacks in summer as well as ice cream consumption. Positive correlation but do not cause one another. Just because things are correlated does not mean they cause each other.

Assumption 7

Error is normally distributed

Assumption 5

Error term has constant variance

Step one of TSLS

Estimate OLS the first stage auxilary regression X= pi0 +pi1Z+ u and obtain predicted values

In the regression model y=β0 +β1x+β2d+β3(x×d)+u, where x is a continuous variable and d is a dummy variable, to test that the intercept and slope parameters for d = 0 and d = 1 are identical, you must use the

F-statistic for the joint hypothesis that β2 = 0, β3 = 0.

weak instrument

F<10 Cov(z,x) is small will elad to b1 can be volatile amd the distribution may be very spread out nad non standard

(T/F) If E(Xi) = µ, then W =[((ΣXi)+4)/N] is an unbiased estimator of µ

FALSE. E(W) = E[(ΣXi+4)/N] = 1/N E(ΣXi + 4) = 1/N [E(X₁) + E(X₂) + ... + 4] = 1/N [µ + µ + ... + 4] = (Nµ + 4)/N = µ + 4/N ≠ µ

(T/F) If the true model is Y = ý₀+ý₁X₁+ ý₂X₂+ ε but you omit X₂and estimate Y = ý₀+ý₁X₁+ ε, your estimate of ý₁will always be biased

FALSE. By omitting a variable that is part of the model we risk to obtain an estimate of ý₁that is biased (omitted variable bias). However, for an omitted variable to cause a bias two conditions must hold: • The first one is that the omitted variable (in this case X₂) causes Y . Since the question says that, according to model considered, X₂is one of the variables that explains Y , this condition is likely to hold in this case. • The second condition is that the omitted variable is correlated with the variable whose coefficient may be biased. As long as X₂is not correlated with X₁, the estimate of ý₁will still be unbiased even if we omit X₂.

(T/F) Assume that H0 : μY = μY,0 and H1 : μY > μY,0, and Y is normally distributed. To compute the critical value for this 1-sided test, we divide by two the positive critical value of the 2-sided test.

FALSE. If we want to conduct this test on the significance level α (where usually α =0:01, 0:05, or 0:10) we have to look for the critical value Z(1-α) in the corresponding table. If the test is 2-sided we have to look for Z(1-α/2) .

(T/F) The higher the standard error of an estimator ^β₁, the more likely it is that you will reject the null hypothesis H0 : β₁ = 0.

FALSE. We divide by the standard error to find the actual value of the t statistic, therefore a higher SE reduces the absolute value of the statistic, thus it becomes less likely that we reject the null.

(T/F) In the following model, Wage = α₀+α₁Educ+α₂Female+α₃Black+α₄Female X Educ+u, to check whether the returns to education are the same for males and females you would have to test a joint hypothesis with an F test.

FALSE. You only have to test one hypothesis on one coefficient: H₀: α₄= 0 (Note: You would have to test a joint hypothesis with an F test if you wanted to check whether average wages are equal for men and women -other things equal. In this case, H₀ : α₂= 0 and α₄= 0.)

Statistically Insignificant

Failure to reject the null hypothesis that a population is equal to zero, at the chosen significance level.

Type II Error

Failure to reject the null hypothesis when its is false.

Jointly Insignificant

Failure to reject, using an F test at a specified significance level, that all coefficients for a group of explanatory variables are zero.

C. True/False: Multiple linear regression is used to model annual income (y) using number of years of education (x1) and number of years employed in current job (x2). If the F-statistic for testing H0 : B1 = B2 = 0 has a p-value equal to 0.001, then we can conclude that both explanatory variables have an effect on annual income.

False

H. True/False: A regression had the following results: SST = 82.55, SSE = 29.85. It can be said that 73.4% of the variation in the dependent variable is explained by the independent variables in the regression.

False

(T/F) To obtain the slope estimator using the least squares principle, we divide the sample covariance of X and Y by the sample variance of Y .

False Instead to get the the slope estimator we divide the sample covariance of X and Y by the sample variance of X.

T/F If the correlation coefficient between two variables is zero, it means that the two variables are independent.

False generally. Covariance is a measure of linear dependence between two random variables. However, variables can be nonlinearly related.

(t/f) When we say that an estimated regression coefficient is statistically significant we mean that it is statistically different from 1.

False. It is statistically significant from 0 not 1.

t/f The way to determine whether a group of explanatory variables exerts significant influence on the dependent variable is to see if any of the explanatory variables has a significant t statistic; if not, they are statistically insignificant as a group.

False. Use the F test not individual t tests.

Consider the following estimated model (by OLS), where return is the total return of holding a firm's stock during one year, dkr is the firm's debt to capital ratio, eps denotes earnings per share, netinc denotes net income and salary denotes total compensation, in millions of dollars, for the CEO (estimated standard errors of the 3 parameters in parentheses below the estimates). The model was estimated using data on n = 142 firms. return = −12.3 + 0.32 dkr + 0.043eps − 0.005 netinc + 0.0035salary, (6.89) (0.150) (0.078) (0.0047) (0.0022) n = 142, R2 = 0.0395 What can you say about the estimated coefficients of the variable salary? (con- sider a two-sided alternative for testing significance of the parameters)

For each additional million dollars in the wage of the CEO, return is predicted to increase by 0.0035, on average, ceteris paribus. But it is not statistically significant at the 5% level of significance.

Consider the following estimated model (standard errors in parentheses) wage = 235.3923 + 60.87774educ − 2.216635hours, (104.1423) (5.716796) (1.738286) n = 935, R2 = 0.108555, where wage is the wage in euros, educ is the level of education measured in years and hours is the average weekly hours of work. What can you say about the estimated coefficient of the variable educ? (consider a two-sided alternative for testing significance of the parameters)

For each additional year of education, wage is predicted to increase by 60.88 euros, on average, ceteris paribus. But it is statistically significant at a 5% level of significance.

housing = 164 + .27(income) Interpret

For every additional (marginal) dollar of income earned, $0.27 goes toward housing.

no multicollinearity (model is estimating what it should be)

Gauss-Markov assumption 3

errors are homoscedastic

Gauss-Markov assumption 5

How to fix serial correlation?

Generalized least squares; either cochrane orcutt methos, prais-watson, or newey west standard errors

Gauss-Markov theorm

Give the classical assumptions the ols estimator of beta is minimum variance estimator from among the set of linear unbiased estimators. BLUE

Test of Joint Hypotheses : Test using F stat

H0 B1=0 B2=0 vs H1: B1 not 0 B2 not one 0

Null hypothesis

H0:Bj=0 Since Bj measures the partial effect of xj on (the expected value of) y, after controlling for all other independent variables this means that, once x1, x2, ..., xj21, xj11, ..., xk have been accounted for, xj has no effect on the expected value of y

two-sided alternative

H1: B≠0 xj has a ceteris paribus effect on y without specifying whether the effect is positive or negative. This is the relevant alternative when the sign of bj is not well determined by theory (or common sense). Even when we know whether bj is positive or negative under the alternative, a two-sided test is often prudent.

if null hypothesis states H0; E(Y)=uy then a two sided alternative hypothesis is

H1: E(Y) does not equal Uy,0

False

Heteroskedasticity means constant variance

True

Heteroskedasticity occurs primarily with cross-sectional data.

cross-sectional datat

Heteroskedasticity typical of

Two stage Least Square Estimator

IF the instrument Z satifies the condtions of instruments relevance and exogeniety, the ceoff B1 can be estimated using IV estimator TWO STAGE 1. decomposes X into two componets. a problematic componet that may be correlated with the regression error adn another problem free componet uncorrelated with the error. 2. Uses the problem free componet to estimate Z.

statistically significant

If H0 is rejected in favor of the model at the 5% level, we usually say that "xj is statistically significant, or statistically different from zero, at the 5% level."

jointly statistically significant

If H0 is rejected, then we say that xk-q+1, ..., xk are jointly statistically significant

Gauss Markov

If OLS assumptions 1-4 hold then B0(hat) = B0 of population approximately B1(Hat) = B1 of population approximately

VIF Rule of Thumb

If VIF greater than 10 indicates presence of high degree of multicollinearity

perfect collinearity

If an independent variable is an exact linear combination of the other independent variables, then we say the model suffers from perfect collinearity

Covariance Stationary

If both the mean, variance are finite and constant and the covariance of the time series with leading or lagged values of itself is constant, then the time series is said to be c

we have autocorrelation

If covariances is not zero

jointly insignificant

If the null is not rejected, then the variables are jointly insignificant, which often justifies dropping them from the model.

The residual u=yi-y^i

If u>0, then yˆi is below yi, which means that, for this observation, yi is underpredicted. If u<0, then yi<yˆi, and yi is overpredicted. 1.The sample average of the residuals is zero and so y= y^ 2. The sample covariance between each independent variable and the OLS residuals is zero. Consequently, the sample covariance between the OLS fitted values and the OLS residuals is zero. 3.Thepoint(x1,x2,...,xk,y) is always on the OLS regression line: y=B0+B1x1+B2x2+...+Bkxk

In the estimated model log(q)=502.57−0.9log(p)+0.6log(ps)+0.3log(y), where p is the price and q is the demanded quantity of a certain good, ps is the price of a substitute good and y is disposable income, what is the meaning of the coefficient on ps?

It is the cross-price elasticity of demand in relation to the substitute good and it bears the expected sign.

"Suppose that a researcher, using wage data on 235 randomly selected male workers and 263 female workers, estimates the OLS regression: Wage = 11.769 + 1.993 × Male, Rsq= 0.04, SER= 3.9. What does the SER of 3.9 tell us ?"

It measures the average size of the OLS residual (the average mistake made by the OLS regression line) and tells us our average error is 3.9 dollars.

Multiple Regression

Linear regression model with more than one regressors

Heteroskedasticity: Impact on OLS Properties

Linearity: Still Met Unbiasedness: Still Met Minimum Variance: Not Met

Most important non-linear econometric data tool

Logs

Akaike Information Criterion

Look for model with smalest AIC

Schwarz's Information Criterion

Look for model with smalest Schwarz value

P value

Lowest probability at which we can reject null

Formula for margin of error

ME=z (stdev/sqrt(n))

Classical Linear Model Assumptions

MLR assumptions 1-6. Including linearity in the parameters, no perfect colinearity, the zero conditional mean assumption, homoskedasticity, no serial correlation, and normality of the errors.

Goodness of fit 7

Most theortical logical function

Markov Example

Move from being unemployed to employed, back to being unemployed

it is a matter of degree, check for severity

Multicollinearity is not a present/absent problem

Some or all of your t-ratios are individually small (canot reject individual slopes being zero), but the F-test value is large (rejects all slopes simultaneously zero)

Multicollinearity may be present in your model if

degrees of freedom

N-K-1

Why are the coefficients of probit and logit models estimated by maximum likelihood instead of​ OLS?

OLS cannot be used because the regression function is not a linear function of the regression coefficients.

unbiased

OLS estimated coefficients centered around the true/population values

gauss-markov theorem

OLS estimator is the best linear unbiased estimator (BLUE) of linear model with average zero error + constant error variance

the variance in our estimator lets us infer the likely accuracy of a given estimate

OLS sampling errors

Increased Importance

Observations with small variances

Conditional probability

One thing depends on the other. (Conditional on walsh being a student, what are the chances he drinks on tuesday)

Rejection region

Probability of making type 1 error, its on the distribution but we end up rejecting a true null

method of least squares

Process of fitting a mathematical function to a set of measured points by minimizing the sum of the squares of the distances from the points to the curve.

when the estimated slope coefficient in the simple regression model b1 is zero, then

R^2=0

R-squared/ coefficient of determination

R^2=SSE/SST=1-SSR/SST When interpreting R^2, we usually multiply it by 100 to change it into a percent: 100xR^2 is the percentage of the sample variation in y that is explained by x.

Homoskedasticity

Random error terms

serial correlation

Refers to the situation in which the residual terms are correlated with one another; error terms follow each other; error terms are thus too small

Limitations of t-test

Researchers confuse statistical significance with theoretical validity or empirical importance. Does not say anything about which variables determine the major portion of the variation in the dependent variable.

In the simple regression model an unbiased estimator for V ar(u) = σ2, the variance of the population regression errors, is:

SSR/(n−2).

total sample variation in explanatory variable xj; converges to n * var(xj)

SST (variance)

Which of the following statements is correct?

SST = SSE + SSR

Which of the following statements is correct? a) SST = SSE + SSR b) SSE = SSR + SST c) SSE > SST d) R2 =1−SSE/SST

SST = SSE + SSR

decomposition of total variation

SST = SSE + SSR

Pooled cross sectional

Same as cross sectional EXCEPT You do several pooling of data in different points of time Different units each time

Panel data

Same entity multiple periods

Labor economists studying the determinants of​ women's earnings discovered a puzzling empirical result. Using randomly selected employed​ women, they regressed earnings on the​ women's number of children and a set of control variables​ (age, education,​ occupation, and so​ forth). They found that women with more children had higher​ wages, controlling for these other factors. What is most likely causing this​ result?

Sample selection bias

Suppose that a state offered voluntary standardized tests to all its third graders and that these data were used in a study of class size on student performance. Which of the following would generate selection​ bias?

Schools with​ higher-achieving students could be more likely to volunteer to take the test.

Observational Data

See Nonexperimental Data

Sample Regression Function (SRF)

See OLS regression line

Coefficient of Determination

See R-sqaured

Covariate

See explanatory variable

Independent Variable

See explanatory variable

Regressor

See explanatory variable

Level of signfigance

Shows the probability of observing a t-value greater than the critical t-value, shows the probability of making type 1 error

just plot residuals against time

Simplest method of detecting autocorrelation

Time Series data

Single individual/entity/varibale Data over time (multiple points in time) same units statistical tracking unemployment each year

Some Multicollinearity

Some linear relationship Typical of economic data

Imagine that you were told that the tstatistic for the slope coefficient of the regression line Test Score = 698.9 - 2.28 × STR was 4.38. What are the units of measurement for the tstatistic?

Standard deviations

OLS characteristics

Sum of residuals =0 best estimator when OLS is an estimator and given Bhat produced by OLS is an estimate

(T/F) Suppose you run a test of the hypothesis H₀ : β₁= 0 against the two-sided alternative H₁: β₁≠ 0. Your t-statistic takes the value |-2.001 | > 1.96. You therefore reject the null at the 10% significance level.

TRUE. According to your t-statistic you'd reject the hypothesis at 5% significance level, as | - 2:001| > 1.96. If you reject at 5%, you necessarily reject at 10% (the lower the significance level, the more difficult it is to reject the null).

(T/F) In the model Y = β₀ + β₁X + u, if Cov(Y,X) > 0 then the estimate ^β₁will be greater than zero.

TRUE. Since ^β₁= sXY/s²X and s²X is always positive, the sign of ^β₁is determined by the sign of sXY

(T/F) Everything else equal, the length of the confidence interval decreases with the sample size n

TRUE. Since the length is proportional to the standard error of the estimator and standard error decreases with n, the length of the condence interval decreases with n if everything else stays the same.

Inclusion of an Irrelevant Variable

Te including of an explanatory variable in a regression model that has zero population parameter in estimating an equation by OLS.

False

The Classical Assumption regarding the variance of the disturbance term is that the variance varies from observation to observation.

False

The Cochrane-Orcutt procedure can be used to correct for heteroskedasticity.

Which of the following statements is true?

The F statistic is always nonnegative as SSRr is never smaller than SSRur.

True

The autocorrelation coefficient can be any number between -1 and +1.

What is meant by the best unbiased or efficient estimator? Why is this important?

The best unbiased or efficient estimator refers to the one with the smallest variance among unbiased estimators. It is the unbiased estimator with the most compact or least spread out distribution. This is very important because the researcher would be more certain that the estimator is closer to the true population parameter being estimated.

Normality Assumption

The classical linear model assumption which states that the error or dependent variable) has a normal distribution, condition on the explanatory variables.

Interpretation of the constant

The constant includes the fixed portion of Y that cannot be explained by independent variables

Why does a regression have an error?

The error u arises because of factors, or variables, that influence Y but are not included in the regression function

Assumption 5: Homoskedasticity

The error u, has the same variance given any value of the explanatory variables. Var(u|x) = σ^2 The variance of u, conditional on x, is constant Implies efficiency properties

Homoskedasticity

The errors in a regression model have constant variance conditional on the explanatory variables.

Consider the model: log(price) = β0 + β1score + β2breeder + u, where price is the price of an adult horse, score is the grade given by a jury (higher score means higher quality of the horse) and breeder is the reputation of the horse breeder. Because reputation of the breeder is difficult to measure we decided to estimate the model omitting the variable breeder. What bias can you expect in the score coefficient, assuming breeder reputation is positively correlated with score and β2 > 0?

The estimated coefficient of score will be biased upwards.

Jointly Statistically Significant

The null hypothesis that two or more explanatory variables have zero population coefficients is rejected at the chosen significance level.

Homoskedasticity

The pattern of the covariation is constant (the same) around the regression line, whether the values are small, medium, or large.

Which of the following correctly identifies an advantage of using adjusted R2 over R2?

The penalty of adding new independent variables is better understood through

Semi-elasticity

The percentage change in the dependent variable given a one-unit increase in an independent variable (only dependent variable appears in logarithmic form).

Sample distribution of B^

The probability distribution of these B^ values across different samples

First Order Conditions

The set of linear equations used to solve for the OLS estimates.

Total Sum of Squares (SST)

The total sample variance in a dependent variable about it sample average.

A researcher estimates a regression using two different software packages. The first uses the​ homoskedasticity-only formula for standard errors. The second uses the​ heteroskedasticity-robust formula. The standard errors are very different. Which should the researcher​ use?

The ​heteroskedasticity-robust standard errors should be used

A researcher investigating the determinants of the demand for public transport in a certain city has the following data for 100 residents for the previous calendar year: expenditure on public transport, E, measured in dollars; number of days worked, W; and number of days not worked, NW. By definition NW is equal to 365 - W. He attempts to fit the following model E= B1 + B2W + B3NW + e .Explain why he is unable to fit this equation. How might he resolve the problem?

There is exact multicollinearity since there is an exact linear relationship between W, NW and the constant term. As a consequence it is not possible to tell whether variations in E are attributable to variations in W or variations in NW, or both. One way of dealing with the problem would be to drop NW from the regression. The interpretation of b2 now is that it is an estimate of the extra expenditure on transport per day worked, compared with expenditure per day not worked.

inclusion of an irrelevant variable or overspecifying the model

This means that one (or more) of the independent variables is included in the model even though it has no partial effect on y in the population.

Comment on whether the following statement is true or false. If a variable in a model is significant at the 10% level, it is also significant at the 5% level.

This statement is false. It works the other way around. If a variable is significant at the 5% level, it is also significant at the 10% level. This is most easily explained on the basis of the p-value. If the p-value is smaller than 0.05 (5%) we say that a variable is significant at the 5% level. Clearly, if p is smaller than 0.05 it is certainly smaller than 0.10 (10%).

Comment on whether this statement is true or false. "The assumption of homoskedasticity states that the variance of the OLS residuals is constant

This statement is false. The homoskedasticity assumption states that the error terms have a constant variance (independent of the regressors). While some people use the terms `disturbance' ('error term') and `residual' interchangably, this is incorrect. Error terms are unobservables in our model and depend upon the unknown population parameters. Residuals are observable and depend upon the estimates for these parameters. Assumptions are always stated in terms of the error terms, never in terms of the residuals (which result after we have estimated the model).

"High multicollinearity affects standard errors of estimated coefficients and therefore estimates are not efficient" Is this statement valid? If yes, cite what assumptions and properties enable you to agree with this statement. If not, explain why not.

This statement is not valid because high multicollinearity does not affect the assumptions made on the model and hence the properties of unbiasedness and efficiency are unaffected by multicollinearity.

G. True/False: Consider a regression in which b2 = - 1.5 and the standard error of this coefficient equals 0.3. To determine whether X2 is a significant explanatory variable, you would compute an observed t-value of - 5.0.

True

I. True/False: A regression had the following results: SST = 82.55, SSE = 29.85. It can be said that 63.84% of the variation in the dependent variable is explained by the independent variables in the regression.

True

(T/F) The output from the Stata command "regress y x" reports the p-value associated with the test of the null hypothesis that β₁= 0.

True. The p-value associated with the test of the null hypothesis that β₁= 0, is reported in a Stata regression under the column "P > | t |."

(T/F) The t-statistic is calculated by dividing the estimator minus its hypothesized value by the standard error of the estimator.

True. The t-statistic is constructed by taking the estimator and subtracting off the hypothesized value and then dividing that quantity by the standard error of the estimator.

(T/F) When the estimated slope coefficient in the simple regression model, ^β₁is zero, then R² = 0.

True. When ^β₁= 0 then Xi explains none of the variation of Yi, and so the ESS (Explained Sum of Squares) = 0. Thus we have R²= ESS/TSS = 0

(T/F) In the presence of heteroskedasticity, and assuming that the usual least squares assumptions hold, the OLS estimator is unbiased and consistent, but not BLUE.

True. With both homoskedasticity and heteroskedasticity, the OLS estimator is unbiased and consistent, but it requires homoskedasticity to be BLUE.

False

Under heteroskedasticity, the conventionally calculated regression variance estimator, s 2 , is unbiased since it has nothing to do with the disturbance term

Total sum of square

Use squared variations of Y around its mean as a measure of amount of variation to be explained by regression, TSS=ESS+RSS

The error term is homoskedastic if

V ar(u|x) is constant

σ^2, error varience

Var(u|x)=E(u^2|x)-[E(u|x)]^2

Property one of variance

Var(x) = 0 iff there is a constant c, such that P(X=c)=1, in which case E(x) = C

Omitted Variables

Variables that are in ui i.e. variables that affect Y other than X

Independent or explanatory variables

Variables within the function

Random sampling

We have a random sampling size of n, following the population model y=B0 + B1x + u

What does it mean when you calculate a 95% confidence interval?

Were the procedure you used to construct the confidence interval to be re- peated on multiple samples, the calculated confidence interval (which would differ for each sample) would encompass the true parameter 95% of the time.

downward bias.

When E(B ̃1) , B1, B ̃1 has a downward bias.

Exclusion Restriction

Z cannot be part of the true model. this maens that Z is correlated with X, but has no direct effect of Y. In other words, in the presence of X, Z has no additional explanatory power for Y in the model

z stat or t stat

Z score when st deviation is known of population T-score when you know st dev of sample only

Predicted Value

^Y = b₀+b₁Xi

estimation results are summarized as

^Yi = ^β₀ + ^β₁ Xi (SE(^β₀)) (SE(^β₁))

The Central Limit Theorem (CLT) implies that

^βj ~ N(βj, Var(^βj)) and (^βj-βj)/SE(^βj) ~ N(0,1)

(T/f) The OLS intercept coefficient ^β₀ is equal to the average of the Yi in the sample.

^β₀ = ~Yi-^β₁X

Var(^β₁)

^σ²(u)/Σ(i)(Xi-~X)²

interpretation of the slope coefficient in the model ln(Yi) = β0 + β1 ln(Xi)+ ui is as follows:

a 1% change in X is associated with a β1 % change in Y.

smaller coefficients (downward bias)

a bias towards zero means...

smaller coefficients, often referred to as downward bias

a bias towards zero means...

The interpretation of the slope coefficient in the model ln(Yi) = β0 + β1Xi + ui is as follows:

a change in X by one unit is associated with a 100 β1 % change in Y.

causality vs correlation

a correlation between two variables does not imply causality

coefficient of determination

a descriptive measure of the strength of the regression relationship, a measure of how well the regression line fits the data

F Test

a formal hypothesis test that is designed to deal with a null hypothesis that contains multiple hypotheses or a single hypothesis about a group of coefficients.

regression specification error test (RESET)

a general way to run a higher-order terms specification test; includes squares and possibly higher order fitted values in the regression; failure to reject implies that some combination of higher order terms and interactions of your X variables would produce a better model

scatter plot

a graph with points plotted to show a possible relationship between two sets of data.

Simple correlation coefficient (r)

a measure of the strength and direction of the linear relationship of two variables. (-1 to +1) If "r" is high (in absolute value), multicollinearity is a potential problem .8 and up is high

error variance

a measure of the variability of the distance between our actual and predicted Y observations (the higher the error variance, the noisier our beta estimates will be)

we believe coefficients are appropriately specified and identified for the sample

a model is internally valid if...

An estimate is

a nonrandom number

assume that for the T=2 time period case you have estimated a simple regression in changes model and found a statistically significant positive intercept

a positive mean change in the LHS variable in the absence of a change in the RHS variable

an estimator is

a random variable

An estimator is

a random variable.

An estimator is:

a random variable.

confidence interval

a range of values so defined that there is a specified probability that the value of a parameter lies within it.

probability distribution

a set of all random variables with the probabilities of all possible outcomes

specification

a specific version of a more general econometric model

Null Hypothesis

a statement of the values that the researcher does not expect.

robust standard errors

a technique to obtain unbiased standard errors of OLS coefficients under heteroscedasticity

Discrete random variable

a variable that takes on only a finite or countably infinite number of values. flipping a coin A number will either be something or another thing

residual (e)=

actual value - estimated value

Ramsey RESET test

add Y^2, Y^3, and Y^4; perform F-test, if different overall fits, it's likely misspecified; checks for ommitted variables, specification errors

11. We would like to predict sales from the amount of money insurance companies spent on advertising. Which would be the independent variable? a) sales. b) advertising. c) insufficient information to decide.

advertising

changing the unit of measurement of any independent variable, where log of the independent variable appears in the regression

affects only the intercept coefficient

Changing the unit of measurement of any independent variables, where log of the independent variable appears in the regression:

affects only the intercept coefficient.

Imperfect multicolinearity

affects the standard errors

slope dummy variables

aka interaction term; allows slope of the relationship between dependent variable and independent variable to be different whether or not the dummy is met

null hypothesis of an F-test for overall significance=

all coefficients equal zero simultaneously; null hypothesis that the fit of the equation isn't significantly better than that provided by using the mean alone

Ceteris Paribus

all other relevant factors are held fixed

The overall regression F-statistic tests the null hypothesis that

all slope coefficients are zero.

In the Chow test the null hypothesis is

all the coefficients in a regression model are the same in two separate popu- lations.

In the Chow test the null hypothesis is:

all the coefficients in a regression model are the same in two separate populations.

panel data

also called longitudinal data

biased estimator

an estimator that comes from a sampling distribution that is not centered around the true value

example of cross sectional data

analyzing the behavior of unemployment rates across US states in march 2006

the adjusted r squared takes into account the number of variables in a model

and may decrease

degrees of freedom

any more observations than you need to create a best fit line for y = b0 + b1x1 degrees of freedom for x1, x2...xk = k + 1

The F statistic is always nonnegative

as SSRr is smaller than SSRur

errors have zero conditional mean; all variables must be exogenous (more likely to hold in multivariate OLS because fewer things end up in the error term)

assumption MLR.4

homoskedasticity

assumption MLR.5

normality of error terms

assumption MLR.6

random sampling

assumption SLR.2

sample variation in explanatory variable

assumption SLR.3

zero conditional mean

assumption SLR.4

classical linear model (CLM)

assumptions MLR.1-MLR.6

A researcher plans to study the causal effect of police crime using data from a random sample of U.S. counties. He plans to regress the county's crime rate on the (per capita) size of the country's police force. Which of the following variable(s) is/are likely NOT to be useful to add to the regression to control for important omitted variables? The average level of education in the county. The number of bowling alleys in the county. The fraction of young males in the county population The average income per capita of the county.

b

A researcher plans to study the causal effect of police crime using data from a random sample of U.S. counties. He plans to regress the county's crime rate on the (per capita) size of the country's police force. Which of the following variable(s) is/are likely NOT to be useful to add to the regression to control for important omitted variables? a.The average level of education in the county. b.The number of bowling alleys in the county. c.The fraction of young males in the county population d.The average income per capita of the county

b

minimize the sum of squared regression residuals

best fit

Joint probability distribution

both discrete variables

An estimator θ^1 of the population value θ is more efficient when compared to another estimator θ^2, if

both estimators are unbiased, and V ar(θ1^) < V ar(θ2^).

the regression slope indicates

by how many units the conditional mean of y increases, given a one unit increase in x

Consider the multiple regression model with two regressors X1 and X2, where both variables are determinants of the dependent variable. When omitting X2 from the regression, there will be omitted variable bias for β1: only if X2 is a dummy variable B. if X2 is measured in percentages. C. if X1 and X2 are correlated. D. always.

c

The correlation between X and Y

can be calculated by dividing the covariances between X and Y by the product of the two standard deviations.

The correlation between X and Y:

can be calculated by dividing the covariances between X and Yby the product of the two standard deviations.

logged coefficients

can be interpreted as percent changes for small deviations in the dependent variable; gives "midpoint" percentage changes; logarithmic changes are elasticities

the confidence interval for the sample regression function slope

can be used to conduct a test about a hypothesized population regression function slope

F statistic computed using maximum likelihood estimators

can be used to test joint hypotheses

F​-statistics computed using maximum likelihood​ estimators

can be used to test joint hypotheses

Binary variables:test

can take on only two values.

reasons why economists do not use experimental data more frequently is for the reasons except that real world experiments

cannot be executed in economics

1) misspecification (omitted variables) 2) outliers 3) skewness 4)incorrect data transformation 5) incorrect functional form for the model 6) improved data collection procedures

causes of heteroskedasticity

ordinary least squares

chooses the estimates to minimize the sum of squared residuals. ∑(y-B0-B1x1-B2x2)^2

expected value of a discrete random variable

computed as a weight average of the possible outcome of that random variable, where weights are probabilities of that outcome

least squares assumptions

conditional distribution of Ui given Xi has a mean of zero Xi,Yi i=1 are independently and identically distributed large outliers are unlikely

the explanatory variable must not contain information about the mean of ANY unobserved factors (talking about the actual parameter u)

conditional mean independence assumption

1) errors start to look normal 2) in large samples, the t distribution is close to the N(0,1) distribution (same with confidence intervals and F tests)

consequences of asymptotic normality

1) errors start to look normal 2) in large samples, the t-distribution is close to the N(0,1) distribution (same with confidence intervals and F tests)

consequences of asymptotic normality

1) estimates will be less precise because the error variance is higher (otherwise, OLS will be unbiased and consistent)

consequences of measurement error in the dependent variable

estimates will be less precise because the error variance is higher (otherwise, OLS will be unbiased and consistent)

consequences of measurement error in the dependent variable

the probability that the estimate is close to the true population value can be made high by increasing the sample size

consistency

cross-sectional data set

consists of a sample of individuals, households, firms, cities,states, countries, or a variety of other units, taken at a given point in time.

economic model

consists of mathematical equations that describe various relationships. basic premise underlying these models is utility maximization.

time series data

consists of observations on a variable or several variables over time. Examples of time series data include stock prices, money supply, consumer price in- dex, gross domestic product, annual homicide rates, and automobile sales figures. Because past events can influence future events and lags in behavior are prevalent in the social sci- ences, time is an important dimension in a time series data set.

Log transformations of the dependent and independent variables

constant elasticity model dependent variable y=log(salary) independent y=log(sales The coefficient of log sales is the estimated elasticity of salary with respect to sales. It implies that a 1% increase in firm sales increases CEO salary by 0.257% (coefficient is 0.257)

Linear Regression Model

describes the relationship between the two random variables • Yi=β₀+β₁Xi+ui

uses of econometrics

description, hypothesis, forecast

in the study of the effectivness of cardiac catheterization using instrument and difference in distance to cardiac catheterization and regular hospitals

determine if the instrument is weak, compute the value of the first stage F statistic

total sum of squares (SST)

difference between data point (y) and sample average (y bar)

explained sum of squares (SSE)

difference between regression line (y hat) and sample average (y bar)

Suppose that n = 100, and that we want to test at 1% level whether the population mean is equal to 20 versus the alternative that it is not equal to 20. The sample mean is found to be 18 and the sample standard deviation is 10. Your conclusion is:

do not reject the null hypothesis.

Interpreting R^2 and adjusted R^2 THEY DONT TELL YOU

do not tell you 1. an included variable is statistically significat 2. the regressor are true cause of the movement in the depedent variable 3. there is omitted variable bias 4. You have chosen the most appropriate set of regressors.

The availability of computerrelated leisure activities in the district. If this variable is omitted, it will likely produce a(an) _________ bias of the estimated effect on tests scores of increasing the number of computers per student.

downward

If Cov(Xi,ui) < 0, then ^β₁is biased

downwards (negative bias)

ols unbiased

e(b1)=e(b^1)

log-log form

elasticities are constant, slopes are not; one percent increase in X leads to a percent increase in Y equal to the coefficient of the X

sum(yi-y bar)^2

equation for SST

Error vs residual

erris is the deviation from the true value while the residual is the difference between the observed and estimated

classical assumption 5

error term has constant variance

classical assumption 2

error term has zero population mean

OLS is biased and inconsistent because the mismeasured variable is endogenous

errors in variable assumption (measurement error)

A survey of earnings contains an unusually high fraction of individuals who state their weekly earnings in​ 100s, such as​ 300, 400,​ 500, etc. This is an example​ of:

errors-in-variables bias.

Standard Error of the regression SER

estimates the standard deviation of the error term u.Thus SER is a measure of the spread of the distribution of Y around the regression line. SSR/ n-k-1 k: slope coeffecients

b₀

estimator of β₀

b₁

estimator of β₁

In general, the t-statistic has the following form:

estimator-hypothesize value/ standard error of estimator

causal inference

evaluating whether a change in x will lead to a change in y assuming nothing else changes (ceteris paribus)

to provide quantitative answer to policy question

examine emperical evidence

Omitted variable bias

exists if the omitted variable is correlated with the included regressor and is a determinant of the dependent variable.

variance

expected squared deviation from the mean

proportionality model of heteroskedasticity

expenditure in Rhode Island: less absolute value of variability than California because of relative percentage

represents variation explained by regression

explained sum of squares

R^2=

explained sum of squares/total sum of squares

stochastic error term

explains all changes in Y not explained by changes in Xs

total sum of squares (SST)

explains total variation of the model, what's explained by the model and what is not

type 2 error

failing to reject an incorrect null hypothesis

Threats to internal validity lead​ to:

failures of one or more of the least squares assumptions.

in practice the most difficult aspect of IV estimation is

finding instruments that are both relevant and exogenous

GLS (generalized least squares)

fix to heteroskedasticity WLS (weighted least squares): use when you have enough info to be able to describe why you have this problem feasible GLS: use when var(u | x) is undeniable and nonconstant but there's an unknown f(x) regress y x [iweight = variable] estat hettest predict weight, residuals

Durbin-Watson d test

for FIRST ORDER serial correlation; no lagged variable, includes an intercept; high and low d; if d is below critical d, you reject the null; if d is above high critical d, you fail to reject null

how well the explanatory variable explain the dependent variable (if we only know x, how much can we say about y)

goodness of fit

how well the explanatory variable explains the dependent variable (if we only know x, how much can we say about y)

goodness of fit

our model explains a lot of the variation in Y but doesn't tell us anything causal

high r-squared only tells us that...

b0 meaning

how much of the dependent variable is fixed? y-intercept for line of best fit minimizes the sum of the squares of residuals

covariance

how much two random variables change together

R^2

how well the regression line fits the data

Omitted variable without Bias

if correlation between term and explanatory variables is 0 or if the coefficient of the omitted variable is 0

Multicollinearity

if its imperfect multi - means one regressor is highly corrleated but not perfect- could lead to one or more coeff could be estimated imprecisely

identification (GM assumption 4)

if the zero conditional mean assumption holds, then we may interpret our coefficients on our X variables as causal (a change in X does not systematically cause a change in Y other than through the impact of the coefficient)

fail to reject

if | t-statistic |<critical value, we ____________ the null hypothesis

why don't we use slope-intercept form?

implies causation

finding a small value of the p value

indicates evidence in against the null hypothesis

Multivariate Regression Coefficient

indicates the change in the dependent variable associated with a one-unit increase in the independent variable in question

In the regression model y=β0 +β1x+β2d+β3(x×d)+u, where x is a continuous variable and d is a dummy variable, β2

indicates the difference in the intercept when d = 1 compared to the base group.

16. In the regression model y=β0 +β1x+β2d+β3(x×d)+u, where x is a continuous variable and d is a dummy variable, β3

indicates the difference in the slope parameter when d = 1 compared to the base group.

Variance inflation factor

is a method of detecting the severity of multicollinearity by looking at the extent to which a given explanatory variable can be explained by all other explanatory variables in an equation. -The higher the VIF, the more severe effects of multicollinearity If VIF>5, multicollinearity is severe.

Adjusted R^2

is a modified version , that doesnt always increase when you add another regressor 1- (n-1)/(n-k-1) * (SSR/TSS) so adjusted R^2 is always less than r^2 useful quanitfies the extent to which the regressor account for, or explain the variation in the dependent variables -use this to see if we should add regressor

Confidence Interval

is a range of values that will contain the true value of β a certain percentage of the time

marginal effect on x and y

is constant and equal to B1 delta(y)=b1(deltax)

reject null if p-value

is less than the level of significance (while beta has same sign as HA)

Multiple regression analysis

is more amenable to ceteris paribus analysis because it allows us to explicitly control for many other factors that simultaneously affect the dependent variable. This is important both for testing economic theories and for evaluating policy effects when we must rely on nonexperimental data. Because multiple regres- sion models can accommodate many explanatory variables that may be correlated, we can hope to infer causality in cases where simple regression analysis would be misleading.

A Control Variable in Multiple Regression

is not the object of interest in the study , rather it is a regressor included to hold constant factos that , if neglected , could lead the estimated casual effect of interest to suffer from omitted variable bias

endogenous variable

is one that is correlated with u

exogenous variable

is one that is uncorrelated with u

The population regression line

is the relationship that holds between Y and X on average in the population

The standard error of the estimated coefficient, SE(Beta)

is the square root of the estimated variance of the βNs, it is similarly affected by the size of the sample and the other factors we've mentioned. For example, an increase in sample size will cause SE1βN 2 to fall; the larger the sample, the more precise our coefficient estimates will be.

An estimator is unbiased if

its expected value is equal to the true population parameter it is supposed to be estimating

detect serial correlation

lagged residuals against residuals --> Breusch-godfrey test with a test statistic of NR^2

Newey west standard errors

larger SE because serial correlation is accounted for;

Meaning of Linear Regression

linear in parameters of B0 and B1. there are no restrictions on how y and x relate to the original explained and explanatory variables of interest

constant elasticity model

log log form suggests...

can be interpreted as percent changes for small deviations in the dependent variable; gives "midpoint" percentage changes; logarithmic changes are elasticities

logged coefficients

making comparisons across different scales (data that varies in magnitude)

logs are a way of...

When looking at the relationship between 2 variables, it is always good to start by

looking at a scatterplot

impure serial correlation

looks like serial correlation but because of some other specification error

in the context of a controlled experiment consider the simple linear regression formulation, let Yi be the outcome Xi the treatment level when the treatment is binary, and u contain all the additional determinants of the outcome. Then calling B^1 a difference estimator:

makes sense since it is the difference between the sample average outcome of the treatment group and the sample average outcome of the control group

The effect that x has on y is a ________ and constant and equal to B1

marginal

p-value

marginal significance value; probability of observing a t-score that size or larger if the null were true; lowest level of significance at which we can reject the null

MSR

mean square regression

minimum variance unbiased estimators

means that OLS has the smallest variance among unbiased estimators; we no longer have to restrict our comparison to estimators that are linear in the yi.

Imperfect multicollinearity:

means that two or more of the regressors are highly correlated

imperfect multicollinearity

means that two or more of the regressors are highly correlated

Imperfect Multicollearity

means two or more of the regressors are highly correlated in the sense that there is a linear function of the regressor that is highly correlated with another regressor. does not pose any problems to the theory of OLS -atleast on indiv regressor will be impresciselt estimated -larger sampling variance

level of significance

measure of probability of type 1 error

this factor (which involves the error variance of a regression of the true value of x1 on the other explanatory variables) will always be between zero and one; implies we are consistently biased towards zero

measurement error inconsistency

covariance

measures how much two random variables vary together; when one is big the other also tends to be big (height and weight of animals)

To obtain OLS estimates of the unknown coefficients β₀, β₁, ... , βk , we

minimize the sum of squared residuals (RSS) with respect to ^β₀, ^β₁, ... , ^βk :

the OLS estimator is derived by

minimizing the sum of squared residuals

pooled cross sections

multiple unit of observations multiple times, but different observations each time (change in property taxes on house prices)

panel/longitudinal data

multiple units of observation with multiple time observations for each (have both cross-sectional and time series dimensions) (city crime statistics)

linear in the coefficients

must be true to perform linear regression

Bernoulli random variable

mutually exclusive, exhaustive binary, extreme case. if probabilities don't add up to 1, there must be other outcomes

instrumental variable

needed when some regressors are endogenous (correlated with the error term); involves finding instruments that are correlated with the endogenous regressors but uncorrelated with the error term

"In a regression, if the p-value for a coefficient is 0.0834, do you reject the null hypothesis of it being equal to zero at the 5% level? "

no

Irrelevant variable effect

no bias but increased variance and decreased adjusted-R^2

serial correlation effects

no bias in coefficients, biased SEs, OLS is no longer the minimum variance estimator

classical assumption 6

no explanatory variable is a perfect linear function of any other explanatory variable

davidson-mackinnon j test

nontested model specification test nnest in stata

What to do about multicollinearity?

nothing, drop a redundant variable (based on theory), or try to center the variables

M

number of constraints, numerator's degrees of freedom for an F-test

time series data

observations of a variable or several variables over time; typical features include trends and seasonality and serially correlated (stock prices, GDP)

classical assumption 4

observations of the error term are uncorrelated

serially correlated

observations that occur before and after each other tend to be similar

T-distribution

obtained from a standard normal distribution

The true causal effect might not be the same in the population studied and the population of interest​ because

of differences in characteristics of the population. of geographical differences. the study is out of date.

high correlation

ols estimators still unbiased, but the estimation of parameters has lower precision when regressors correlated

lagged dependent variables as proxies

omitted unobserved factors may be proxied by the value of the dependent variable from an earlier time period

Consider the multiple regression model with two regressors X1 and X2, where both variables are determinants of the dependent variable. You regress Y on X1 only and find no relationship, However when regressing Y on X1 and X2 the slope coefficient changes by a large amount, the first regression suffers from

omitted variable bias

The power of the test is

one minus the probability of committing a type II error.

The significance level of a test is:

one minus the probability of rejecting the null hypothesis when it is true.

proxy variables

one way of dealing with omitted variables (but imperfect); think of running them as something more like a specification test, or a test for the influence of possible omitted variable bias (worried about nonrandom sampling)

Linear in the Coefficients

only if the coefficients (the βs) appear in their simplest form—they are not raised to any powers (other than one), are not multiplied or divided by other coefficients, and do not themselves include some sort of function (like logs or exponents).

Degrees of Freddom

or the excess of the number of observations (N) over the number of coefficients (including the intercept) estimated (K + 1). Higher DOF = More reliable data

OLS

ordinary least squares best fit line that optimizes the value of u^2 minimizes the difference between observed and expected values

exogenous

originating outside the system, ie determined by processes unrelated to the question at hand (randomly assigned treatments and their outcomes)

endogenous

originating within the system, ie co-influential or jointly determined (education and earnings)

dummy variable trap is an example of

perfect multicollinearity

unobserved household environment

person specific

unobserved motivation

person-specific

the best way to interpret polynomial regressions is to:

plot the estimated regression function and to calculate the estimated effect on Y associated with a change in X for one or more values of X

Linear in Variables

plotting the function in terms of X and Y generates a straight line.

counterfactual (randomized experiments)

potential outcome of each individual as opposed to the other (if it were not the case that the individual received treatment, they would have been in control)

significance level

probability of rejecting the null when it is in fact true

quasi experiments

provide a bridge between the econometric analysis of observational data sets and the statistical ideal of a true randomized controlled experiment

confidence interval (CI)

provide a range of likely values for the population parameter, and not just a point estimate B=B^(+/-)C*SE(B^)

multiple restrictions

putting more than one restriction on the parameters in

numerator degrees of freedom

q=dfr-dfur

econometrics

quantitative measurement of actual economic and business phenomena

stochastic variables

random variables

estimate of the treatment effect; yielded by the regression of Y on indicator variable Z

randomized experiments

they create groups that on average are virtually identical to each other (able to attribute difference in groups to treatment)

randomized experiments are the gold standard for answering causal questions because...

excluding a relevant variable or underspecifying the model

rather than including an irrelevant variable, we omit a variable that actually belongs in the true (or population) model.

biased toward zero

refers to cases where E(B ̃1) is closer to zero than is B1. Therefore, if B1 is positive, then B ̃1 is biased toward zero if it has a downward bias. On the other hand, if B1<0, then B ̃1 is biased toward zero if it has an upward bias.

White test for heteroskedasticity

regress squared residuals on all explanatory variables, their squares, and interactions; detects more general deviation from hetero than BP test

OLS

regression estimation technique that calculates beta hats so as to minimize the sun of the squared residuals

classical assumption 1

regression is linear, specified correctly, and has an additive error term

dummy/indicator variable

regression of Y on an indicator variable for treatment X which takes on the value 1 when treatment occurred and 0 otherwise; 1 if person is a woman, 0 if person is a man

Type 1 error

reject a true null

type 1 error

rejecting a true null hypothesis

A predicted value of a dependent variable:

represents the expected value of the dependent variable given particular values for the explanatory variables.

represents variation not explained by regression

residual sum of squares

deviations from regression line

residuals

why partialling out works

residuals from the first regression are the part of the explanatory variable that is uncorrelated with the other explanatory variables (slope coefficient from 2nd regression represents isolated effect of explanatory variable on dep. variable)

restricted vs. unrestricted model

restricted log(salary)=B0+B1years+B2gamesyr+u unrestricted log(salary)=B0+B1years +B2gamesyr+B3bavg+B4hrunsyr+b5rbisyr+u The restricted model always has fewer parameters than the unrestricted model.

omitted variable bias

results in a misestimated coefficient on the included variable, which is trying to capture the roles of both variables "underspecifying the model"

Generalized least squares

rids an equation of first order serial correlation problems and makes it a minimum variance equation again; decreases standard errors and makes confidence intervals more accurately wider

the larger the variability of the unobserved factors (bad variation)

sampling variability of the estimated regression coefficients will be higher...

the higher the variation in the explanatory variable (good variation)

sampling variability of the estimated regression coefficients will be lower...

standard deviation of B

sd(B)=σ/[SST(1-R^2)]^(1/2)

When testing joint hypotheses, you should

se the F-statistics and conclude that at least one of the restrictions does not hold if the statistic exceeds the critical value.

elastic demand

sensitive to changes in price and income

multivariate regression coefficient

serve to isolate the impact on Y of a change in one variable from the impact on Y of changes in the other variables

one of your friends us using data on individuals to study the determinants of smoking at your university she is concerned with estimating marginal effects on the probability of smoking at the extremes, what should she use

she should use the logit or probit, but mot the linear probability model

variance inflation factors

show how much the variance of one variable is inflated due to the addition of another variable (above 5 is a concern)

how could you determine whether this instrument the difference in distance to cardiac catheterization and regular hospitals is exogenous

since there is one endogenous regressor and one instrument the J test cant be used to test the exogeneity of the instruments Expert judgment is required to assess the exogeneity

we must make a case for using proxies and arguing that they do not threaten any inferences we make (especially causal ones)

since we can never observe our unobservables...

third moment of standardized variable

skewness (+ right, - left)

summary of functional forms involving logs

slide 18 notes 2-5

standard error of estimated model

small error compared to the mean indicates that unexplained results/errors are small compared to reality

Nonlinear least​ squares

solves the minimization of the sum of squared predictive mistakes through sophisticated mathematical so ​ routines, essentially by​ trial-and-error methods.

example of randomized controlled experiment

some 5th graders in a specific elementary school are allowed to use comps at school, while others are not, and their end of year performance is compared holding constant other factors

variance vs std deviation

squaring the std deviation emphasizes dispersion and draws attention to things that are unusual

install function in stata

ssc install ____

sample variance of y

sst / n-1 (sum(yi-¯y)^2 /n-1)

estimated standard deviations of the regression coefficients; measure how precisely the regression coefficients are estimated

standard errors

when you add state fixed effects to a simple regression model for U.S states over a certain time period and the regression R^2 increases significantly then it is safe to assume that

state fixed effects account for a large amount of the variation in the data

Ordinary least squares

sum of the vertical distances squared between the residuals and the estimated regression line (also RSS)

summarize in stata

summarize

Reject the null hypothesis if

t-stat falls outside of the critical values or if p-value ≤ α or if H₀: β₁falls within the confidence interval

The rejection rule for alternative

t<-c

t statistic

t=B^/se(B^)

Dummy Variable

takes on the value of one or zero (and only those values) depending on whether a specified condition is met. e.g Male=1 Female=0

reduced form estimation

testing something close to the assumptions of the model. more potential sources for error

The cumulative probability distribution shows the probability

that a random variable is less than or equal to a particular value.

in the case of the simple regression model Y=Bo+B1Xi+ui, i=1 when X and u are correlated then

the OLS estimator is inconsistent

In testing multiple exclusion restrictions in the multiple regression model under the classical assumptions, we are more likely to reject the null that some coefficients are zero if:

the R-squared of the unrestricted model is large relative to the R-squared of the restricted model.

to decide whether Y=B0+B1X+u or ln(Y)=B0+B1X+u fits the data better you cant consult the regression because

the TSS are not measured in the same unites between the two models

In the probit​ regression, the coefficient beta 1 ​indicates:

the change in the the z​-value associated with a unit change in X.

Econometrics

the development of statistical methods for estimating economic relationships, testing economic theories, and evaluating and implementing gov- ernment and business policy. The most common application of econometrics is the fore- casting of such important macroeconomic variables as interest rates, inflation rates, and gross domestic product.

The Student t distribution is:

the distribution of the ratio of a standard normal random variable, divided by the square root of an independently distributed chi−squared random variable with m degrees of freedom divided by m.

The student t distribution is

the distribution of the ratio of a standard normal random variable, divided by the square root of an independently distributed chi−squared random variable with m degrees of freedom divided by m.

Each slope coefficient βj, in the multiple regression, measures

the effect of a one unit change in the corresponding regressor Xji , holding all else (e.g. the other regressors) constant.

why is the variance of the error term the same as the variance of our dependent variable y?

the error shows up in the equation containing population parameters, which are never observable the residual shows up in equations containing sample parameters, which are computed from the data

MLR.6

the error term is independent of the explanatory variables x1, x2, x3,....xk and is normally distributed with mean zero and variance o^2

classical assumption 7

the error term is normally distributed

A type I error is

the error you make when rejecting the null hypothesis when it is true

a type 1 error is

the error you make when rejecting the null hypothesis when it is true

A type I error is

the error you make when rejecting the null hypothesis when it is true.

Internal validity is​ that:

the estimator of the causal effect should be unbiased and consistent.

The regression R2 is a measure of

the goodness of fit of your regression line.

The regression R^2 is a measure of:

the goodness of fit of your regression line.

15. Which of the following is not correct in a regression model containing an interaction term between two independent variables, x1 and x2:

the interaction term coefficient is the effect of a unit increase in √x₁x₂.

When Xi is a binary variable

the interpretation of the estimated coefficients ^β₀ and ^β₁is different. ^β₁measures the difference in ^Yi between Xi=0 and Xi=1

Take an observed (that is, estimated) 95% confidence interval for a parameter of a multiple linear regression. If you increase the confidence level to 99% then, necessarily:

the length of the confidence interval increases.

Total Sum of Squares

the sum, over all observations, of the squared differences of each observation from the overall mean.

Explained Sum of Squares (ESS)

the total variation of the fitted Y values around their average (i.e. the variation that is explained by the regression): • ESS = Σ(i)(^Yi-~Y)²

reason why estimators have a sampling distribution

the values of explanatory variable and the error term differ across samples

MLR.5 homoskedasticity

the variance of ui is the same for all xi and all i

the larger F is, the larger SSR restricted relative to SSR unrestricted

the worse the explanatory power of the restricted model, implying H0 is false

2. The availability of computerized adaptive learning tools in the district. If this variable is omitted, it will likely produce a(an) ___________ bias of the estimated effect on tests scores of increasing the number of computers per student.

upword

MSE=

variance + bias^2 (the lower the better!)

homoskedasticity

variance does not change for different observations of the error term

1) total sample variation 2) linear relationships between x variables 3) error variance

what determines the size of our standard errors?

coefficient of determination (R^2)

what share of all variation there is to explain, does your model explain?

r squared can never decrease

when another independent variable is added to a regression

Perfect multicollinearity is

when one of the regressors is an exact linear function of the other regressors.

omitted variable bias (OVB)

when our model has omitted an important variable that is correlated with one or more of our included (x) variables, causing biased estimators

inverse functional form

when the impact of a dependent variable approaches zero as it approaches infinity

Level-log

y=b0+b1log(x) for every 1% increase in x, y incr/decr by b1/100 UNITS.

"In a regression, if the confidence interval for a coefficient is (1.83, 2.76), do you reject the null hypothesis of the coefficient being equal to zero at the 5% level?"

yes

In the simple regression model y = β0 + β1x + u, the simple average of the OLS residuals is

zero

In the simple regression model y = β0 + β1x + u, the simple average of the OLS residuals is

zero.

two-tailed test

|t|>c c is chosen to make the area in each tail of the t distribution equal 2.5%. In other words, c is the 97.5th percen- tile in the t distribution with n-k-1 degrees of freedom. When n-k-1=25, the 5% critical value for a two-sided test is c=2.060.

The estimates Bˆ1 and Bˆ2 have partial effect, or ceteris paribus

Δy=B1Δx1+B2Δx2 Δy=B1Δx1 Δy=B2Δx2 Δy=B1Δx1+B2x2+...+BkΔxK Δy=B1Δx1

Assume that Y is normally distributed N(μ,σ2). To find Pr(c1 ≤ Y ≤ c2), where c1 <c2,anddi =ci−μ,youneedtocalculatePr(d1 ≤Z≤d2)=

Φ(d2 ) − Φ(d1 )

Assume that Y is normally distributed N(μ,σ2). To find Pr(c1 ≤ Y ≤ c2), where c1 <c2,anddi =ci−μ,youneedtocalculatePr(d1 ≤Z≤d2)=

Φ(d2 ) − Φ(d1 ).

Properties of OLS

• The sample regression function obtained through OLS always passes through the sample mean values of X and Y . • ~û = (Σ(i) ûi)/n = 0 (mean value of residuals is zero) • Σ(i) ûiXi = 0 (^ui and Xi are uncorrelated) • Given the OLS assumptions and homoscedastic errors, the OLS estimators have minimum variance among all unbiased estimators of the β's that are linear functions of the Y 's. They are Best Linear Unbiased Estimators (BLUE).

We can decompose each Yi value into the fitted (or predicted) part given Xi and the residual part, which we called ^ui :

• Yi=^Yi+^ui •Yi-~Y=(^Yi-~Y)+^ui •(Yi-~Y)=(^Yi-~Y)+(Yi-^Yi) • Σ(i)(Yi-~Y)²=Σ(i)(^Yi-~Y)²+Σ(i)(Yi-^Yi)² • TSS = ESS + RSS

Root Mean Squared Error (RMSE)

• ^σu = √^σ²u = √Σ(i)û²i/(n-2) = √Σ(i)(Yi-^Yi)²/(n-2) • Also called the standard error of the regression (SER) • is a measure of the deviation of the Y values around the regression line

OLS estimators of β₀ and β₁

• denoted ^β₀ and ^β₁ • The estimators that minimize the sum of squared residuals ∑(i=1,n) (Yi-b₀-b₁Xi)²

There a 5 basic steps to a standard hypothesis test. What are these?

• null and alternative hypotheses • test statistic (with distribution under the null) • significance level and critical value • decision rule • inference and conclusion


Conjuntos de estudio relacionados

Final Exam (DELETE WHAT YOU KNOW)

View Set

Six Sigma Green Belt Multiple Choice Questions

View Set

Cs133p chapter 3, 4, 5 and 6 book

View Set

Employee Dissatisfaction and Grievances Ch. 9

View Set

Biology : 9. ECOLOGY, POLLUTION, AND ENERGY

View Set

Exam 3 Spring 18 Mental Health Varcarolis 14-17, 28

View Set