Quant2 Final

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Stochastic Error Term

A term added to a regression equation to introduce all the variation in Y that cannot be explained by the included X's.

The F test statistic in a one-way ANOVA is

A) MSB/MSW.

Specification of the Model

Choose independent variables, functional form, and stochastic error term

Which of the following is true for a two-factor ANOVA model:

D) A and B above are true

The Y-intercept (b0) in the expression

D) predicted value of Y when X = 0.

The p-value measures

D) the lowest possible level of significance for which the null hypothesis would still be rejected.

Critical Value

Divides the acceptance from rejection region

Gauss-Markov Theorem

Given the Classical Assumptions, the OLS estimator is the minimum variance estimator from all linear unbiased estimators

Consequence of Omitting a Relevant Variable

Leads to biased estimates of other variables

Four sources of "specification error"

Omitted variables, Measurement Error, Different Functional Form, Nature of Random Component

Meaning of Regression Coefficient

The impact of a one-unit increase in X1 on the dependent variable Y, holding all other independent variables constant.

asdlfasdkjl;fasdf

awk;lfasdkljlkajsd;f

T/F A completely randomized ANOVA design with 4 groups would have 12 possible pairwise comparisons mean comparison.

f

T/F Data that exhibit an autocorrelation effect violate the regression assumption of homoscedasticity, or constant variance.

f

T/F The coefficient of determination is computed as the ratio of SSE to SST.

f

T/F If the null hypothesis is true in one-way ANOVA, then both MSB and MSW provide unbiased estimates of the population variance of the variable under investigation.

t

T/F The confidence interval for the mean of Y given X in regression analysis is always narrower than the prediction interval for an individual response Y given the same data set, X value, and confidence level.

t

The least squares method minimizes which of the following?

C) SSE

Interaction between main effects in an experimental design can be tested in

C) a two-factor model.

A regression diagnostic tool used to study the possible effects of collinearity is

C) the VIF.

The coefficient of determination (r^2) tells us

C) the proportion of total variation (SST) that is explained (SSR).

E) All of the above.

In a multiple regression model, which of the following is correct regarding the value of the adjusted r^2 ?

D) the estimated mean of Y increases by 2 units for each increase of 1 unit of X1, holding X2 constant.

In a multiple regression problem involving two independent variables, if b1 is computed to be +2.0, it means that

What do we need to predict direction of change for individual variables?

Knowledge of economic theory and general characteristics of how explanatory variables relate to the dependent variable under consideration

Consequence of Including an Irrelevant Variable

Leads to higher variances of estimated coefficients

Level of Significance

Level of Type 1 Error

R Squared Formula

The ratio of SSR/SST, explained sums of squares divided by total sums of squares

T/F Collinearity among explanatory variables in multiple regression analysis increases the standard errors of the beta coefficient estimates, thereby increasing the likelihood of committing a Type II error.

t

T/F Even though multicollinearity and autocorrelation are violations of major assumptions of multiple regression analysis, our simulations reveal that the expected values of the coefficient estimates are still equal to the true population values.

t

T/F If there are four dummy variables for some phenomenon such as occupation group, and all four are included in the regression, then the constant term must be excluded.

t

T/F Regression analysis is used for prediction and estimation of the impact and explanatory power of an independent variable on a dependent variable, while correlation analysis is used to measure the strength of the linear relationship between two numerical variables and implies no causal relationship.

t

Sum of Residuals in OLS

zero

Degrees of Freedom, Multiple Regression

1 - MSE/MST, where MSE is the Mean Squared Error (Variance of the error term) and MST is estimated variance of the dependent variable

Standard Error of Beta Coefficient

A measure of sampling variation (standard deviation) of the slope term estimates of the population parameters. Dividing this value into the beta coefficient estimate yields a t-ratio for comparision with the null that the true Beta equals zero

Durbin-Watson Statistic

A measure of the extent of serial correlation. A value of 2 indicated no evidence of serial correlation.

Null Hypothesis

A numeric contention or requirement that the researcher seeks to determine whether statistical evidence is supportive

Confidence Interval

A range containing the true value of an item a specific percentage of the time.

The degrees of freedom for the F test in a one-way ANOVA for the numerator and denominator, respectively, are

A) (c - 1) and (n - c).

High levels of intercorrelation among independent variables in a multiple regression model

A) can result in a problem called multicollinearity

In a one-way ANOVA

A) there is no interaction term.

Why would you use the Levene procedure?

A) to test for homogeneity of variance

Type II Error

Accepting a false null hypothesis. Also known as a beta error, and can be calculated only in reference to specific values of the alternative hypothesis

Two Sided Test

Alternative hypothesis is given for two sides of the null (null=0)

One Sided Test

Alternative hypothesis is only given for one side of the null, a more "powerful" test than a two sided test, meaning lower probability of Type II error

Residual Sum of Squares, a.k.a SSE for Sum of Squared Errors

Amount of squared deviation that is unexplained by the regression line, the squared difference of the actual value of Y from the predicted value of Y from the estimated regression equation, that is Sum(Y - Yhat)^2

Explained Sum of Squares, a.k.a SSR for Sum of Squares Regression

Amount of the squared deviation of the predicted value of Y as determined by the estimated regression equation from the mean of Y, that is Sum(Yhat - Ybar)^2

In a two-way ANOVA the degrees of freedom for the interaction term is

B) (r - 1)(c - 1).

If the Durbin-Watson statistic has a value close to 0, which assumption is violated?

B) independence of errors

An interaction term in a multiple regression model may be used when

B) the relationship between X1 and Y changes for differing values of X2.

The standard error of the estimate is a measure of

B) the variation of the dependent variable around the sample regression line.

Logarithm transformations can be used in regression analysis

B) to change a nonlinear model into a linear model.

C) homoscedasticity

Based on the residual plot to the right, you will conclude that there might be a violation of which of the following assumptions?

OLS is BLUE

Best Linear Unbiased Estimator, meaning that no other unbiased linear estimator has a lower variance than the least-squares measures

Estimated Regression Coefficients

Beta hats, are empirical best guesses, obtained from a sample

Reqression Coefficient Elasticity Measure

Correlation of the error terms, typically first order correlation where the error for time period t is correlated with the error from the prior time period, t-1. Tends to reduce standard errors, meaning that we are more likely to say that a variable matters when it doesn't (Type I Error)

Serial Correlation a.k.a. Autocorrelation

Correlation of the error terms, typically first order correlation where the error for time period t is correlated with the error from the prior time period, t-1. Tends to reduce standard errors, meaning that we are more likely to say that a variable matters when it doesn't (Type I Error)

Which of the following will generally lead to a model exhibiting a "better fit?"

D) All of the above tend to improve the "goodness of fit".

Clearly among all of the statistical techniques that we have studied, the most powerful and useful, because it is capable of incorporating other statistical models, is

D) Multiple regression analysis

In a one-way ANOVA, if the computed F statistic exceeds the critical F value we may

D) reject H0 since there is evidence that some of the means differ, and thus evidence of a treatment effect.

Why would you use the Tukey-Kramer procedure?

D) to test for differences in pairwise means

If the F statistic yields a p-value for a multiple regression model that is statistically significant, which of the following is true?

D)The regression coefficient for at least one independent variable is statistically significantly different from zero ( β1 ≠ 0; or β2 ≠ 0; or...or βk ≠ 0)

Cross-Sectional

Data set that includes entries from the same time period but different economic entities (countries, for instance)

Time Series Data

Data set that is ordered by time, typically generating an serial or autocorrelation violation of the randomness of the error term

Three purposes of econometrics

Describe reality, Test hypotheses, Predict the future

Critical T Value

Determined by degress of freedom and the level of significance

Residual, e

Difference between dependent variable's actual value and the estimated value of the dependent variable from the regression results

Sampling Distribution

Distribution of different values of B Hat across different samples

If the correlation coefficient (r) = 1.00 for two variables, then

E) A and D above.

In a simple two-variable linear regression model with

E) All of the above are true about the simple two-variable model.

The width of the prediction interval for the predicted value of Y is dependent on

E) All of the above.

Signs that multicollinearity among explanatory variables may be a problem are indicated by

E) All the above are indicators of multicollinearity problems.

Correlation of the error terms in a multiple regression model

E) All the above are true

Multiple Regression Analysis

E) All the above are true of multiple regression analysis.

C) 4

If a categorical independent variable contains 5 distinct types, such as a Student Home Residence factor (OKC region, Tulsa Region, Other OK, Texas, Other) , then when the model contains a constant term, ________ dummy variable(s) will be needed to uniquely represent these categories.

C) 4

If one categorical independent variable contains 4 types and a second categorical independent variable contains two types, then when the model contains a constant term, ________ dummy variable(s) will be needed to uniquely represent these categories.

B) 2.0

If the residuals in a regression analysis of time ordered data are not correlated, the value of the Durbin-Watson D statistic should be near ________.

Omitted Variable

Important explanatory variable has been left out

Multicollinearity

Intercorrelation of Explanatory variables that can lead to expansion of the standard errors of the coefficient estimates, thereby leading one to say that a variable does not matter when it actually does (Type II error)

Classical Assumptions

Linear, Zero Population Mean, Explanatory Variables Uncorrelated with Error, Error Term is Uncorrelated with Itself, Error has Constant Variance, No Perfect Multicollinearity

Multivariate Regression Coefficient

Measures change in the dependent variable associated with a one unit increase in the independent variable, holding all other independent variables constant.

R Squared Meaning

Measures the percentage of the variation of Y around the mean of Y that is explained by the regression equation.

Simple Correlation Coefficient, r

Measures the strength and direction of a linear relationship between two variables

Dummy variable

Only takes on values 0 and 1

Econometrics standard tool

Ordinary Least Squares or OLS: Single-equation linear regression analysis

OLS or "Ordinary Least-Squares"

Regression technique which minimizes the sum of squared residuals

Type I Error

Rejecting a true null hypothesis. Also known as the alpha error, as determined by the level of significance

Unbiased Estimator

Sampling distribution has its expected value equal to the true value of B.

Normalized Beta Coefficient

Slope-term multiplied by the ratio of the standard deviation of the independent variable to the standard deviation of the dependent variable. Transformed slope-term then reads as the standard deviation change in Y per one standard deviation change in X.

Standard Error of the Estimate

Square-root of the Mean Squared Error (MSE), a measure of the average deviation of error terms about the regression line.

Alternative Hypothesis

Statement in opposition to the null that the researcher seeks to detetermine whether statistical evidence is sufficient to call into question the null hypothesis. The "Research Question" that statistical evidence seeks to confirm

Econometrics

Statistical measurement of economic phenomena to determine independent influence of explanatory variables on a specified dependent variable

Regression Analysis

Statistical technique to "explain" movements in one variable as a function of movements in another

Total Sum of Squares, a.k.a. SST for Sum of Squares Total

Sum of squared variations of Y around its mean

B) testing that the slope (β1) differs from zero in a two variable regression.

Testing for the existence of statistically significant correlation between two variables is equivalent to

Adjusted R Squared Formula

The R Squared that has been adjusted for Degrees of Freedom lost, since adding an independent variable to the original R Squared will likely increase it, even if the fit isn't necessarily better.

Adjusted R Squared Meaning

The R Squared that has been adjusted for Degrees of Freedom lost, since adding an independent variable to the original R Squared will likely increase it, even if the fit isn't necessarily better.

B) coefficient of correlation.

The strength of the linear relationship between two numerical variables may be measured by the

Mean Squared Error

Variance of the Regression, or SSE/(n-k-1) where SSE is "sum of squared errors," n is # of observations, k is # of independent variables.

B) The model is a better predictor of Y than the sample mean,

What do we mean when we say that a simple linear regression model is "statistically" useful?

Correlation of the resulting error terms with any of the explanatory variables and the dependent variable.

Zero, that is that the regression results yield estimated errors that are uncorrelated with any of the variables used in the regression model


Ensembles d'études connexes

Business Law: Chapter 5 Administrative Law

View Set

Chapter 11 TERMS - The Endocrine System

View Set

Chapter 16: Employment Discrimination

View Set