POS 3713: Final Exam

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

What does rMSE tell us?

"On average, our model is off by ..."

Should you use Logit with Small Sample Sizes?

No.

Is Leverage always bad?

No. It depends of the direction.

In Multiple Regression, instead of drawing a line through points, we draw a....

Plane through the points in multiple dimensions.

With Models with Interaction Terms, we can still find differences between all of our Binary Categories by carefully...

Plugging in 0 and 1 in our Model where appropriate. (...): Several Examples in Notes

We can use Residuals in order to test...

Goodness of Fit.

How do we phrase the effect of (X) on (Y) for a Binary (X)?

"(X) have B more/less (Y) than not-(X)." "Compared to not-(X), (X) have B more/less (Y)." (STN): not-(X) is the Reference Category.

(....)

(...)

How are B and a estimates calculated in Bivariate Regression?

(...): Look in Notes

(...): Know the Following Formulas/Models -Population and Sample Regression Model -Regression Coefficient -Standardized Coefficients -Quick and Easy Approximation

(...): See Formulas/Models in Notes

What are 2 Problems of the Linear Probability Model?

1. Impossible Probabilities 2. Straight Line not fitting the Data

What is the Range of R-Squared?

0-1.

What are the Critical Values at the .05 level in Logit Models for 1 and 2-Tailed Tests?

1-Tailed: .645 2-Tailed: 1.960

You may use OLS only if...(2)

1. (Y) is Continuous and Unbounded 2. (Y) is Normally Distributed

How do we calculate RSS (3)?

1. Calculate your Model's Residuals 2. Square them 3. Add them up

What are the 5 Ordinary Least Squares Assumptions?

1. Linear in the Parameters (a and B) (Y=a + B0 + Bx1 + u) 2. Random Sample (So we can estimate the Coefficients of our (X)'s) 3. Variation in (X) (Obvious) 4. Zero Conditional Mean This means that our estimates are Conditional on (X) and that we have no Confounders (Z) 5. Homoscedasticity (Means that We Assume All Errors are Independent and Identically Distributed) (For Time Series Data...we assume that there is no Autocorrelation) (...): See notes for more clarity

What are the 2 Measures of Model Fit?

1. Root Mean Squared Error (rMSE) 2. R2 (R^2) (R-Squared)

What are the 3 Ways in which we can test to see which (X) has the largest effect on (Y)?/Largest Substantive Significance.

1. Standardized Coefficients 2. See Changing in Fitted Values/Predictions of (Y) for the 25th and 75th percentile of (X) 3. Quick and Easy Approximation (OO)

How do we calculate TSS (3)?

1. Subtract the mean of (Y) from all of the observed values of (Y) 2. Square them 3. Add them up

What are the 4 ways in which we can test for Statistical Significance in our B estimate?

1. T-Test. Find the t-ratio ((B-B*)/se(B)), then compare that to the corresponding Critical Value. If the t-ratio is greater than the Critical Value, then your estimate is statistically significant. 2. If your B is Twice as Big as your Standard Error, then your B should be statistically significant. 3. If your Confidence Interval overlaps 0, then your B is not Statistically Significant, 4. Look at the p-value for your B. If it's less than .05, then your B is Statistically Significant. (STN): These Tests are also dependent on whether you have a One or Two-Tailed Hypothesis.

What is the importance/purpose of Binary/Dummy Variables in Regression (3)?

1. They can help you overcome the Linearity Assumption of Ordinal Variables. 2. You must use them to Measure Categorical Variables in Regression. 3. You can only compare cases to the Reference Category. (OO)

What are the 2 Main Reasons for Using Multiple Regression?

1. We may have multiple Hypotheses 2. We can control for (Z)'s

What is Influence?

A case that is both an Outlier and has Leverage is said to "Influence" the Regression Line. It affects the Constant (a) and Slope (B).

What is an Outlier?

A case with an unusual (Y) given its (X) value.

With Logit, we use ___ to perform Hypothesis Tests.

B (coefficient estimate). (STN): The substantive meaning of B is different in each type of regression model.

Regression does a better job of Predicting (Y) values than Predicting the Mean. Why?

Because Regression gives you Coefficients, which estimate how much of an effect (X) has on (Y). Not just the strength of the relationship like a Difference of Means test would.

Why shouldn't you use OLS with a binary (Y)?

Because it will violate the assumptions of OLS (Linearity/Normality).

How do we phrase the effect of (X) on (Y) for a Continuous (X)?

Bivariate: "A 1 unit change in (X) leads to a B unit change in (Y)." Multiple: " A 1 unit change in (X1) leads to a B1 unit change in (Y), holding constant the effects of all other (Xi)."

Regression only gives you correlations, not...

Cause.

We typically use rMSE to...

Compare Models. (The one with the smaller rMSE is usually better.)

Regression gives us ___________ Expectations.

Conditional.

How do we calculate ESS?

ESS = TSS - RSS

Regression is a form of _________ Model.

Empirical.

Including (Z) in our model allows us to...

Examine the Effect of (X) on (Y) while holding (Z) constant

R-Squared is very important if we are trying to Predict, but not so much for....

Hypothesis Testing.

When is Multicollinearity a problem?

If/When our variables/relationship are/is not statistically significant.

Why do we need a Reference Category for Binary (X)'s?

In order to estimate B, the variables cannot be Perfectly Collinear/Related. If we don't leave one variable out as the Reference, then they'd be Perfectly Related.

When the # of Variables increases, R-Squared will...

Increase.

How do we select a Reference Category?

It should be driven by the interesting comparisons, but really it's just arbitrary.

Z and T distributions are the sample with _____ sample sizes.

Large.

What is Ordinary Least Squares (OLS) Regression? What is the Process (3)?

Minimizing the Sum of the Squared Errors? 1. Start with a Scattterplot 2. Draw the line that Minimizes the Sum of Squared Errors 3. Predict the Average Value of (Y) for variance in (X)

Interactions introduce....

Multicollinearity (STN): We may also have (Z)'s and/or Sample Problems (OO)

a and B are the ________ of our Regression Models.

Parameters.

In Multiple Regression, each B tells us the ______ effect of each (X) on (Y).

Partial.

What are the 2 Most Common Types of Linear Probability Models?

Probit and Logit.

Logit and Probit are Examples of ______ Models with Binary (Y)'s.

Proper.

What is the Formula for R-Squared?

R-Squared = ESS/TSS

What does a Statistically Significant Effect for one of the Dummy Variables tell us?

That that category is different from its Reference Category. (STN): Dummy Variable is another name for Binary Variables in a Regression Model.

What does "k" mean/signify?

The # of parameters.

What is ESS?

The Explained Sum of Squares. The Deviations from the mean predicted by our model.

Marginal Effect has the same meaning as...

The Partial Effect/Coefficient.

What is the Essence of Regression?

The Regression Line for Y on X estimates the Average Value of Y corresponding to Each Value of X. "What is the predicted value for Y given specific info/values for X?"

What is RSS?

The Residual Sum of Squares. The Total Deviations that our Model Does Not Explain.

What is TSS?

The Total Sum of Squares. The Total Deviations from the Mean.

B1 tells us the.. B2 tells us the...

The effect of (X1) on (Y) that cannot be explained by (X2). The effect of (X2) on (Y) that cannot be explained by/controlling for (X1)

What is Multicollinearity?

The correlation between our (X)'s. (OO)

What is a Residual (ui)?

The difference between the actual value of (Y) and the predicted value of (Y) from our empirical model. Formula Residual(ui) = Y(observed) - Yhat(predicted)

How do Models control for (Z)?

The formula for each B does not use all of the variation in (X) and (Y). Just that which can't be explained by some other variable in our model. (...): See Triple Venn Diagram Example in Notes.

What is rMSE?

The measure of the Typical Deviations from the Regression Line.

The closer the R-Squared is to 1...

The more of the Variation our Model explains and the better it is at predicting.

How many Binary Variables do you include in the Model?

The number of categories in the variable minus one.

What is R-Squared?

The proportion of the variance in (Y) that our model explains.

TRUE or FALSE: When using Dummy Variables, we can only make inferences about our Reference Category and not necessarily other Dummy Variables included in the model.

True.

TRUE or FALSE: We can Transform the Typical OLS Model to estimate models that take into account the Distribution of (Y). Give an Example of this.

True. Logistic Regression AKA Logit.

Which form of Uncertainty in our sample parameter estimates is usually more important?

Uncertainty around B (our slope estimate).

How do you calculate Predicted Values with Several (X)'s?

Vary 1 (X) and hold the others at some constant. (STN): "Some Constant" is typically the Mean.

How do you Interpret a Linear Probability Model?

We have a Binary (Y), but the model is the same as before. "A one unit change in (X) results in a B unit change in the Probability that (Y)=1."

What is the importance of Goodness of Fit?

We want to know how well the model predicts/explains the Dependent Variable.

What is Leverage.

When a case has an unusual (X) value.

When do we employ the use of Probit/Logit Linear Probability Models?

When we have a Continuous (X) and a Categorical (Y).

When do you use Interaction Terms within a model?

When you believe that the value of one (X) depends on the value of another (X).

Show the model for Bivariate Regression.

Y = a + Bx Y = Dependent Variable X = Independent Variable B = Coefficient for X on Y a = Constant (Y, when X = 0)

What is the Multiple Regression Model?

Y = a + Bx1 + Bx2 + Bx3 + .... + Bxn + u

What happens if you do use OLD with a Binary (Y)?

You will be performing a Linearity Probability Model.

With Logit, we use a _-test, instead of a t-test for Statistical Significance.

Z-Test.


Ensembles d'études connexes

Chapter 11 Supply Chain Management

View Set

Theories of learning- classical conditioning

View Set

PEDS CH 29, Pediatric Emergencies (Exam 4), PEDS EXAM 4 Chapter 34

View Set

Microscopy, Centrifugation, Spectrophotometry and Photometry

View Set

Writing a Narrative Application Essay 100%

View Set

ABU 1-02 Geld und Konsum Fachbegriffe

View Set

Week 8 - Stigma, discrmination and SDH & Social justice and SDH

View Set

FIN 501: Ch. 3 "Working with Financial Statements"

View Set

lecture_3_DataStructures_and_Iteration

View Set

5AP Chemistry Chapter 12-13 Possible Questions

View Set