BA with RRRRR

Ace your homework & exams now with Quizwiz!

Standard Significance Level

.1, .05, .01, .001

using R to adjust for heteroscedasticity

1. White estimator for robustness 2. tiny g and tiny w??

R linear model output order of terms

1. estimate 2. standard error 3. t-value 4. p(value)

model selection

1. r^2 -- always goes up when you add variables 2. adjusted r^2 -- bad statistics 3. t-testing -- compounds type 1 error 4. AIC or BIC are the best ways to select your model

where does endogeneity come from?

1. sample selection bias 2. omitted variable bias 3. measurement error bias 4. reverse regression bias 5. simultaneity bias

Covariance

A measure of linear association between two variables. Positive values indicate a positive relationship; negative values indicate a negative relationship

Correlation

A measure of the extent to which two factors vary together, and thus of how well either factor predicts the other. Between -1 and 1

normal random variable

A random variable whose probability distribution defines a standard bell-shaped curve

heteroscedasticity

A regression in which the variances in y for the values of x are not equal -- very problematic because it messes up the hypothesis test results and confidence intervals

interaction variable

a characteristic whose different values are associated with differences in the pattern and/or strength of the relationship between x and z. Shows the differences (if any) between the slopes of the regression line of x variable at different levels of z variable

independence

a critical assumption that is made in statistics all the time -- assume that rows are independent of each other. This means we can separate the variables

r squared

a measure of the strength of a linear relationship; the coefficient of determination

endogeneity

a relationship between the error rate and the independent variable; a violation of base assumptions

joint hypothesis

testing two hypotheis (Ex: B1=0 and B2=0); use ANOVA to calculate F stat and find p value to decide whether or not to reject the null

sample selection bias

the bias introduced when data availability leads to certain observations being excluded from the analysis. Can be reduced if add other variables

Beta 0

the intercept; = y-bar-B1(x-bar)

conditional probability

the probability that one event happens given that another event is already known to have happened

confidence intervals

the range on either side of an estimate that is likely to contain the true value for the whole population; reverse the t-statistic

how to show interaction variables in R?

variable 1:variable2

joint distributions

ways to describe potential relationshis when you have multiple random variables

what are you doing when you log a rate?

you're looking at a growth rate

logs in linear model

tells us that we're interpreting a percentage change

BIC

Bayesian information criterion; ln(r)*r-2lnL where r= number of variables and L = likelihood

simultaneity bias

Bias in an estimate of a causal effect due to reverse causation (x impacts y but y also impacts x)

reverse regression bias

Cannot reverse the linear regression model; example from class: bwght=B0+B1cigs does not equal cigs=B0+B1bwght because the error term and bwght are completely correlated

complicated hypothesis

Example: B1=B3; need to use algebra to rearrange the linear model and calculate for the difference (theta??)

as we get more data, what happens to the variance

It decreases so a larger data set provides more accurate results than a smaller one

is there a test for endogeneity?

No -- you have to know your data and think about room for bias

Beta 1

Sxy/Sx^2; coefficient of independent variable that shows relationship between independent and dependent

omitted variable bias

The bias that arises in the OLS estimators when a relevant variable is omitted from the regression. Can reduce omitted variable bias by adding relevant variables

expected value

The mean of a probability distribution.

central limit theorem

The theory that, as sample size increases, the distribution of sample means of size n, randomly selected, approaches a normal distribution. CLT takes an unknown distribution and standardizes it to a normal distribution

white estimator for robustness

adjust variance to account for heteroscedasticity -- you can use the white estimator on both heteroscedastic and non-heteroscedastic data

AIC

akaike information criterion; 2r-2lnr where r = number of variables

likelihood

another way to find best fit; essentially the joint distribution PDF and it helps us estimate where mu and population standard deviation are

uniform random variable

any random variable with a uniform density function

weighted least squares method

assigns less weight to studies with smaller samples or greater error; in R ??? good luck with that

are lower values better or worse in AIC and BIC when selecting model?

better

if we know there is exogeneity, what is bias?

bias(beta 1) = 0

bias

breaking of fundamental assumptions

when log is on the right of the linear model...

divide by 100

when there are logs on both sides of a linear model

do nothing

quadratic models

include a squared variable in the linear model to recognize diminishing marginal returns

iid

independent and identically distributed -- all values are independent of other values and all values are from the same distribution

t statistic

indicates the distance of a sample mean from a population mean in terms of the estimated standard error; is the result something that came from a normal distribution or something that came from infinity

Probability

likelihood that a particular event will occur

when log is on the left of the linear model...

multiple coefficient by 100

measurement error bias

occurs when we have inaccurate data due to a faulty or inappropriate measuring tool (example: mothers reported number of cigs they smoked/day while pregnant)

OLS

ordinary least squares -- standard linear model that finds the best fit line by minimizing the sum of error squared

BA with RRRRR

Related study sets

Crystal Systems

Psych Ch1

FIN 300 CH 1

SIE Unit 19 Qbank

Chapter 7.3

Data Analysis Technology Bootcamp: Power BI

Financial Markets Chapter 14 The Mortgage Markets

ESPM 50 AC midterm

Unit 3 Final Exam (Study Guide 3)

Native Americans - True or false?

2.7 Visual Communication Design (PART OF FINAL)

AH2 Adaptive Quizzing

Words with the prefix ANTI

chapter 11

ch. 15 assignment

PPR Practice Test 3

Chapter 6 Anatomy Reading Objectives

GSCM 510

chp7

3. Corporate Social Responsibility and Citizenship