Econometrics 1
heteroskedasticity
- Is the situation when the variances of the error are not constant The variance of the error term, given the explanatory variables, is not constant.at different values, x has different standard errors - different shapes on the graph (at every level of education, characs of data distribution changes, SD is different) (destroys integrity of model) invalidate calculations used to calc t-score OLS estimator is unbiased, but Standard errors is incorrect; need robust regression for test of hypothesis
ceteris paribus
A latin phrase that means "all other things held constant"
coefficient of variation
A measure of dispersion calculated by dividing a distribution's standard deviation by its mean. SD/meanX100% • so what—compares variability in two groups of scores where the means are known to be different (good candidate for the qualifying exam) • allows us to put SD in the yardstick of expected value • if we talk about 20, is 20 large or small, this defines it in terms of the expected mean and SD, like using a yardstick to measure a room, expressing the SD in the unit of the mean
variance
A measure of variability based on the squared deviations of the data values about the mean - average squared differences
dummy variable
A qualitative variable that takes on the value zero or one. Dummy variables only have a slope when they interact- fig 7.1 The main advtg of running dichotomous regression instead of several regression is in the gain in degrees of freedom- as degrees of freedom increases, the statistical power increases - another reason we prefer model with fewer variables - as the number of variables increase, can increase or remain the same- that is why is we use adjusted R squared - Intercept- cons is intercept _ binary Beta 0 when female is 0, then intercept represents males if female = 1 - - we now care about sig of intercept - - it is incorrect to say something is more significant than something else
Adjusted R-Squared
A goodness-of-fit measure in multiple regression analysis that penalizes additional explanatory variables by using a degrees of freedom adjustment in estimating the error variance. (SSR/n-k-1)SST(n-1)
Aysmptote
Properties of estimators and test statistics that apply when the sample size grows without bound.
Confidence intervals
Provide a range of likely values for the population parameter. The importance of confidence intervals comes into play when we are using interval estimation. Interval estimate involves the following steps: 1) select a level of confidence (95% confidence) - note how this is different from the 1% or 5% levels of confidence we use with p-values and point estimates, which is the form of estimation we've been using mostly in class. 2) analyze the sample data 3) extract a number out of a statistical table 4) build an interval that surrounds the sample statistic
Galton 1886
Regression Towards Mediocrity in Hereditary Stature.
Retrospective Data
Data collected in the present based on recollections of past events; apt to be inaccurate because of faulty memory, bias, mood, and situation.
F statistic
F-statistic is the ratio of mean values of SSM / mean SSR the ANOVA test statistic, equal to the ratio of two independent estimates of the common population variance , s2A / s2W , which is also the ratio of explained variance to unexplained variance; F-statistic is the ratio of mean values of SSM / mean SSR
interaction term
One variable interacts with another to impact the dependent variable.The relationship between Y and X1 depends on X2.
Efficiency
if an estimator has smaller variance, it is more efficient - smallest variance is most efficient Ex of target might have unbiased- hits target but does not come close you can correct for bias, but you cannot correct for inefficiency- such as qual studies that do not measure error Consistent and efficient consistent and unbiased consistent but not efficient - which do we use? Most likely choose efficient, altho biased-
x
independent variable, the explanatory variable, the control variable, the predictor variable, or the regressor. (The term covariate is also used for x.)
A statistic
is a numerical value calculated from a sample that is variable and known. Naghshpour, Shahdad (2012-11-10). Statistics for Economics (Kindle Location 4792). Business Expert Press. Kindle Edition. A statistic is a number that is computed from the data in a sample.
unbiased
is achieving consistent results on average • an estimator is said to be unbiased if its expected value on average equals the population characteristic, sample mean equals population mean, if what you are getting is the same as what you are hoping for, on average, then it is unbiased • no such thing as consistency in a statistic, an estimator can be consistent
estimator
it is a rule that can be applied to any sample of data to produce an estimate (p. 102); a statistic based on sample observations that is used to estimate the numerical value of an unknown population parameter
Shortcoming of qualitative analysis (Dr. N)
lack validity because they cannot calculate standard deviation. You can't prove or disprove. With stats, you can calculate margin of error (Type I error) probability of Type I error
hypothesis test
o start with a null and alternative hypothesis based on a hypothesis based on a theory o calculate a test statistic (will either be given to us or will come from the computer) o convert the test statistic into a p value o last step is inference: if p value small enough, reject null, if not small enough, fail to reject • a test is one-tailed if directional • a two-tailed test is half as powerful, looks at two end points in the normal distribution, conceptually the same as confidence intervals
p value
probability value, for your results to be significant, this should be equal to or less than .05 at 5 % level; at 10% level, less than .1 - p value is how much of 1- the area under the curve sums to 1
R-squared
ratio of SSM / SST A statistic, sometimes called the coefficient of determination, defined as the explained sum of squares divided by the total sum of squares. It measures the strength of the relationship between a dependent variable and one or more independent variables
Chi-square Naghshpour, Shahdad (2012-11-10). Statistics for Economics (Kindle Location 4642). Business Expert Press. Kindle Edition.
represents the distribution function of a variance. Naghshpour, Shahdad (2012-11-10). Statistics for Economics (Kindle Locations 4642-4643). Business Expert Press. Kindle Edition.
F statistic
the ANOVA test statistic, equal to the ratio of two independent estimates of the common population variance , s2A / s2W , which is also the ratio of explained variance to unexplained variance • The null hypothesis for an F statistic is always that the model is not a good fit
odds ratio
the chances of returns- a coefficient gives us this information
marginal economic approach
the last unit you produce determines the price
Elasticity
the percent change in one variable given a 1% ceteris paribus increase in another variableA measure of how much one economic variable responds to changes in another economic variable.
consistent
the variance gets smaller as the sample size gets larger
goodness of fit
to see if null hypothesis can be rejected- look at p value of F stat to see validity of model- R squared is measure of goodness of fit
prediction
using previous data to make a prediction about data -outside of data - predicting future temp using our data (book sometimes uses this interchangeably, not correct)
inference
using statistics to draw conclusions about parameters (Dr. N) book's definition: • in inference, take what you observe, then use that to make conclusions about parameters which never observed • deduce a conclusion from observing facts, and based on those facts, decide what is the probability that this can happen or cannot happen
estimation
within our data (contrast with prediction) any kind of computation - estimate- between data (ex: avg temp)- data that we have
simple linear regression model
y = b0 + b1x b2 + u
Z- score
Z- score- a variable is standardized in the sample by subtracting its mean and dividing by its standard deviation (p 189 PDF) - called standardized coefficients or beta coefficents- betaj hat? Z-score- observed- minus expected/SD - 2 options standardize xs - dependent and independent variables- beta coefficients
multicollinearity
a case of multiple regression in which the predictor variables are themselves highly correlated
causal effect
a ceteris paribus change in one variable has an effect on another variable.
parameter
a characteristic of a population that is constant and generally unknown. why no population variance—it is a constant number, constant numbers do not have variance
dichotomous variable
a discrete variable that has only two possible amounts or categories
z score
a measure of how many standard deviations you are away from the norm (average or mean) a variable is standardized in the sample by subtracting its mean and dividing by its standard deviation (p 189 PDF) - called standardized coefficients or beta coefficents
t-statistic
coefficient divdided by standard error. used to test hypotheses about a population when the value of the population variance (and standard deviation) is unknown; uses the same formula as the z-statistic except that the estimated standard error is substituted for the standard error in the denominator
Chow test
-model- restricted/unrestricted - put F formula 2.47 p. 150 Chow test- Dr. N has written articles using Chow test - notion that when two models are not nested- you augment by creating bigger model that becomes unrestricted model -procedure to see if one model is better than another Use the chow test when you think your sample is not homogeneous but you have two categories that should not be lumped together when estimating the regression model
linear
...
Gauss-Markov assumptions
1) Linear 2) Random 3) NPC - no perfect collinearity 4) Zero conditional mean (error =0) 5) HomoSkedascity A set of assumptions under which OLS is BLUE Assumption MLR.1 (Linear in Parameters) The model in the population can be written as y 5 b0 1 b1x1 1 b2x2 1 ... 1 bkxk 1 u, where b0, b1, ..., bk are the unknown parameters (constants) of interest and u is an unobserved random error or disturbance term. Assumption MLR.2 (Random Sampling) We have a random sample of n observations, {(xi1, xi2, ..., xik, yi): i 5 1, 2, ..., n}, following the population model in Assumption MLR.1. Assumption MLR.3 (No Perfect Collinearity) In the sample (and therefore in the population), none of the independent variables is constant, and there are no exact linear relationships among the independent variables. Assumption MLR.4 (Zero Conditional Mean) The error u has an expected value of zero given any values of the independent variables. In other words, E(uux1, x2, ..., xk) 5 0. Assumption MLR.5 (Homoskedasticity) The error u has the same variance given any value of the explanatory variables. In other words, Var(uux1, ..., xk) 5 s2.
Regression - 3 properties
1. Unbiasedness—an estimator is said to be unbiased if, and only if, its expected value is equal to the parameter • Regression is popular because can theoretically can prove that the estimator is an unbiased estimator of the population slope • Theoretically if assumptions of regression are met the estimate is unbiased 2. Consistent—a estimator is said to be a consistent estimator if its variance gets smaller as the sample size gets larger 3. Efficiency—an estimator is said to be efficient if it has a smaller variance than any other estimator • OLS—refers to minimization of the squared errors (distance between the observations and the regression line), meets the definition of efficiency
Type I error
Error of rejecting null hypothesis when in fact it is true (also called a "false positive"). You think you found a cause effect relationship but ONE IS NOT THERE
standard deviation
A computed measure of how much scores vary around the mean score.The square root of the variance
correlation coefficient
A statistical measure of the extent to which two factors vary together, and thus of how well either factor predicts the other. from - to +1 • when close to 1 in absolute value—it is high, when close to 0 it is low • procedures to test whether a correlation is significant or not • still does not mean causality or even related • association is a better word to use than correlation (favorite word of IDV students—correlation is specifically defined relationship), in scientific writing use association instead of using correlation casually
Time series data
A time series data set consists of observations on a variable or several variables over time.
Kahn & Roseman
Advertising as an Engineering Science
Type II error
An error that occurs when a researcher concludes that the independent variable had no effect on the dependent variable, when in truth it did; a false negative.
population regression function (PRF)
E(y/x) = b0 + b1x
Sum of squares model
SSM uses the differences between the mean value of Y and the regression line
Sum of Squares - Residuals
SSR uses the differences between the observed data and the regression line
SST
SST uses the differences between the observed data and the mean value of Y
standard error of the regression
The SER is an estimator of the standard deviation of the error term. This estimate is usually reported by regression packages, although it is called different things by different packages. (In addition to SER, s ˆ is also called the standard error of the estimate and the root mean squared error. (p. 100 of ecometrics PDF)
zero conditional mean
The error u has an expected value of zero given any value of the explanatory variable. In other words, E(u│x)=0
Homoskedasticity
The errors in a regression model have constant variance conditional on the explanatory variables.Var(u/x) =sigma squared. SSM sum of sq model over sum of sq residual- SSM/SSR - SSM/SST is coefficient of determination (it has to be less than one)- we cannot explain more than total variation Sum of squares model cannot be more than sum of squares total Don't take numbers at face value- make sure they are statistically significant - how do you determine? Use t-test- divide coefficient and divide by standard error
R squared
The percentage of total variation in the response variable, Y, that is explained by the regression equation; Explained variance/total variance (of total model in simple regression- 1 variable) On stata, model Sum of Squares divided by total Sum of Squares = R2 - tells us how much of the variation explains
Gauss-Markov Theorem
Under Assumptions MLR.1 through MLR.5, ˆb 0, ˆb 1, ..., ˆb k are the best linear unbiased estimators (BLUEs) of b0, b1, ..., bk, respectively (p. 102 of the PDF) Theorem 3.4. the theorem that states that under the five Gauss-Markove assumptions (for cross-sectional or time-series models), the OLS estimator is BLUE (best linear unbiased estimator )conditional on the sample values of the explanatory variables)., Given classical assumptions I through VI, the OLS estimator of βᵢ is the minimum variance estimator from among the set of all linear unbiased estimators of βᵢ, for i = 0, 1, 2,...,i.
sampling variance of the OLS slope estimator
Var (Betahat) = sigma sq/ SST (1-R2 subj) p. 94 of PDF The variance of a sample mean (Ymeanj) that is used to estimate an unknown population mean (β0j), and is given by σ2/nj.
sampling variances of the OLS slope estimators
Var (Bhat sub j) = Sigma squared divided by SSTj(1-R2subj)
R squared
Wooldrige: In a multiple regression model, the proportion of the total sample variation in the dependent variable that is explained by the independent variable. Explains the variation of Y that is explained by ONE independent variable.
Endogenous Explanatory Variable
an explanatory variable in a multiple regression model that is correlated with the error term, either because of an omitted variable, measurement error, or simultaneity.
exogenous explanatory variable
an explanatory variable that is uncorrelated with the error term.
variance
average differences of the mean, squared. A measure of spread within a distribution (the square of the standard deviation).
Cleary & Sharpe randomness in the stock market
finanical specialists did not do better than random changes in the stock market in the short term; changes in the long term occurred with changes in the economy. Changes in the market are inherently random & hard to predict
y
dependent variable, the explained variable,response variable, the predicted variable, or the regressand
p value
do not use to judge "how signficant" a variable is- either it is, or it isn't based on the pre-determined value. Do not use to compare relative stregnth of variables
u
error term or disturbance in the relationship, represents factors other than x that affect y. A simple regression analysis effectively treats all factors affecting y other than x as being unobserved. You can usefully think of u as standing for "unobserved."
panel data (or longitudinal data)
set consists of a time series for each cross-sectional member in the data set. As an example, suppose we have wage, education, and employment history for a set of individuals followed over a ten-year period. Or we might collect information, such as investment and financial data, about the same set of firms over a five-year time period. Panel data can also be collected on geographical units. For example, we can collect data for the same set of counties in the United States on immigration flows, tax rates, wage rates, government expenditures, and so on, for the years 1980, 1985, and 1990. The key feature of panel data that distinguishes them from a pooled cross section is that the same cross-sectional units (individuals, firms, or counties in the preceding examples) are followed over a given time period.
beta coefficient
standardized variable - when all variables are standardized in order to compare different slopes Standardized Coefficients: Regression coefficients that measure the standard deviation change in the dependent variable given a one standard deviation increase in an independent variable. Standardized Random Variable: A random variable transformed by subtracting off its expected value and dividing the result by its standard deviation; the new random variable has mean zero and standard deviation one. to standardize coefficient- z score- observed-expected divided by Standard deviation Ability to compare slope of unlike items- test scores and salary
central limit theorem
states that in repeated random samples from a population, the sample mean will have a distribution function approximated by normal distribution, the expected value of the sample mean is equal to the true value of the population mean, and the variance of the sample mean is equal to population variance divided by the sample size. Naghshpour, Shahdad (2012-11-10). Statistics for Economics (Kindle Locations 4637-4640). Business Expert Press. Kindle Edition. ..., As the sample size increases, the distribution of the sample mean of a randomly selected sample approaches the normal distribution. • A theorem that states that the sampling distribution curve (for sample sizes of 30 and over) will be centered on the population parameter value and it will have all the properties of a normal distribution. • if we sample randomly and repeatedly, then the sample statistic (mean, median, etc) will be an unbiased estimate of the population parameter and it will have a standard deviation which is going to decline • inference about sample mean—will be unbiased (that is, same as the sample mean), variance will be smaller than the Var. of the population • why no population variance—it is a constant number, constant numbers do not have variance • the sample mean will be an unbiased estimator—on avg. sample mean will equal pop mean, the variance of the sample will equal variance of the pop divided by sample size • that's why if sample size increase, sample variance will get smaller and smaller • what does it mean if sample variance approaches 0? Sample mean = population mean (will become pop mean when probability = 0) • only way to be zero is to have an infinite sample size • as sample size gets larger, estimate becomes more consistent
normal distribution
• Mean=0 • SD=1 • Things more likely to happen around the mean • Look at corners areas in the normal distribution: o the likelihood that something falls in a particular area. • if SD is big, smaller probability of covering the entire area • building the confidence interval approach • If the shaded area under the curve is 0.06, 6% probability that the number would fall in to that area, can't make a conclusion • take a random sample, mean weight is 250, shaded area is 0.008, therefore 0.8% chance that it would fall in this area, 0.008 probability of being wrong • when being analytical, provide the probability of doing a type 1 error • central limit theorem • distinguishes analysis from superstition • no SD for 1 or 2 case samples (sample size 2 has 1 degrees of freedom), must have a minimum of 3, in practice require 30 observations or more