ECON 3010, EXAM 1 Study Guide

Ace your homework & exams now with Quizwiz!

If an independent variable in a multiple linear regression model is an exact linear combination of other independent variables, the model suffers from the problem of . . . - heteroskedasticty - omitted variable bias - homoskedasticity - perfect collinearity

perfect collinearity

Suppose you are interested in using a simple linear regression model to study the effects of time spent studying on exam scores. Your linear regression model is as follows: examscore = β0 + β1study + u. Suppose the zero conditional mean assumption holds, that is, E(u | study)=0, resulting in the following population regression function: E(examscore | study) = β0 + β1study. Identify the systematic parts and the unsystematic parts of the examscore.

systematic parts: β1, study, β0 (parts explained by study) unsystematic parts: u (parts not explained by study)

The Gauss-Markov theorem will not hold if . . . - the regression model relies on the method of random sampling for collection of data - the error term has an expected value of zero given any values of the independent variables - the error term has the same variance given any values of the explanatory variables - the independent variables have exact linear relationships among them

the independent variables have exact linear relationships among them

Which of the following is true of BLUE?​ - It is a rule that can be applied to any one value of the data to produce an estimate. - An estimator is linear if and only if it can be expressed as a linear function of the data on the dependent variable. - It is the best linear uniform estimator. - An estimator βi^ is an unbiased estimator of βi if Var(βi^) = βi for any β0, β1, . . . , βk.

An estimator is linear if and only if it can be expressed as a linear function of the data on the dependent variable.

Regardless of initial estriol level, a 1 mg/24h increase in estriol in the mother's system results in a predicted change in birth weight of . . .

*bweight is measured in hundreds of grams, estriol is measured in milligrams +0.6 hundred grams or +60 grams

Suppose you are interested in studying the effects of the hormones of mothers and the birth weight of their children. You hypothesize the following relationship between birth weight and the level of the estriol, a type of estrogen, of the mother while pregnant: bweight = β0 + β1estriol. You gather data and run a simple ordinary least squares (OLS) linear regression. The result is the following OLS regression line relating birth weight and estriol. bweight^ = 17 + 0.6 estriol. If a mother has no estriol in her system, her child's predicted birth weight is . . .

*bweight is measured in hundreds of grams, estriol is measured in milligrams bweight^ = 17 + 0.6(0) = 17(100) = 1,700 grams

bweight^ = 17 + 0.6 estriol If a mother's estriol level is 14 mg/24h, her child's predicted birth weight is . . .

*bweight is measured in hundreds of grams, estriol is measured in milligrams bweight^ = 17 + 0.6(14) = 25.4(100) = 2,540 grams

Suppose you are interested in studying the effects of education on wages. You gather four data points and use ordinary least squares (OLS) to estimate the following simple linear model: wage = β0 + β1educ + u. - Based on the data in the table, the explained sum of squares (SSE) is . . . - Based on the data in the table, the residual sum of squares (SSR) is . . . - Based on the data in the table, the total sum of squares (SST) is . . . - While you are skeptical of your OLS regression due to the low number of data points, you decide to calculate the R-squared of the regression to understand how well the independent variable educ explains the dependent variable wage. The resulting R^2 is . . .

*see notes for work - SSE = 135.2 - SSR = 1.8 - SST = 137 - R^2 = 0.9869

Suppose you are studying how the birth weight of a child is affected by the smoking habits of the mother during pregnancy. After gathering data, you run a simple ordinary least squares (OLS) regression that yields the following results: bwght^ = 115.89 − 0.623cigs. - According to your regression results, if a mother smokes one more cigarette per day on average, the child's birth weight is predicted to . . . - Suppose you would like to change the units of the dependent variable, bwght, from ounces to pounds. Let bwghtP represent the birth weight of a child in pounds. Instead of rerunning the regression with bwghtP instead of bwght as the explained variable, the results can be derived using knowledge of how a unit change can affect the regression results. The new regression line will be . . . - Suppose you decide not to change the units for birth weight (leaving it in ounces), but instead you would like to change the units of the independent variable, cigs. Instead of measuring smoking in terms of average cigarettes per day, you would like to measure smoking in terms of average number of packs per day. Let packs represent the average number of packs of cigarettes smoked by the mother per day during pregnancy. Instead of rerunning your regression with packs instead of cigs as the explanatory variable, the results can be derived using knowledge of how a unit change can affect the regression results. Assuming each pack contains 20 cigarettes, the new regression line will be . . .

*see notes for work - decrease by 0.623 ounces (-0.623(1)) - bwghtP^ = 7.2431 - 0.0389cigs - bwght^ = 115.89 - 12.46packs -

Suppose the simple OLS linear regression model is given as: y = β0 + β1x + u. - Although impossible in reality, suppose you know that Var(ux) = 3. What does this mean? - Given that the assumptions SLR.1 through SLR.4 hold, and the error term exhibits homoskedasticity with Var(ux)=σ^2, which of the following represent the variance of β1^?

- The error term exhibits homoskedasticity, but you cannot determine whether the OLS estimates will be unbiased. (error term is constant, heteroskedasticity when error term is not constant and depends on explanatory variable) - σ^2/∑(xi−x¯)^2 and σ^2/SSTx

Suppose the simple linear regression model is given as y = β0 + β1x + u. Assuming ∑i=1n(xi−x¯)2>0, what best represents the ordinary least squares (OLS) slope parameter β1^?

- The sample correlation between xi and yi, multiplied by the ratio of the sample standard deviation of yi to the sample standard deviation of xi.

Suppose you would like to analyze the effect of smoking on life expectancy, using the following simple linear regression model: lifeyrs = β0 + β1smoke + u. Which of the following factors are more likely correlated with the explanatory variable for smoking cigarettes, as well as influence the explained variable for life expectancy and thus are included in the error term? - alcohol abuse - other drug use - exercise habits - diet

- alcohol abuse - other drug use - exercise habits - diet Individuals who smoke more cigarettes are more likely to care less about their general health than those who do not smoke, making them more likely to abuse alcohol, use recreational drugs, exercise less, and have poor dietary habits. Further, each of these factors also influences life expectancy.

Suppose you are interested in studying the impact of crime on property values. - All else equal, you would expect an increase in violent crime rate to cause . . . in property value. - Suppose you run an ordinary least squares (OLS) regression of property value on violent crime rate. You get the following results: value^ = 10 + 0.5crime. The positive sign on the coefficient for crime implies that an increase in violent crime rate actually increases property value. Why might you suspect this simple regression would yield biased results?

- decrease - Property values likely vary with other factors that affect violent crime, such as inflation and population growth.

The Graduate Record Exam (GRE) is an exam most students must take to apply to graduate schools. Suppose you are interested in studying the effects of time spent studying for the GRE on the score of the math section of the test. You propose the following model: log(score) = β0 + β1log(study). - Based on the functional form of your simple OLS model, your model would best be classified as a . . . - Suppose you estimate the model using OLS, with the following results: (score)^ = 10 + 0.33log(study). Given the functional form of your model, what is the interpretation of β1^?

- log-log - For a 1% increase in hours spent studying, the predicted exam score rises by 0.33%.

Suppose you and a research partner are interested in studying the effects of height on salary. Your partner proposes using a simple OLS regression model to analyze the following population model: wage = β0 + β1^2height + u. - The proposed model is . . . in parameters. - The proposed model is . . . in variables.

- nonlinear (slope or intercept parameters appear in a nonlinear form) - linear (model contains only linear functions of the explanatory variable)

In the equation, y = β0 + β1x1 + β2x2 + u, β2 is a(n) . . . - intercept parameter - dependent variable - slope parameter - independent variable

- slope parameter

Suppose you want to evaluate the effectiveness of a job training program using wage = β0 + β1train + u as a model. You take 500 employees and divide them into two groups using a coin flip. If the coin lands on heads, the employee is given the training. If the coin lands on tails, the employee is not given the training. The simple regression gives: wage^ = 7.25 + 1.25train. - True or False: The control group consists of individuals who did not participate in the training program, while the treatment group is made up of individuals who participated in the training program. - The expected wage of an employee who participated in the training program is . . . per hour. - The expected wage of an employee who did not participate in the training program is . . . per hour. - The average treatment (or causal) effect of the training program is . . . per hour. - Which of the following best represents β1 when the explanatory variable in a simple linear regression model is binary? (Note: Assume SLR.4 (Zero Conditional Mean) holds true.)

- true (control group = not subject to treatment, treatment group = subject to treatment) - $8.50 (7.25 + 1.25(1)) - $7.25 (7.25 + 1.25(0)) - $1.25 (8.50 - 7.25) - β1 = E(wage | train = 1) − E(wage | train = 0)

Suppose you are interested in studying the effects of stock price on the salary of the company's CEO. You propose the following model: sal = β0 + β1stockp + u. You plan to run a simple OLS regression of sal on stockp. You collect a sample of four observations, each consisting of the 6-month average stock price for a company and the salary of the corresponding CEO. - CEO salary (dependent variable): 2, 4, 5, 6 - Average stock price (independent variable): 20, 45, 60, 80 - Is there variation in the explanatory variable, stockp? - Are the OLS slope and intercept estimates defined?

- yes (stockp varies in the sample and does not equal any one value for each observation) - yes (variation in stockp allows slope and intercept estimates to be defined)

One crucial assumption in the simple linear regression model is that the error term u has a mean of zero, conditional on the value of the explanatory variable x. Suppose you are using the following simple linear regression model to study the effect of education on salary. In this case, assume that besides education, natural ability level is the only other factor that influences salary, and therefore, u and ability level are equivalent: sal = β0 + β1educ + u. - If u has an expected value of 0, conditional on educ, then E(u | educ) = 0, and therefore, E(⁡sal | educ) = . . . - True or False: Assuming the model is linear in parameters, and you obtain a random sample of observations with varying values of education, the simple OLS slope and intercept estimates will be unbiased.

- β0 + β1educ - false

Suppose the following population regression function (PRF) holds: E(examscore | study) = 14 + 5study. According to this PRF, the average exam score of students who study 4 hours is . . .

14 + 5(4) = 34%

If the explained sum of squares is 35 and the total sum of squares is 49, what is the residual sum of squares? - 14 - 10 - 18 - 12

14; 49-35

Find the degrees of freedom in a regression model that has 10 observations and 7 independent variables. - 3 - 2 - 17 - 4

2

Suppose you collect data and find that xi and yi are positively correlated in your sample. You know that after using OLS to estimate the simple linear regression model, β1^ must be . . .

>0; This highlights an important limitation of the simple linear regression model with nonexperimental data. That is, the simple linear regression model is essentially a study of the correlation between the explanatory variable and the explained variable. For this reason, most economists do not use the simple linear regression model to infer causality.

True or false: An explanatory variable is said to be exogenous if it is correlated with the error term.

False

True or false: If two regressions use different sets of observations, then we can tell how the R^2s will compare, even if one regression uses a subset of regressors.

False

True or false: The coefficient of determination (R^2) decreases when an independent variable is added to a multiple regression model.

False

True or false: The key assumption for the general multiple regression model is that all factors in the unobserved error term be correlated with the explanatory variables.

False

True or false: When one randomly samples from a population, the total sample variation in xj decreases without bound as the sample size increases.

False

Suppose that you are interested in estimating the average impact a job training program has on wages. However, you recognize that there are some observed factors that influence wage, participation on the training program, or both. You may still get the unbiased estimate for the program effectiveness by: - Including factors that predict both wage and participation as controls and running a multiple linear regression - Excluding those observed factors from your model and running a simple linear regression - Including only the factors that predict wage but not participation as controls and running a multiple linear regression - Including only the factors that predict participation but not wage as controls and running a multiple linear regression

Including factors that predict both wage and participation as controls and running a multiple linear regression

Which of the following is true of R^2? - R^2 is also called the standard error of regression. - R^2 usually decreases with an increase in the number of independent variables in a regression. - A low R^2 indicates that the Ordinary Least Squares line fits the data well. - R^2 shows what percentage of the total variation in the dependent variable, Y, is explained by the explanatory variables.

R^2 shows what percentage of the total variation in the dependent variable, Y, is explained by the explanatory variables.

Suppose you have gathered data on wages and education level for n individuals. You would like to use your data to construct an ordinary least squares (OLS) simple linear regression model to study the effects of education on salary. Your OLS regression line is: wagei^ = β0^ + β1^educi. Given the OLS regression line, which of the following best represents the residual for the ith observation? - The actual wage plus the wage predicted by the OLS regression line - The difference between the actual wage and the wage predicted by the OLS regression line - The square of the wage predicted by the OLS regression line - The square of the actual wage

The difference between the actual wage and the wage predicted by the OLS regression line

True or false: A larger error variance makes it difficult to estimate the partial effect of any of the independent variables on the dependent variable.

True

True or false: The term "linear" in a multiple linear regression model means that the equation is linear in parameters.

True

Suppose you are using the following simple linear regression model to study the effect of education on salary. In this case, assume that besides education, natural ability level is the only other factor that influences salary, and therefore, u is equivalent to an individual's ability level: sal = β0 + β1educ + u. True or false: If u is mean independent of educ, then the average level of ability is the same regardless of the level of education.

True; If E(u | x) = E(u), then u is mean independent of x. In other words, the expected value of u at each level of x is the same as the expected value of u for the entire population.

Suppose the simple linear regression model is given by y = β0 + β1x + u, which describes the relationship between y and x. Identify the intercept parameter, slope parameter, response variable, control variable, and error term.

intercept parameter: β0 (also called constant term, usually not critical to regression analysis) slope parameter: β1 (change in y is equal to β1*change in x) response variable: y control variable: x error term: u (all other factors other than x that influence y)

The value of R^2 always . . . - lies between 0 and 1 - lies above 1 - lies below 0 - lies between 1 and 1.5

lies between 0 and 1

Since you are using OLS to obtain the slope and intercept parameter estimates, you know that your choice of β0^ and β1^ . . .

minimizes the sum of the squares of the differences between the actual wage and the wage predicted by the OLS regression line, for all observations in the sample (minimizes residuals or ui)

Exclusion of a relevant variable from a multiple linear regression model leads to the problem of . . . - multicollinearity - perfect collinearity - homoskedasticity - misspecification of the model

misspecification of the model

High (but not perfect) correlation between two or more independent variables is called . . . - heteroskedasticty - multicollinearity - homoskedasticty - micronumerosity

multicollinearity

The assumption that there are no exact linear relationships among the independent variables in a multiple linear regression model fails if . . . , where n is the sample size and k is the number of parameters. - n = k + 1 - n > k - n > 2 - n < k + 1

n < k + 1

[Var(βi^) = σ^2/SSTi] ignores the error variance increase because it treats both regressors as . . . - dependent - nonrandom - independent - random

nonrandom

Consider the following regression equation: y = β0 + β1x1 + β2x2 + u. What does β1 imply? - β1 measures the ceteris paribus effect of x1 on u. - β1 measures the ceteris paribus effect of x1 on y. - β1 measures the ceteris paribus effect of x1 on x2. - β1 measures the ceteris paribus effect of y on x1.

β1 measures the ceteris paribus effect of x1 on y.

Suppose the variable x2 has been omitted from the following regression equation, y = β0 + β1x1 + β2x2 + u. β1^ is the estimator obtained when x2 is omitted from the equation. The bias in β1^ is positive if . . . - β2 > 0 and x1 and x2 are negatively correlated - β2 = 0 and x1 and x2 are negatively correlated - β2 > 0 and x1 and x2 are positively correlated - β2 < 0 and x1 and x2 are positively correlated

β2 > 0 and x1 and x2 are positively correlated


Related study sets

USA Differential Diagnosis Exam #1 (Chapters 1-8)

View Set

Chapter 6: Labeling and Conflict Theories

View Set