Econometrics Final Exam

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

2 Vital properties of a good instrument

(1) Instrument Relevance: Zi should be correlated with the causal variable of interest xI (endogenous variable) (2) Instrument Exogeneity: If the instruments are not exogenous, then IV estimation is inconsistent. The idea of instrumental variables regression is that the instrument contains information about variation in Xi that is unrelated to the error term ui.

Standard assumptions for the multiple regression model

(1) Linear in parameters (2) Random sampling (3) No perfect collinearity (4) Zero conditional mean (5) Homoskedasticity

Violation of the Non-random sampling assumption

(1) Self-selection bias: when the selection is based on the participant's decision, the selection is likely to be correlated with the error (2) Sample selection based on the dependent variable: If sample selection is based on the dependent variable sample selection is correlated with the error term u (3) Missing Data: observations with missing information cannot be used

3 Main issues to be addressed in simple regression anaysis

(1) Since there is never an exact or deterministic relationship between two variables, how do we allow for other factors to affect y? (2) What is the functional relationship between x and y? (3) How can we be sure we are capturing the ceteris paribus relationship between x and y?

Two special cases when simple regression will produce the same estimate as multiple regression

(1) The OLS coefficients on x2...xk are all 0 (2) X1 is uncorrelated with each x2...xk in the sample

Components of OLS Variances

(1) The error variance (2) the total sample variation in the explanatory variable (3) Linear relationships among the independent variables (the term 𝑅^2𝑗 in the variance)

No perfect collinearity

(1) There exists no set of numbers λ1 and λ2, not both zero such that λ1 . x1i + λ2 . x2i = 0, if such is the case an exact linear relationship exists, the x2 and xs are said to be collinear or linearly dependent- one variable is a constant multiple of another (2) One independent variable is not an exact linear function of two or more other independent variables

The smaller the standard error of 𝛽̂ 1 ...

(1) the smaller the error variance 𝜎² (the smaller the variance of unobserved and unknown random influences on Yi) (2) the larger is the variation in the X values (3) the larger the size of the sample (n) due to the fact that n is in the denominator of the standard error calculation

confidence coefficient

(1-𝛼) The confidence level expressed as a decimal value. For example, .95 is the confidence coefficient for a 95% confidence level.

Alternative Hypothesis

(H1) Tests against the Null Hypothesis which may state for example that true Bj is not equal to zero

Biased estimator

(w) = E(W) - θ

In presence of heteroscedasticity-the following is true

1) ) Although the OLS estimator remains unbiased, but no longer efficient (minimum variance)

If multicollinearity is perfect in a regression model, then the regression coefficients of the explanatory variable are indeterminate.

1) A) Always True

To prevent a tradeoff between large error variance and high multicollinearity, an independent variable should be included in a regression model

1) A) When it affects y and is uncorrelated with all of the independent variables of interest.

Which of the following is true of BLUE?

1) C) An estimator is linear if and only if can be expressed as linear function of the data on the dependent variable.

The variable rdinterns is the expenditure on research and development (R&D) as a percentage of sales. Salary is measured in millions of dollars. The variable profmarg is profit as a percentage of sales. Let's suppose the equation below is estimated.

1) C) The absolute change in rdinterns for a percentage change in sales

In trying to test that females earn less than their male counterparts we estimated the following model: W= average earnings per day in dollars, D= 1 for females, and 0 otherwise. Here B1 measures:

1) C) The differential intercept coefficient for female earning

Imagine you regressed earnings of individuals on a constant, a binary variable (Male) which takes on the value 1 for males and is 0 otherwise, and another binary variable (Female) which takes on the value 1 for females and is 0 otherwise. Because females typically earn less than males, you would expect

1) C) none of the OLS estimators exist because there is perfect multicollinearity

Which of the following is an appropriate definition of a 95% confidence interval?

1) C)90% of the time in repeated samples, the interval would contain the true value of the parameter

As Rj2 tends to 1, VIF tends to...

1) D) ∞

Suppose the variable x2 has been omitted from the following two variable regression equation, y = β0 + 𝜷𝟏x1 + β2x2 + u. β1 is the estimator obtained when x2 is omitted from the equation. The bias in 𝜷𝟏i s positive if:

1) a) B2)0 and x1 and x1 are positively correlated

Consider the multiple regression model with two regressors X2 and X2, where both variables are determinants of the dependent variable. When omitting X2 from the regression, then there will be omitted variable bias for B^1

1) a) If X1 and X2 are correlated

Suppose we regress SAT score on parent's education and parent's income. If we run the regression again but also include the student's GPA as an additional regressor

1) a) The R2 for the regression will either stay the same or increase

Suppose you use regression to predict the height of a woman's current boyfriend by using her own height as the explanatory variable. Height was measured in feet from a sample of 100 women undergraduates, and their boyfriends, at Boston University. Now, suppose that the heights of women's boyfriend are converted to centimeters instead of feet. The impact of this conversion on the slope coefficient is: 1 foot = 30.48 centimeters

1) d) neither a nor b are correct

Adjusted R-squared equation

1- ((SSR/(n-k-1))/(SST/(n-1)))

Rejection rules for 5%, 10%, and 1%

5%: If P-value is >0.05 then we fail to reject the null 10%: If P-value is >0.10 then we fail to reject the null 1%: If P-value is >0.00 then we fail to reject the null

The error variance

A high error variance increases the sampling variance because there is more "noise" in the equation

two-tailed test

A hypothesis test in which rejection of the null hypothesis occurs for values of the test statistic in either tail of its sampling distribution.

Based on data taken from a sample of 38 countries the following regression was obtained. Which of the following is the most accurate interpretation of the coefficient on GDPi?

A) Holding Pop constant, an additional million dollar spent by a nation on GDP leads to an increase in expenditure on education by 0.523 million dollars.

Consider the following equation: Where salary = major league baseball salary, years in the major league and rbisyr is the runs batted in per year Here the coefficient on years can be interpreted as follows: An additional year in the major league is associated with a .0904 percent increase in salary

A) TRUE

If the p-value associated with an F statistic in a multiple regression model is 0.00, then we can say that the regression is overall significant at the 1% level.

A) TRUE

In case of imperfect multicollinearity, the sampling variances/standard errors of the explanatory variables involved will be inflated making it difficult to estimate the partial effect of any of the independent variables on the dependent variable.

A) TRUE

In single parameter hypothesis testing, a two tailed test is used when we test the null hypothesis that a coefficient does not have any effect against the alternative that the coefficient has an effect

A) TRUE

Measurement error in the explanatory variable will bias the OLS estimator towards zero—also known as attenuation bias.

A) TRUE

Partialling out nature of multiple regression coefficient means that the multiple regression coefficient of any single variable, holding the others constant implies measuring the effect of that variable , net of any effect other independent variables in the model may have on it and the dependent variable.

A) TRUE

The confidence interval constructed for the slope estimator is a random variable

A) TRUE

The following simple model is used to determine the college grade point average of female and male athletes. The variable 'female' takes a value of 1 if the person is female and the variable 'sat' measures the SAT score of the individual. Here β2 measures the additional effect of one more point in SAT on the cumulative gpa of females relative to males

A) TRUE

The null hypothesis pertaining to the F stat in a regression output is as follows - all the explanatory variables are jointly equal to zero.

A) TRUE

When computing the t ratio for single parameter hypothesis testing (Null: β=0) if the point estimate obtained is exactly equal to zero (i.e., the hypothesized value of the population parameter), then the t value will be equal to 0

A) TRUE

When investigating the impact of participating in a social welfare program on an outcome of interest, self-selection bias occurs when individuals choose to participate in a program based on unobserved characteristics that may be correlated with the outcome of interest

A) TRUE

With regards to difference in differences method -- we can never observe both potential outcomes at the same time. In the treated group, the potential outcomes with treatment are factual (we can observe them), but the potential outcomes with no treatment are counterfactual (we cannot observe them). In diff-in-diff, we use data from the control group to impute untreated outcomes in the treated group. This is the "secret sauce" of diff-in-diff

A) TRUE

Why is adjusted R-squared better?

Adjusted R-squared adjusts for the number of explanatory variables in the model. Its value increases only when the new term improved the model fit more than expected by chance alone. The value can decrease when the term doesn't improve the model fit by sufficient amount, ie; the benefit from including it in the model is less than the cost. when more regressors are added SSR (the numerator) falls.

Interpretation of coefficients in Multiple Regression Model

All else held constant a 1 unit increase in X will increase/decrease y by Bx.

Dummy variable coefficient interpretation when Y is logged

As the dummy changes from 0 to 1 Y increases/decreases by (X*100)% points.

1) An explanatory variable is said to be exogenous if it is correlated with the error term.

B) FALSE

Suppose that you are interested in estimating the average impact of capital punishment on county level murders. After controlling for observed factors that influence murder and may be correlated with capital punishment, you find that the coefficient on # of executions (variable that represents capital punishment) are 0.37 and the standard error is 1.23. Thus, we can infer that number of executions has a positive and statistically significant impact on murders at a 95% confidence level.

B) FALSE

BLUE Properties of OLS

Best Linear Unbiased Efficient Ordinary Least Squares (1)Linearity (2)Unbiasedness (3)Minimum Variance

Interpretation of the slope coefficient in a multiple regression model

By how much does the dependent variable change if the j-th independent variable is increased by one unit, holding all other independent variables constant PARTIAL EFFECT

Dummy variable coefficient interpretation

By how much the value of the categorical variable differs from the mean value of the omitted category. All comparisons made relative to the omitted category ie: Being female increases one's average wage by 𝛿0 relative to males .

Instrumental variables regression uses instruments to

C) To isolate the movements in X that are uncorrelated to U

1) The Gauss Markov assumption E(ui |Xi) = 0 is more credible in a multiple regression model because: i) It explicitly allows to hold other factors fixed while examining effects of a particular independent variable on a dependent variable and thus reduces the probability of omitted variable bias ii) There is no collinearity between regressors iii) the sum of the residuals is no longer zero.iv) It can incorporate fairly general functional form relationships - for example quadratic relationships Which of the following is true?

C) i) and iv)

3. Which of the following correctly identifies an advantage of using adjusted R2 over R2?

C)Adjusted R2 imposes a penalty for adding new independent variables to the model if it does not improve the model fit more than expected by chance alone which is why it is always less than R2

Variance Inflation Factor (VIF)

Can be used to assess and eliminate multicollinearity. The speed with which variance increases is defected by the VIF. VIF should not be larger than 10

Interpretation of coefficients in a double logged model

Coefficient on X is the elasticity of Y with respect to X; When X increases by 1% Y increase/decrease by (coefficient)%

t statistic

Defines a rule that determines when H0 is rejected in favor of H1

The multicollinearity problem

Dropping some independent variables may reduce multicollinearity but may lead to issues such as omitted variable bias. Only the sampling variance of the variables involved in multicollinearity will be inflated; the estimated of other effects may be precise

Violation of endogenienty

Error term correlated with the independent variable; zero conditional mean assumption does not hold through (1) Omitted Variable Bias (2) Measurement Error (3) Simultaneity or Reverse Causality

Change in Units of Dependent Variable (Y)

If Y is multiplied by c, all OLS coefficients are multiplied by c, intercept is multiplied by c and standard errors are multiplied by c

Omitted Variable Bias

If we omit a variable from the regression that belongs in the model, either positive or negative bias

Change in Units of Independent Variable (X)

If xj is multiplied by c, its coefficient is divided by c and its standard error is divided by c

Primary drawback of simple regression

It is very difficult to draw ceteris paribus conclusions about how x effects y. The key assumption that all other factors affecting y are uncorrelated with x is often unrealistic

Linearity of OLS Estimators

Linear function of the dependent variable, estimator for the slope is a weighted sum of the outcomes (y), this makes it a linear function of y.

Variance of OLS estimators

OLS estimator has the minimum sampling variance within the class of linear unbiased estimators.

Unbiasedness of OLS Estimators

Probability distribution has an expected value equal to the parameter it is supposed to be estimating. If we indefinitely draw random samples on Y from the population, compute an estimate each time, and then average these estimates over all random samples, we would obtain θ.

R-Squared

Proportion of the sample variation in Y explained by the OLS regression line. Does not decide unbiasedness of an OLS estimator. The proportion of the sample variation in Y jointly explained by all explanatory variables. Increases when another independent variable is added to the model making it a poor tool in understanding whether one variable or several variables should be added to the model. NEVER DECREASES

Linear relationships among the independent variables

Regress xj on all other independent variables (Including the constant), R-squared of this regression (R2j) will be the higher when xj is highly correlated with the other independent variables. High correlation between two or more independent variables is called multicollinearity if X variables are highly collinear, then it is difficult to isolate the individual influence on Y. If this happens, the sampling variance of the slope estimator for xj will be higher.

SST =

SSE + SSR

R-Squared Equation

SSE/SST or 1-SSR/SST

OLS bias calculation

See chart

Omitted Variable Bias Expected Value

See chart

T-statistic equation

T = (estimate) / standard error

F-Test

Tests whether the k explanatory variables have no effect and can be excluded from the regression. Joint test asking if explanatory variables are relevant; large values of F tell us explanatory variables are significant

interaction of dummy variable coefficient interpretation

The additional effect of X1 on Y is greater/lower for X2𝛿0 compared to that of X2𝛿1

Omitted (Base) Category

The category for which no dummy variable is assigned (Benchmark/reference category)

Interpretation of coefficients in a semi logged model

The coefficient on X*100 is the semi elasticity of Y with respect to X; When X increases by 1 Y increases approximately by (100*coefficient)%

Interpretation of unbiasedness of OLS estimators

The estimated coefficients may be smaller or larger than the true parameter β, depending on the sample that is the result, however, on average, they will be equal to β.

independent variable

The experimental factor that is manipulated; the variable whose effect is being studied. The Independent variable, explanatory variable, or the regressor

r squared interpretation

The j regressors jointly explain r-sq% of total variation in y

dependent variable

The outcome factor, explained variable, response variable; the variable that may change in response to manipulations of the independent variable (y)

Null Hypothesis

The stated hypothesis (H0), the population parameter is equal to 0; ie after controlling for the other independent variables, there is no effect of Xj on y. Presumed to be true until the data strongly suggests otherwise

1. Difference-in-differences (DID) analysis is used widely to estimate the causal effects of health policies and interventions. A critical assumption in DID is "parallel trends": that pre-intervention trends in outcomes are the same between treated and comparison groups. True False

True

Residuals Equation

Ui = Yi = Yhati

Instrumental Variables

Used to obtain a consistent estimate of a coefficient, if we want to split Xi into two parts

Reverse Causality (Simultaneity)

When one or more independent variable is jointly determined with the dependent variable. If Y is determined by the Xs , it makes the distinction between dependent and explanatory variables of dubious values, this often occurs when the explanatory variable under consideration is a "choice variable"

Measurement Error

When the independent variables we use in our model are measured with error. X is measured with error if the data includes inaccurate or misreported information. This will cause a big problem because it can be shown that x is now correlated with the error term.

Dummy Variable Trap

When using dummy variables, one category always has to be omitted: Golden Rule-For each qualitative regressor, the number of dummy variable introduced should be one less than the categories of that variable. If a qualitative variable has m categories, only use (m-1) dummies.

Interpretation of coefficients in quadratic functional forms

a change in X increases/decreases Y by ∆Y/∆X = 0

Test statistic

a statistic used in hypothesis testing. It is some function of a random sample (formula). Its value changes as the sample changes. Given a test statistic we can define a rule that determines when H0 is rejected in favor of H1; helps determine whether a null hypothesis should be rejected

If the p-value of a given two tailed test statistic associated with a single coefficient estimate is 0.23 then, which of the following is true

a) The coefficient is statistically insignificant

1. Suppose the relationship between wage, years of education (educ), years of experience (exper), and participation in a job training program (train) is modeled as: wage= B0+B1educ+ B2exper+ B3train Which of the following is the most accurate interpretation of the coefficient B3? a. Holding education and experience constant, participating in the training program is predicted to increase the wage by $B3 b. Holding education and experience constant, participating in the training program is predicted to increase the wage by B3% c.Participating in the training program is predicted to increase the wage by B3% . d.Participating in the training program increases the wage by $B3

a. Holding education and experience constant, participating in the training program is predicted to increase the wage by $B3

7. In the following equation, gdp refers to gross domestic product, and FDI refers to foreign direct investment. log(gdp) = 2.65 + 0.527log(bankcredit) + 0.222FDI Which of the following statements is then true? a. If FDI increases by 1 percentage point, gdp increases by approximately 22.2%, the amount of bank credit remaining constant. b. If FDI increases by 1%, gdp increases by approximately 26.5%, the amount of bank credit remaining constant. c. If FDI increases by 1%, gdp increases by approximately 24.8%, the amount of bank credit remaining constant. d. If FDI increases by 1%, gdp increases by approximately 52.7%, the amount of bank credit remaining constant.

a. If FDI increases by 1 percentage point, gdp increases by approximately 22.2%, the amount of bank credit remaining constant.

1. Which of the following is true about an instrument: a. It should be substantially correlated with X but uncorrelated with U b. It should be strongly correlated with U c. It should always be a binary variable d. It should affect Y only via X and not have a direct effect on Y

a. It should be substantially correlated with X but uncorrelated with U d. It should affect Y only via X and not have a direct effect on Y

6. In a regression analysis if r-squared = 1, then a. SSE=SST b. SSR=SSE c. SSR=SST d. SSE=1

a. SSE=SST

8. 9. Holzer et al 1993 did a study where they looked at the effect of job training grants on worker productivity. log (scrap)=4.99-.052grant-.455sales+.639 log (employ) Here: The regression slope on sales measures: a. The relative change in Y for an absolute change in x b. The absolute change in Y for a percentage change in x a. By how many units y changes for a unit change in x b. The relative change in Y for a relative change in x

a. The relative change in Y for an absolute change in x

9. The statement that "only the sampling variance of the explanatory variables involved in multicollinearity will be inflated; the estimates of other explanatory variables may be very precise" is correct. a. True b. False

a. True

1. In a regression model, which of the following will be described using a binary variable? a. Whether it rained on a particular day or it did not b. The concentration of dust particles in air c. The percentage of humidity in air on a particular day d. The volume of rainfall during a year

a. Whether it rained on a particular day or it did not

1. To prevent a tradeoff between large error variance and high multicollinearity, an independent variable can be included in a regression model:​ a. when it affects y and is uncorrelated with all of the independent variables of interest. b. when it does not affect y and is correlated with all of the independent variables of interest c. when it does not affect y and is uncorrelated with all of the independent variables of interest. d. when it affects y and is correlated with all of the independent variables of interest.

a. when it affects y and is uncorrelated with all of the independent variables of interest.

6. A larger error variance makes it difficult to estimate the partial effect of any of the independent variables on the dependent variable. a.True b.False

a.True

Zero Conditional Mean Assumption

all factors on the unobserved error term are uncorrelated with the explanatory variable

Standard error

an estimate of the standard deviation of the estimator; an index for the amount by which the regression coefficient is likely to vary across cases on an average. A measure of the precision with which the regression coefficient is measured

Consider the following equation that investigates the effect of sleep-in total minutes per week spent sleeping at night on total weekly minutes spent working, education and age in years and a gender dummy The turnaround value for age in the equation is

b) 34 years

8.If the explained sum of squares (SSE) in a regression analysis is 66 and the total sum of squares (SST) is equal to 90, what is the value of the coefficient of determination R2? a. 1.2 b. .73 c. .27 d. .55

b. .73

10. Larger the total sample variation on xj larger is the Var (͡βj) a. True b. False

b. False

10. The OLS regression line passes through the sample means of y and x. a. True b. False

b. False

9. In a regression of y on x1 and x2, the ceteris paribus effects of x1 when holding x2 constant implies measuring the effect of x1 on the mean value of y, net of any effect x1 may have on x2 a. True b. False

b. False

3. In a study relating college grade point average to time spent in various activities, you distribute a survey to several students. The students are asked how many hours they spend each week in four activities: studying, sleeping, working and leisure. Any activity is put into one of the four categories, so that, for each student, the sum of the hours in the four activities must be 168. You then estimate the following model. GPA= βo+β1study +β2sleep +β3work +β4leisure +u. Which of the following problems are you going to run into: a. Heteroscedasticity b. Perfect collinearity c. Multicollinearity d. Over-controlling of variables

b. Perfect collinearity

1. A _____ variable is used to incorporate qualitative information in a regression model a. dependent b. binary c. continuous d. binomial

b. binary

5. Which of the following is NOT a violation of the assumption of no perfect collinearity between regressors? a. one regressor is a constant multiple of another b. one regressor is a square of another regressor c. one regressor is expressed as an exact linear function of two or more regressors

b. one regressor is a square of another regressor

3. Rj 2 = 1 indicates that that explanatory variable j is a. uncorrelated to other regressors in the model b. perfectly correlated to other regressors in the model c. somewhat correlated with other regressors in the model d. highly correlated to other regressors in the model

b. perfectly correlated to other regressors in the model

1. Consider the equation below. A null hypothesis, = 0 states that: a. x2 has no effect on the expected value of β2 b. x2 has no effect on the expected value of y. c. β2has no effect on the expected value of y. d. y has no effect on the expected value of x2

b. x2 has no effect on the expected value of y.

4. The value of Adjusted R-squared is always less than R-squared a. Incorrect b.Correct c. Dependent on n value d.Dependent on k value

b.Correct

1. Which of the following is true of dummy variables? a. A dummy variable always takes a value less than 1. b. A dummy variable takes a value of 1 or 10 c. A dummy variable takes a value of 0 or 1 d. a dummy variable takes a value higher than 1

c. A dummy variable takes a value of 0 or 1

1. When there are omitted variables in the regression, which are determinants of the dependent variable, then a. this has no effect on the estimator of your included variable because the other variable is not included. b. this will always bias the OLS estimator of the included variable. c. the OLS estimator is biased if the omitted variable is correlated with the included variable. d.you cannot measure the effect of the omitted variable, but the estimator of your included variable(s) is (are) unaffected.

c. the OLS estimator is biased if the omitted variable is correlated with the included variable.

4. Suppose that you are interested in estimating the ceteris paribus relationship between y and x1. For this you collect data on two control variables, x2 and x3. (For concreteness you might think of y as final exam score, x1 as class attendance and x2 as GPA up through previous semester and x3 as SAT or ACT score. Let B1 be the simple regression estimate from y on x1 and let B2 be the multiple regression estimate from y on x1, x2 and x3. We would expect B1 and B2 to be very similar if: a. x1 highly correlated to x2 and x3 in the sample and x2 and x3 have large partial effects on y b. x1 is almost uncorrelated to x2 and x3 in the sample and x2 and x3 have large partial effects on y c. x1 is almost uncorrelated to x2 and x3 in the sample and x2 and x3 have negligible partial effects on y d. x1 is highly correlated to to x2 and x3 in the sample and x2 and x3 have negligible partial effects on y

c. x1 is almost uncorrelated to x2 and x3 in the sample and x2 and x3 have negligible partial effects on y

2. Which of the following correctly identifies an advantage of using adjusted R2 over R2? a.Adjusted R2 corrects the bias in R2. b.Adjusted R2 is easier to calculate than R2. c.The penalty of adding new independent variables is better understood through adjusted R2 than R2. d.The adjusted R2 can be calculated for models having logarithmic functions while R2 cannot be calculated for such models.

c.The penalty of adding new independent variables is better understood through adjusted R2 than R2.

1. In trying to test that female earn less than their male counter parts, we estimate the following model. ͡͡Y = β1+β2D Where Y = average earnings per hour in $, D=1 for male and 0 otherwise Here β2 refers to a. Average hourly earnings of females b. Difference in the average hourly earnings of males and females c. Slope coefficient of the regression equation d. Average hourly earnings of males

d. Average hourly earnings of males

Why is the number of dummy variables to be entered into the regression model always equal to the number of groups (g) minus 1 (g-1)? a. To control for other variables in the model b. To increase the R-squared value c. To avoid the model misspecification d. To avoid the situation of perfect multicollinearity

d. To avoid the situation of perfect multicollinearity

7. In a multiple regression model, the zero conditional mean assumption is much more credible because a. the average value of the error term in the population is zero b. the explanatory variables are uncorrelated c. the independent variables and the error term are negatively correlated d. fewer things end up in the error.

d. fewer things end up in the error.

2. Which of the following is true of R^2? a. R^2 is also called the standard error of regression. b. A low R^2 indicates that the Ordinary Least Squares line fits the data well. c. R^2 usually decreases with an increase in the number of independent variables in a regression. d.R^2 shows what percentage of the total variation in the dependent variable, Y, is explained by the explanatory variables.

d.R^2 shows what percentage of the total variation in the dependent variable, Y, is explained by the explanatory variables.

Simple regression model

explains variable y in terms of x

Partial regression coefficient

gives the "direct" or the net effect of a unit change in x1 on mean value of y, net of any effect that xs may have on y

Why multiple regression analysis

it is more amenable to the ceteris paribus assumption because it allows us to explicitly control for many other factors that simultaneously affect the dependent variable. Multiple regression models accommodate many explanatory variable that may be correlated, we infer causality in cases where simple regression may be misleading. If we add more factors to our model that are useful in explaining y, then more variation in y can be explained.

The total sample variance in the explanatory variable

more sample variation leads to more precise estimates, total sample variation automatically increases with the sample size, increasing the sample size is thus a way to get precise estimates

7. Consider the estimated equation from your textbook below with the reported results from the regression (standard error in parentheses). The t-statistic for the slope under the conventional null hypothesis is approximately: sleep_hat = 3586.38 - .151totwork (38.91) (0.52) R-squared= .051 o 0.29 o 0.52 o 1.76 o 67.20

o 0.29

7. The statement "If there is sufficient evidence to reject a null hypothesis at the 5% significance level, then there is sufficient evidence to reject it at the 10% significance level" is: Please select the best answer of those provided below o Always true o Never true o Sometimes true o Not enough information

o Always true

7. H1: βj ≠ 0, where βj is a regression coefficient associated with an explanatory variable, represents a one‐sided alternative hypothesis. o True o False

o False

7. Null and alternative hypotheses are statements about sample parameters. o True o False

o False

1. Let's say you want to assess the effectiveness of a program for improving the eating habits of shift workers. Your key independent variable of interest is a dummy variable = 1 if the person participated in the program and 0 otherwise. You asses a variety of outcomes related to eating habits such as under- or over-eating, daily intake of fast food, consumption of low food that is low in fibre or high in fat, salt and/or sugar etc. You are likely to run into the following issues concerning internal validity (that is sources of possible biases) of your study: a. Omitted Variable Bias b. Selection Bias c. Measurement error d. Multicollinearity o all of the above o a, b, d o a, b and c o a, and b

o a, and b

. The significance level of a test is: o the probability of rejecting the null hypothesis when it is false. o one minus the probability of rejecting the null hypothesis when it is false. o probability of rejecting the null hypothesis when it is true. o one minus the probability of rejecting the null hypothesis when it is true.

o probability of rejecting the null hypothesis when it is true.

Critical Values of t

often denoted by -c and c; -ta/2.n-k-1 and ta/2.n-k-1; the areas between c and -c (ie, 95%)

efficient estimator

one that has the lowest variance among all unbiased estimators of the same parameter

Single tailed test

suggests that the expected differences for all groups will occur in a single direction either above or below the mean.

The more the variation in X...

the more certain we can be about the slope of the estimated regression line from a given sample, implies a smaller standard error of our estimator

level of signifigance

the sensitivity test/ margin of error 𝛼(0< 𝛼 <1), 10% level (1.645), 5% level (1.96), 1% level (2.576)

The more certain we are about the slope of the line...

the smaller the error variance and the larger the likelihood of obtaining a sample that can yield an estimation

Sampling variance

the variance of an estimator associated with a sampling distribution

When can you compare r-squares of two different models?

when the two regressions use the same number of observations.


Kaugnay na mga set ng pag-aaral

Income Tax - Chapter 2 Key Points

View Set

Access Chapter 1: End of Chapter Quiz

View Set

Chapter 13 - Labor and Birth Process Comb

View Set

MGMT 461 Reward Systems & Performance Management Final Exam

View Set

Chapter 6 PrepU - Values, Ethics and Advocacy

View Set

Chapter 7 Arrays and ArrayList Class

View Set