Chapter 6 - Regression Analysis
The example of momentum p is the product of the mass m and the velocity v of an object; that is, p=mv, is an example of a ____________________ relationship.
'deterministic'
The nonzero slope coefficient test is used for a renowned financial application referred to as the capital asset pricing model (CAPM).The model y = α + βx + ɛ, is essentially a simple linear regression model that uses α and β, in place of the usual β0 and β1, to represent the intercept and the slope coefficients, respectively. Which of the following is true about the slope coefficient α, called the stock's alpha? Select that apply! Multiple select question. -Abnormal returns are positive when α > 0 -Abnormal returns are negative when α < 0. -Measures how sensitive the stock's return is to changes in the level of the overall market -The CAPM theory predicts α to be zero
-Abnormal returns are positive when α > 0 -Abnormal returns are negative when α < 0. -The CAPM theory predicts α to be zero
In the presence of correlated observations, the OLS estimators are unbiased, but their estimated standard errors are inappropriate. Which of the following could happen as a result? -The model looks better than it really is with a spuriously high R2 -The F test may suggest that the predictor variables are individually and jointly significant when this is not true -All of the answers are correct -The t test may suggest that the predictor variables are individually and jointly significant when this is not true
-All of the answers are correct
Consider the following linear regression model, which links the response variable y with k predictor variables x1, x2,..., xk:y=β0+β1x1+β2x2+... +βkxk+ε. If, for example, the slope coefficient β1 equals zero, then the predictor variable x1 does what and implies what? Choose all that answer the 'does what?' and 'Implies what?' questions! -Implying that x1 does not influence y -Drops out of the equation -Then x1 influences y -Does not drop out of the equation
-Drops out of the equation -Implying that x1 does not influence y
The detection methods for multicollinearity are mostly informal. Which of the following indicate a potential multicollinearity issue? -Individually insignificant predictor variables -High R2 plus individually insignificant predictor variables -High R2 and significant F statistic coupled with insignificant predictor variables -Significant F statistic coupled with individually insignificant predictor variables
-High R2 and significant F statistic coupled with insignificant predictor variables
What is a good solution when confronted with multicollinearity? Select all that apply Multiple select question. -Add another variable -Obtain more data because the sample correlation may get weaker -Drop one of the collinear variables -Obtain more data because a bigger sample is always better
-Obtain more data because the sample correlation may get weaker -Drop one of the collinear variables
In the presence of changing variability, the estimated standard errors of the OLS estimators are inappropriate. What does this imply about using standard testing? -We should use F tests only -Standard t or F tests are not valid as they are based on these estimated standard errors. -Use standard t or F tests -We should use standard t tests only
-Standard t or F tests are not valid as they are based on these estimated standard errors.
Which of the following are the assumptions that underlie the classical linear regression model? Please select all that apply! Multiple select question. -The regression model given by y = β0 + β1x1 + β2x2 +... + βkxk + ɛ is linear in the parameters β0, β1,..., βk. -There is an exact linear relationship among the predictor variables; or, in statistical terminology, there is no perfect multicollinearity. -Conditional on x1, x2,.., xk, the error term ɛ is uncorrelated across observations; or, in statistical terminology, there is no serial correlation. -The error term ɛ is correlated with any of the predictor variables x1, x2,..., xk
-The regression model given by y = β0 + β1x1 + β2x2 +... + βkxk + ɛ is linear in the parameters β0, β1,..., βk. -Conditional on x1, x2,.., xk, the error term ɛ is uncorrelated across observations; or, in statistical terminology, there is no serial correlation.
We can plot the residuals sequentially over time to look for correlated observations. If there is no violation, then what would you see? -The residuals should show no pattern around the vertical axis. -The residuals should show a normal pattern around the horizontal axis. -The residuals should show no pattern around the horizontal axis. -The residuals should show a normal pattern around the vertical axis.
-The residuals should show no pattern around the horizontal axis
An important first step before running a regression model is to compile a comprehensive list of potential predictor variables. How can we reduce the list to a smaller list of predictor variables? -The best approach may be to do nothing -We must include all relevant variables -Use the adjusted R2 criterion to reduce the list -We use R to make the necessary correction
-Use the adjusted R2 criterion to reduce the list
A crucial assumption in a linear regression model is that the error term is not correlated with the predictor variables. In general, when does this assumption break down? -When there are too many variables in the model -When important predictor variables are excluded. -The estimated standard errors of the OLS estimators are inappropriate -When the standard errors are distorted downward
-When important predictor variables are excluded.
The nonzero slope coefficient test is used for a renowned financial application referred to as the capital asset pricing model (CAPM).The model y = α + βx + ɛ, is essentially a simple linear regression model that uses α and β, in place of the usual β0 and β1, to represent the intercept and the slope coefficients, respectively. Which of the following is true about the slope coefficient β, called the stock's beta? Select that apply! Multiple select question. -When β equals 0, any change in the market return leads to an identical change in the given stock return. -A stock for which β > 1 is considered more "aggressive" or riskier than the market -Measures how sensitive the stock's return is to changes in the level of the overall market -When β equals 1, any change in the market return leads to an identical change in the given stock return.
-When β equals 1, any change in the market return leads to an identical change in the given stock return. -Measures how sensitive the stock's return is to changes in the level of the overall market -A stock for which β > 1 is considered more "aggressive" or riskier than the market
A dummy variable, also referred to as an indicator or a binary variable, takes on numerical values of 1 or 0 to describe two categories of a categorical variable.For a predictor variable that is a dummy variable, it is common to refer to the category that assumes a value of 0 as: Please select all that apply. Multiple select question. Benchmark category Reference category Regression dummy Dummy category
Benchmark category Reference category
The assumption of constant variability of observations often breaks down in studies with cross-sectional data. Consider the model y = β0 + β1x + ɛ, where y is a household's consumption expenditure and x is its disposable income. It may be unreasonable to assume that the variability of consumption is the same across a cross-section of household incomes. This violation is called: Nonlinear Patterns Multicollinearity Changing variability Correlated Observations
Changing variability
If the value of the response variable is uniquely determined by the values of the predictor variables, we say that the relationship between the variables is: Regressive Deterministic Random Stochastic
Deterministic
If the linear regression model includes an intercept, the number of dummy variables representing a categorical variable should be one less than the number of categories of the variable. This solution helps avoid which problem? -Categorical variables -This is not an issue -Dummy variable trap -Category sum trap
Dummy variable trap
True or false: R2 can decrease as we add more predictor variables to the linear regression model True False
False
In the case of a dummy variable categorizing a person's gender, we can define 1 for male and 0 for female. In this case, what would the reference category be? § Male § 1 § Female § 0
Female
What are some measures that summarize how well the sample regression equation fits the data? Goodness-of-fit Predictor variables Dummy variables Regression
Goodness-of-fit
For the linear regression model, y = β0 + β1x1 + β2x2 + . . . + βkxk + ɛ, which of the following are the competing hypotheses used for a test of joint significance? Choose both the correct test for the null and alternative hypotheses. Multiple select question. H0:β1=β2=... =βk=0 HA:At least one βi≠0 H0: βj =βj0 HA: βj ≠βj0
HA:At least one βi≠0 H0:β1=β2=... =βk=0
A simple linear regression model and is represented as y = β0 + β1x1 + ɛ,; What do β0and β1 (the Greek letters read as betas) represent? (They must be shown in the correct order!) 1. Intercept, slope 2. Dependent, independent 3. Slope, intercept 4. Slope, dependent
Intercept, slope
The variance inflation factor (VIF) is another measure that can detect a high correlation between three or more predictor variables even if no pair of predictor variables has a particularly high correlation. What is the smallest possible value of VIF? (absence of multicollinearity). Zero VIF exceeds 5 or 10 VIF does not exceed 5 or 10 One
One
What is the condition called when two or more predictor variables have an exact linear relationship? Model inadequacies Nonzero slope coefficient Nonlinear violation Perfect multicollinearity
Perfect multicollinearity
'______________________' plots are used to detect some of the common violations to the regression model assumptions. These graphical plots are easy to use and provide informal analysis of the estimated regression models.
Residual
If residual plots exhibit strong nonlinear patterns, the inferences made by a linear regression model can be quite misleading. In such instances, we should employ nonlinear regression methods based on simple transformations of the '___________________' and the predictor variables.
Response
When we assess a linear regression model, there are several tests we can use. What is the test called that determines whether the predictor variables x1, x2,..., xk have a joint statistical influence on y? Test for nonzero slope coefficient Test of joint significance Test for goodness of fit Test of individual significance
Test of joint significance
In order to select the preferred model, we examine several goodness-of-fit measures: Select all goodness-of-fit measures examined! Multiple select question. The standard coefficient The coefficient of determination The adjusted coefficient of determination The standard error of the estimate
The coefficient of determination The standard error of the estimate The adjusted coefficient of determination
We can use residual plots to gauge changing variability. The residuals are generally plotted against each predictor variable xj Which of the following indicates there is no violation? -There is no way to indicate no violation -The predictor variable is randomly dispersed across the residuals -The residuals are randomly dispersed across the values of xj -The residuals are NOT randomly dispersed across the values of xj
The residuals are randomly dispersed across the values of xj
In order to avoid the possibility of R2 creating a false impression, virtually all software packages include adjusted R2. Unlike R2, adjusted R2 explicitly accounts for what? The number of samples taken Multicollinearity The sample size n The number of predictor variables k
The sample size n The number of predictor variables k
Instead of se2,we generally report the standard deviation of the residual, denoted se, more commonly referred to as The descriptive statistic Goodness-of-fit The standard error of the estimate The standard deviation of the sample
The standard error of the estimate
What is used to evaluate how well the sample regression equation fits the data? The standard error of the estimate The goodness-of-fit measure The coefficient of determination, R² The dispersion of residuals
The standard error of the estimate The coefficient of determination, R²
We use analysis of variance (ANOVA) in the context of the linear regression model to derive R2.We denote the total variation in y as Σ(yi−y ̄)2, which is the numerator in the formula for the variance of y. What is this total variation called? Total sum of squares Total error Squared error Regression error
Total sum of squares
True or false: Linearity is justified if the residuals are randomly dispersed across the values of a predictor variable. True False
True
We can plot the residuals sequentially over time to look for correlated observations. How are violations indicated? -When positive residuals are shown consistently over time and negative residuals are shown consistently over time -When all the residuals are negative -When positive residuals and negative residuals alternate over a few periods, sometimes positive or negative for a couple of periods. -There is no detection method
When positive residuals and negative residuals alternate over a few periods, sometimes positive or negative for a couple of periods.
If one or more of the relevant predictor variables are excluded, then the resulting OLS estimators are biased. The extent of the bias depends on the degree of the '____________" between the included and the excluded predictor variables.
correlation
We can use residual plots to gauge changing variability. The residuals are generally plotted against each predictor variable xj. There is a violation if the variability increases or '______________' over the values of xj.
decreases
When comparing models with the same response variable, we prefer the model with a smaller se. A smaller se implies that there is '_________________' dispersion of the observed values from the predicted values.
less
When confronted with multicollinearity, the best approach may be to do '_______________________' if the estimated model yields a high R2,
nothing
If we include as many dummy variables as there are categories, then their sum will equal '__________________'.
one
If a linear regression model uses only one predictor variable, then the model is referred to as a '______________' linear regression model
simple
In the presence of changing variability, the OLS estimators are '_______________________', but their estimated standard errors are inappropriate
unbiased
Suppose the competing hypotheses in testing for individual significance are H0: βj = 0 versus HA: βj ≠ 0. What would rejecting the null hypothesis imply? xj explains all the variation in y xj is not significant in explaining y xj is significant in explaining y We would accept the null hypothesis
xj is significant in explaining y
The simple linear regression model y = β0 + β1x + ɛ implies that if x goes up by one unit, we expect y to change by how much? (irrespective of the value of x), ɛ β1 x β0
β1