ECON 803 Midterm Review
An F-test statistic is:
((𝑅𝑅𝑆𝑆―𝑈𝑅𝑆𝑆)/𝑚)/𝑈𝑅𝑆𝑆/(𝑇―𝑘) ~𝐹𝑚,𝑇―𝑘
Suppose the researcher estimates a second regression (see below) where all factors but excess market return are excluded and then calculates an F-statistic to test the hypothesis that the coefficients of the excluded variables are jointly insignificant. The correct formula for the F-statistic is:
(4819-4150)/4150 * (276-6/4) ~ F4276-6
Put the following steps of the model-building process in the order in which it would be statistically most appropriate to do them: (i) Estimate model (ii) Conduct hypothesis tests on coefficients (iii) Remove or replace variables to eliminate autocorrelation, heteroscedasticity and multicollinearity (iv) Conduct diagnostic tests on the model residuals.
(a) (i) then (iv) then (iii) then (ii)
If a regression equation omits an important variable,
(a) The coefficients on other variables with be biased and inconsistent if the excluded variable is correlated with them; (b) The intercept estimate will be biased and inconsistent; (c) The standard errors will be biased upwards; (d) *All of the above.
Which of the following models can be estimated using ordinary least squares? (i) An AR(1) (ii) An ARMA(2,0) (iii) An MA(1) (iv) An ARMA(1,1).
(i) and (ii) only
Corr(yt,yt-1) = T1 = (01)stdev^2/(1+0^2)stdev^2 Which of the following statements are TRUE? (i) An MA(q) can be expressed as an AR(infinity) if it is invertible (ii) An AR(p) can be written as an MA(infinity) if it is stationary (iii) The unconditional (long run) mean of an ARMA process will depend only on the intercept and on the AR coefficients and not on the MA coefficients (iv) An AR(1) process will have zero pacf except at lag
(i), (ii), (iii), and (iv).
If the residuals of a regression on a large sample are found to be heteroscedastic which of the following might be a likely consequence? (i) The coefficient estimates are biased (ii) The standard error estimates for the slope coefficients may be too small (iii) Hypothesis tests may be wrong.
(ii) and (iii) only
If OLS is used in the presence of heteroscedasticity, which of the following will be likely consequences? (i) Coefficient estimates may be misleading (ii) Hypothesis tests could reach the wrong conclusions (iii) Forecasts made from the model could be biased (iv) Standard errors may inappropriate.
(ii) and (iv) only
Suppose you had to guess at the most likely value of a one hundred-step-ahead forecast for the AR(2) model given in Question 14 - what would your forecast be?
-0.27
Consider the following AR(2) model. What is the optimal 2-step-ahead forecast for y if all information available is up to and including time t, if the values of y at time t, t-1 and t-2 are -0.3, 0.4 and -0.1, respectively, and the value of u at time t-1 is 0.3? yt = -0.1 + 0.75yt-1 - 0.125yt-2 + ut
-0.34
Assuming there are 1000 observations in your sample, what are the critical values of a two-sided hypothesis test of whether the true value of statistically different from zero given a 5% significance level in Example 1?
-1.96 and 1.96
Consider a series that follows an MA(1) with zero mean and a moving average coefficient of 0.5. What is the value of the autocorrelation function at lag 2?
0
Consider the following MA(3) process yt = 0.1 + 0.4ut-1 + 0.2ut-2 - 0.1ut-3 + ut What is the optimal forecast for yt, 3 steps into the future (i.e., for time t+2 if all information until time t-1 is available), if you have the following data? ut-1 = 0.3; ut-2 = -0.6; ut-3 = -0.3
0.07
Consider a series that follows an MA(1) with zero mean and a moving average coefficient of 0.5. What is the value of the autocorrelation function at lag 1?
0.40
Use the following to answer Questions 28 to 30. A researcher is interested in forecasting the house price index in Country Z. The observed price index values from 1996 to 2000 are 101, 103 104, 107 and 111. The researcher uses two different forecasting models, A and B. The forecasts for the price index using Model A are 100.5, 102.4, 103.2, 106 and 111 whilst the forecast using Model B are 100.8, 102.2, 104, 104.2 and 112.1. What are the closest to the mean squared errors for model A and B's forecasts?
0.45 and 1.95, respectively
What are the closest to the mean absolute errors from models A and B?
0.58 and 0.98, respectively
Consider the following model estimated for a time series yt = 0.3 + 0.5 yt-1 - 0.4 et-1 + et where et is a zero mean error process. What is the (unconditional) mean of the series, yt ?
0.6
The roots of the process are
1 and 1/2
Suppose you have calculated the following regression results: yt = 1.25+.64x The standard errors of alpha and beta are 1.22 and 0.58, respectively. Assume the hypothesis is to test whether the true value of statistically different from zero. What is the t-statistic value in Example 1?
1.10
Suppose you have 5-year annual data on the excess returns on a fund manager's portfolio ('fund ABC') and the excess returns on a market index (where is the return ABCron fund ABC, is the risk-free rate and is the return on the market index) The frMrestimated intercept, alpha ( ), and slope, beta ( ), are 2.3 and 3.1, respectively. If the ˆˆexcess returns on the market index is 12%, what would we estimate the excess return of Fund ABC to be?
39.5%
The graphs above are time series plots of residuals from two separate regressions. Which of these combinations is true?
A shows negative autocorrelation and B shows positive autocorrelation
Which of the following sets of characteristics would usually best describe an autoregressive process of order 3 (i.e., an AR(3))?
A slowly decaying acf, and a pacf with 3 significant spikes
What is the relationship, if any, between the normal and t-distributions?
A t-distribution with infinite degrees of freedom is a normal distribution
Which of the following is FALSE?
A t-distribution with infinite degrees of freedom is an F-distribution
A process, xt, which has a constant mean and variance, and zero autocovariance for all lags is best described as
A white noise process
French-Fama Factor Model of excess returns on DuPont stock from 1999-2004. Excess market return=(Mkt-RF), SMB=small firm factor, HML=high book-to-value factor, termspread=slope of yield curve, creditspread=(corporate yields - Treasury yields). The French-Fama Model explains
About 34% of the variation in excess returns on DuPont stock
For Questions 2 and 3, consider the following set of simultaneous equations: y1t = a0+a1Y2t + a2Y3t + a4X1t + u1t y2t = B0 + B1X1t + B3X2t + B4X3t +u2t y3t = y0+y1Y1 + u3t Assume that the Y's are endogenous and the X's exogenous variables, and that the error terms are uncorrelated. 2. Which of the following statement is true of equation (3)?
According to the order condition, it is over-identified
Which of the following conditions must hold for the autoregressive part of an ARMA model to be stationary?
All roots of the characteristic equation must lie outside the unit circle
. Consider the following picture and suggest the model from the following list that best characterises the process:
An ARMA(1,1) Explanation: The acf is clearly declining very slowly in this case, which is consistent with this as the autoregressive part to an appropriate model. The pacf is clearly significant for lags 1 and 2, but the question is: does it them become insignificant for lags 2 and 4, indicating an AR(2) process, or does it remain significant, which would be more consistent with a mixed ARMA process? Well, given the huge size of the sample that gave rise to this acf and pacf, even a pacf value of 0.001 would still be statistically significant. Thus an ARMA process is the most likely candidate.
To test if residuals are normally distributed, one uses a
Bera-Jarque test
If the residuals of a model containing lags of the dependent variable are autocorrelated, which one of the following could this lead to?
Biased and inconsistent coefficient estimates
Which of these is a test for residual autocorrelation?
Breusch-Godfrey test
. Estimation of equation (2) on its own using OLS would result in
Coefficient estimates that are neither unbiased nor consistent.
Which one of the following is NOT a symptom of near multicollinearity?
Confidence intervals on parameter estimates are narrow
regression equation contains an irrelevant variable, the parameter estimates will be
Consistent and unbiased but inefficient
Which of these is an appropriate way to determine the order of an ARMA model required to capture the dynamic features of a given data?
Determining the number of parameters that minimises the information criteria
Which of these assumptions is violated when an equation is estimated using OLS when it is in fact part of a simultaneous structural system?
E(X'u) /= 0
French-Fama Factor Model of excess returns on DuPont stock from 1999-2004. Excess market return=(Mkt-RF), SMB=small firm factor, HML=high book-to-value factor, termspread=slope of yield curve, creditspread=(corporate yields - Treasury yields). Answer the questions below. Based on the French-Fama Model, at the 5% significance level, which factors (explanatory variables) are significantly different from zero?
Excess market return, SMB and HML are significantly different from zero.
Standard errors
Give us an idea of the precision of estimates of alpha and beta
The estimators ˆa andˆb determined by OLS will be the Best Linear Unbiased Estimators (BLUE) if which of the following assumptions hold? (I) The errors have zero mean (II) The variance of the errors is constant and finite over all values of the independent variable(s) (III) The errors are linearly independent of one another (IV)There is no relationship between the error and corresponding independent variables
I, II, III, and IV.
Which of these statements is true? (I) The F-distribution has 2 degrees of freedom parameters (II) Asymptotically, the LM test and the Wald test are equivalent (III) The results from the LM and Wald tests may differ somewhat in small samples (IV) The F-distribution is a special case of the t-distribution.
I, II, and III
Which of these is a viable solution to the problem of multicollinearity? (I) Ignore it (II) Drop one of the collinear variables (III) Transform the highly correlated variables into an average (IV) Take the logs of the variables
I, II, and III only
Which of the following is the most accurate definition of the term 'the OLS estimator'?
It is a formula that, when applied to the data, will yield the parameter estimates
What would be the consequences for the OLS estimator if autocorrelation is present in a regression model but ignored?
It will be inefficient
Based on the MAE and MSE forecast evaluation metrics, which of these statements are true?
Model A outperforms Model B at forecasting the house price index
Characteristic eq: 1-3z+2z^2 = 0 -> (1-2z)(1-z) = 0-> z = 1,1/2 Is the following process stationary?
No
Suppose you have 103 weekly observations on the excess return on Facebook and the S&P500 index. You estimate a linear regression with the excess return on Facebook as the dependent variable and the excess return on the market index as the independent variable. You obtain α=-.03 with SE(α)=.30, β=1.2 with SE(β)=.20, standard deviation of error=3.0. Recall that a significance level of 5% corresponds to a z-value of 1.96. Answer the questions below. Based on the CAPM model, the 95% confidence level for β includes
One, but not 0.
Which one of the following statements must hold for EVERY CASE concerning the residual sums of squares for the restricted and unrestricted regressions?
RRSS >= URSS.
The second stage in two-stage least squares estimation of a simultaneous system would be to
Replace the endogenous variables that are on the RHS of the structural equations with their reduced form fitted values
Two researchers have identical models, data, coefficients and standard error estimates. They test the same hypothesis using a two-sided alternative, but researcher 1 uses a 5% size of test while researcher 2 uses a 10% test. Which one of the following statements is correct?
Researcher 2 will have a higher probability of type I error
If our regression equation is y = X + u, where we have T observations and k regressors, what will be the dimension of uu' using the standard matrix notation?
T x T
Which of these is NOT a viable 'solution' for heteroscedasticity?
Taking the first differences of the series.
Why is R^2 a commonly used and perhaps better measure of how well a regression model fits the data than the residual sum of squares (RSS)?
The RSS depends on the scale of the dependent variable whereas the R2 does not
Which of these is not a consequence of ignoring autocorrelation if it is present?
The coefficient estimates derived using OLS are biased
Which one of the following is NOT an assumption of the classical linear regression model?
The dependent variable is not correlated with the disturbance terms
What is the most appropriate interpretation of the assumption cov(ut,uj) = 0 concerning the regression disturbance terms?
The errors are linearly independent of one another.
In a time-series regression of the excess return of a mutual fund on a constant and the excess return on a market index, which of the following statements should be true for the fund manager to be considered to have 'beaten the market' in a statistical sense?
The estimate for alpha should be positive and statistically significant
A recursive forecasting framework is one where
The initial estimation date is fixed but additional observations are added one at a time to the estimation period
The type I error associated with testing a hypothesis is equal to
The size of the test
Simultaneous equations bias is a situation where
There is a two-way causal relationship between the explanatory and explained variable
What is the long-run solution to the following dynamic econometric model? Dyt = b1 + b2DX2t + b3DX3t + ut
There is no long-run solution to this equation.
Which of the following would NOT be a potential remedy for the problem of multicollinearity between regressors?
Transforming the data into logarithms
Which of the following is a correct interpretation of a '95% confidence interval' for a regression parameter?
We are 95% sure that the interval contains the true value of the parameter
To test if residuals are homoscedastic, one could use the
White test
Type I error is made when
a null hypothesis is rejected when it is actually true
The power of a test
all of the above - Depends on alpha, the significance level - The assumed value of the true parameter - The sample size
Statistic Distrib. Value df Prob>Crit.Value Breush-Godfrey LM Chi-sq 7.0 4 0.130 Breush-Pagan Chi-sq 19.0 5 0.002 5. These diagnostic statistics for the French-Fama regression above indicate that the residuals are:
heteroscedastic but not autocorrelated.
Consider a bivariate regression model with coefficient standard errors calculated using the usual formulae. Which of the following statements is/are correct regarding the standard error estimator for the slope coefficient? (i)It varies positively with the square root of the residual variance (s) (ii)It varies positively with the spread of X about its mean value (iii)It varies positively with the spread of X about zero (iv)It varies positively with the sample size T
i only
Question 3 refers to the following regression estimated on 64 observations:yt = b1 + b2X2t + b3X3t + b4X4t + ut Which of the following null hypotheses could we test using an F-test? (i) b2 = 0 (ii) b2 = 1 and b3 + b4 = 1 (iii) b3b4 = 1 (iv) b2 -b3 -b4 = 1.
i, ii, iv only
Suppose that the value of R^2 for an estimated regression model is exactly one. Which of the following are true? (i)All of the data points must lie exactly on the line (ii)All of the residuals must be zero (iii)All of the variability of y about its mean has been explained by the model (vi)The fitted line will be horizontal with respect to all of the explanatory variables.
i,ii,iii only
If the F-statistic is larger than the critical F-statistic value chosen (e.g. at the 5% level) the researcher should conclude that the excluded variables are:
jointly significant.
If our regression equation is y = X + u, where we have T observations and k regressors, what will be the dimension of using the standard matrix notation?ˆ
k x 1
Estimates of the slope and intercept with the classical regression model
minimize the sum of the squared residuals
A regression has a high R^2 with low t-statistics for its explanatory variables. Which problem is most likely?
multicollinearity
Which of the following is NOT correct with regard to the p-value attached to a test statistic?
p-values can only be used for two-sided tests
Our regression equation, y = Xbeta + u, where we have T observations and k regressors. Estimation by ordinary least squares minimizes
u'u
What is the long-run solution to the following dynamic econometric model?
y= -(B1 + B5X2+ B6X3)/B4
What is the relevant encompassing model required to compare the two regression models?
yt = y1 + y2x2t + y3x3t + y4x4t + y5x5t + wt
The probability of a Type I error is
α, the significance level chosen
Suppose you have 103 weekly observations on the excess return on Facebook and the S&P500 index. You estimate a linear regression with the excess return on Facebook as the dependent variable and the excess return on the market index as the independent variable. You obtain α=-.03 with SE(α)=.30, β=1.2 with SE(β)=.20, standard deviation of error=3.0. Recall that a significance level of 5% corresponds to a z-value of 1.96. Answer the questions below. 1. Based on the CAPM model, at the 5% significance level, which is true?
β is significantly different from zero, whereas α is not.
Test statistics for the LM test and the Wald test are usually constructed to follow a
χ2 distribution and F-distribution, respectively