CFA Level II - Stats

Ace your homework & exams now with Quizwiz!

The T-STATISTIC used to test the significance of the individual coefficients in a multiple regression is calculated using the same formula that is used with simple linear regression

(estimated regression coefficient - hypothesized value) / coefficient standard error of Bj DF = n - k - 1 when it says "test the statistical significance" that means the hypothesized value will be ZERO

To determine what type of model is best suited to meet your needs, follow these guidelines:

1.Determine your goal. •Are you attempting to model the relationship of a variable to other variables (e.g., cointegrated time series, cross-sectional multiple regression)? •Are you trying to model the variable over time (e.g., trend model)? 2.If you have decided on using a time series analysis for an individual variable, plot the values of the variable over time and look for characteristics that would indicate nonstationarity, such as non-constant variance (heteroskedasticity), non-constant mean, seasonality, or structural change. A structural change is indicated by a significant shift in the plotted data at a point in time that seems to divide the data into two or more distinct patterns. (Figure 3 shows a data plot that indicates a structural shift in the time series at Point a.) In this example, you have to run two different models, one incorporating the data before and one after that date, and test whether the time series has actually shifted. If the time series has shifted significantly, a single time series encompassing the entire period (i.e., both patterns) will likely produce unreliable results. 3.If there is no seasonality or structural shift, use a trend model. •If the data plot on a straight line with an upward or downward slope, use a linear trend model. •If the data plot in a curve, use a log-linear trend model 4.Run the trend analysis, compute the residuals, and test for serial correlation using the Durbin Watson test. •If you detect no serial correlation, you can use the model. •If you detect serial correlation, you must use another model (e.g., AR). 5.If the data has serial correlation, reexamine the data for stationarity before running an AR model. If it is not stationary, treat the data for use in an AR model as follows: •If the data has a linear trend, first-difference the data. •If the data has an exponential trend, first-difference the natural log of the data. •If there is a structural shift in the data, run two separate models as discussed above. •If the data has a seasonal component, incorporate the seasonality in the AR model as discussed below. 6.After first-differencing in 5 above, if the series is covariance stationary, run an AR(1) model and test for serial correlation and seasonality. •If there is no remaining serial correlation, you can use the model.•If you still detect serial correlation, incorporate lagged values of the variable (possibly including one for seasonality—e.g., for monthly data, add the 12th lag of the time series) into the AR model until you have removed (i.e., modeled) any serial correlation. 7.Test for ARCH. Regress the square of the residuals on squares of lagged values of the residuals and test whether the resulting coefficient is significantly different from zero. •If the coefficient is not significantly different from zero, you can use the model.•If the coefficient is significantly different from zero, ARCH is present. Correct using generalized least squares. 8.If you have developed two statistically reliable models and want to determine which is better at forecasting, calculate their out-of-sample RMSE

20.Qualitative dependent variables should be verified using: A.a dummy variable based on the logistic distribution. B.a discriminant model using a linear function for ranked observations. C.tests for heteroskedasticity, serial correlation, and multicollinearity.

All qualitative dependent variable models must be tested for heteroskedasticity, serial correlation, and multicollinearity. Each of the alternatives are potential examples of a qualitative dependent variable model, but none are universal elements of all qualitative dependent variable models.

F-Statistic

An F-test assesses how well the set of independent variables, as a group, explains the variation in the dependent variable. That is, the F-statistic is used to test whether at least one of the independent variables explains a significant portion of the variation of the dependent variable. For example, if there are four independent variables in the model, the hypotheses are structured as: H0: b1 = b2 = b3 = b4 = 0 versus Ha: at least one bj ≠ 0 F = MSR / MSE = (RSS/k) / (SSE/ n - k -1)

Hypothesis Testing of Regression Coefficients

As with simple linear regression, the magnitude of the coefficients in a multiple regression tells us nothing about the importance of the independent variable in explaining the dependent variable. Thus, we must conduct hypothesis testing on the estimated slope coefficients to determine if the independent variables make a significant contribution to explaining the variation in the dependent variable.

test of significance for the CORRELATION coefficient

Assuming that the two populations are normally distributed: t = [correlation x sqrt(n -2)] / sqrt(1-correlation^2) reject Null if +Tcritical< T or T< -Tcritical (think of the normal distribution and if the calculated t value is in the tails) IF YOU CANT REJECT NULL, WE CONCLUDE THAT THE CORRELATION BETWEEN v AND y IS NOT SIGNIGICANTLY DIFFERENT AT WHATEVER SINIFICANCE LEVEL

12.Which of the following situations is least likely to result in the misspecification of a regression model with monthly returns as the dependent variable? A.Failing to include an independent variable that is related to monthly returns. B.Using leading P/E from the previous period as an independent variable. C.Using actual inflation as an independent variable to proxy for expected inflation

B Using leading P/E from a prior period as an independent variable in the regression is unlikely to result in misspecification because it is not related to any of the six types of misspecifications previously discussed. We're not forecasting the past because leading P/E is calculated using beginning-of-period stock price and a forecast of earnings for the next period. Omitting a relevant independent variable from the regression and using actual instead of expected inflation (measuring the independent variable in error) are likely to result in model misspecification.

Suppose that you want to predict stock returns with GDP growth. Which variable is the independent variable?

Because GDP is going to be used as a predictor of stock returns, stock returns are being explained by GDP. Hence, stock returns are the dependent (explained) variable, and GDP is the independent (explanatory) variable.

Effect of Serial Correlation on Regression Analysis

Because of the tendency of the data to cluster together from observation to observation, positive serial correlation typically results in coefficient standard errors that are too small, even though the estimated coefficients are consistent. These small standard error terms will cause the computed t-statistics to be LARGER than they should be, which will cause too many Type I errors: the rejection of the null hypothesis when it is actually true. The F-test will also be unreliable because the MSE will be underestimated leading again to too many Type I errors

There are two advantages of a carefully crafted simulation:

Better input quality. Superior inputs are likely to result when an analyst goes through the process of selecting a proper distribution for critical inputs, rather than relying on single best estimates. The distribution selected can additionally be checked for conformity with historical or cross-sectional data. •Provides a distribution of expected value rather than a point estimate. The distribution of an investment's expected value provides an indication of risk in the investment. Note that simulations do not provide better estimates of expected value. (Expected values from simulations should be close to the expected value obtained using point estimates of individual inputs.)

three types of constraints

Book value constraints 1)Book value constraints are imposed on a firm's book value of equity. There are two types of restrictions on book value of equity that may necessitate risk hedging:•Regulatory capital requirements. Banks and insurance companies are required to maintain adequate levels of capital. Violations of minimum capital requirements are considered serious and could threaten the very existence of the firm.•Negative equity. In some countries, negative book value of equity may have serious consequences. For example, some European countries require firms to raise additional capital in the event that book value of equity becomes negative. Earnings and cash flow constraints 2)Earnings or cash flow constraints can be imposed internally to meet analyst expectations or to achieve bonus targets. Sometimes, failure to meet analyst expectations could result in job losses for the executive team. In such cases, executives may find it important to pursue expensive risk hedging. Risk hedging in this context is then not related to value of the firm, but rather to managerial employment contract or compensation levels. Earnings constraints can also be imposed externally, such as a loan covenant. Violating such a constraint could be very expensive for the firm. Market value constraints 3) Market value constraints seek to minimize the likelihood of financial distress or bankruptcy for the firm. In a simulation, we can explicitly model the entire distribution of key input variables to identify situations where financial distress would be likely. We can then explicitly incorporate the costs of financial distress in a valuation model for the firm.

19.Which of the following is not correct about the Dickey-Fuller unit root test for nonstationarity? A.The null hypothesis is that the time series has a unit root. B.A hypothesis test is conducted using critical values computed by Dickey and Fuller in place of conventional t-test values. C.If the test statistic is significant, we conclude that the times series is nonstationary.

C For a unit root test, the null hypothesis is that the time series has a unit root. For testing for unit roots, the Dickey-Fuller (DF) test computes the conventional t-statistic, which is then compared against the revised set of critical values computed by DF. If the test statistic is significant, we reject the null hypothesis (that the time series has a unit root), implying that a unit root is not present.

The Y variable is regressed against the X variable resulting in a regression line that is horizontal with the plot of the paired observations widely dispersed about the regression line. Based on this information, which statement is most likely accurate? The R2 of this regression is close to 100%. X is perfectly positively correlated to Y. The correlation between X and Y is close to zero.

C Perfect correlation means that all the observations fall on the regression line. An R2 of 100% means perfect correlation. When there is no correlation, the regression line is horizontal.

Constraints

Constraints are specific limits imposed by users of simulations as a risk assessment tool. A constraint is a condition that, if violated, would pose dire consequences for the firm. For example, one constraint might correspond to the company not being able to meet its contractual debt obligations—the cost of which can be substantial. Firms employ expensive hedging tools to ensure that such constraints are not violated. Decisions about whether and how to hedge risk are made after evaluating the cost of different hedging tools versus their effectiveness in preventing violation of such constraints

Nonlinear Relationships

Correlation measures the linear relationship between two variables. That's why in the first panel of Figure 3 the data points lie perfectly on a straight line when the two variables are perfectly positively correlated. For example, Y = 6 - 3X is a linear relationship. However, two variables could have a nonlinear relationship yet a zero correlation. Therefore, another limitation of correlation analysis is that it does not capture strong nonlinear relationships between variables.

Checking for correlation among variables:

In this step, we use historical data to determine whether any of the probabilistic input variables are systematically related. For example, net margins may not be completely random; margins may be systematically higher at higher revenues. When there is a strong correlation between variables, we can either 1) allow only one of the variables to vary (the other variable could then be algorithmically computed), or 2) build the rules of correlation into the simulation (this necessitates more sophisticated simulation packages). If we choose to pursue the first option (i.e., allow only one variable to fluctuate randomly), the random variable should be the one that has the HIGHEST impact on valuation.

limitations of using simulations as a risk assessment tool

Input quality. Regardless of the complexities employed in running simulations, if the underlying inputs are poorly specified, the output will be low quality (i.e., garbage in, garbage out). In fact, the detailed output provided in a simulation may give the decision maker a false sense of making an informed decision. 2.Inappropriate statistical distributions. Real world data often does not fit the stringent requirements of statistical distributions. If the underlying distribution of an input is improperly specified, the quality of that input will be poor. 3.Non-stationary distributions. Input variable distributions may change over time, so the distribution and parameters specified for a particular simulation may not be valid anymore. For example, based on past data, we conclude that earnings growth rate has a normal distribution with a mean of 3% and variance of 2.5%. However, the parameters may have changed to a mean of 2% and variance of 5%. 4.Dynamic correlations. Correlations between input variables may not be stable. To the extent that correlations between input variables change over time, it becomes far more difficult to model them. If we model the correlation between variables based on past data and such relationships amongst variables change, the output of simulation will be flawed.

In-sample forecasts are within the range of data (i.e., time period) used to estimate the model, which for a time series is known as the sample or test period. In-sample forecast errors are (Yt - ^Yt), where t is an observation within the sample period. In other words, we are comparing how accurate our model is in forecasting the actual data we used to develop the model. The Predicted vs. Actual Capacity Utilization figure in our Trend Analysis example shows an example of values predicted by the model compared to the values used to generate the model.

Out-of-sample forecasts are made outside of the sample period. In other words, we compare how accurate a model is in forecasting the y variable value for a time period outside the period used to develop the model. Out-of-sample forecasts are important because they provide a test of whether the model adequately describes the time series and whether it has relevance (i.e., predictive power) in the real world. Nonetheless, an analyst should be aware that most published research employs in-sample forecasts only.

Models with QUALITATIVE DEPENDENT VARIABLES (i.e. bankrupt vs. not bankrupt)

Probit and logit models. A probit model is based on the normal distribution, while a logit model is based on the logistic distribution. Application of these models results in estimates of the probability that the event occurs (e.g., probability of default). The maximum likelihood methodology is used to estimate coefficients for probit and logit models. These coefficients relate the independent variables to the likelihood of an event occurring, such as a merger, bankruptcy, or default .•Discriminant models. Discriminant models are similar to probit and logit models but make different assumptions regarding the independent variables. Discriminant analysis results in a linear function similar to an ordinary regression, which generates an overall score, or ranking, for an observation. The scores can then be used to rank or classify observations. A popular application of a discriminant model makes use of financial ratios as the independent variables to predict the qualitative dependent variable bankruptcy. A linear relationship among the independent variables produces a value for the dependent variable that places a company in a bankrupt or not bankrupt class.

Limitations of Trend Models

Recall from the previous two topic reviews that one of the assumptions underlying linear regression is that the residuals are uncorrelated with each other. A violation of this assumption is referred to as autocorrelation. In this case, the residuals are persistently positive or negative for periods of time and it is said that the data exhibit serial correlation. This is a significant limitation, as it means that the model is not appropriate for the time series and that we should not use it to predict future values. In the preceding discussion, we suggested that a log-linear trend model would be better than a linear trend model when the variable exhibits a constant growth rate. However, it may be the case that even a log-linear model is not appropriate in the presence of serial correlation. In this case, we will want to turn to an autoregressive model. Recall from the previous topic review that the Durbin Watson statistic (DW) is used to detect autocorrelation. For a time series model without serial correlation DW should be approximately equal to 2.0. A DW significantly different from 2.0 suggests that the residual terms are correlated.

Misspecification #2: Variable Should Be Transformed

Regression assumes that the dependent variable is linearly related to each of the independent variables. Typically, however, market capitalization is not linearly related to portfolio returns, but rather the natural log of market cap is linearly related. If we include market cap in the regression without transforming it by taking the natural log—if we use M and not ln(M)—we've misspecified the model. R = c0 + c1B + c2M + c3lnPB + c4FF + ε Other examples of transformations include squaring the variable or taking the square root of the variable. If financial statement data are included in the regression model, a common transformation is to standardize the variables by dividing by sales (for income statement or cash flow items) or total assets (for balance sheet items). You should recognize these as items from common-size financial statements.

comparing scenario analysis, decision trees, and simulations

Scenario analysis computes the value of an investment under a finite set of scenarios (e.g., best case, worst case, and most likely case). Because the full spectrum of outcomes is not considered in these scenarios, the combined probability of the outcomes that are considered is less than 1. Decision trees are an appropriate approach when risk is both discrete and sequential. For example, imagine that an investment's value varies based on the uncertain outcome of a number of discrete sequential events, and at time t=0 there are two possible choices: make the investment or not. If we make the investment, the cash flow at time t=1 can be either high (C1H) or low (C1L). If the cash flow is high, we can then decide to expand capacity (expand or don't expand). If we expand capacity, the cash flow can be EC2H or EC2L, but if we don't expand capacity, the cash flow will either be DC2H or DC2L. Like simulations, decision trees consider all possible states of the outcome and hence the sum of the probabilities is 1. If the various uncertain variables influencing the value of an investment are correlated, such correlations can be explicitly built into the simulations. We can also incorporate such correlations (albeit subjectively) into scenario analysis. It is usually not possible to model correlations in decision trees.

Decision trees and simulations can be used as complements to risk-adjusted valuation or as substitutes for such valuation. Scenario analysis, because it does not include the full spectrum of outcomes, can only be used as a complement to risk-adjusted valuation. If used as a substitute, the cash flows in an investment are discounted at risk-free rate and then the expected value obtained is evaluated in conjunction with the variability obtained from the analysis. Alternatively, we can discount the cash flows using risk-adjusted discount rate and then ignore the variability of values. Regardless of the tool used, care should be taken to not double count risk.

Simulations are appropriate when risk is continuous. Decision trees and scenario analysis are appropriate when risk is discrete. Decision trees are suitable when the risk is discrete as well as sequential

Spurious Correlation

Spurious correlation refers to the appearance of a causal linear relationship when, in fact, there is no relation. Certain data items may be highly correlated purely by chance. For example, suppose that you compute the correlation coefficient for historical stock prices and snowfall totals in Minnesota and get a statistically significant relationship—especially for the month of January. Obviously there is no economic explanation for this relationship, so this would be considered a spurious correlation

Misspecification #3: Incorrectly Pooling Data

Suppose the relationship between returns and the independent variables during the first three years is actually different than relationship in the second three-year period (i.e., the regression coefficients are different from one period to the next). By pooling the data and estimating one regression over the entire period, rather than estimating two separate regressions over each of the subperiods, we have misspecified the model and our hypothesis tests and predictions of portfolio returns will be misleading.

do detect seasonality, look at the AUTOCORRELATIONS OF THE LAGS AND THEIR CORRESPONDING STATISTICS AN COMPARE THEM TO THE T TABLE

The bottom part of the table contains the residual autocorrelations for the first four lags of the time series. What stands out is the relatively large autocorrelation and t-statistic for the fourth lag. With 39 observations and two parameters, (b0 and b1), there are 37 degrees of freedom. At a significance level of 5%, the critical t-value is 2.026. The t-statistics indicate that none of the first three lagged autocorrelations is significantly different from zero. However, the t-statistic at Lag 4 is 5.4460, which means that we must reject the null hypothesis that the Lag 4 autocorrelation is zero and conclude that seasonality is present in the time-series. Thus, we conclude that this model is misspecified and will be unreliable for forecasting purposes. We need to include a seasonality term to make the model more correctly specified.

Misspecification #6: Measuring Independent Variables with Error

The free float (FF) independent variable is actually trying to capture the relationship between corporate governance quality and portfolio returns. However, because we can't actually measure "corporate governance quality," we have to use a proxy variable. Wang and Xu used free float to proxy for corporate governance quality. The presumption is that the higher the level of free float, the more influence the capital markets have on management's decision making process and the more effective the corporate governance structure. However, because we're using free float as a proxy, we're actually measuring the variable we want to include in our regression—corporate governance quality—with error. Once again our regression estimates will be biased and inconsistent and our hypothesis testing and predictions unreliable. Another common example when an independent variable is measured with error is when we want to use expected inflation in our regression but use actual inflation as a proxy

Correcting Heteroskedasticity

The most common remedy and the one recommended in the CFA curriculum is to calculate ROBUST STANDARD ERRORS (also called White-corrected standard errors or heteroskedasticity-consistent standard errors). These robust standard errors are then used to recalculate the t-statistics using the original regression coefficients. On the exam, use robust standard errors to calculate t-statistics if there is evidence of heteroskedasticity. A second method to correct for heteroskedasticy is the use of generalized least squares, which attempts to eliminate the hetereoskedasticity by modifying the original equation He determines using the Breusch-Pagan test that heteroskedasticity is present, so he also estimates the White-corrected standard error for the coefficient on inflation to be 0.31. The critical two-tail 5% t-value for 118 degrees of freedom is 1.98. Is inflation statistically significant at the 5% level? Answer: The t-statistic should be recalculated using the White-corrected standard error as: t = .6/.31 = 1.94 This is less than the critical t-value of 1.98, which means after correcting for heteroskedasticity, the null hypothesis that the inflation coefficient is zero cannot be rejected. Therefore, inflation is not statistically significant. Notice that because the coefficient estimate of 0.60 was not affected by heteroskedasticity, but the original standard error of 0.28 was too low, the original t-statistic of 2.14 was too high. After using the higher White-corrected standard error of 0.31, the t-statistic fell to 1.94.

Based on the regression model stated previously, the regression process estimates an equation for a line through a scatter plot of the data that "best" explains the observed values for Y in terms of the observed values for X. The linear equation, often called the line of best fit, or regression line.

The regression line is the line for which the estimates of Bo and B1 are such that the sum of the squared differences (vertical distances) between the Y-values predicted by the regression equation (Y1 = B0 + B1Xi) and actual Y-values, Yi, is minimized. The sum of the squared vertical distances between the estimated and actual Y-values is referred to as the SUM OF SQUARED ERRORS (SSE). Thus, the regression line is the line that minimizes the SSE. This explains why simple linear regression is frequently referred to as ordinary least squares (OLS) regression, and the values estimated by the estimated regression equation, Yi , are called least squares estimates.

Unit Root Testing for Nonstationarity/ Dickey Fuller test

To determine whether a time series is covariance stationary, we can (1) run an AR model and examine autocorrelations, or (2) perform the Dickey Fuller test.

overestimating the regression / Adjusted R2

Unfortunately, R2 by itself may not be a reliable measure of the explanatory power of the multiple regression model. This is because R2 almost always increases as variables are added to the model, even if the marginal contribution of the new variables is not statistically significant. Consequently, a relatively high R2 may reflect the impact of a large set of independent variables rather than how well the set explains the dependent variable. This problem is often referred to as overestimating the regression

EXAMPLE: Test the statistical significance of the independent variable PR in the real earnings growth example at the 10% significance level. The results of that regression are reproduced in the following figure.

We are testing the following hypothesis: H0: PR = 0 versus Ha: PR ≠ 0 The 10% two-tailed critical t-value with 46 - 2 - 1 = 43 degrees of freedom is approximately 1.68. We should reject the null hypothesis if the t-statistic is greater than 1.68 or less than -1.68. t = .25 / .032 = 7.8 Therefore, because the t-statistic of 7.8 is greater than the upper critical t-value of 1.68, we can reject the null hypothesis and conclude that the PR regression coefficient is statistically significantly different from zero at the 10% significance level.

autoregressive conditional heteroskedasticity (ARCH)

When examining a single time series, such as an AR model, autoregressive conditional heteroskedasticity (ARCH) exists if the variance of the residuals in one period is dependent on the variance of the residuals in a previous period. When this condition exists, the standard errors of the regression coefficients in AR models and the hypothesis tests of these coefficients are invalid. ARCH is present when the variance of the error depends on the variance of previous errors. A zero autocorrelation of the error term at all lags suggests that an autoregressive model is a good fit to the data.

autoregressive model (AR).

When the dependent variable is regressed against one or more lagged values of itself, the resultant model is called as an autoregressive model (AR). For example, the sales for a firm could be regressed against the sales for the firm in the previous month. Consider: xt = b0 + b1xt-1+ εt where: xt = value of time series at time t b0 = intercept at the vertical axis (y-axis) b1 = slope coefficient xt-1 = value of time series at time t - 1 εt = error term (or residual term or disturbance term) t = time; t = 1, 2, 3...T In an autoregressive time series, past values of a variable are used to predict the current (and hence future) value of the variable

Dummy variables

When the independent variable is binary in nature (either "on" or "off"). They are often sued to quantify th eimpact of qualitative evensts. Are assigned a value of 0 or 1.

There are three primary assumption violations that you will encounter: (1) heteroskedasticity, (2) serial correlation (i.e., autocorrelation), and (3) multicollinearity.

Without getting into the math, suffice it to say that the coefficient standard error is calculated using the standard error of estimate (SEE), which is the standard deviation of the error term. Any violation of an assumption that affects the error term will ultimately affect the coefficient standard error. Consequently, this will affect the t-statistic and F-statistic and any conclusions drawn from hypothesis tests involving these statistics.

Unit root

if the coefficient on the lag variable is 1, the series is not covariance stationary. If the value of the lag coefficient is equal to one, the time series is said to have a unit root and will follow a random walk process. Since a time series that follows a random walk is not covariance stationary, modeling such a time series in an AR model can lead to incorrect inferences.

p-value

is the smallest level of significance for which the null hypothesis can be rejected. An alternative method of doing hypothesis testing of the coefficients is to compare the p-value to the significance level: •If the p-value is LESS than significance level, the null hypothesis CAN be rejected. •If the p-value is GREATER than the significance level, the null hypothesis CANNOT be rejected. THEY TELL US THE EXACT SAME THING AS THE T TESTS IN TERMS OF STATISTICALLY SIGNIFICANT/INSIGNIFICANCE

SEE

is the standard deviation of the regression error terms and is equal to the square root of the mean squared error (MSE):

root mean squared error criterion (RMSE)

is used to compare the accuracy of autoregressive models in forecasting out-of-sample values. For example, a researcher may have two autoregressive (AR) models: an AR(1) model and an AR(2) model. To determine which model will more accurately forecast future values, we calculate the RMSE (the square root of the average of the squared errors) for the out-of-sample data. Note that the model with the lowest RMSE for in-sample data may not be the model with the lowest RMSE for out-of-sample data. For example, imagine that we have 60 months of historical unemployment data. We estimate both models over the first 36 of 60 months. To determine which model will produce better (i.e., more accurate) forecasts, we then forecast the values for the last 24 of 60 months of historical data. Using the actual values for the last 24 months as well as the values predicted by the models, we can calculate the RMSE for each model. The model with the lower RMSE for the out-of-sample data will have lower forecast error and will be expected to have better predictive power in the future. In addition to examining the RMSE criteria for a model, we will also want to examine the stability of regression coefficients, which we discuss below.

AR(1) Model

is when the independent variable is the dependant variable lagged one period

Cointegration

means that two time series are economically linked (related to the same macro variables) or follow the same trend and that relationship is not expected to change. If two time series are cointegrated, the error term from regressing one on the other is covariance stationary and the t-tests are reliable. This means that scenario 5 will produce reliable regression estimates, whereas scenario 4 will not. To test whether two time series are cointegrated, we regress one variable on the other using the following model: yt = b0 + b1xt + ε where: yt = value of time series y at time t xt = value of time series x at time t The residuals are tested for a unit root using the Dickey Fuller test with critical t-values calculated by Engle and Granger (i.e., the DF-EG test). If the test rejects the null hypothesis of a unit root, we say the error terms generated by the two time series are covariance stationary and the two series are cointegrated. If the two series are cointegrated, we can use the regression to model their relationship. Professor's Note: For the exam, remember that the Dickey Fuller test does not use the standard critical t-values we typically use in testing the statistical significance of individual regression coefficients. The DF-EG test further adjusts them to test for cointegration. As with the DF test, you do not have to know critical t-values for the DF-EG test. Just remember that like the regular DF test, if the null is rejected, we say the series (of error terms in this case) is covariance stationary and the two time series are cointegrated.

standard error of estimate (SEE)

measures the degree of variability of the actual Y-values relative to the estimated Y-values from a regression equation. The SEE gauges the "fit" of the regression line. The smaller the standard error, the better the fit. The SEE is the standard deviation of the error terms in the regression. As such, SEE is also referred to as the standard error of the residual, or standard error of the regression. There are multiple terms for SEE and you can expect to see any of these on the exam. Standard error, when used in the context of the whole regression (as opposed to for an individual coefficient), also refers to SEE.

Use scatter plots or the Durbin-Watson statistic to detect the presence of serial correlation.

where Et = residual for period T if the same size is very large: DW ≈ 2(1-r) , where r = correlation coefficient between residuals from one period and those from the previous period. You can see from the approximation that the Durbin-Watson test statistic is approximately equal to 2 if the error terms are homoskedastic and not serially correlated (r = 0). DW < 2 if the error terms are positively serially correlated (r > 0), and DW > 2 if the error terms are negatively serially correlated (r < 0). But how much below the magic number 2 is statistically significant enough to reject the null hypothesis of no positive serial correlation?

Assumptions of multiple regression mostly pertain to the error term, εi

•A linear relationship exists between the dependent and independent variables. •The independent variables are not random, and there is no exact linear relation between any two or more independent variables .•The expected value of the error term is zero. •The variance of the error terms is constant .•The error for one observation is not correlated with that of another observation. •The error term is normally distributed.

assumptions of a multiple regression model

•A linear relationship exists between the dependent and independent variables. In other words, the model on the first page of this topic review correctly describes the relationship. •The independent variables are not random, and there is no exact linear relation between any two or more independent variables. •The expected value of the error term, conditional on the independent variable, is zero •The variance of the error terms is constant for all observations •The error term for one observation is not correlated with that of another observation [i.e., E(εiεj) = 0, j ≠ i]. •The error term is normally distributed.

Limitations of regression analysis

•Linear relationships can change over time. This means that the estimation equation based on data from a specific time period may not be relevant for forecasts or predictions in another time period. This is referred to asPARAMETER INSTABILITY •Even if the regression model accurately reflects the historical relationship between the two variables, its usefulness in investment analysis will be limited if other market participants are also aware of and act on this evidence. •If the assumptions underlying regression analysis do not hold, the interpretation and tests of hypotheses may not be valid. For example, if the data is heteroskedastic (non-constant variance of the error terms) or exhibits autocorrelation (error terms are not independent), regression results may be invalid. We will discuss these issues in more detail in the next topic review.

dependent variable

•The dependent variable is the variable whose variation is explained by the independent variable. We are interested in answering the question, "What explains fluctuations in the dependent variable?" The dependent variable is also referred to as the EXPLAINED variable, the ENDOGENOUS variable, or the PREDICTED variable. GOES ON Y AXIS

Interpreting the Multiple Regression Results

•The intercept term is the value of the dependent variable when the independent variables are all equal to zero. •Each slope coefficient is the estimated change in the dependent variable for a one-unit change in that independent variable, holding the other independent variables constant. That's why the slope coefficients in a multiple regression are sometimes called partial slope coefficients

Effect of Heteroskedasticity on Regression Analysis

•The standard errors are usually unreliable estimates. •The coefficient estimates ARE NOT affected. •If the standard errors are too small, but the coefficient estimates themselves are not affected, the t-statistics will be too large and the null hypothesis of no statistical significance is rejected too often. The opposite will be true if the standard errors are too large. •The F-test is also unreliable.

testing statistical significance = testing the null hypothesis that the coefficient is zero versus the alternative that it is not

"testing statistical significance" ⇒ H0: Bj = 0 versus Ha : Bj ≠ 0

EX: Hypothesis test for significance of regression coefficients The estimated slope coefficient from the ABC example is 0.64 with a standard error equal to 0.26. Assuming that the sample has 36 observations, determine if the estimated slope coefficient is significantly different than zero at a 5% level of significance.

.64 - 0 / .26 = 2.46 The estimated slope coefficient from the ABC example is 0.64 with a standard error equal to 0.26. Assuming that the sample has 36 observations, determine if the estimated slope coefficient is significantly different than zero at a 5% level of significance.

Steps in simulations

1) Determine the probalistic variables 2) define probability distibutions for these variables 3) check for correlations among variables 4) run the simulation

Example: Confidence interval for a predicted value Calculate a 95% prediction interval on the predicted value of ABC excess returns from the previous example. Suppose the standard error of the forecast is 3.67, and the forecasted value of S&P 500 excess returns is 10%

1) Find the predicted value by plugging the 10% into the equation -2.3% + (.64)(10%) = 4.1% 2) The 5% two-tailed critical t-value with 34 degrees of freedom is 2.03 3) 4.1% +/- (2.03 x 3.67) = 4.1% +/- 7.5% = the predicted value +/- (critical value x standard error) STADARD ERROR WILL MOST LIKELY BE GIVEN 4)This range can be interpreted as, given a forecasted value for S&P 500 excess returns of 10%, we can be 95% confident that the ABC excess returns will be between -3.4% and 11.6%.

HOW TO FIND THE CRITICAL F VALUE

1) the left hand column is your DF = n - k - 1 2) the top row is your # of independent variables or K

Assumptions of the linear regression most of the major assumptions pertain to the regression model's residual term (ε).

1.A linear relationship exists between the dependent and the independent variable. 2.The independent variable is uncorrelated with the residuals. 3.The expected value of the residual term is zero [E(ε) = 0]. 4.The variance of the residual term is constant for all observations . 5.The residual term is independently distributed; that is, the residual for one observation is not correlated with that of another observation 6.The residual term is normally distributed

There are THREE broad categories of model misspecification, or ways in which the regression model can be specified incorrectly, each with several subcategories:

1.The functional form can be misspecified. •Important variables are omitted. •Variables should be transformed .•Data is improperly pooled. 2.Explanatory variables are correlated with the error term in time series models. •A lagged dependent variable is used as an independent variable. •A function of the dependent variable is used as an independent variable ("forecasting the past").•Independent variables are measured with error. 3.Other time-series misspecifications that result in nonstationarity

R^2 = COEFFICIENT OF DETERMINATION

= [total Variation (SST) - unexplained variation (SSE)] / total variation

HETEROSKEDASTIC

= non constant variance of the error terms

20.A time-series model that uses quarterly data exhibits seasonality if the fourth autocorrelation of the error term: A.differs significantly from 0. B.does not differ significantly from 0 C.does not differ significantly from the first autocorrelation of the error term.

A If the fourth autocorrelation of the error term differs significantly from 0, this is an indication of seasonality.

An UNBIASED ESTIMATOR is one for which the expected value of the estimator is equal to the parameter you are trying to estimate. For example, because the expected value of the sample mean is equal to the population mean, the sample mean is an unbiased estimator of the population mean.

A CONSISTENT ESTIMATOR is one for which the accuracy of the parameter estimate increases as the sample size increases. As the sample size increases, the standard error of the sample mean falls, and the sampling distribution bunches more closely around the population mean. In fact, as the sample size approaches infinity, the standard error approaches zero

Misspecification #4: Using a Lagged Dependent Variable as an Independent Variable

A lagged variable in a time series regression is the value of a variable from a prior period. In our example, the dependent variable is portfolio return in month t, so a lagged dependent variable would be the portfolio return in the previous period, month t - 1 (which is denoted as Rt-1). R = d0 + d1B + d2lnM + d3lnPB + d4FF + d5Rt - 1 + ε If the error terms in the regression model (without the lagged dependent variable) are also serially correlated (which is common in time series regressions), then this model misspecification will result in biased and inconsistent regression estimates and unreliable hypothesis tests and return predictions

linear trend

A linear trend is a time series pattern that can be graphed using a straight line. A downward sloping line indicates a negative trend, while an upward-sloping line indicates a positive trend. The simplest form of a linear trend is represented by the following linear trend model: yt = b0 + b1(t) + εt where: yt = the value of the time series (the dependent variable) at time t b0 = intercept at the vertical axis (y-axis) b1 = slope coefficient (or trend coefficient) εt = error term (or residual term or disturbance term) t = time (the independent variable); t = 1, 2, 3...T

time series

A time series is a set of observations for a variable over successive periods of time (e.g., monthly stock market returns for the past ten years). The series has a trend if a consistent pattern can be seen by plotting the data (i.e., the individual observations) on a graph. For example, a seasonal trend in sales data is easily detected by plotting the data and noting the significant jump in sales during the same month(s) each year.

Statistical inferences based on ordinary least squares (OLS) estimates for an AR time series model may be invalid unless the time series being modeled is COVARIANCE STATIONARY.

A time series is covariance stationary if it satisfies the following three conditions: 1. Constant and finite expected value. The expected value of the time series is constant over time. (Later, we will refer to this value as the mean-reverting level.) 2. Constant and finite variance. The time series' volatility around its mean (i.e., the distribution of the individual observations around the mean) does not change over time. 3. Constant and finite covariance between values at any given lag. The covariance of the time series with leading or lagged values of itself is constant.

Correcting Serial Correlation

Adjust the coefficient standard errors, which is the method recommended in the CFA curriculum, using the Hansen method. The Hansen method also corrects for conditional heteroskedasticity. These adjusted standard errors, which are sometimes called serial correlation consistent standard errors or Hansen-White standard errors, are then used in hypothesis testing of the regression coefficients. Only use the Hansen method if serial correlation is a problem. The White-corrected standard errors are preferred if only heteroskedasticity is a problem. If both conditions are present, use the Hansen method. •Improve the specification of the model. The best way to do this is to explicitly incorporate the time-series nature of the data (e.g., include a seasonal term). This can be tricky.

adjusted R2 cont.

Adjusted R2 is less than or equal to R2. So while adding a new independent variable to the model will increase R2, it may either increase or decrease the adjusted R2. If the new variable has only a small effect on R2, the value of adjusted R2 may decrease. In addition, adjusted R2 may be less than zero if the R2 is low enough.

Outliers

Computed correlation coefficients, as well as other sample statistics, may be affected by outliers. Outliers represent a few extreme values for sample observations. Relative to the rest of the sample data, the value of an outlier may be extraordinarily large or small. Outliers can result in apparent statistical evidence that a significant relationship exists when, in fact, there is none, or that there is no relationship when, in fact, there is a relationship.

confidence intervals for PREDICTED VALUES

Confidence intervals for the predicted value of a dependent variable are calculated in a manner similar to the confidence interval for the regression coefficients

F statistic decision rule ( ALWAYS 1 TAILED!)

Decision rule: reject H0 if F (test-statistic) > Fc (critical value) WHAT IT MEANS ? -> Rejection of the null hypothesis at a stated level of significance indicates that at least one of the coefficients is significantly different than zero, which is interpreted to mean that at least one of the independent variables in the regression model makes a significant contribution to the explanation of the dependent variable. remember to use the F-test on the exam if you are asked to test all of the coefficients simultaneously.

Effect of Multicollinearity on Regression Analysis

Even though multicollinearity does not affect the consistency of slope coefficients, such coefficients themselves tend to be unreliable. Additionally, the standard errors of the slope coefficients are ARTIFICIALLY INFLATED . Hence, there is a greater probability that we will incorrectly conclude that a variable is not statistically significant (i.e., a Type II error). Multicollinearity is likely to be present to some extent in most economic models. The issue is whether the multicollinearity has a significant effect on the regression results.

Dummy variable example

For example, suppose we estimate the quarterly EPS regression model with ten years of data (40 quarterly observations) and find that b0 = 1.25, b1 = 0.75, b2 = -0.20, and b3 = 0.10: EPS = 1.25 + .75Q1 - .2Q2 + .1Q3 average fourth quarter EPS = 1.25 average first quarter EPS = 1.25 + 0.75 = 2.00 average second quarter EPS = 1.25 - 0.20 = 1.05 average third quarter EPS = 1.25 + 0.10 = 1.35 The intercept term, b0, represents the average value of EPS for the fourth quarter. The slope coefficient on each dummy variable estimates the difference in earnings per share (on average) between the respective quarter (i.e., quarter 1, 2, or 3) and the omitted quarter (the fourth quarter in this case). THINK OF THE OMITTED CLASS AS THE REFERENCE POINT.

There are three approaches to specifying a distribution

Historical data: Examination of past data may point to a distribution that is suitable for the probabilistic variable. This method assumes that the future values of the variable will be similar to its past. •Cross-sectional data: When past data is unavailable (or unreliable), we may estimate the distribution of the variable based on the values of the variable for peers. For example, we can estimate the distribution of operating margin of a new natural-gas-fired power plant based on the known distribution of margins for other similar-sized natural gas plants. •Pick a distribution and estimate the parameters: An advantage of the above two methods is that we not only get a good idea of the appropriate distribution, but we can also estimate the relevant parameters (e.g., the mean and standard deviation of a normally-distributed variable). When neither historical nor cross-sectional data provide adequate insight, subjective specification of a distribution along with related parameters is the appropriate approach. For example, we might specify (based on insights into the industry) that the net margin for a discount retailer has a normal distribution with a mean of 3% and standard deviation of 1.2%.

Regression Coefficient (either slope of intercept?) Confidence Interval

Hypothesis testing for a regression coefficient may use the confidence interval for the coefficient being tested. For instance, a frequently asked question is whether an estimated slope coefficient is statistically different from zero. In other words, the null hypothesis is H0: b1 = 0 and the alternative hypothesis is Ha: b1 ≠ 0. If the confidence interval at the desired level of significance does not include zero, the null is rejected, and the coefficient is said to be statistically different from zero.

Random walk

If a time series follows a random walk process, the predicted value of the series (i.e., the value of the dependent variable) in one period is equal to the value of the series in the previous period plus a random error term. A time series that follows a simple random walk process is described in equation form as xt = xt-1 + εt, where the best forecast of xt is xt-1 and: 1.E(εt) = 0: The expected value of each error term is zero .2.E(εt2) = σ2: The variance of the error terms is constant. 3.E(εi,εj) = 0; if i ≠ j: There is no serial correlation in the error terms.

Random Walk with a Drift

If a time series follows a random walk with a drift, the intercept term is not equal to zero. That is, in addition to a random error term, the time series is expected to increase or decrease by a constant amount each period

Using ARCH Models

If a time-series model has been determined to contain ARCH errors, regression procedures that correct for heteroskedasticity, such as generalized least squares, must be used in order to develop a predictive model. Otherwise, the standard errors of the model's coefficients will be incorrect, leading to invalid conclusions

Use the DW statistic to determine if there is serial correlation by using the DW table and seeing the calculated statistic is between the upper and lower bounds or falls outside of them

If lower then lower bound => reject null and conclude there IS serial correlation if between => inconclusive if above => no serial correlation

First Differencing

If we believe a time series is a random walk (i.e., has a unit root), we can transform the data to a covariance stationary time series using a procedure called first differencing. The first differencing process involves subtracting the value of the time series (i.e., the dependent variable) in the immediately preceding period from the current value of the time series to define a new dependent variable, y. Note that by taking first differences, you model the change in the value of the dependent variable

MSE = SSE / n -2

MSR = RSS / k = , where k = number of slope parameters k is the number of slope parameters estimated and n is the number of observations. In general, the regression df = k and the error df = (n - k - 1). Because we are limited to simple linear regressions in this topic review (one independent variable), we use k = 1 for the regression df and n - 1 - 1 = n - 2 for the error df

Covariance Stationarity

Neither a random walk nor a random walk with a drift exhibits covariance stationarity In either case (with or without a drift), the mean-reverting level is B0 / ( 1-B1) = B0/0 (the division of any number by zero is undefined), and as we stated earlier, a time series must have a finite mean-reverting level to be covariance stationary. Thus, a random walk, WITH OR WITHOUT a drift, is NOT covariance stationary, and exhibits what is known as a unit root (b1 = 1). For a time series that is not covariance stationary, the least squares regression procedure that we have been using to estimate an AR(1) model will not work without transforming the data.

Serial correlation, also known as autocorrelation, refers to the situation in which the residual terms are correlated with one another. Serial correlation is a relatively common problem with time series data.

Positive serial correlation exists when a POSITIVE regression error in one time period increases the probability of observing a POSITIVE regression error for the next time period. •Negative serial correlation occurs when a POSITIVE error in one period increases the probability of observing a NEGATIVE error in the next period.

EX: An analyst runs a regression of monthly value-stock returns on five independent variables over 60 months. The total sum of squares for the regression is 460, and the sum of squared errors is 170

R2 = 460 - 170 / 460 = 63% adjusted R2 = 1 - [(60-1/60-5-1) x (1-.63)] = 59.6% The R2 of 63% suggests that the five independent variables together explain 63% of the variation in monthly value-stock returns. Suppose the analyst now adds four more independent variables to the regression, and the R2 increases to 65.0%. Identify which model the analyst would most likely prefer: With nine independent variables, even though the R2 has increased from 63% to 65%, the adjusted R2 has decreased from 59.6% to 58.7%: The analyst would prefer the first model because the adjusted R2 is higher and the model has five independent variables as opposed to nine.

total variation = explained variation + unexplained variation, or:

SST = RSS + SSE

Example: Using a linear trend model

Suppose you are given a linear trend model with Bo= 1.70 and b1= 3.0. Calculate Yt for t = 1 and t = 2. Answer: When t = 1, Y1 = 1.7 + 3.0(1) = 4.7 When t = 2, Y2 = 1.7 + 3.0(2) = 7.7 Note that the difference betweenY1 and Y2 is 3.0, or the value of the trend coefficient b1.

DW Decision rules

Suppose you have a regression output which includes three independent variables that provide you with a DW statistic of 1.23. Also suppose that the sample size is 40. At a 5% significance level, determine if the error terms are serially correlated. dl = lower bound du = upper bound Answer: From a 5% DW table with n = 40 and k = 3, the upper and lower critical DW values are found to be dl = 1.34 and du = 1.66, respectively. Since DW < dl (i.e., 1.23 < 1.34), you should reject the null hypothesis and conclude that the regression has positive serial correlation among the error terms

Calcualting the confidence intervale for a COEFFICIENT example The estimated SLOPE coefficient, b1, from the ABC regression is 0.64 with a standard error equal to 0.26. Assuming that the sample had 36 observations, calculate the 95% confidence interval for b1.

The critical two-tail t-values are ± 2.03 (from the t-table with n - 2 = 34 degrees of freedom). We can compute the 95% confidence interval as: .64 +- (2.03)(.26) = .11 to 1.17 coefficient +- (tvalue)(standard error of the FORECAST) Because this confidence interval does not include zero, we can conclude that the SLOPEcoefficient is significantly different from zero. YOU WONT HAVE TO CALCUATLED THE STANDARD ERROR OF THE REGRESSION COEFFICIENT

Effects of model misspecificiation on the regression results

The effects of the model misspecification on the regression results are basically the same for all of the misspecifications: regression coefficients are biased and inconsistent, which means we can't have any confidence in our hypothesis tests of the coefficients or in the predictions of the model.

Example: Hypothesis test for significance of regression coefficients

The estimated slope coefficient from the ABC example is 0.64 with a standard error equal to 0.26. Assuming that the sample has 36 observations, determine if the estimated slope coefficient is significantly different than ZERO at a 5% level of significance. .64 - 0 / .26 = 2.46 (it is - o bc you are seeing if different than zero, if it was different from 1 you would subtract 1) The critical two-tailed t-values are ± 2.03 (from the t-table with df = 36 - 2 = 34). Because t > tcritical (i.e., 2.46 > 2.03), we reject the null hypothesis and conclude that the slope is different from zero. Note that the t-test and the confidence interval lead to the same conclusion to reject the null hypothesis and conclude that the slope coefficient is statistically significant.

Detecting Multicollinearity

The most common way to detect multicollinearity is the situation where t-tests indicate that none of the individual coefficients is significantly different than zero, while the F-test is statistically significant and the R2 is high. This suggests that the variables together explain much of the variation in the dependent variable, but the individual independent variables don't. The only way this can happen is when the independent variables are highly correlated with each other, so while their common source of variation is explaining the dependent variable, the high degree of correlation also "washes out" the individual effects.

Misspecification #5: Forecasting the Past

The proper specification of the model is to measure the dependent variable as returns during a particular month (say July 1996), and the independent variable ln(M) as the natural log of market capitalization at the beginning of July. Remember that market cap is equal to shares outstanding times price per share. If we measure market cap at the end of July and use it in our regression, we're naturally going to conclude that stocks with higher market cap at the end of July had higher returns during July. In other words, our model is misspecified because it is forecasting the past: we're using variables measured at the end of July to predict a variable measured during July.

simple linear regression

The purpose of simple linear regression is to explain the variation in a DEPENDENT variable in terms of the variation in a single INDEPEDENT variable. Here, the term "variation" is interpreted as the degree to which a variable differs from its mean value. Don't confuse variation with variance—they are related but are not the same.

Example: Hypothesis testing with dummy variables

The standard error of the coefficient b1 is equal to 0.15 from the EPS regression model. Test whether first quarter EPS is equal to fourth quarter EPS at the 5% significance level. Answer: We are testing the following hypothesis: H0: b1 = 0 vs. HA: b1 ≠0 The t-statistic is 0.75/0.15 = 5.0 and the two-tail 5% critical value with 36 degrees of freedom is approximately 2.03. Therefore, we should reject the null and conclude that first quarter EPS is statistically significantly different than fourth quarter EPS at the 5% significance level.

SUM OF SQUARED ERRORS (SSE).

The sum of the squared vertical distances between the estimated and actual Y-values is referred to as the SUM OF SQUARED ERRORS (SSE

20.Carla Preusser finds that the total assets under management by a popular hedge fund manager, and the number of lizards lying out in the sun in a nearby park, can be modeled as functions of time: f(t) = t1.8 and f(t) = t + 5, respectively. The correlation between the two models is 0.98. Two potential problems with using the lizards to predict total assets include: A.spurious correlation and the non-linear relationship in the total assets function. B.spurious correlation and the non-geometric relationship in the lizard function. C.outliers and non-linear relationship in the total assets function.

There is little to no chance that the relationship between total assets under management and lizards in a park is other than a coincidence. The correlation is spurious. The non-linear relationship in the total assets function makes correlation a poor choice of measure.

The more common way to detect conditional heteroskedasticity is the Breusch-Pagan test, which calls for the regression of the squared residuals on the independent variables. If conditional heteroskedasticity is present, the independent variables will significantly contribute to the explanation of the squared residuals

This is a one-tailed test because heteroskedasticity is only a problem if the R2 and the BP test statistic are too large EXAMPLE: With five years of monthly observations, n is equal to 60. The test statistic is: n × R2 = 60 × 0.08 = 4.8 The one-tailed critical value for a chi-square distribution with one degree of freedom and α equal to 5% is 3.841. Therefore you should reject the null hypothesis and conclude that you have a problem with conditional heteroskedasticity. The Breusch-Pagan test is statistically significant at any reasonable level of significance, which indicates heteroskedasticity

Log-Linear Trend Models

Time series data, particularly financial time series, often display exponential growth (growth with continuous compounding). Positive exponential growth means that the random variable (i.e., the time series) tends to increase at some constant rate of growth. If we plot the data, the observations will form a convex curve. Negative exponential growth means that the data tends to decrease at some constant rate of decay, and the plotted time series will be a concave curve.

How to determine if a linear or log-linear trend model should be used:

To determine if a linear or log-linear trend model should be used, the analyst should plot the data. A linear trend model may be appropriate if the data points appear to be equally distributed above and below the regression line. Inflation rate data can often be modeled with a linear trend model. If, on the other hand, the data plots with a non-linear (curved) shape, then the residuals from a linear trend model will be persistently positive or negative for a period of time. In this case, the log-linear model may be more suitable. In other words, when the residuals from a linear trend model are serially correlated, a log-linear trend model may be more appropriate. By taking the log of the y variable, a regression line can better fit the data. Financial data (e.g., stock indices and stock prices) and company sales data are often best modeled with log-linear models. THE BOTTOM LINE IS THAT WHEN A VARIABLE GROWS AT A CONSTANT RATE, A LOG-LINEAR MODEL IS MOST APPROPRIATE. WHEN THE VARIABLE INCREASES OVER TIME BY A CONSTANT AMOUNT, A LINEAR TREND MODEL IS MOST APPROPRIATE.

Recall that one of the assumptions of multiple regression is that the variance of the residuals is constant across observations. Heteroskedasticity occurs when the variance of the residuals is not the same across all observations in the sample. THIS HAPPENS WHEN there are subsamples that are more spread out than the rest of the sample.

UNCONDITIONAL HETEROSKEDASTICITY occurs when the heteroskedasticity is not related to the level of the independent variables, which means that it doesn't systematically increase or decrease with changes in the value of the independent variable(s). While this is a violation of the equal variance assumption, it usually causes NO MAJOR PROBLEMS with the regression. CONDITIONAL HETEROSKEDASTICITY is heteroskedasticity that is related to the level of (i.e., conditional on) the independent variables. For example, conditional heteroskedasticity exists if the variance of the residual term increases as the value of the independent variable increases, as shown in Figure 4. Notice in this figure that the residual variance associated with the larger values of the independent variable, X, is larger than the residual variance associated with the smaller values of X. Conditional heteroskedasticity does create significant problems for statistical inference.

A time series is covariance stationary if its mean, variance, and covariances with lagged and leading values do not change over time. Covariance stationarity is a requirement for using AR models

When working with two time series in a regression: (1) if neither time series has a unit root, then the regression can be used; (2) if only one series has a unit root, the regression results will be invalid; (3) if both time series have a unit root and are cointegrated, then the regression can be used; (4) if both time series have a unit root but are not cointegrated, the regression results will be invalid.

A t-test may also be used to test the hypothesis that the true slope coefficient, b1, is equal to some hypothesized value. Letting image ^B1 be the point estimate for b1, the appropriate test statistic with n - 2 degrees of freedom is:

^B1 - B1 / standard error Reject H0 if t > + tcritical or t < -tcritical

15.Which of the following will always have a finite mean-reverting level? A.A covariance-stationary time series. B.A random-walk-with-drift time series. C.A time series with unit root.

a All random-walk time series have a unit root. Time series with unit root do not have a finite mean-reverting level

Probabilistic variables

are the uncertain input variables that influence the value of an investment. While there is no limit to the number of uncertain input variables, in practice some variables are either predictable (and hence can be derived/estimated) or have an insignificant influence on the value of the investment (and hence can be assumed to be constant).

Predicted values

are values of the dependent variable based on the estimated regression coefficients and a prediction about the value of the independent variable. They are the values that are predicted by the regression equation, given an estimate of the independent variable.

F-test

assesses how well a set of independent variables, as a group, explains the variation in the dependent variable. In multiple regression, the F-statistic is used to test whether at least one independent variable in a set of independent variables explains a significant portion of the variation of the dependent variable F = MSR / MSE = (RSS/k) / (SSE/n-k-1) In multiple regression, the F-statistic tests all independent variables as a group. ALWAYS ONE TAILED The bottom line is that the F-test is not as useful when we only have one independent variable because it tells us the same thing as the t-test of the slope coefficient. Make sure you know that fact for the exam, and then concentrate on the application of the F-test in multiple regression.

Confidence Intervals for a Regression Coefficient The confidence interval for a regression coefficient in multiple regression is calculated and interpreted the same way as it is in simple linear regression.

estimated regression coefficient +/- (critical t-value)(coefficient standard error) Constructing a confidence interval and conducting a t-test with a null hypothesis of "equal to zero" will always result in the same conclusion regarding the statistical significance of the regression coefficient.

slope coefficient (B1)

for the regression line describes the change in Y for a one unit change in X. It can be positive, negative, or zero, depending on the relationship between the regression variables. For example, an estimated slope coefficient of 2 would indicate that the dependent variable will change two units for every 1-unit change in the independent variable The slope coefficient of 0.64 can be interpreted to mean that when excess S&P 500 returns increase (decrease) by 1%, ABC excess returns increase (decrease) by 0.64%. The slope coefficient in a regression like this is called the stock's beta, and it measures the relative amount of systematic risk in ABC's returns. Notice that ABC is less risky than average because its returns tend to increase or decrease by less than the change in the market returns. A stock with a beta of one would have an average level of systematic risk and a stock with a beta greater than one would have more than average systematic risk

Analysis of variance (ANOVA)

is a statistical procedure for analyzing the total variability of the dependent variable

coefficient of determination (R2)

is defined as the percentage of the total variation in the dependent variable explained by the independent variable. For example, an R2 of 0.63 indicates that the variation of the independent variable explains 63% of the variation in the dependent variable. For simple linear regression (i.e., one independent variable), the coefficient of determination, R2, may be computed by simply squaring the correlation coefficient, r. In other words, R2 = r2 for a regression with one independent variable. This approach is not appropriate when more than one independent variable is used in the regression, as is the case with the multiple regression techniques presented in the next topic review

the RESIDUAL TERM

is the difference between the model predicted outcome of Y and what the outcome of Y actually was EX: the model said ABC should be -7.3% but it actually was 1.1% leaving the residuals as 8.4%

Intercept term (Bo)

is the line's intersection with the Y-axis at X = 0. It can be positive, negative, or zero. A property of the least squares method is that the intercept term may be expressed as: Bo = Mean of Y - (B1 x mean of X) the intercept is an estimate of the dependent variable when the independent variable takes on a value of zero. The intercept term of -2.3% can be interpreted to mean that when the excess return on the S&P 500 is zero, the return on ABC stock is -2.3%. The intercept term in this regression is called the stock's ex-post alpha. It is a measure of excess risk-adjusted returns. A negative ex-post alpha means that ABC underperformed the S&P 500 on a risk-adjusted basis over the time period.

Regression model specification

is the selection of the explanatory (independent) variables to be included in the regression and the transformations, if any, of those explanatory variables.

Total sum of squares (SST)

measures the TOTAL variation in the dependent variable. SST is equal to the sum of the squared differences between the actual Y-values and the MEAN of Y: NOT the same as variance. Variance of the dependent variable = SST / (n -1)

Sum of squared errors (SSE)

measures the UNEXPLAINED variation in the dependent variable. It's also known as the sum of squared residuals or the residual sum of squares. SSE is the sum of the squared vertical distances between the actual Y-values and the predicted Y-values on the regression line. You don't have to memorize the formulas for the sums of squares. You do need to know what they measure and how you use them to construct an ANOVA table.

Regression sum of squares (RSS)

measures the variation in the dependent variable that is EXPLAINED by the independent variable. RSS is the sum of the squared distances between the predicted Y-values and the MEAN of Y

Multicollinearity

refers to the condition when two or more of the independent variables, or linear combinations of the independent variables, in a multiple regression are highly correlated with each other. This condition distorts the standard error of estimate and the coefficient standard errors, leading to problems when conducting t-tests for statistical significance of parameters.

the T-tests test the individual coefficients to determine statistical significance

the F-Test determines the statistical significance of the variables AS A WHOLE.

trend model

the independent variable is time

independent variable

•The independent variable is the variable used to explain the variation of the dependent variable. The independent variable is also referred to as the EXPLANATORY variable, the EXOGENOUS variable, or the PREDICTING variable. GOES ON X AXIS


Related study sets

U.S. History Women in the American Revolution Notes

View Set

Chapter 16: Disorders of Brain Function

View Set

Unit 2.2 VARIABLE insurance products

View Set

AP European History Chapters 12 and 13 Test

View Set