MGSC372 Multiple Choice (powerpoints 8-16)

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

seasonally adjusted data

It contains the time series components T, C, and I

the fundamental idea of the ANOVA concept

the fundamental idea is that if the null hypothesis is true (i.e. µ₁ = µ₂ = ... = µκ), then the p populations are statistically identical. Consequently, whichever way we calculate the variance (between populations or within populations) we should get similar results (i.e. F=MSTR/MSE should be close to 1)

testing for residual correlation: if the residual for year t is positive...

there is a tendency for year t+1 to be positive

identification of potential models

at the identification stage, you do not need to worry about the sign of the ACF or PACF, or about the speed at which an exponentially declining ACF or PACF as it approaches 0. These depend on the sign or actual value of AR and MA coefficients in some instances, the exponentially declining ACF alternates between positive and negative values. ACF and PACF plots from real data are never as clean as the plots drawn

residual term in an estimated multiple regression model

ei = yi - ŷ ei = yi - (βˆ₀+βˆ₁x₁+...βˆκxκ)

Bartlett's test

follows a Chi-square distribution with p-1 degrees of freedom

detecting unequal variances

for each treatment, construct a box plot or dot plot for y and look for differences in variability. In this situation, we can test the hypothesis H₀: σ₁²=σ₂²=...=σp² H₁: one of the σ² differs using one of the 3 tests for the homogeneity of variance

tests for main effects are only relevant when...

no interaction exists between factors. for this reason, the test for interaction is generally performed first, and if it is present, a test on the main effects is not performed

why differencing?

non-stationary series have an autocorrelation function that declines slowly (exponentially) rather than quickly declining to zero; you must difference such a series until it is stationary before you can identify the process

specific seasonal relative

(Y/4QCMA)*100

4 points to remember regarding the definition of the Durbin-Watson statistic (d)

1) 0 ≤ d ≤ 4 2) if residuals are uncorrelated, d≈2 3) if residuals are positively correlated, d<2 (if very highly positive, d≈0) 4) if residuals are negatively correlated, d>2 (if very highly negative, d≈4)

2 methods for variable selection in regression analysis

1) Akaike's information criterion 2) Bayesian information criterion (both included on formula sheet) (note, for both formulas, r is the total number of parameters, including the constant term) the lower the value, the better the model!

2 guidelines for interpreting VIFs

1) Any VIF > 10 suggests severe multicollinearity. Start to suspect problems when VIF > 5 2) If all VIFs are less than 1/(1-R²) where R² is the coefficient of determination in the model with all independent variables present, then multicollinearity is not strong enough to affect the coefficient estimates (i.e. variables are more strongly correlated to Y than to each other)

5 different degrees of freedom in two way ANOVA with replication

1) Factor A: n₁-1 2) Factor B: n₂-1 3) Interaction: (n₁-1)(n₂-1) 4) Error: n₁n₂(r-1) 5) Total: n₁n₂r-1 where r = # of replications

3 tests for the homogeneity of variance

1) Hartley test 2) Bartlett's test 3) Modified Levene's test

leverages are a measure of the distance between the _____1_____ for _____2_____ and the _____3_____ for _____4_____.

1) x values 2) an observation 3) mean of x values 4) all observations

4 steps showing how to isolate the cyclical and random components

1) Y/T = CSI 2) Y/TS = CI (deseasonalized % of trend) 3) a 3 period moving average (3MA) is taken; the averaging process reduces irregularity, leaving just C 4) CI/C = I

9 pieces of ANOVA notation: Yij, Tj, nj, Ybarj, n, p, Y double bar, µj, σ

1) Yij = the ith observation from the jth treatment (population) 2) Tj = the total of the jth sample 3) nj = the size of the jth sample 4) Y-barj = Tj/nj = the mean of the jth sample 5) n = ∑nj = the total # of observations 6) p = # of treatments (populations) 7) Y double bar = the overall (grand) mean of all the data combined 8) µj = the mean for the jth treatment (population) 9) σ = √MSE = the common standard deviation for all treatments (populations)

5 time series components

1) Yt = data at time t 2) Tt = trend 3) Ct = cyclical component 4) St = seasonal component 5) It (or Rt) = irregular (or random) component

3 assumptions of ANOVA

1) all populations are normally distributed 2) all populations have the same variance 3) independent random samples from each population

4 steps to calculating seasonal indeces

1) arrange the specific seasonal relatives [(Y/4QCMA)*100] according to their respective seasons 2) find the median for each season 3) determine the "adjustment factor" = (# of seasons*100)/sum of medians 4) obtain seasonal indeces by multiplying each median by the adjustment factor (the seasonal indeces should add to 400 if using quarterly data, 1200 if monthly data)

Cook's D: if the calculated percentile value of the F distribution with (k+1, n-[k+1]) degrees of freedom is... (3 levels)

1) between 0 and .3: not influential 2) between .3 and .5: mildly influential 3) greater than .5: influential

3 steps to detecting non-normal populations and resolving them

1) for each treatment, construct a histogram, normal probability plot, or other graphical display that detects skewness 2) conduct formal tests for normality (e.g. Anderson-Darling) for which the H₀ is is that the probability distribution is normal. (note: in practice, the normality assumption will often not be exactly satisfied, so these tests are of limited use in practice) 3) if the distribution departs greatly from normality, a normalizing transformation such as ln(y) or √y may be necessary

3 steps of autoregressive integrating moving average (ARIMA) models

1) identification 2) estimation 3) diagnostic checking

because D is calculated using _____1_____ and _____2_____ it considers whether an observation is unusual with respect to both x and y values

1) leverage values 2) standardized residuals

3 measures of forecast error

1) mean absolute error (MAE) 2) mean squared error (MSE) (also called mean squared deviation - MSD) 3) mean absolute percentage error (MAPE) see formula sheet for all!

_____1______ departures from the _____2______ assumption will generally not invalidate the results of the regression analysis; regression analysis is robust with regard to this assumption.

1) moderate 2) normality (i.e. if the data is not badly skewed and has one major central peak, we can be confident using the model)

2 types of time series models

1) multiplicative, Yt = Tt*Ct*St*It (the one we care about) 2) additive, Yt = Tt+Ct+St+It

residual analysis: 3 things that can be detected through

1) normality of error terms 2) constant variance 3) independence of the error terms

2 possible causes of outliers within the model

1) omission of an important variable 2) omission of higher order terms

2 plots to detect lack of fit

1) plot residuals ei on the vertical axis against each of the independent variables x₁, x₂, ..., xκ on the horizontal axis (residual plot) 2) plot the residuals on the vertical against the predicted value ŷ on the horizontal axis (residuals vs fits)

It is important to choose a model that is consistent with the _____1_____ associated with the phenomenon under consideration, even if the _____2_____ is lower.

1) scientific fact 2) R²

2 notes on ratio-to-moving-average

1) the first 2 and last 2 values of a time series are lost in the totalling process 2) the ratio-to-MA values are not pure seasonal indeces because they contain the irregular component in addition to the seasonal component.

2 reasons to transform y

1) to make y values satisfy model assumptions 2) to make the deterministic portion of the model a better approximation to the mean value of the transformed variable

(similar to ANOVA for regression) the total variation comes from 2 sources:

1) treatment (between) 2) error (within)

Factor A: _______1_______ Factor B: _______2_______

1) treatments 2) blocks

4 notes on lags

1) we can plot the pairs (yt, yt-1) to investigate a possible first lag relationship 2) the Durbin-Watson statistic is used to detect significant auto-correlation of lag 1 3) the autocorrelation coefficient for the kth lag is rκ 4) the question we seek to answer: what lag length k might be interesting to investigate?

4 guidelines for identifying models using ACF and PACF plots

1) you are entitled to treat any non-significant values as 0 (i.e. ignore values that lie within the confidence intervals on the plot) 2) you do not HAVE to ignore them, particularly if they continue the pattern fo the statistically significant values 3) an occasional autocorrelation will be significant by chance alone. 4) you can ignore a statistically significant autocorrelation if it is isolated, preferably at a high lag, and if it does not occur at a seasonal lag

in the context of ANOVA, what are 2 reasons why t test can be more flexible than an ANOVA F test?

1) you may choose a one sided alternative instead 2) you may want to run a t test assuming unequal variance if you're not sure that your 2 populations have the same std. deviation σ

2 stabilizing transformations

1) √y 2) ln(y)

4 assumptions of the multiple linear regression model

1. *E(error) = 0* 2. *Normality* - values are normally distributed 3. *No multicollinearity* - independent variables are not highly correlated with each other 4. *Homoscedasticity* - similar variance of error terms across values of the independent variable

3 reasons why multicollinearity is a problem

1. The std. error of the regression coefficients are inflated 2. estimated regression coefficients must be interpreted as the average change in the dependent variable per unit change in an independent variable *when all other variables are held constant* 3. inferential statistics on the regression parameters are not reliable (t tests, CIs for βi: large std. error means large CIs and small t-stats, researcher will accept too many H0s)

4 ways to detect multicollinearity

1. high correlation between pairs of variables (correlation matrix) 2. estimated regression coefficients change when a variable is added or removed 3. there are conflicting results between F and t tests (e.g. overall F is significant but individual t values are not) 4. variance inflation factor (VIF) > 5

4 advantages of experimental variables

1. user controls experiment 2. variable values can be assigned so independent variables are not correlated and multicollinearity can be eliminated 3. cause and effect relationships can be inferred 4. randomization can be controlled by assigning a range of values to the independent variable

observations with large leverage may...

exert considerable influence on the fitted value, and thus the regression model

influential observation

one whose removal would significantly affect the regression equation

Does AIC or BIC choose a more parsimonious model and why?

BIC chooses the more parsimonious model as it imposes a higher penalty for including extra parameters (ln[n] in the numerator of the second part of the expression!)

Case 1: if all populations are approximately normal, use...

Bartlett's test

this methodology can be used to decide whether to use a pure AR or MA model or an ARMA model

Box-Jenkins

formula for Cook's D (not on the formula sheet!)

D = (ei²/[(k+1)MSE])*[hi/(1-hi)²] where: k= the number of x coefficients in the regression model ei = the ith residual hi = the ith leverage value

First difference

Dt = Yt -Yt-₁

other commonly used MAs besides 3MA

Henderson's MA (not on final)

Hypothesis test for equal variances

H₀: (σ₁²/σ₂²) = 1 H₁: (σ₁²/σ₂²) ≠ 1 (where σ₁² is the larger and σ₂² is the smaller) TS: F = (s²larger/s²smaller) F= (MSE larger/MSE smaller) CV: F tables Accept H₀ if F≤CV Reject H₀ if F>CV Rejecting the hypothesis of equal variance means that the variances are unequal i.e. there is heteroscedasticity present

ANOVA test of hypothesis

H₀: µ₁ = µ₂ = ... = µκ H₁: not all µj are equal (at least one mean is different) TS: F=(MSTR/MSE)=(MSB/MSW) CV: Fα (p-1,nt-p) [one way] [note nt: t should be subscript] Do not reject H₀ if F≤CV, reject H₀ is F>CV

Values of I in ARIMA

I refers to the degree of differencing I=0: differencing is not necessary I=1: first difference I=2: second difference Zt = Dt - Dt-₁ = (Yt - Yt-₁)-(Yt-₁ - Yt-₂) = (Yt - 2Yt-₁ - Yt-₂) note: it is rare to take higher order differences than 2 (i.e. 0, 1, or 2 are usually sufficient)

Portmanteau tests' goal

Instead of studying the correlation coefficients rκ one at a time, the idea of this class of tests is consider a whole set of nκ values (e.g. r₁ through r₁₂) all at once.

rationale for the Bonferroni correction

It is used to reduce the chances of obtaining false-positive results (i.e. type 1 errors) when multiple pairwise tests are performed on a single set of data.

Case 2: if all populations are clearly not normal, use...

Levene's test

Ljung-Box Q statistic formula (not on formula sheet)

Q = n(n+2) ∑ [rκ²/(n-k)] where k= time lag, rκ are the ACFS and n= # of observations

Ri² in VIF formula

Ri² is the multiple coefficient of determination in the multiple regression model that expresses xi as a function of all variables except xi

R² when multicollinearity is present

R² and the predictive power of the regression model remain unaffected by multicollinearity (i.e. a model with a high R² and significant F test can be used for prediction purposes)

Between/Within/Total

SSB SSW SSTO where SSTO = SSB + SSW MSB = σ² of B F = MSB/MSW = (SSB/p-1)/(SSW/nt-p) note: nt, t should be subscript

Treatment/Error/Total

SSTR SSE SSTO where SSTO = SSTR + SSE MSTR = σ² of TR F = MSTR/MSE = (SSTR/p-1)/(SSE/nt-p) note: nt, t should be subscript

the moving average (MA) model of order q

Y = θ₀ + et - θ₁et-₁ - θ₂et-₂ -...- θqet-q where et is the error term for period t. By convention, the θ terms in the model are preceded by a minus sign

the autoregressive (AR) model of order p

Y = ∅₀ + ∅₁Yt-₁ + ∅₂Yt-₂ +...+ ∅pYt-p+et where Yt-₁, Yt-₂... are time-lagged values and et is the residual error term.

ratio-to-moving-average

Yt = TCSI, 4QCMA = TC Therefore: Y/4QCMA = SI the ratio-to-moving-average includes only the seasonal and irregular components of the time series

autoregressive moving average (ARMA) model of order (p,q)

a combination of the AR and MA models Y = ∅₀ + ∅₁Yt-₁ + ∅₂Yt-₂ +...+ ∅pYt-p+et - θ₁et-₁ - θ₂et-₂ -...- θqet-q e.g. ARMA (1,1) : Y = ∅₀ + ∅₁Yt-₁ + et - θ₁et-₁

leverage

a measure of how influential an observation is; the larger the leverage value, the more influence the observed y has on its predicted value

time series

a sequence of observations collected from a process at fixed (and usually equally spaced) points in time.

ANOVA definition

a statistical test of significance for the equality of several (2 or more) population ("treatment") means H₀: µ₁ = µ₂ = ... = µκ H₁: not all µj are equal

stationary time series

a time series is said to be stationary if there is no systematic change in the mean (i.e. no trend), no systematic change in the the variance, and if strictly periodic variations have been removed; a longitudinal measure in which the process generating returns is identical over time

stabilizing transformation

a transformation that reduces heteroscedasticity

correlation plots: why?

after a times series has been made stationary by differencing, the next step in fitting an ARIMA model is to determine whether AR or MA terms, or a combination of the 2, are needed to correct any autocorrelation that remains in the differenced series. By looking at ACF and PACF plots of the differenced series, you can tentatively identify the numbers of AR or MA terms that are needed.

cyclical

alternating periods of expansion and contraction of more than one year's duration

the Bonferroni correction

an adjustment made to α values when several statistical tests are being performed simultaneously on a single data set to perform the correction, divide the α value by the number of comparisons being made (e.g. if k hypotheses are being tested, the new level of significance would be α/k, and the statistical power of the study is then calculated based on this modified α)

lag

an interval of time between observations in a time series (e.g. a one quarter lag between data points)

outlier

an observation with a residual greater than 3 standard deviations (>3σ), or equivalently with a standardized residual >3.

Cook's Distance (D)

an overall measure of the impact of the ith observation on the n fitted values

Tukey's multiple comparisons

another more sophisticated test for equality of means. In minitab output, means that do not share a letter are significantly different

a Bonferroni CI

construct k CIs each with confidence level 1-(α/k), note that there will be one CI for each population (formula is included on formula sheet!)

longitudinal (time series) data analysis

data are collected by making repeated observations on a process over time, and the past behavior of these series can be examined to try to predict future behavior

standardized residuals

denoted zi for the ith observation, this is the residual for the observation ei divided by the standard error of the estimate s (s=√MSE) "Standardized residuals normalize your data in regression analysis, [it is] a measure of the strength of the difference between observed and expected values." (see formula sheet!)

a test for heteroscedasticity

divide the sample of observations based on all values of ŷ. How many observations fall above or below a certain value?

if there is strong evidence of residual correlation...

doubt is cast on the least squares results and any inferences drawn from them

one possible action you can take if an outlier represents a real and plausible value in your dataset?

exclude the data from the statistical analysis and write an exception report to explain the absence of the value from the dataset

normal probability plot

graphs the *residuals* against the *expected values of the residuals* under the assumption of normality. if the assumption is true, then a residual should approximately equal its expected value, resulting in a straight line graph (points should lie within 95% confidence limits of the straight line representing the normal distribution)

resolving non-stationarity with regard to the mean

if a time series plot is not stationary with regard to the mean (i.e. it exhibits a trend), try differencing the series.

resolving non-stationarity with regard to the variance

if a time series plot is not stationary with regard to the variance, try a transformation (e.g. logarithmic, square root, etc.)

note on logarithmic transformations

if any values in the dataset are equal to zero, a logarithmic transformation cannot be used, as ln(0) is undefined. Use a square root transformation instead

outliers are often caused by...

improper measurement

what to look for in residual plots and residual vs fits plots

in each plot, look for trends, dramatic changes in variability, and/or more than 5% of residuals lying outside 2 standard deviations of 0; any of these patterns indicate a problem with model fit

computing leverage

in regression analysis it is known that the predicted value for the ith observation, ŷi, can be written as a linear combination of the observed values (y₁, y₂, ..., yₙ). Thus for each value yi, i = 1 to n: ŷi = h₁y₁ + h₂y₂ + ... + hiyi + ... + hκyκ where hi is the leverage of the ith observation, meaning it measures the influence of the observed value yi on its on predicted ŷi

note about the 3MA

it is the simplest possible MA and is not very efficient at removing the irregular component

trend

long-term growth or decline in a time series (linear is the only one we care about) Tt = β₀ + β₁t

the Bonferroni correction: the probability of identifying at least 1 significant result due to chance increases as...

more hypotheses are tested. for example, researcher testing 20 hypotheses with α=.05: P(at least 1 sig. result) = 1 - P(no sig. results) P(at least 1) = 1-(1.05)²⁰ P(at least 1) = .6415 this implies that there is a 64.15% chance of identifying a significant result, even if all of the tests are not statistically significant

observations with large D values may be...

outliers

in the AR model, forecasts are based exclusively on...

previous values of the variable to be forecasted

seasonal

repetitive patterns completing themselves within a year (i.e. seen in monthly and quarterly data)

irregular (random)

residual movements in the time series after all other factors have been removed, follows no recognizable pattern

Modified Levene's test

similar to a one way ANOVA based on the absolute deviation of the observations of each sample from their medians if the p value < α, reject H₀, conclude that the variances are unequal

Hartley test

test statistic = max σi²/min σi² reject H₀: σ₁²=σ₂²=...=σp² for large values of the test statistic

the idea of ANOVA

testing the equality of means extended to more than 2 groups

the rationale for finding the 4 quarter centered moving average (4QCMA)

the 4QCMA contains only the trend T and the cyclical component C. The totalling process removes the seasonal effect and dividing the 8QCMT by 8 (the averaging process) removes the random component

finding the 4 quarter centered moving average (4QCMA)

the 8QCMT is divided by 8 to obtain the 4QCMA. This can be directly compared to the original data value for the corresponding time period

F tests for ANOVA and regression

the ANOVA test of hypothesis (H₀: µ₁ = µ₂ = ... = µp) is equivalent to the regression test of hypothesis (H₀: β₁ = β₂ = ... = βp-₁ = 0) while the formulas for computing the F statistic are different, they result in the same F value

partial correlation

the amount of correlation between 2 variables which is not explained by their mutual correlations with a given specified set of other variables

partial autocorrelation

the amount of correlation between a variable and a lag of itself that is not explained by correlations at all lower order lags

autocorrelation

the correlation of a series with its own past history

centering seasonal data

the four-quarter moving totals (4QMT) are not "centered," meaning they do not correspond to the same time scale as the original data data is centered by adding 2 of the 4QMTS to get an 8 quarter centered moving total (8QCMT). The 8QCMT will have the same midpoint as the first 4QMT

any interpretation of main effects depends on the interaction and therefore...

the main effects must be retained in the model even if their p values are not significant

replications

the number of data points for each factor/block combination

lag length

the number of periods back that our series reaches

multicollinearity specifically affects...

the reliability of the least-squares estimate of the regression coefficients

the first lag

the series of values yt-1 note: t-1 should be subscript

computing partial correlation

the square root of the coefficient of partial determination (i.e. it is the % reduction in error variance that is achieved by adding X₃ to the regression model of Y on X₁ and X₂)

t statistics for autocorrelation functions (ACF) and partial autocorrelation functions (PACF)

the t-statistic can be used to test whether a particular lag equals zero. A t statistic with an absolute value greater than 1.25 for lags 1-3, or greater than 2 for lags 4+, indicate a lag not equal to zero.

treatment variation

the variation *between* groups

error variation

the variation within the group

residual plot: if the cluster of error points becomes more spread out as predicted Y values increase...

then heteroscedasticity is present

residual plot: if there is a tight cluster of points above zero and a more spread out cluster below...

then there is a failure of normality

residual plot: if a band of points forms around zero...

then there is no evidence that any of the 3 assumptions have been violated

a note on stabilizing transformations

there are many possible transformations of the form Y^k where k can be any real exponent (see Box-Cox)

multicollinearity: if 2 independent variables are moderately or highly correlated...

there are variables in the model that are performing the same job in explaining the variation in the dependent variable (i.e. they are not needed in the model together.)

ACF and PACF plots of ARIMA processes

they typically show exponential declines in both ACF and PACF plots

residual plot: if the cluster of points forms a wavy pattern that moves above and below zero...

this is a sign of nonlinearity

Ljung-Box Q statistic (extension of Box-Pierce test)

this is a test for overall model adequacy (a sort of lack-of-fit test); it tests to see if the entire set is significantly different from the zero set (a member of the class of tests known as "Portmanteau")

interpreting Cook's D

to interpret, compare D to the F distribution with (k+1, n-[k+1]) degrees of freedom to determine the corresponding percentile. Generally, if the percentile values is >50%, the observation has a major influence on the fitted values and should be examined

Hypothesis test with Ljung-Box Q statistic

use the Ljung-Box Q statistic to test the null hypothesis that the autocorrelations for all lags up to lag k are equal to 0 H₀: model is adequate H₁: model is not adequate If p value associated with the Q statistic is significant (p<.05), then the model is considered *inadequate* if all p values > .05, do not reject H₀

the Durbin-Watson test

used for time series data to detect serial correlations in residuals

residual plot

used to test the assumptions of the multiple regression model: plots residuals against the independent variable X or against the predicted Y values

residual analysis

using residuals to detect departure from assumptions

experimental data

values of the independent variable are controlled

observational data

values of the independent variable are uncontrolled

if the ANOVA is significant and the data fails to pass a test of normality of equal variances...

we must deem any conclusions drawn from the ANOVA invalid

when does heteroscedasticity occur?

when regression results produce error terms that are of significantly varying degrees across different observations

leverage values provide information about...

whether an observation has *unusual predictor values* compared to the rest of the data

if Yt is non-stationary in the mean, Dt...

will often be stationary (occasionally, second differencing will be necessary to achieve seasonality)

defining variables for two way ANOVA, factorial design: experimental design with 2 factors A and B, where A has 3 levels and B has 2

x₁ : 1 if Factor A level 1, 0 if not x₂ : 1 if Factor A level 2, 0 if not x₃ : 1 if Factor B level 1, 0 if not (see complete formula on formula sheet!)

reciprocal transformation of the independent variable

y = β₀+β₁x₁+ε where x₁ = 1/x₁ y = β₀+β₁(1/x₁)+ε

note on ACF and PACF plots

you must learn to pick out what is essential in any given plot, always check the ACF and PACF plots of the residuals in case your identification is wrong

error term in a multiple regression model

εi = yi - E(y) εi = yi - (β₀+β₁x₁+...+βκxκ)


Set pelajaran terkait

CIW Unit 7: Business Email and Personal Information Management

View Set

Microecon Chapter 5: Elasticity and Its Application

View Set

A&P The Endocrine System, Development and Inheritance,...

View Set

Business Administration 141 - Chapter 8 Questions

View Set

Writing an Informative Essay about Heroic Qualities

View Set