Unit 2: Auto Regressive and Moving Average (ARMA) Model

Ace your homework & exams now with Quizwiz!

property of lag 0 ACF for white noise

This aligns with the property of the auto-covariance function of a white noise, which is nonzero at lag 0 and it's equal to the variance, but is 0 otherwise.

MLE formula to minimize

To estimate the AR and MA coefficients, we minimize this likelihood function, more precisely, the log of the likelihood function.

why do we prefer that processes are invertible?

We prefer the invertible ARMA processes in practice because if we can invert an MA process to an AR process, we can find the value of Zt which is not observable, based on all past values of Xt which are observable. If a process is non-invertible, then in order to find the value of Zt, we have to know all future values of Xt.

can Yule Walker be used to estimate order of AR(p)?

YES

For ARMA parameter estimation, do you de-mean? Are p and q fixed?

Yes and Yes

Can MA be used on ARMA models?

Yes, if causal

invertible process

Zt is white noise process

stochastic

a random process - an expected value + variation

ARMA stands for

auto-regressive moving average

why do we perform a histogram/Q-Q plot with ARMA/ARIMA?

b/c we assume normality

what is a causal process?

can invert MA to AR and AR to MA model

What's a downside to AICC?

computationally very expense

stationary solution to ARMA equation if this condition met

for all values | z | = 1, phi(z) cannot = 0

How can we use the Yule Walker equations?

for finding the phi coefficients of the AR model

forecasting ARIMA model

forecasting in ARIMA <> forecasting in ARMA b/c ARMA processes assumed causal, and can't do that w/ ARIMA

Sample PACF

good for identifying a 'good' model

what are the conditions of an ARMA process?

if stationary for every t

What conditions make a time series ideal for Yule Walker, asides from the requirements of white noise w/ mean zero and constant variance?

large length of time series or large sample size

as phi increases, is it more or less likely that a process approaches causality?

less likely See example - top time series is stationary while bottom is NOT

What does the last equation here mean?

means that for h > p, the variance ~ 1/p AN = asymptotic normal distribution

what does side = 1 mean?

means we're generating an MA process

what method do you use for an AR process?

method = recursive

are all stationary processes causal?

another approach for evaluating a prediction is

seeing if the observed values fall within the prediction bands

how do you find the psi coefficients?

solve a system of linear equations

what part of the notation refers to the autoregressive part?

the B term

what happens to the co-variance between two points in a stationary process as the lag approaches infinity?

the co-variance approaches 0

What's the d in the ARIMA model?

the d represents how many differences - differences can remove trend

prediction accuracy measures

the y* is the predicted values The capital y_hat is mean fit response MSPE is appropriate for validation prediction accuracy for model using the best linear prediction approach. But it depends on scales, and it's sensitive to outliers. MAE is not appropriate to evaluate prediction accuracy of the best linear prediction, and it depends on skills but it is robust to outliers. MAPE is not appropriate again, to evaluate prediction accuracy of the best linear prediction, but it does not depend on the scale, and it is robust to outliers. PM or the percentage measure, it's a appropriate for evaluating prediction accuracy of the best linear prediction and it does not depend on the scale. The PM measure is reminiscent of the regression R squared used in the linear regression. It can be interpreted as the proportion between the variability in the prediction and the variability in the new data. While MAE and MAPE are commonly used to evaluate prediction accuracy, I recommend using the precision measure.

ARMA order and ACF below confidence bands

we do not expect the sample autocorrelation to be approximately 0 for lags larger than the order of the MA process, but to be approximately 0. *only for stationary processes

What do we use to estimate MA parameters, the theta parameters?

we use the Innovations algorithm

ARMA autocovariance function

where psi = coefficients to linear process

Can ARIMA be used on non-stationary processes?

yes

are all causal processes stationary?

yes

is there a significant difference b/w predicting 10 days ahead, and 10 days ahead (But one day at a time)?

yes

if auto-regressing, will you have to filter your data?

yes - since this ARMA model is order 2, you must filter p:end in your indices where p is order

Can Yule Walker determine confidence intervals and statistical significance of parameters?

yes!

identifying order

AR(p) from PACF MA(q) from ACF For an ARMA process, it can be shown that the PACF has the property, that it's the 0 for lags larger than the order p of the AR process. Thus, we can identify the lack of the AR process using the PACF plot. To summarize, if we have an AR process, we can identify p using PACF. And if we have an MA process, we can identify q using ACF. Unfortunately, there are no such simple rules for ARMA(p,q) processes in general.

ARIMA limitations

ARIMA modeling is easy to implement, but it captures the non-stationary in trend assuming similarity among prior observations and thus it can mispredict if there are large changes in the trend. On the other hand, the trend plus ARIMA estimation approach is more difficult to implement, particularly if interested in obtaining confidence bands also, but it can capture long-memory trends. Prediction using ARIMA can perform well if only short time prediction periods are considered.

PACF and confounding

Another important measure of the dependence for a time series that accounts for disconfounding is the partial autocorrelation function.

In the ED data example, what methods do we use to test for correlation?

Box-Pierce and Ljung-Box

AR coefficient estimation via linear regression

Constraint - must assume normality

Confounding

Correlation between two observations in a time series at two time points can result from a mutual linear dependence on observations at other time points called confounding.

what's a less computationally expensive order selection method than AICC?

Extended auto correlation function

coefficient correlation example

From this examples, and other examples, we see that although Xt is modeled as a moving average model, thus, a linear combination of white noise. Then Xt and Xt-1 are still dependent on the sample sequence of white noise. Thus, they are correlated. This correlation is reflected in the ACF plots.

PACF above bands and order

If we were to use the PACF plot to identify the lag of the AR process, we identity an AR process with order one. However, the property of the PACF for an AR process does not hold for non-stationary processes. See non-starionary and stationary, both AR(2) plots The non-stationary makes it look like order 1 when it's order 2

Are the expectation and covariance of the ARIMA process uniquely determined?

Importantly, the expectation and the covariance of the ARIMA process are not uniquely determined.

How does ARIMA differ from ARMA?

In ARMA d = 0 while in ARIMA d > 0 Also, a causal ARMA process will not be causal as ARIMA b/c there are d roots on the unit circle

statistical properties of MLE

Last, we can use the property of the MLEs, the estimated maximum likelihood estimators, particular the property of asymptotically normally distributed, regardless of the assumption on the noise Zt. The MLEs are asymptotically unbiased, that is, for large sample size, the expectation of the MLEs is close to the true parameter values. The covariance matrix of the estimator is directly dependent on the covariance of the time series. This asymptotic distribution can be used for statistical inference on the AR and MA coefficients, as well as for the asymptotic distribution of the ACF and PACF.

Consequence of ACF estimate versus exact value

Moreover, its estimate is not exact, but it's a random variable with a distribution as I will explain in more detail in a next lesson. Because of this, we do not expect the sample autocorrelation to be exactly 0 for lags larger than the order of the MA process, but to be approximately 0. Thus, the ACF plot indicates that the order of the MA process is 1, which is the order of the process we actually simulated.

consider the sample model

Mt is trend St is seasonality Zt is white noise In practice: Because there are many orders to determine those of ARIMA P, D, Q, along with those of a seasonal ARIMA, capital P, capital D, capital Q, such a model is difficult to implement.

To apply ARMA, do the random variables need a normal distribution?

NO Note, that the distribution of the random variables does not need to be normal.

Can Yule Walker be used for MA and ARMA?

NO!

Can the ACF and PACF plots predict p and q for an ARMA process?

No The PACF plot here suggests AR(p) = 3 and the ACF plot here suggests MA(q) = 4. However, it's (2, 2)

would you expect to see periodic patterns in PACF?

No - by design should exclude

If you fit a time series with AR alone and then ARMA, will both models have the same AR(p) that minimizes AIC?

No; in the ED data example the AR alone minimized AIC with p order 6 but with ARMA the p order was 5.

ARMA notation

Phi is a polynomial of order p with coefficients given by the AR portion of the model. Theta is a polynomial of order q with coefficients given by the MA portion

AR linear regression process

1) calculate actuals PACF and see if time series order aligns with order chosen for model 2) calculate coefficients w/ right order 3) see if time series - residuals is inside confidence bands and shows AR(0)

process used in IBM example

1) check log(time series) with ACF/PACF 2) check log(time series difference order = 1) with ACF/PACF (lags should give guidance to AR order needed) 4) check ARMA with ACF/PACF/Histogram/QQ/Ljung and Box-Pierce

process to estimate ARMA parameters

1. The AR coefficients denoted with phi1 to phip and the MA coefficients denoted with theta1 to thetaq are unknown parameters. But if the process has non-zero mean, then, mu, the mean of the process, is a parameter, along with a variance of the y noise Zt. 2. Commonly, we estimate them in parameter mu first, and 3. subtract mu from the process, assuming a zero mean process. 4.Thus, we only estimate the AR and MA coefficients, and the variance parameter.

See all study sets

Related study sets

Marketing Exam 2

Depressive Disorders & Bipolar and Related Disorders

NSC 170 Chapter 1

Chapter 1 - Psychoactive Drugs: Classification & History

Manufacturing Processes Chapter 21

Fire inspector set ch 1-12 for state test

Triangular Trade/Slave Trade (Cultural Exchange pages 11-12)

303 Hinkle PrepU Chapter 61: Management of Patients with Dermatologic Disorders

Me 286 19

Managerial Accounting Exam 3

govt. chapter 9-10

Micro Econ Exam 1

Vocab Lesson 2 Review

PM Ch.8

Chp 15 - BA3301

BAMK360

Practice A mental health online ATI

Medication Calculation Quiz, IV dosage calculation practice test, Dosage Calculation Practice Exam 1, ATI Desired Over Have: Medication Administration, HESI Dosage Calculations Practice Exam, HESI Dosage Calculations Quiz, Med Surg 1 - HESI practice...