Time Series Midterm GA Tech

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

How do you estimate a trend?

1. Moving average 2. Parametric Regression (Linear, Quadratic, etc.) 3. Non-parametric Regression.

What is the general approach to time series analysis?

1. Plot the time series and check for a: trend, a seasonal component, any apparent sharp changes in behavior, any outlying observations. 2. Remove trend and seasonal components to get stationary residuals. 3. Choose a model to fit the residual process 4. Forecasting can be carried out by forecasting residuals and then inverting the transformations carried out in step 2.2

Seasonality Estimation Methods include

1. Seasonal Average 2. Parametric Regression - fit a mean for each seasonality group (e.g. month) using linear regression. - use a cosine-sin curve to fit the seasonal component.

What does the Seasonal Means Model do?

A linear regression model approach to estimate the seasonal effects st is to consider an ANOVA type of model where we group the data by seasonality group. If the periodicity is d, then we have d seasonal groups or effects. Thus, we treat seasonality as a categorical variable. To fit such a model, we need to set up indicator variables or dummy variables that indicate the group [to which] each of the data points belongs. When fitting categorical variables with d groups, we only include d-1 dummy variables if the model has an intercept. Thus, this approach reduces to estimating a linear regression model. This can be convenient since we have a trend. We can estimate the trend and the seasonality in one regression model. But, the previous model can have many predicting variables. This reduced regression model can have many predictive variables, particularly if we have multiple seasonalities. For example, month and week of the year, and/or the periodicity d is large. For week of the year, we will need to have 52 different dummy variables, for example.

A linear process is

A moving average (MA) process with the sum of the absolute value of the MA coefficients being finite. correct

Describe the white noise process

A white noise process Xt has constant variance, sigma squared, uncorrelated and constant mean = 0.

Statistical Inference consists of what?

Confidence intervals and Hypothesis Testing

A Time Series can be characterized by:

Constant or non-constant variability over time. Constant, linear or non-linear trend over time. Cyclical patterns which may happen at regular or irregular time periods.

What are the basic components of a time series?

Data: where t indexes time, e.g. minute, hour, day, month Model: = m_t - is a trend component; s_t - is a seasonality component with known periodicity d () such that s_t = s_t + d. Meaning that the pattern is repeated for each seasonality component. x_t - is a stationary component, i.e. its probability distribution does not change when shifted in time Approach: and are first estimated and subtracted from to have left the stationary process to be model using time series modeling approaches.

What is a caution of doing regression analysis on a time series?

Each variable may be correlated.

How do you eliminate a trend (no seasonality)

Estimate the trend (allows us to gain more insights) and remove it or difference the data to remove the trend directly.

What do moments of a distribution do?

Fully characterizing the distribution

How does the width selection affect the trend? (Pretty much the same as describing the bias variance tradeoff)

If the width is large, then the trend is smooth (bias). If the width is small, then the trend is not smooth (variance).

Method of Moments

In English: You'll basically substitute the sample moments (mean, variance, etc) to be parameters of whatever parametric distribution you want (gamma, poisson) to estimate what that distribution should look like. Formal Definition: Let X be random variable following some distribution. Then the kth moment of the distribution is defined as, µk = E(Xk). For example µ1 = E(X) and µ2 = V ar(X) + E(X)2 . The sample moments of observations X1, X2, . . . , Xn independent and identically distributed (iid) from some distribution are defined as, For example, ˆµ1 = X¯ is the familiar sample mean and ˆµ2 = ˆσ 2 + X¯ 2 where ˆσ is the standard deviation of the sample. The method of moments estimator simply equates the moments of the distribution with the sample moments (µk = ˆµk) and solves for the unknown parameters. Note that this implies the distribution must have finite moments.

Seasonality is an _________ component

In the basic time series model, discussed in the previous slides, seasonality is an additive component and can be estimated from the detrended data if there is also a trend component.

R-squared

In the output, we also have the R-squared or so-called the coefficient of determination, which is an indication of the variability explained by the predicting variables in the linear model. For the example Anova type model in the class, the R-squared is very large, 99.7% indicating that seasonality explains most of the variability in the monthly average temperature over years.

Durbin-Levinson Algorithm

In this algorithm, we're getting the coefficients in the alpha n vector to be used to obtain the prediction and the prediction error. This algorithm does not require the inversion of the matrix gamma n, but will require getting the coefficients in alpha 1, alpha n, and so on, all the vectors up to alpha n vectors recursively.

What does the width selection in moving average reflect?

It reflects the bias-variance tradeoff.

How does Seasonal Averaging Work

It takes all time series values corresponding to each seasonal group k from 1 to d, and averaging them to get the wk.

What is a realization?

It's an observation, one row, one instance. It's the same thing as one observation. It's a small portion of a distribution you observe. If a distribution is a column, the realization is the row.

How do we observe multivariate data in terms of a distribution?

Joint, marginal, and conditional distribution

In the following R command, the item being predicted is on the left or right? lm(temp~x1+x2)

Left. We are predicting the time series. In R, we generally have the output variable, in this case being time, followed by a "~" (tilde) and then the variables predicting it.

What is Consistency in Statistics?

Let θn be an estimate of θ based on a sample of size n. θn is consistent in probability if θn converges in probability to θ; that is for any < 0 P(|θn − θ| > 0) → 0 as n → ∞. You'll basically continue to see the same probability as you get a larger sample size. The mathematical terms are just above.

Why is polynomial regression preferred over kernel regression?

Local polynomial smoothing usually performs better than kernel regression on the boundaries and is not biased by the selected design of the time points.

Trend

Long term increase or decrease in the data over time.

Statistical Inference refers to:

Making statistical statements about an unknown parameter of a distribution, for example, if it is larger than a given value.

Statistical Estimation refers to:

Obtaining an approximation of the parameter of a distribution given the data.

Expectation in statistics?

Statistics Expectation. Introduction. The expectation E(X) is simply defined as the sum of the products of all values by the respective probability. "There" [thinking this should be Their] expectation is valid for discrete and continuous distributions. The Expectation is actually synonymous with the mean value.

Describe the IID process

The IID process has constant variance independence and constant mean = 0. Both are stationary, since the conditions 1, 2, 3 hold, assuming that sigma squared is finite.

What is the classic transformation of poisson to normal? What does this also do?

The classic transformation use for count data is the square root of the counts +3/8. It also reduces the variability.

How do you deseasonalize or remove the season using the seasonal averaging method?

The lectures don't make any sense, nor are the correct variables talked about. So I basically take the average of that specific season for each group aggregated in that season. Let's call that variable wk. I then take the total average of wk. Let's say I call that w_a. So now I take wk / wa and I have an index. I now take that index and I multiply it by my raw values in the time series, and the index corresponds to each season that time series belonged to. I now have a deseasonalized time series.

But why do we need yet another set of statistical modeling tools to model time series data?

The main reason is that the time series response data are correlated. This correlation results in a much smaller number of degrees of freedom than otherwise assumed under independence. Moreover, because of the correlation, the data are concentrating to a smaller part of the probability space where the data align. Ignoring dependence leads to inefficient estimates of the parameters in a model. It leads to poor predictions, to standard errors unrealistically small. In other words, leading to narrow confidence intervals, thus, improper statistical inferences.

The stationarity property of a time series means that

The probability distribution of the time series process does not change when shifted in time. The mean of the time series process is constant over time. The variability in the time series process is finite.

What is the fourth moment when estimating a (sample of) distribution?

The sample kurtosis

What is the first moment when estimating a (sample of) distribution?

The sample mean

What is the third moment when estimating a (sample of) distribution?

The sample skewness

What is the second moment when estimating a (sample of) distribution?

The sample variance

What is a caution of doing model selection on a time series?

The variables are correlated.

What is the difference between white noise process and IID process?

The white noise process assumes uncorrelated data. That means Xt are uncorrelated, whereas the IID process assumes independence.

An ARMA model can include both or just one of the two components. It can simply be an error process or an MA process. True or False?

True

Seasonality can be estimated by fitting a linear regression model with seasonality represented by a categorical variable. True or False?

True

The summary R command provides information about the model fit, True or False?

True

the autocovariance function of a random walk depends on the particular time values s and t, and not on the time separation or lag. True or False

True

var(U) = cov(U,U). True or False

True

The autocovariance function can be defined for a stationary process. True or False?

True.

There are multiple approaches that can be used to estimate the seasonality. True or False?

True.

Trend in a time series can be estimated using the following approach:

Using non-parametric regression with time being the predictor if we assume no given shape to the trend. Using multiple linear regression if we assume a linear or polynomial trend. Using nonlinear regression if we assume a known nonlinear trend up to a set of unknown parameters.

How do you make a joint distribution?

We can decompose this joint distribution into a product of conditional distribution of x given y times the marginal distribution of y. We write the conditional density as f(x|y). This is also equal to the product of the conditional distribution y given x times the marginal of x.

How do you eliminate seasonality when there is no trend?

We can remove the trend [correction: seasonality] using two approaches: 1) Estimations of the seasonality and remove it, or 2) Differencing the data which directly removes the seasonality. To estimate the seasonality, we can use a seasonal averaging or use parametric regression in two ways.

In order to account for trend and seasonality or periodicity, we decompose a time series into three components mt, st, and Xt. What do those components mean?

Where mt is the trend, st is the seasonality, and Xt is the time process after accounting for trend and seasonality. Xt is often assumed to be a stationary component. Most classic time series models assume that the time series process is stationary, and thus we need to first estimate a trend and the seasonality components, subtracting them from the time series Yt, then find the model Xt.

Is the moving average a non parametric approach?

Yes.

What is parametric statistics?

You observe realizations from a set of random variables where f(x; theta) is a density function with parameter theta which is assumed unknown.

In the class, for Trend Estimation, we compare a fitted trend with a constant line using the ___________ command in R

abline

A stationary process is defined by its

autocovariance or autocorrelation function

If all the ACF values are positive, there is some ___________ in the estimation.

bias

A ____ kernel gives a moving average estimator

box

The seasonal means model treats seasonality as a ____________ type variable.

categorical

cyclical trend

data exhibits rises and falls that are not of a fixed period

What is estimation in parametric statistics?

evaluate the unknown parameter theta using the set of observations x1, ..., xn and using the distribution of the random variables from which we observe. You're basically estimating the unknown part of the density function. You'll use approaches like method of moments and maximum likelihood estimation to solve this in a time series.

periodicity

exact repitition in regular pattern (seasonal series often called periodic, although they do not exactly repeat themselves)

In the ANOVA model, the adjusted R-squared increases with the addition of the other set of harmonics, an indication that perhaps including higher-frequency harmonics improves the ____________________.

fit of the seasonality

monotonically increasing

formal def: for all real x, x is increasing.

Xt is weakly stationary if ...

if it has constant mean for all time points, t, it has a finite variance, or more specifically, has a finite second moment, and the covariance function does not change when shifted in time. That is the dependence between Xr and Xs is the same as for the shifted Xr + t and Xs + t.

Seasonality

influenced by seasonal factors (e.g. quarter of the year, month, or day of the week).

Local Polynomial Regression

involves fitting a polynomial within a moving window at each time point.

One of the limitations of using the standard linear regression model under normality to model count data

is that the variance of the error terms is non constant, and thus, a departure from the model assumptions.

Moving Average is a particular case of ______________

kernel regression

Moving average is a particular case of ________________

kernel regression

The moving average is a particular case of _______________

kernel regression when we use a kernel with equal weights.

The _________ command is used for local polynomial regression

loess

In the Anova Type model for seasonal estimation, the estimated regression coefficients provide the _______ for each individual seasonal group.

mean

What are some examples of classic estimators in time series estimation?

mean, variance, skewness, kurtosis

overall monotonically increasing

means that the trend is increasing in general, there are no continual drops in the trend.

Professor Serban uses the _______ library and the ______ command for splines regression where ______ denotes the splines around the time series.

mgcv, gam, s

Name some other methods of nonparametric regression besides kernel regression and local polynomial regression.

non-parametric regression using splines, wavelets or fully orthogonal functions.

A time series is a linear process if it can be represented as a linear combination ...

of a white noise process, where the coefficients in the linear combination have the property that the sum of their absolute values is finite.

Dependence

positive (successive observations are similar) or negative (successive observations are dissimilar)

The observed values of a stochastic process are referred to as a __________ of the stochastic process

realization

In general, a collection of random variables, {xt }, indexed by t is referred to as a

stochastic process

Stationarity

the probability of the distribution does not change when shifted in time.

Two classic examples of stationary processes are

the white noise process and the IID process.

What do MLE and Method of Moments do?

they help answer which are the best parameters/coefficients for my model?

Heteroskedasticity

varying variance with time.

The Innovations Algorithm

we do not obtain the coefficients in the alpha n vector but we are getting directly the predictions. In this algorithm, we first compute the elements of the K matrix in which the i, j element is the expectation of Xi [times] Xj. This is a one-time computation. Then we get the one step prediction using a formula involving the weighted sum of the differences between past data, Xn +1- j, and their prediction and those are called also innovations hence the name of the algorithm. The weights are the theta nj, which are functions of the elements in the K matrix.

To define the best linear predictor.

we minimize the mean squared error with respect to these coefficients

Periodicity

when rises and falls are more or less exactly consistent in both magnitude and interval. It's pretty rare to observe this without some manual manipulation being present (e.g. a factory follows a strict schedule of how much output is produced each day of the week)

Cyclicity

when the data does exhibit patterns of rise and fall, but the length time intervals aren't necessarily consistent. Typically it is exhibited across broader time scale than seasonality. To continue off the rain example, El Nino occurs every several years and results in higher overall rain levels, on top of typical seasonality.

Seasonality

when you observe consistent trends in the data that occur at approximately the same time intervals. An example would be rainfall in a region, where the amount is expected to be directly tied to time of year, but the exact magnitude at various times can still differ a fair bit.

Linear predictors

which are linear combinations of past data.

what does as_vector do in R?

which takes a matrix and stacks up the columns in a matrix into one vector.

Bias

θn is unbiased if, E(θn) = θ. Your expectation of the sample pretty much remains the same if it is unbiased. If your expectation does not remain the same, there is bias.


Kaugnay na mga set ng pag-aaral

Combined Class- Maternity Evolve- Part 2

View Set

The Skeletal System & Calcium Management Chapter 7

View Set

16 .Food fermentation basics and dairy products 1 (wild, backslopping, starter culture., homo, hetero, bulk direct vat, yogurt, buttermilk, sourcream,kefir)

View Set

Riders Covering Additional Insureds

View Set