It is a series which needs to be differentiated in order to be
made stationary is an “integrated” (I) series.
Lags of the stationarized series are called “autoregressive” that
refers to (AR) terms & Lags of the forecast errors are called “moving average” which refers to (MA) terms.
It is basically used for forecasting
Arima is a Generalized random walk models which is fine-
tuned to eliminate all residual autocorrelation.
It is a Generalized exponential smoothing model that can
incorporate long-term trends and seasonality.
The Stationarized regression model uses lags of the
dependent variables and/or lags of the forecast errors as regressors.
Here the forecasting model of time series can be stationarized
by using transformations like differencing, logging and deflating.
By this we can say that a time series is “Stationary” if all the
Statistical properties like mean, variance, autocorrelation etc. are constant in time. ARIMA Model CONSTRUCTION
Terms:
ACF= Autocorrelation Function
PACF= Partial Autocorrelation Function
ACF
The autocorrelation function (ACF). Intuitively, a
stationary time series is defined by its mean, variance and ACF. A useful result is that any function of a stationary time series is also a stationary time series. PACF
In time series analysis, the partial autocorrelation function
(PACF) gives the partial correlation of a time series with its own lagged values, controlling for the values of the time series at all shorter lags. It contrasts with the autocorrelation function, which does not control for other lags.
Terminologies in ARIMA
ARIMA model can be (almost) completely summarized by three
numbers:
p = the number of autoregressive terms
p is the number of autoregressive terms (AR part). It allows to
incorporate the effect of past values into our model. Intuitively, this would be similar to stating that it is likely to be warm tomorrow if it has been warm the past 3 days.
d = the number of nonseasonal differences
d is the number of nonseasonal differences needed for
stationarity. Intuitively, this would be similar to stating that it is likely to be same temperature tomorrow if the difference in temperature in the last three days has been very small.
q = the number of moving-average terms
q is the number of lagged forecast errors in the prediction
equation (MA part). This allows us to set the error of our model as a linear combination of the error values observed at previous time points in the past. These are the three integers (p, d, q) that are used to parametrize ARIMA models. Hence, this is called an “ARIMA (p, d, q)” model. ARIMA “filtering box”
Working of ARIMA (p, d, q)
First apply differencing Then fit ARMA (p, q):
used for removing trends. Order d= {0,1,2}
Order d = how many times to perform lag-1 differencing?
d=0: no differencing (no trends)
d=1: perform differencing once (linear trend)
d=2: double differencing
used for removing seasonality. Order D= {0=none, 1=once}
Let us now try and understand ARIMA forecasting equation for
better understanding. Advantages:
It is a strong underlined mathematical theory which makes it
easy to predict “PREICTIVE INTERVALS” which is it is flexible in capturing a lot of different parameters.
Drawbacks:
No explicit seasonal indices, hard to interpret coefficients or
explain “how the model works”, there is danger of overfitting or mis-identification if not used with care.