LAB – 9
ARMA Model fitting and Forecasting
STA - 631
JEREMY MATHEW JOSE
2240217
Introduction
Time series analysis is a statistical approach for modeling and forecasting future values based on
historical data. The ARMA (AutoRegressive Moving Average) model is commonly applied to
stationary time series, incorporating both past observations (AR) and past errors (MA) to
enhance predictive accuracy. This study aims to fit an ARMA model to the AirPassengers
dataset, perform residual diagnostics, and assess its forecasting effectiveness.
Objective:
Choose a suitable time series dataset to fit a predictive ARMA model. Perform a residual
analysis to assess model adequacy. Evaluate whether the model is suitable for future forecasting.
Illustrate three-step-ahead forecasts for the fitted ARMA model using the forecast package in R.
CODE:
library(forecast)
## Warning: package 'forecast' was built under R version 4.2.3
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(tseries)
## Warning: package 'tseries' was built under R version 4.2.3
data("AirPassengers")
ts_data <- log(AirPassengers)
plot(ts_data, main="Log of AirPassengers Data",
ylab="Log(Passengers)", xlab="Year")
# Inference: The log transformation is applied to stabilize
variance. The plot provides a visual representation of the time
series data, showing trends and seasonality.
adf.test(ts_data)
## Warning in adf.test(ts_data): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: ts_data
## Dickey-Fuller = -6.4215, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
# Inference: The ADF test results indicate a strongly negative
test statistic with a p-value of 0.01, rejecting the null
hypothesis of a unit root. This suggests that the time series is
stationary after transformation.
data_ts <- diff(ts_data)
fit <- auto.arima(data_ts, seasonal=FALSE, approximation=FALSE,
stepwise=FALSE)
summary(fit)
## Series: data_ts
## ARIMA(0,0,5) with zero mean
##
## Coefficients:
## ma1 ma2 ma3 ma4 ma5
## 0.1302 -0.0032 -0.3010 -0.6104 0.2508
## s.e. 0.0868 0.0670 0.0503 0.0608 0.0809
##
## sigma^2 = 0.008727: log likelihood = 137.03
## AIC=-262.07 AICc=-261.45 BIC=-244.29
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
ACF1
## Training set 0.02077617 0.09176947 0.0719861 NaN Inf 2.085115 -
0.01829788
# Inference: The best-fit ARIMA model chosen is ARIMA(0,0,5), meaning
a purely moving average (MA) model with five lags. Low AIC (-262.07)
and BIC (-244.29) suggest a good model fit.
res <- residuals(fit)
par(mfrow=c(2,2))
plot(res, main="Residuals of ARMA Model", ylab="Residuals",
xlab="Time")
# Inference: Residuals appear randomly scattered around zero,
suggesting no visible autocorrelation or trends.
acf(res, main="ACF of Residuals")
# Inference: The ACF plot shows that residuals have little to no
autocorrelation, confirming that the model has captured dependencies
well.
qqnorm(res); qqline(res)
# Inference: The QQ plot shows that residuals approximately follow a
normal distribution, though slight deviations at the tails are
possible.
hist(res, main="Histogram of Residuals", xlab="Residuals",
col="lightblue", breaks=20)
# Inference: The histogram further supports normality, indicating the
residuals are well-distributed.
Box.test(res, type="Ljung-Box")
##
## Box-Ljung test
##
## data: res
## X-squared = 0.04889, df = 1, p-value = 0.825
# Inference: The high p-value (0.825) suggests no significant
autocorrelation in residuals, confirming model adequacy.
shapiro.test(res)
##
## Shapiro-Wilk normality test
##
## data: res
## W = 0.98495, p-value = 0.12
# Inference: The Shapiro-Wilk test p-value (0.12) is above
0.05, indicating residuals do not significantly deviate from
normality.
ar1_fit <- Arima(data_ts, order=c(1,0,0))
forecast_vals <- forecast(ar1_fit, h=3)
par(mfrow=c(1,1))
plot(forecast_vals, main="Three-Step-Ahead Forecast for AR(1) Model")
# Inference: The plot shows a three-step-ahead forecast using
the AR(1) model, providing an estimate for future values with
confidence intervals.
Conclusion:
The ARIMA(0,0,5) model effectively captures the underlying patterns in the transformed
and differenced AirPassengers dataset. Residual analysis confirms that the model satisfies
key assumptions for adequacy, making it reliable for short-term forecasting. The three-
step-ahead forecast follows expected trends, demonstrating the model’s practical
applicability in real-world scenarios.