0% found this document useful (0 votes)
10 views12 pages

Prac III (1) - VVJ

The document presents an analysis of unemployment time series data from July 1975 to September 1979 using SARIMA modeling techniques. It details the identification of seasonal patterns, the selection of model parameters, and the evaluation of model adequacy through residual analysis. Multiple SARIMA models are compared, with the SARIMA (2,1,0) × (1,1,0)12 model showing the best fit due to its significant seasonal AR term and low residual variance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views12 pages

Prac III (1) - VVJ

The document presents an analysis of unemployment time series data from July 1975 to September 1979 using SARIMA modeling techniques. It details the identification of seasonal patterns, the selection of model parameters, and the evaluation of model adequacy through residual analysis. Multiple SARIMA models are compared, with the SARIMA (2,1,0) × (1,1,0)12 model showing the best fit due to its significant seasonal AR term and low residual variance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

SHIVAJI UNIVERSITY KOLHAPUR

DEPARTMENT OF STATISTICS
M. Sc-II Sem.-IV Statistics
Modeling and analysis of univariate time series
Name: Vaishnavi Vishwas Jadhav Practical No:
PRN: 2023000320 Date

Q.1
Aim: To analyze the time series using ARIMA/SARIMA techniques and also perform residual
analysis for model adequacy checking of the model.

Solution:

Time series plot for the unemployed data taken over several months from July 1975 to September
1979 is as follows-

Time Series Plot of UNEMPLYD

120000

100000

80000
UNEMPLYD

60000

40000

20000

0
1 5 10 15 20 25 30 35 40 45 50
Index

The unemployment data exhibits a strong seasonal pattern with regular peaks, but no clear
long-term increasing or decreasing trend. The cyclical behaviour suggests a predictable pattern
influenced by external factors, with no major outliers disturbing the trend.
ACF -

Autocorrelation Function for UNEMPLYD


(with 5% significance limits for the autocorrelations)

1.0

0.8

0.6

0.4
Autocorrelation

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1 2 3 4 5 6 7 8 9 10 11 12 13
Lag

The ACF plot shows significant autocorrelations at multiple lags. A strong positive
autocorrelation at lag 1 suggests that unemployment in one period is highly dependent on the
previous period. Also, positive autocorrelation at lag 12 suggest a repeating cycle every 12
periods of time and in this case it is yearly.

PACF -

Partial Autocorrelation Function for UNEMPLYD


(with 5% significance limits for the partial autocorrelations)

1.0

0.8

0.6
Partial Autocorrelation

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1 2 3 4 5 6 7 8 9 10 11 12 13
Lag

The PACF plot shows a strong positive spike at lag 1 and a moderate negative spike at lag 2. A
positive lag 1 spike suggests that the current value is positively correlated with the previous
value. A negative lag 2 spike suggests a possible oscillatory pattern, meaning that the value at
time t is positively related to t−1 but negatively related to t−2.
By observing the time series plot and ACF, it is clearly seen that the data has seasonality (12
months cycle). So, it is suitable to go to SARIMA model rather than ARIMA.

Parameters for SARIMA model:

The SARIMA model has total 6 parameters as (p, d, q) × (P, D, Q)𝑠 .

p: The PACF cuts off after lag 2, indicating an autoregressive (AR) component of order 2.

d: The original time series likely showed a trend, and first differencing was applied to achieve
stationarity.

q: The ACF does not exhibit a sharp cutoff, suggesting a weak MA component, so 1st trying for
q = 0.

P: No significant spike at lag 12 in the PACF suggests a seasonal AR(0) component.

D: The ACF shows a strong seasonal pattern, indicating the need for first seasonal differencing.

Q: The seasonal component in ACF suggests a possible MA effect, but it's weak, so trying Q =
0 first is reasonable.

s: The seasonal period is 12 months, matching the observed yearly seasonality in the ACF.

Model I: SARIMA (𝟐, 𝟏, 𝟎) × (𝟎, 𝟏, 𝟎)𝟏𝟐


ACF of Residuals for UNEMPLYD
(with 5% significance limits for the autocorrelations)

1.0

0.8

0.6

0.4
Autocorrelation

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1 2 3 4 5 6 7 8 9
Lag

PACF of Residuals for UNEMPLYD


(with 5% significance limits for the partial autocorrelations)

1.0

0.8

0.6
Partial Autocorrelation

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1 2 3 4 5 6 7 8 9
Lag

Final Estimates of Parameters

Type Coef SE Coef T P


AR 1 -0.2151 0.1565 -1.37 0.178
AR 2 -0.3729 0.1564 -2.38 0.023
Constant 855.8 926.2 0.92 0.362
AR(1) is not statistically significant at the 5% level, suggesting weak evidence for an AR(1)
term. AR(2) is statistically significant at the 5% level, meaning the AR(2) component has a
meaningful contribution.

Differencing: 1 regular, 1 seasonal of order 12


Number of observations: Original series 51, after differencing 38
Residuals: SS = 1140814193 (back forecasts excluded)
MS = 32594691 DF = 35

The model applies first-order differencing to remove trend and seasonal differencing of order
12 to address seasonality.
Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48
Chi-Square 17.5 32.3 36.3 *
DF 9 21 33 *
P-Value 0.042 0.054 0.319 *

The Ljung-Box test suggests the presence of mild residual autocorrelation, meaning the model
may require additional refinements.

Conclusion:

The SARIMA (2,1,0) × (0,1,0)12 model captures some seasonal and trend components, but
residual analysis indicates mild autocorrelation at lag 12. The AR(2) term is significant, while
AR(1) and the constant are not. The Ljung-Box test suggests the model may need refinement.
Further improvements can be tested using alternative SARIMA parameter specifications.

Model II: SARIMA (𝟐, 𝟏, 𝟎) × (𝟎, 𝟏, 𝟏)𝟏𝟐


ACF of Residuals for UNEMPLYD
(with 5% significance limits for the autocorrelations)

1.0

0.8

0.6

0.4
Autocorrelation

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1 2 3 4 5 6 7 8 9
Lag

PACF of Residuals for UNEMPLYD


(with 5% significance limits for the partial autocorrelations)

1.0

0.8

0.6
Partial Autocorrelation

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1 2 3 4 5 6 7 8 9
Lag

Final Estimates of Parameters

Type Coef SE Coef T P


AR 1 -0.0932 0.1635 -0.57 0.572
AR 2 -0.2529 0.1639 -1.54 0.132
SMA 12 0.7342 0.2398 3.06 0.004
Constant 613.9 242.4 2.53 0.016
Differencing: 1 regular, 1 seasonal of order 12
Number of observations: Original series 51, after differencing 38
Residuals: SS = 630089924 (back forecasts excluded)
MS = 18532057 DF = 34

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48
Chi-Square 12.5 30.7 34.3 *
DF 8 20 32 *
P-Value 0.130 0.059 0.360 *

The SARIMA (2,1,0) × (0,1,1)12 model shows an improvement over the previous model. The
seasonal MA(12) term is significant, suggesting the presence of seasonal effects. However, the
AR terms are not significant, indicating that the autoregressive component may not be
necessary. The model's residual variance is lower, and the Ljung-Box test suggests no strong
autocorrelation issues, implying a better fit. This model is a reasonable choice, but further
refinements could be explored.

Model III: SARIMA (𝟐, 𝟏, 𝟎) × (𝟏, 𝟏, 𝟎)𝟏𝟐


ACF of Residuals for UNEMPLYD
(with 5% significance limits for the autocorrelations)

1.0

0.8

0.6

0.4
Autocorrelation

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1 2 3 4 5 6 7 8 9
Lag

PACF of Residuals for UNEMPLYD


(with 5% significance limits for the partial autocorrelations)

1.0

0.8

0.6
Partial Autocorrelation

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1 2 3 4 5 6 7 8 9
Lag

Final Estimates of Parameters

Type Coef SE Coef T P


AR 1 0.2123 0.1722 1.23 0.226
AR 2 0.3117 0.1653 1.89 0.068
SAR 12 -0.9970 0.0738 -13.51 0.000
Constant 405.8 465.8 0.87 0.390
Differencing: 1 regular, 1 seasonal of order 12
Number of observations: Original series 51, after differencing 38
Residuals: SS = 280222757 (back forecasts excluded)
MS = 8241846 DF = 34

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48
Chi-Square 7.9 12.4 12.4 *
DF 8 20 32 *
P-Value 0.443 0.902 0.999 *

The SARIMA (2,1,0) × (1,1,0)12 model shows a significant improvement over the previous
models. The seasonal AR(12) term is highly significant, indicating strong seasonal
dependencies in the data. The AR(2) term is close to significance, suggesting a possible short-
term autoregressive effect, while AR(1) is not significant. The model has the lowest residual
variance among the tested models, implying a better fit. The Ljung-Box test results indicate no
strong autocorrelation in residuals, confirming the model's adequacy. This model is the best so
far, but minor refinements could be explored for further optimization.

Model IV: SARIMA (𝟏, 𝟏, 𝟎) × (𝟏, 𝟏, 𝟎)𝟏𝟐


ACF of Residuals for UNEMPLYD
(with 5% significance limits for the autocorrelations)

1.0

0.8

0.6

0.4
Autocorrelation

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1 2 3 4 5 6 7 8 9
Lag

PACF of Residuals for UNEMPLYD


(with 5% significance limits for the partial autocorrelations)

1.0

0.8

0.6
Partial Autocorrelation

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1 2 3 4 5 6 7 8 9
Lag

Final Estimates of Parameters

Type Coef SE Coef T P


AR 1 0.2847 0.1708 1.67 0.104
SAR 12 -0.9959 0.0841 -11.84 0.000
Constant 449.8 480.7 0.94 0.356

Differencing: 1 regular, 1 seasonal of order 12


Number of observations: Original series 51, after differencing 38
Residuals: SS = 306891510 (back forecasts excluded)
MS = 8768329 DF = 35

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48
Chi-Square 11.0 13.6 13.6 *
DF 9 21 33 *
P-Value 0.279 0.886 0.999 *

The SARIMA (1,1,0) × (1,1,0)12 model performs well, capturing the seasonal structure
effectively. The seasonal AR(12) term is highly significant, confirming strong seasonal
dependencies. However, the AR(1) term is not statistically significant, suggesting that the
short-term autoregressive component may not be essential. The model's residual variance is
slightly higher than SARIMA(2,1,0)(1,1,0)[12], indicating a marginally worse fit. The Ljung-
Box test results suggest no significant autocorrelation issues in the residuals, supporting the
model's adequacy.

Conclusion:

In order to decide the best model, we check the AIC and BIC criteria for these models as
follows:

R Code-
It gives following output-

Model AIC BIC

SARIMA (𝟐, 𝟏, 𝟎) × (𝟎, 𝟏, 𝟎)𝟏𝟐 769.5356 774.4483

SARIMA (𝟐, 𝟏, 𝟎) × (𝟎, 𝟏, 𝟏)𝟏𝟐 758.4213 764.9717

SARIMA (𝟐, 𝟏, 𝟎) × (𝟏, 𝟏, 𝟎)𝟏𝟐 747.5776 754.1279

SARIMA (𝟏, 𝟏, 𝟎) × (𝟏, 𝟏, 𝟎)𝟏𝟐 745.9788 750.8916

The 𝑆𝐴𝑅𝐼𝑀𝐴 (1,1,0) × (1,1,0)12 is the best model based on AIC and BIC. Adding an extra
AR term in 𝑆𝐴𝑅𝐼𝑀𝐴 (2,1,0) × (1,1,0)12 does not improve performance significantly. Hence,
the 𝑆𝐴𝑅𝐼𝑀𝐴 (1,1,0) × (1,1,0)12 is the best fit model to the given data of monthly numbers of
unemployed workers in the building trade in Germany from July 1975 to September 1979.

You might also like