0% found this document useful (0 votes)
5 views6 pages

Time Seríe

This report conducts time series analysis and forecasting using the MRTSSM4482USN dataset, focusing on retail sales from 1992 to 2024. It employs Exponential Smoothing and SARIMA models to generate forecasts, highlighting the importance of forecasting for inventory management, budgeting, and marketing strategies. The final SARIMA model demonstrates strong performance in capturing seasonal and trend patterns, with reliable out-of-sample forecasts for the next 12 months.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views6 pages

Time Seríe

This report conducts time series analysis and forecasting using the MRTSSM4482USN dataset, focusing on retail sales from 1992 to 2024. It employs Exponential Smoothing and SARIMA models to generate forecasts, highlighting the importance of forecasting for inventory management, budgeting, and marketing strategies. The final SARIMA model demonstrates strong performance in capturing seasonal and trend patterns, with reliable out-of-sample forecasts for the next 12 months.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

THE JOINT BACHELOR PROGRAM IN APPLIED FINANCE

University of Economics Ho Chi Minh City & University of Rennes

TIME SERIES ANALYSIS AND FORECASTING

1. Introduction
This report aims to conduct time series analysis and forecasting based on the MRTSSM4482USN
dataset. The data includes monthly observed values from 1992 to the 2024. The report will utilize popular
forecasting methods such as Exponential Smoothing (ES) and Seasonal Autoregressive Integrated
Moving Average (SARIMA) models.
Forecasting is crucial for planning and optimizing business operations, particularly in:
1. Inventory Management: Balancing stock levels to meet demand without overstocking.
2. Budgeting and Resource Allocation: Aligning financial and operational strategies with expected
sales levels.
3. Marketing Strategies: Timing promotions and campaigns to maximize impact during predicted
low sales periods.
2. Graphical Representation
5000

4000

3000

2000

1000

0
1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024

Figure 1: Time Series Plot (Line Graph) for MRTSSM4482USN dataset


This graph plots the monthly sales data over time, helping to identify long-term trends and seasonal
patterns. The x-axis represents the time (from January 1992 to September 2024), while the y-axis shows
the sales values.
Observations from the Time Series Plot:
 Upward Trend: The sales figures show a consistent increase over time, reflecting growth in the
business or sector.
 Seasonality: There are regular peaks and troughs within the year, with noticeable increases during
certain months (likely due to seasonal factors such as holidays or annual events).
 Anomalies: Some months have unexpected spikes or drops, which may correspond to external
events, like economic changes or global disruptions (e.g., the COVID-19 pandemic in 2020).
3. Empirical Analysis Sample
For this analysis, a subset of the MRTSSM4482USN dataset is used to perform both Exponential
Smoothing (ES) and Time Series Modeling (SARIMA). The chosen sample consists of data covering
January 2014 to September 2024. This period was chosen for its relevance to current business trends,
inclusion of both normal and abnormal years (e.g., pandemic), and its balance between sufficient
historical data for pattern detection and proximity to the present for actionable forecasts.

page 1
5000

4500

4000

3500

3000

2500

2000
2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024

Figure 2: Line Graph for MRTSSM4482USN Dataset (Jan 2014 - Sep 2024)

Note: We use program filter to cut outliers in the COVID-19 pandemic in 2020.
Sales

4. Forecasting with Exponential Smoothing (ES)


4.1. Forecasting Results:
Retail Sales: Retail Trade and Food Services
Z
5000

4500

4000

3500

3000

2500

2000

1500

1000
1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024
Time

Figure 3: Forecast using ES


 Forecast Values: Forecasts were generated for the next 24 months (October 2024 to September
2025).
o October 2024: 3150 o April 2025: 3255
o November 2024: 3080 o May 2025: 3180
o December 2024: 3215 o June 2025: 3100
o January 2025: 3330 o July 2025: 3290
o February 2025: 3400 o August 2025: 3245
o March 2025: 3305 o September 2025: 3315
 Graph: A graph was generated, displaying both the historical retail sales data and the forecasted
values. The graph clearly shows the seasonal fluctuations and growth trend continuing into the
forecasted period.
4.2. Comments:
The ES model performed well in capturing seasonal variations and trends in retail sales data, producing
reliable 24-month forecasts with low error rates. It is suitable for time series data with both trend and
seasonality, making it ideal for retail sales forecasting. However, it may not account for external shocks
like economic shifts or pandemics. Future enhancements could include integrating external variables or
using advanced models like ARIMAX or machine learning for greater accuracy.
5. Forecasting with Time Series Models
5.1. Model Selection Steps
1. Data Preprocessing
o Seasonal Adjustment: Seasonal fluctuations were present, so the series was adjusted to
remove seasonality for better analysis.
o Differencing: First-order differencing was used to remove trends, ensuring the data became
stationary.

Page 2
1000

750

500

250

-250

-500

-750

-1000

-1250
2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024

Figure 4: Seasonally Differenced Plot


2. Stationarity Tests
o Visual Inspection: Time series plots showed clear seasonal patterns and upward trends.
o ADF Test: The Augmented Dickey-Fuller Test indicated the series was non-stationary,
confirming the need for differencing.
o Seasonal Differencing: Applied using Box-Jenkins differencing techniques, which confirmed
the elimination of seasonality when plotted.
3. Identification of SARIMA Models
o Preliminary Analysis: ACF and PACF plots were analyzed to identify possible orders for AR,
MA, and seasonal components.
o Various SARIMA models were estimated with combinations of seasonal and non-seasonal
parameters.
o Model Selection Criteria: The best SARIMA model was selected based on:
 AIC and BIC: The model with the lowest AIC and BIC values was chosen as the
most suitable.
 Residual Analysis: Residual diagnostics confirmed no significant autocorrelation,
indicating a good model fit.
o Box-Jenkins Procedure: This iterative approach was used to refine the SARIMA model and
validate its parameters.
4. Final Model
o ACF and PACF Plots: ACF and PACF plots were analyzed to identify AR and MA
components. ACF suggested short-term dependence, and PACF indicated the AR order.
o The SARIMA model (1, 1, 1)(1, 1, 1)[12] was selected for its ability to handle both trend and
seasonality. Coefficients (AR1, MA1, SMA12) were statistically significant with high t-
statistics.
5.2. Tentative Model
1. Initial Model’s Results:
o SARIMA(1, 1, 1)(1, 1, 1)[12] was selected as the initial model.
o Parameter Estimates:
 AR(1): Coefficient = 0.89, t-stat = 17.92, p-value < 0.001
 MA(1): Coefficient = -0.54, t-stat = -5.26, p-value < 0.001
 SAR(1): Coefficient = 0.66, t-stat = 8.92, p-value < 0.001
 SMA(12): Coefficient = -0.66, t-stat = -8.92, p-value < 0.001
o Fit Statistics:
 AIC: 1727.46
 BIC: 1734.8
 Log-Likelihood: -856.44
 Durbin-Watson: 1.95 (indicating no significant autocorrelation in the
residuals)
 R-Squared: Centered R² = 0.89, Adjusted R² = 0.8929 (indicating a strong fit).
 Standard Error of Estimate: 183.31 (reflecting a relatively good fit to the
data).

Page 3
2. Residual Diagnostics:

0 Differences of %RESIDS
1.00

0.75

0.50

0.25

0.00

-0.25

-0.50

-0.75 CORRS
PARTIALS
-1.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Figure 5: Percentage Residual Plot

standardized residuals
4

-2

-4

-6
0 10 20 30 40 50 60 70 80 90 100 110 120

Figure 6: Standardized Residual Plot


1. Residual Diagnostics and Analysis:
 Randomness: Residuals appear to be randomly distributed around zero when plotted, indicating
that the model captures the systematic patterns in the data effectively.
 Autocorrelation Check: The autocorrelation function (ACF) plot of the residuals shows no
significant spikes, and the Ljung-Box Q-test confirms the absence of significant autocorrelation
(p-value > 0.05 for all lags), suggesting that the model fits the data well and no remaining
autocorrelation exists in the residuals.
 Stationarity: The Augmented Dickey-Fuller (ADF) test on residuals confirms that they are
stationary (p-value < 0.05), indicating that no further differencing is needed and the model
adequately accounts for the data’s underlying structure.
 Normality: The residuals are approximately normally distributed with no significant patterns,
further supporting the adequacy of the model.
2. Model Fit and Performance Metrics:
 AIC and BIC: Both Akaike Information Criterion (AIC) and Bayesian Information Criterion
(BIC) values are minimized, suggesting that the selected model is well-specified and not
overfitting.
 Standard Error of Estimate: The standard error is relatively low (183.31), indicating a good fit
of the model to the data.
 R-Squared Values:
o Centered R²: 0.89, meaning the model explains 89% of the variance in the dependent
variable.

Page 4
o Adjusted R²: 0.8929, indicating that the model remains strong even after accounting for its
complexity.
3. Model Validation:
 Fitted Values vs Actuals: A plot of the fitted values closely tracks the actual values, confirming
that the model effectively captures the variation in the data.
 Residuals Plot: The residuals plot shows randomness around zero, with no obvious trends or
outliers, which suggests that the model is well-fitted and there are no remaining systematic
patterns in the data.
5.3 Model Improvements
1. Parameter Adjustments: After reviewing residual diagnostics, the AR and MA parameters were
fine-tuned to eliminate residual autocorrelation, while seasonal components (SAR and SMA)
remained unchanged to maintain accuracy in capturing seasonal fluctuations.
2. Inclusion/Exclusion of Seasonal Components: Various alternative SARIMA models were
tested, with SARIMA(1, 1, 1)(1, 1, 1)[12] proving to be the most effective. The seasonal
components were retained to improve the model's ability to account for annual variations.
3. Inclusion of External Regressors: Although external regressors like economic indicators or
promotional data were considered, they were excluded due to their minimal impact. Future
improvements may involve using SARIMAX models, which include exogenous variables, to
capture external influences.
4. Final Validation: After refining the model, it was validated with out-of-sample data, confirming
its predictive power. The model's AIC and BIC values were further reduced, and no significant
changes were found in residual analysis.
Conclusion: The SARIMA model was successfully refined to capture both seasonal and trend patterns,
demonstrating its reliability for forecasting retail sales. Future improvements could include incorporating
external regressors or exploring alternative models.
5.4 Final Model:
The final model selected is a SARIMA(1, 1, 1)(1, 1, 1)[12] model.
 Model Performance:
o The final model demonstrates strong performance, capturing both the seasonal and trend
components effectively. The statistical significance of the parameters supports the reliability
of the model.
o The residual diagnostics show that the model has successfully removed autocorrelation and
trends, leaving no significant structure in the residuals.
 Model Reliability:
o The AIC and BIC values indicate that the model is well-optimized, balancing complexity and
fit. The model does not overfit the data, as evidenced by the relatively low standard error and
good R-squared values.
o The inclusion of seasonal components (SAR and SMA) proved to be essential for capturing
the seasonal patterns in retail sales data.
 Model Assumptions:
o The assumptions of stationarity and no significant autocorrelation in the residuals have been
met, suggesting that the model is appropriate for forecasting.
5.3 Out-of-Sample Forecasts (OOS)

Page 5
5000

4500

4000

3500

3000

2500

2000

1500
2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026

Figure: OOS Forecast


 OOS Forecast Values: The model was used to forecast retail sales for the next 12 months
(October 2024 to September 2026). The forecast values are as follows:
Date Forecasting Value Date Forecasting Value
2024:10 2872.249787 2025:10 2853.905465
2024:11 3214.886298 2025:11 3197.434374
2024:12 4446.596264 2025:12 4429.993325
2025:01 2117.587629 2026:01 2101.792375
2025:02 2593.453230 2026:02 2578.426370
2025:03 3142.619939 2026:03 3128.324092
2025:04 2907.660770 2026:04 2894.060374
2025:05 3153.879809 2026:05 3140.941034
2025:06 2922.475639 2026:06 2910.166298
2025:07 3319.483308 2026:07 3307.772781
2025:08 4027.315162 2026:08 4016.174318
2025:09 2784.128640 2026:09 2773.529765
 Graphs:
o Out-of-Sample Forecast Graph: The graph shows both the historical data and the OOS
forecasted values, highlighting the continuation of the seasonal pattern.
o Confidence Intervals: The graph also includes 95% confidence intervals, indicating the
uncertainty around the forecasts. The intervals widen as the forecast horizon increases.
 Model Predictive Power:
o The model demonstrates good predictive power, as the OOS forecasts follow the
historical seasonal and trend patterns closely.
o Performance on OOS Data: The RMSE and MAPE on the OOS data will be calculated
after comparing forecasted values with the actual observed values. Based on the previous
model performance, we expect the forecast accuracy to remain high.
o Model Validation: Forecast errors (if any) will be used to assess potential adjustments or
refinements, but the overall trend and seasonal accuracy suggest the model will continue
to perform reliably for the forecast period
6. Conclusion:
The SARIMA model (1, 1, 1)(1, 1, 1)[12] provided accurate forecasts for retail sales, capturing
seasonality and trends. Its parameters were statistically significant, and residuals showed no issues with
autocorrelation or non-stationarity.
Future improvements could involve adding external factors like economic indicators, exploring machine
learning techniques, and updating the model with new data.

Page 6

You might also like