0% found this document useful (0 votes)
23 views29 pages

Mini Project PDF

The report focuses on demand forecasting of electric vehicles (EVs) in India using time series analysis. It outlines the objectives, data cleaning, decomposition, smoothing methods, and tests for stationarity, while also exploring the relationships between different fuel types through Granger causality tests. The findings aim to provide insights for policy-makers and industry stakeholders to inform future strategies in the automotive sector.

Uploaded by

struggler010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views29 pages

Mini Project PDF

The report focuses on demand forecasting of electric vehicles (EVs) in India using time series analysis. It outlines the objectives, data cleaning, decomposition, smoothing methods, and tests for stationarity, while also exploring the relationships between different fuel types through Granger causality tests. The findings aim to provide insights for policy-makers and industry stakeholders to inform future strategies in the automotive sector.

Uploaded by

struggler010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Report on Mini Project

Time Series Analysis (DJ19DSC5012) AY: 2023-24

Demand Forecasting of ev vehicles in India.

SHOBIT GUPTA 60009220032

SAIYYAM LODAYA 60009220067

SAYANTAN MUKHERJEE 60009220131

Guided By
Prof. Shruti Mathur

1
TABLE OF CONTENTS

Sr. No. Topic Pg.


No.
1. Introduction 3
2. Data Description 4
3. Objective 5
4. Data Cleaning 6
5. Data Decomposition 7
6. Smoothing Methods 8
7. Testing Stationary 9
8. Justification why it is a time series problem. 10
9. Implementation and Interpretation for forecast 11
10. Reasons For Selecting the Time Series Model 14
11. Comparative Result Analysis 16
12. Google Colab Link 18
13. Conclusion 18
14. Future Scope 19
15. References 19

2
CHAPTER 1
INTRODUCTION
The automotive industry in India is undergoing a significant transformation, characterized by the
rapid adoption of various fuel types, particularly Electric Vehicles (EVs). This project delves into
the analysis of trends in vehicle registrations across different fuel types, including Electric Vehicles,
Hybrid, CNG, and Petrol, through the lens of time series methods.

Spanning several years, the dataset provides a comprehensive view of the evolving preferences and
market dynamics within the automotive sector. By leveraging robust statistical and machine
learning models, this study aims to forecast future vehicle registrations. These forecasts are
intended to inform policy-makers, market analysts, and industry stakeholders, enabling them to
make data-driven decisions that will shape the future landscape of the automotive industry in India.

Through this analysis, we seek to uncover the underlying patterns and factors driving the shift
towards more sustainable and alternative fuel vehicles, thereby contributing to a more informed
understanding of the market trends and aiding in the formulation of strategic policies and market
strategies.

3
CHAPTER 2
DATA DESCRIPTION

The original dataset contained many columns namely ,

1. CNG ONLY
2. DIESEL
3. DIESEL/HYBRID
4. DUAL DIESEL/CNG
5. ELECTRIC(BOV)
6. ETHANOL
7. LNG
8. LPG ONLY……

The dataset consists of:


 Date: Timestamp of recorded observations.

 Fuel Categories: Counts for Electric Vehicles (EV), Hybrids, CNG,


Petrol, and others.

4
CHAPTER 3

OBJECTIVE

The primary objectives of this project on demand forecasting of Electric


Vehicles (EVs) in India using time series methods are as follows:
1. Data Analysis and Cleaning: To thoroughly examine and preprocess the
dataset, ensuring that it is free from inconsistencies, missing values, and
outliers, thereby preparing it for accurate analysis.
2. Identifying Trends in Data: To detect and analyze the underlying trends
in vehicle registrations across various fuel types (Electric Vehicles,
Hybrid, CNG, Petrol), which will help understand the long-term
movement and patterns.
3. Checking for Seasonality: To investigate the presence of seasonal
patterns in the data, which could indicate periodic fluctuations in vehicle
registrations due to factors such as festivals, new model launches, or
policy changes.
4. Data Smoothing: To apply techniques that smooth the time series data,
thereby reducing noise and making the trends and patterns more
discernible.
5. Model Parameter Identification: To determine the optimal parameters
for the statistical and machine learning models that will be employed for
time series forecasting.
6. Model Fitting: To fit the selected models to the historical data and ensure
they accurately capture the underlying trends and patterns.
7. Accuracy Matrix Calculation: To evaluate the performance of the fitted
models using appropriate accuracy metrics, ensuring that the models are
robust and reliable.
8. Forecasting Future Registrations: To use the fitted models to forecast
future vehicle registrations, providing valuable insights that can inform
policy decisions, market strategies, and industry planning.

5
CHAPTER 4

DATA CLEANING

Final dataset after cleaning:

6
CHAPTER 5
DATA DECOMPOSITION

Any Time Series data can be decomposed into three key components:
1. Trend: The long-term movement in the data over a period of time. This
component shows the general direction in which the data is moving,
whether it's upward, downward, or stagnant.
2. Seasonal: The repeating pattern or cycle in the data at regular intervals,
often driven by seasonal factors such as months, quarters, or even days of
the week. This component helps identify periodic fluctuations.
3. Residual: Also known as the random or irregular component, it
represents the noise in the data that cannot be explained by the trend or
seasonal components. This includes random variations and outliers.
By decomposing the time series data into these three components over a period
of one year, we can gain valuable insights into the seasonality and trends present
in the data. This decomposition helps in isolating the effects of each component,
making it easier to analyze and forecast future patterns accurately.

7
CHAPTER 6

DATA SMOOTHING
Smoothing is a crucial method in time series analysis used to remove
irregularities in data, allowing for the identification of significant patterns. By
applying smoothing techniques, we aim to reduce noise and highlight the
underlying trends and seasonal components in the data. The primary methods
used for smoothing data are as follows:
1. Simple Exponential Smoothing (SES): This method applies an
exponentially decreasing weight to past observations. It is useful for data
without trends or seasonal patterns. SES forecasts future values as a
weighted average of past observations.
2. Double Exponential Smoothing (DES): Also known as Holt’s method,
DES extends Simple Exponential Smoothing by incorporating a trend
component. This method is particularly effective for data with a linear
trend, providing more accurate forecasts by accounting for the trend's
impact.
3. Triple Exponential Smoothing (TES): Also known as Holt-Winters
method, TES further extends the double exponential smoothing by
including a seasonal component. This technique is ideal for data with both
trend and seasonal variations, allowing for more precise forecasting by
considering all three components: level, trend, and seasonality.
By employing these smoothing techniques, we can enhance the accuracy of our
time series models and improve our ability to forecast future trends in vehicle
registrations.

(SEE)

8
(DEE)

(TEE)

9
CHAPTER 7
TESTING STATIONARITY

Stationarity is a fundamental property in time series analysis, referring to a


series whose statistical properties, such as mean and variance, remain constant
over time. It is crucial to ensure that a time series is stationary because many
statistical models assume stationarity for accurate forecasting and analysis.
To test for stationarity, we use the Augmented Dickey-Fuller (ADF) test. The
ADF test helps determine the presence of unit roots in the data, which indicates
non-stationarity. By applying the ADF test, we can assess whether the time
series data needs to be differenced or transformed to achieve stationarity.

Checking for Unit Roots

Checking for unit roots is a crucial step in time series analysis because not all
models are suitable for time series data that contains unit roots. To ensure the
applicability of various models, it is essential to test for stationarity using the
Augmented Dickey-Fuller (ADF) test.

10
In the ADF test, the null hypothesis states that a unit root is present, indicating
non-stationarity in the time series. We reject the null hypothesis if the p-value
is less than 0.05 and the test statistic is less than the critical values. This
rejection confirms that the time series data is stationary, allowing us to proceed
with modeling and forecasting with confidence.

Granger Causality
Granger Causality is a method used to determine if one time series can
predict another. By testing for Granger Causality, we can identify
whether changes in one variable are a cause of changes in another
over time. This is particularly useful in understanding the
relationships and dependencies between different factors influencing
vehicle registrations.
For instance, in our analysis of vehicle registrations by fuel type, we
might test whether the registration numbers of petrol vehicles
Granger-cause the registrations of electric vehicles. If the test
indicates causality, it suggests that changes in petrol vehicle
registrations could be used to forecast changes in electric vehicle
registrations.
11
To conduct a Granger Causality test, we typically follow these steps:
1. Formulate the hypothesis where the null hypothesis states that
the time series XX does not Granger-cause YY.
2. Estimate a vector autoregressive (VAR) model for the time
series data.
3. Use statistical tests to check the significance of the lagged values
of XX in predicting YY.
4. If the p-value is less than the chosen significance level (usually
0.05), we reject the null hypothesis and conclude that XX
Granger-causes YY.

Output of the causality:


DIESEL

Granger Causality
number of lags (no zero) 1
ssr based F test: F=1.3499 , p=0.2475 , df_denom=123, df_num=1
ssr based chi2 test: chi2=1.3828 , p=0.2396 , df=1
likelihood ratio test: chi2=1.3753 , p=0.2409 , df=1
parameter F test: F=1.3499 , p=0.2475 , df_denom=123, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test: F=0.3266 , p=0.7220 , df_denom=120, df_num=2
ssr based chi2 test: chi2=0.6804 , p=0.7116 , df=2
likelihood ratio test: chi2=0.6785 , p=0.7123 , df=2
parameter F test: F=0.3266 , p=0.7220 , df_denom=120, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test: F=1.1713 , p=0.3238 , df_denom=117, df_num=3
ssr based chi2 test: chi2=3.7241 , p=0.2928 , df=3
likelihood ratio test: chi2=3.6693 , p=0.2995 , df=3
parameter F test: F=1.1713 , p=0.3238 , df_denom=117, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test: F=0.8889 , p=0.4731 , df_denom=114, df_num=4
ssr based chi2 test: chi2=3.8362 , p=0.4286 , df=4
likelihood ratio test: chi2=3.7775 , p=0.4369 , df=4
parameter F test: F=0.8889 , p=0.4731 , df_denom=114, df_num=4

Petrol

Granger Causality
number of lags (no zero) 1
ssr based F test: F=2.0136 , p=0.1584 , df_denom=123, df_num=1
ssr based chi2 test: chi2=2.0627 , p=0.1509 , df=1
likelihood ratio test: chi2=2.0460 , p=0.1526 , df=1
12
parameter F test: F=2.0136 , p=0.1584 , df_denom=123, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test: F=1.4924 , p=0.2290 , df_denom=120, df_num=2
ssr based chi2 test: chi2=3.1091 , p=0.2113 , df=2
likelihood ratio test: chi2=3.0711 , p=0.2153 , df=2
parameter F test: F=1.4924 , p=0.2290 , df_denom=120, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test: F=3.5055 , p=0.0176 , df_denom=117, df_num=3
ssr based chi2 test: chi2=11.1457 , p=0.0110 , df=3
likelihood ratio test: chi2=10.6729 , p=0.0136 , df=3
parameter F test: F=3.5055 , p=0.0176 , df_denom=117, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test: F=2.1964 , p=0.0738 , df_denom=114, df_num=4
ssr based chi2 test: chi2=9.4790 , p=0.0502 , df=4
likelihood ratio test: chi2=9.1315 , p=0.0579 , df=4
parameter F test: F=2.1964 , p=0.0738 , df_denom=114, df_num=4

cng

Granger Causality
number of lags (no zero) 1
ssr based F test: F=8.0891 , p=0.0052 , df_denom=123, df_num=1
ssr based chi2 test: chi2=8.2864 , p=0.0040 , df=1
likelihood ratio test: chi2=8.0253 , p=0.0046 , df=1
parameter F test: F=8.0891 , p=0.0052 , df_denom=123, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test: F=16.8951 , p=0.0000 , df_denom=120, df_num=2
ssr based chi2 test: chi2=35.1982 , p=0.0000 , df=2
likelihood ratio test: chi2=31.0123 , p=0.0000 , df=2
parameter F test: F=16.8951 , p=0.0000 , df_denom=120, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test: F=15.9203 , p=0.0000 , df_denom=117, df_num=3
ssr based chi2 test: chi2=50.6183 , p=0.0000 , df=3
likelihood ratio test: chi2=42.4478 , p=0.0000 , df=3
parameter F test: F=15.9203 , p=0.0000 , df_denom=117, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test: F=11.2967 , p=0.0000 , df_denom=114, df_num=4
ssr based chi2 test: chi2=48.7541 , p=0.0000 , df=4
likelihood ratio test: chi2=41.0672 , p=0.0000 , df=4
parameter F test: F=11.2967 , p=0.0000 , df_denom=114, df_num=4

Hybrid

Granger Causality
number of lags (no zero) 1
ssr based F test: F=1.8869 , p=0.1721 , df_denom=123, df_num=1
ssr based chi2 test: chi2=1.9329 , p=0.1644 , df=1
likelihood ratio test: chi2=1.9182 , p=0.1661 , df=1
parameter F test: F=1.8869 , p=0.1721 , df_denom=123, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test: F=2.5946 , p=0.0789 , df_denom=120, df_num=2
ssr based chi2 test: chi2=5.4054 , p=0.0670 , df=2
likelihood ratio test: chi2=5.2918 , p=0.0709 , df=2
parameter F test: F=2.5946 , p=0.0789 , df_denom=120, df_num=2

Granger Causality

13
number of lags (no zero) 3
ssr based F test: F=9.9630 , p=0.0000 , df_denom=117, df_num=3
ssr based chi2 test: chi2=31.6773 , p=0.0000 , df=3
likelihood ratio test: chi2=28.2105 , p=0.0000 , df=3
parameter F test: F=9.9630 , p=0.0000 , df_denom=117, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test: F=9.0546 , p=0.0000 , df_denom=114, df_num=4
ssr based chi2 test: chi2=39.0778 , p=0.0000 , df=4
likelihood ratio test: chi2=33.9348 , p=0.0000 , df=4
parameter F test: F=9.0546 , p=0.0000 , df_denom=114, df_num=4

Summary:

1. DIESEL There is no Granger causality detected between DIESEL and ev up to


4 lags. The p-values for all the lags are well above the significance threshold
(typically 0.05), indicating that DIESEL does not Granger-cause ev.

2. Petrol Granger causality is detected at lag 3. At lag 3, the p-value for the F-test
is 0.0176, which is below 0.05, indicating significant Granger causality. The
other test statistics (ssr-based chi2 and likelihood ratio test) also confirm this
with low p-values. At lag 4, the p-values approach significance, but they are
slightly above 0.05 (e.g., 0.0738 for the F-test, 0.0502 for chi2 test), so we
cannot confirm strong causality at lag 4.

3. CNG There is strong Granger causality detected between CNG and ev across
all lags. At lag 1, the p-value for the F-test is 0.0052, which is significant. At
lag 2, the p-value drops to 0.0000, which indicates extremely strong causality.
The trend continues at lags 3 and 4 with p-values remaining at 0.0000,
indicating that CNG is a significant Granger cause of ev.

4. Hybrid Granger causality is detected at lag 3. The p-value for the F-test is
0.0000, which is highly significant. Granger causality is also detected at lag 4,
with p-values for all tests being 0.0000, indicating strong causality at higher
lags as well.

Summary of Inferences:

DIESEL does not appear to Granger-cause ev.

Petrol Granger-causes ev at lag 3.

CNG has strong causality for ev across all lags (1 to 4), with lag 2 showing the
most robust causality.

Hybrid vehicles Granger-cause ev starting at lag 3 and continuing strongly at lag 4.

This suggests that CNG and Hybrid variables have a much stronger temporal
influence on ev, while Petrol shows a moderate effect at lag 3. Diesel shows no

14
significant influence.

15
CHAPTER 8

JUSTIFICATION FOR TIME SERIES

This problem qualifies as a time series analysis due to several key


factors:
 Continuous Time Intervals: The data spans continuous time
intervals, allowing us to observe changes and patterns over a
sustained period. This continuity is essential for capturing long-
term trends and seasonal variations accurately.
 Evident Trends and Seasonality: The data exhibits clear trends
and seasonality, indicating predictable patterns that occur at
regular intervals. Analyzing these patterns helps us understand
the underlying factors driving changes in vehicle registrations.
 Critical Forecasting for Decision-Making: Forecasting future
vehicle registrations is crucial for strategic planning and
informed decision-making. Accurate predictions enable
policymakers, industry stakeholders, and market analysts to
develop effective strategies and respond proactively to emerging
trends.
By leveraging time series analysis, we can gain valuable insights into
the evolving automotive industry in India, particularly the adoption
and growth of Electric Vehicles (EVs) and other alternative fuel types.

16
CHAPTER 9

IMPLEMENTATION & FORECASTING

Methods and Algorithms


Various methods and algorithms can be applied for implementing a time series model
and interpreting its forecast. From our implementation, we observed that the forecasts
generated by the models are capable of capturing irregularities in the data and
providing relatively accurate predictions.
Implemented Methods
1. Vector AutoRegression (VAR) Model:
o Captured relationships between multiple series, such as Electric Vehicle
(EV) and hybrid vehicle registrations.
o Forecasted future trends for a 12-month period, providing valuable
insights into the potential trajectory of vehicle registrations.
Key Visualizations
 Historical vs. Forecasted EV Sales: Visualizing past and predicted sales to
compare and validate the model's accuracy.
 Impulse Response Functions: Analyzing the response of one variable to
shocks in another variable, helping to understand the dynamic interactions and
dependencies within the system.
These methods and visualizations are crucial for interpreting the forecast results and
understanding the underlying patterns and influences within the time series data.

17
IRF IN VAR MODEL:

18
1. Response of EV to a shock in CNG
irf.plot(impulse=0, response=2, orth=False) # CNG's effect on EV
 Shows how EV sales respond to a sudden change (shock) in CNG vehicle sales
 The graph indicates a significant initial positive response
 This means when CNG sales increase suddenly, EV sales tend to also increase in the
short term
 The effect appears to persist for several periods before stabilizing
 This suggests a complementary relationship between CNG and EV markets

2. Response of EV to a shock in Petrol


irf.plot(impulse=1, response=2, orth=False) # Petrol's effect on EV
 Shows how EV sales respond to a sudden change in petrol vehicle sales
 The response appears to be negative initially
 This suggests that an increase in petrol vehicle sales tends to decrease EV sales in the
short term
 The effect gradually diminishes over time
 This indicates a competitive relationship between petrol and EV markets

19
3. Response of EV to a shock in Hybrid
irf.plot(impulse=3, response=2, orth=False) # Hybrid's effect on EV
 Shows how EV sales respond to a sudden change in hybrid vehicle sales
 The response appears to be mixed:
o Initial positive response
o Some fluctuation in subsequent periods
 This suggests a complex relationship between hybrid and EV markets
 Hybrid sales might serve as a bridge to EV adoption

4. Cumulative IRF Effects (from irf.plot_cum_effects(orth=False))


 Shows the accumulated impact of shocks over time
 Helps understand the long-term relationships between variables

 Key observations:
o CNG has a persistent positive cumulative effect
o Petrol shows a negative cumulative effect
o Hybrid shows a mixed but generally positive cumulative effects

FORECASTING:
Principal Component Analysis (PCA) is often done before fitting a Vector AutoRegression
(VAR) model for a few important reasons:
1. Dimensionality Reduction: Time series data can often have many variables, making
the model complex and computationally expensive. PCA reduces the number of
variables by transforming the original set into a smaller set of uncorrelated variables
called principal components, capturing most of the variance in the data.
2. Multicollinearity Mitigation: In time series data, especially those with many related
variables, multicollinearity can be a problem. It can lead to less reliable estimates of
model parameters. PCA helps to mitigate multicollinearity by creating principal
components that are orthogonal (uncorrelated) to each other.
3. Noise Reduction: PCA can help to remove noise from the data by focusing on the
principal components that explain the most variance and ignoring those that contribute
less. This results in a cleaner dataset that can improve model performance.
4. Improved Model Efficiency: By reducing the number of variables, PCA simplifies
the model, which can lead to faster computation and easier interpretation of results.
Applying PCA before fitting a VAR model allows for a more robust and efficient analysis by
addressing these challenges, ultimately leading to more accurate forecasts and insights.

20
21
FINAL OUTPUT:

22
CHAPTER 10

REASONS FOR SELECTION OF MODEL

Reasons for Using VAR


1. Accommodates Multivariate Time Series: VAR is
designed to handle multiple time series simultaneously.
This is particularly beneficial when analyzing complex
systems with interdependent variables, such as vehicle
registrations across different fuel types.
2. Effective in Capturing Relationships: VAR models are
adept at capturing the dynamic relationships between
different time series. In the context of your project, this
means effectively modeling the interactions and
dependencies between registrations of Electric Vehicles
(EVs), Hybrid vehicles, CNG, and Petrol vehicles.
3. Outperforms Simpler Models: When evaluated using
accuracy metrics, VAR models often outperform simpler
univariate models. This improved performance is due to
VAR's ability to incorporate information from multiple
time series, leading to more accurate and reliable
forecasts.

Error Metrics for VAR model :

23
CHAPTER 11
COMPARATIVE ANALYSIS

Error Metrics for Model Comparison


To evaluate and compare the performance of the ARIMA model forecasts, we
used several error metrics. These metrics provide a quantitative assessment of
the forecast accuracy and help identify the best-performing model. The error
parameters used are:
1. Mean Error (ME): Measures the average error between the forecasted
and actual values. It helps indicate the direction of bias in the forecasts.
2. Root Mean Squared Error (RMSE): Calculates the square root of the
average squared differences between forecasted and actual values. RMSE
gives a higher weight to larger errors, making it sensitive to outliers.
3. Mean Absolute Error (MAE): Represents the average of the absolute
differences between forecasted and actual values. MAE is less sensitive to
outliers compared to RMSE.
4. Mean Percentage Error (MPE): Measures the average percentage error
between forecasted and actual values. It indicates the relative accuracy of
the forecasts in percentage terms.
5. Mean Absolute Percentage Error (MAPE): Calculates the average of
the absolute percentage errors between forecasted and actual values.
MAPE is widely used because it is easy to interpret and understand.

24
Model Comparison and Selection
In this project, three advanced time series models were implemented and
compared:
1. AutoRegressive Integrated Moving Average with Exogenous
Variables (ARIAMX):
o This model extends the ARIMA model by including exogenous

variables, allowing it to factor in external influences on the time


series.
o Useful for capturing complex relationships and providing robust

forecasts.
2. Generalized AutoRegressive Conditional Heteroskedasticity
(GARCH):
o This model is particularly effective for handling time series data

with changing variances or volatility, often observed in financial


data.
o Demonstrated strength in modeling and forecasting volatility.

3. Vector AutoRegression (VAR):


o The VAR model accommodates multivariate time series, capturing

the dynamic relationships between multiple variables such as EV,


Hybrid, CNG, and Petrol vehicle registrations.
o Effective in capturing interdependencies and providing accurate

forecasts.
Model Evaluation and Selection
The forecasts generated by these models were evaluated based on several error
metrics, including Mean Error (ME), Root Mean Squared Error (RMSE), Mean
Absolute Error (MAE), Mean Percentage Error (MPE), and Mean Absolute
Percentage Error (MAPE).
Based on this comparison, the VAR model was selected for its superior
performance in capturing the relationships between different fuel categories and
providing accurate predictions. The VAR model's ability to handle multivariate
time series data and outperform simpler models in accuracy metrics made it the
best choice for our forecasting needs.

25
Error Metrics for ARIMA :

Error Metrics for ARCH :

26
CHAPTER 12 & 13
CONCLUSION & COLAB

1] This project successfully demonstrated the application of time series analysis


techniques to forecast trends in vehicle registrations by fuel type. The
decomposition of time series data revealed clear patterns of growth in electric
and hybrid vehicle adoption.

2] By implementing the VAR model, we captured interdependencies between


fuel types, enhancing the forecasting accuracy. The forecasts not only aligned
closely with historical data but also offered actionable insights for stakeholders,
such as policymakers, manufacturers, and market analysts.

3] The project underscored the importance of rigorous data preprocessing,


stationarity testing, and model evaluation. It also showcased the potential of
time series models in addressing real-world challenges like planning for the
future of sustainable transportation.

4] In conclusion, the analysis affirmed the growing dominance of electric and


hybrid vehicles while offering a robust methodology to predict future trends.
The VAR model proved to be a reliable tool for multivariate time series
forecasting in this domain.

GOOGLE COLAB LINK 👇


https://colab.research.google.com/drive/1JeX06L1I8ps7mp4XfgQv4ZguKJBYh1NE?
usp=sharing#scrollTo=SQzKEorjOpy0

27
CHAPTER 14 & 15

FUTURE SCOPE & REFERENCES


This type of TS data can be used for :

 Incorporating Additional Features: Future analyses could integrate


economic variables such as fuel prices, GDP, or government incentives to
enhance the predictive power of the models.

 Deep Learning Models: Advanced machine learning techniques, like


Long Short-Term Memory (LSTM) networks or Transformers, can be
explored for capturing complex non-linear relationships and long-term
dependencies in the data.

 Geographical Insights: Analyzing regional or country-level trends in


vehicle registrations could provide a more granular understanding of
adoption patterns.

 Scenario Analysis: Simulating the impact of policy changes (e.g.,


subsidies for EVs or increased fuel taxes) could offer insights into
possible future trends
.
 Integration with Climate Data: Linking vehicle registration trends
with environmental factors, such as CO2 emissions, can contribute to
sustainability research.

 Real-Time Forecasting: Extending this project to enable live


forecasting using real-time data streams could improve its practical utility
for decision-makers

28
REFERENCES :

 Dataset: Fuel Type and Vehicle Registration Trends.


 Source: Parivahan.gov.in website

 Jason Brownlee, “Regression Metrics for Machine Learning.”
 Source: Machine Learning Mastery

 Davide Burba, “An Overview of Time Series Forecasting Models.”


 Source: Towards Data Science

 Statsmodels Documentation: Statistical Models for Time Series.


 Source: Statsmodels

 Seaborn and Matplotlib Documentation for Visualization.


 Source: Matplotlib and Seaborn

 "Smoothing Techniques in Time Series Analysis."


 Source: Statistical Software for Excel.

29

You might also like