MATH387 Final Project
MATH387 Final Project
Authors:
Anthony Hollerich
Augusto Gonzalez
Bonorino
1
framework as an alternative to traditional econometric models that often fall
short in this task due to the nuances of the VIX’s statistical properties. The
paper is organized into 5 sections. After a brief literature review on previous
attempts to model implied volatility, particularly the VIX, section 3 describes
the multiswitch problem that motivated efforts to merge markov chains with
regression models, and introduces the MSAR model. Section 5 presents the
dataset used to fit a baseline markov model and our MSAR implementa-
tion; we discuss performance and forecasting power of both models. Finally,
section 6 offers concluding thoughts and suggests potential improvements.
2 Literature Review
The VIX index, calculated by the Chicago Board Options Exchange (CBOE)
from S&P 500 index option prices, represents a critical measure of market
volatility. It has garnered significant interest in the literature due to its
role as a forward-looking volatility predictor, offering insights into market
sentiment and investor expectations. Scholars have emphasized the VIX’s
utility in financial risk management and market analysis [2].
However, traditional linear models like multivariate regressions or ARIMA
face challenges in accurately predicting the VIX [3] due to their inherent
linear nature. These models, while effective in many time-series analyses,
struggle with the VIX’s distinct regimes characterized by shifts in mean and
variance. This limitation has prompted the exploration of more advanced
methodologies capable of capturing such non-linear and multivariate dynam-
ics of time-series with inherent ”turning points”. Results from simulations
showcasing these limitations are provided in the Appendix.
Markov Switching Regressions (MSRs) [4] emerge as a promising alter-
native, leveraging the properties of Markov chains to accommodate regime
shifts in an autoregressive time-series [5]. MSRs offer a framework where
the parameters of interest, such as the mean and variance, can vary across
different states. This approach aligns with the observable characteristics of
the VIX, which exhibits periods of relative stability interspersed with high
volatility. By fitting a system of linear equations for each state and esti-
mating a transition matrix, MSRs enhance the predictive accuracy for such
complex time-series data.
In integrating macroeconomic variables, we follow the precedent set by
Prasad et al. [6]. Their application of machine learning techniques, such
2
as Light GBM and XG Boost, identified key economic indicators that sig-
nificantly predict the VIX. These findings guide our selection of exogenous
covariates, enhancing the model’s robustness and relevance to current eco-
nomic conditions.
Despite the advantages of MSRs, it’s crucial to acknowledge their limi-
tations. The MSR framework, like traditional regressions, is susceptible to
endogeneity and relies heavily on assumptions like the Markov property, sta-
tionarity, and time-invariant transition probabilities [7]. Moreover, the com-
plexity and stability of the model can be challenged as the number of regimes
increases, posing potential issues in model specification and interpretation.
Overall, this review underscores the evolving landscape of volatility pre-
diction. By transitioning from traditional linear models to more sophisticated
approaches like MSRs, we address the dynamic nature of financial markets.
However, this advancement does not come without new challenges and as-
sumptions, which must be carefully considered in empirical applications.
3 Multiswitch Problem
The concept of multiswitch problems emerges from the need to model sys-
tems that exhibit multiple regimes or states, each governed by distinct equa-
tions. Consequently, the statistical moments (mean and variance) of the
dependent variable of interest are conditionally dependent on the state of
the world. Traditional linear models often struggle in these contexts, as they
inherently assume a singular, consistent pattern throughout the observed
data. Mathematically, this can be conceptualized using a piecewise function,
where different linear models apply depending on the current state or regime.
A simplified two-state model could be represented as:
(
α1 + β1 Xt + ϵ1t , if State 1
Yt = (1)
α2 + β2 Xt + ϵ2t , if State 2
where Yt s the observed dependent variable at time t, Xt is a vector of
independent variables, α and β are the parameters unique to each state, and
ϵ denotes the state-dependent error terms of each equation.
Early proposals for solving the multiswitch problem relied on determinis-
tic solutions, such as indicator functions. Among these, Quandt’s D-method
and lambda-method are notable. The D-method involves splitting the sam-
3
ple into different regimes based on an exogenously determined variable, D.
Mathematically, it’s expressed as [4]:
4
variable, typically denoted as st , is crucial as it dictates the current regime
and, consequently, the parameters of the model.
Consider the structure of a basic Markov Switching Autoregressive (MSAR)
model: (
α0 + βYt−1 + ϵt , if st = 0,
Yt = (4)
α1 + βYt−1 + ϵt , if st = 1.
In this formulation, the model parameters vary depending on the state
variable st , which is governed by a Markov process. This state-dependency
is the essence of the Markov switching model, enabling the model to adapt
its behavior based on the prevailing regime and historical patterns.
The integration of the Markov chain into the regression model is achieved
through the state variable st . The Markov chain property posits that the
probability of being in a certain state at time t is dependent solely on the
state at time t − 1, encapsulating the memoryless characteristic of Markov
processes. This memoryless assumption is supported by the financial litera-
ture in Implied Volatility (IV) forecasting that finds the VIX to be accurately
represented by a first-order model since ”no second lags turned out to be sta-
tistically significant” [1].
The transition between states is quantified by a transition probability
matrix T , which is central to the estimation of the model:
P (st = 0|st−1 = 0) P (st = 1|st−1 = 0)
P =
P (st = 0|st−1 = 1) P (st = 1|st−1 = 1)
P00 P01
=
P10 P11
5
3.1.1 Data Requirements and Model Estimation
As in any markov chain framework, stationarity is a crucial prerequisite for
the effective application of Markov switching models. In cases where the
time series data exhibit non-stationarity, such as a unit root, differencing the
series is a necessary step before applying the model. This ensures that the
underlying assumptions of the model are satisfied, leading to meaningful and
interpretable results.
Markov switching models are estimated from the data by employing the
following steps:
6
understanding of regime changes, thereby enhancing the modeling and pre-
diction of complex time series behaviors.
7
Should the QQQ demonstrate a positive percent change in the preceding
week and the VIX register a value above 25, 1 is added to the corresponding
cell in the matrix, in this case the first row and the first column of P .
Following the accumulation of data points over 520 weeks, the transition
matrix (P) is computed. This matrix is derived by dividing the frequency
of occurrences of a particular event by the total number of data points. An
illustrative example of P is provided below:
0.3 0.7
P =
0.4 0.6
Then, after P is fully created, the program calculates the current state of
the QQQ Previous Week Change variable. If the QQQ declined the previous
week, than the VIX would have a 40 percent change of below 25 and a 60
percent change of above 25 the following week. This process is repeated for
each of the five variables. Note, the example above was a simplified matrix
to demonstrate how the transition matrices are calculated, the true matrices
in the program are 8x8 matrices with more specific cutoffs to generate a
more accurate prediction. After the transition matrices have been created
for each of the five variables, a weighted average is taken to create the final
VIX prediction.
The R2 value of this method is 0.315. The graph can be seen below. The
model follows the direction of the changes, and is able to follow the spikes in
the testing data, however, it fails to account for sharp spikes, most notably,
the COVID spike. This is due to the fact that the model did not have reliable
past data to compare the values of the indicators in beginning of the COVID
pandemic because nothing like that had happened in the previous 10 years.
Figure 1 depicts the results of the baseline model, the blue line is the actual
VIX data, and the orange line is the predicted VIX.
We followed by fitting a multivariate MSAR model. The dataset used
consists of data for the VIX (V̂IX), and the following macroeconomic vari-
ables that the literature found to be relevant predictors: gold prices (GC=F),
Crude Oil Prices (CL=F), USD Index (DX-Y.NYB), and the financial stress
index (STLFSI4). The weekly sample size spans 13 years of data, from
01/17/2010 to 01/01/2023. Data for the financial stress index was fetched
from FRED, and the remaining variables were obtained from yahoo finance.
Finally, to correct for stationarity and/or large outliers, we applied a loga-
rithmic transformation to the first three covariates and a first-differences to
8
Figure 1: Baseline model predictions
the VIX data; leaving FSI data unchanged. Figure 2 shows the resulting
series.
Interestingly, transforming the dependent variable eliminates a lot of rel-
evant information relevant for estimating transition probabilities. In short,
by normalizing the time-series we obfuscate the difference between regimes
and consequently the estimated filtered probabilities are almost equivalent.
Given that our primary objective is to estimate transition probabilities to
predict future direction of the VIX, we proceed by fitting the model to the
raw weekly VIX data instead.
4.1 Prediction
The model is implemented in Python via the statsmodels [12] package, which
provides an API to access a variety of econometric and time-series models.
We estimate an MSAR model with the weekly VIX as the endogenous vari-
able, the matrix of transformed variables as our exogenous variables, two
regimes, a constant trend, assumed an autoregressive order of one, and non-
switching autoregressive parameters. Table 1 describes the summary results
from the fitted model
The Markov Switching Model results reveal two distinct regimes with sig-
nificantly different baseline levels, as evidenced by the intercepts for Regime
0 (14.8930) and Regime 1 (19.8009). The non-switching parameters, particu-
9
Figure 2: Transformed time-series
larly x3, show a substantial impact across both regimes, indicating that USD
index has a significant influence on the dependent variable. The autoregres-
sive component, ar.L1, with a coefficient of 0.83 suggests that the close value
of the VIX is highly influenced by its immediate past value, a common char-
acteristic in financial time series data. We believe that our large sample size
(677) and careful selection of covariates help explain the strong statistical
significance of the exogenous macroeconomic variables.
Figure 3 shows the in-sample predictions of our model. It is clear from
the plot that the MSAR model is able to captured the dynamics of the VIX’s
time series, but in-sample predictions only provide a limited understanding
of the model’s capability to generalize.
One limitation we found with statsmodels implementation of MSAR is
10
Dep. Variable: Close No. Observations: 676
Model: MarkovAutoregression Log Likelihood -1740.139
Date: Mon, 11 Dec 2023 AIC 3500.277
Time: 12:20:15 BIC 3545.439
Sample: 01-17-2010 HQIC 3517.763
- 01-01-2023
Covariance Type: approx
coef std err z P> |z| [0.025 0.975]
const 14.8930 1.257 11.852 0.000 12.430 17.356
coef std err z P> |z| [0.025 0.975]
const 19.8009 0.675 29.330 0.000 18.478 21.124
coef std err z P> |z| [0.025 0.975]
x1 16.9873 5.496 3.091 0.002 6.215 27.760
x2 -5.8513 2.375 -2.463 0.014 -10.507 -1.196
x3 160.9426 26.129 6.159 0.000 109.730 212.155
x4 7.6144 0.483 15.773 0.000 6.668 8.561
sigma2 7.9047 0.778 10.156 0.000 6.379 9.430
ar.L1 0.8310 0.026 32.436 0.000 0.781 0.881
11
Figure 3: MSAR in-sample predictions
12
for this purposes, available to the reader in the Appendix and supplementary
materials. The obtained forecasts are as follows:
Time Step Forecasted VIX
1 18.0565
2 17.7305
3 17.7540
4 17.7523
5 17.7524
The MSAR model forecasts a drop in the value of the VIX in upcoming
weeks, this means that we expect a downward movement of implied volatil-
ity in the markets but still remain in the same high-volatility regime. The
model’s prediction of a continued presence in Regime 1 aligns with current
market expectation of an unsettled market, potentially driven by macroe-
conomic factors, geopolitical tensions, or other systemic risks. Hence, these
results can be interpreted as a potential future decrease in Implied Volatility,
making its way to the low-volatility regime.
In light of these predictions, investors might adopt more conservative
strategies, prioritizing assets that are traditionally seen as less volatile or
hedging their portfolios to mitigate potential risks. For options traders, this
could mean increased demand for hedging instruments, leading to higher
premiums on options contracts. On a broader scale, these forecasts could in-
fluence financial institutions and policy makers in their risk assessment and
management strategies. However, it’s important to acknowledge the model’s
limitations and the inherently unpredictable nature of financial markets. Ex-
ternal shocks and unforeseen events can rapidly alter market dynamics, un-
derscoring the need for continual monitoring and adaptation of predictive
models to new data and changing conditions.
13
understanding of market fluctuations, a key element in our analytical frame-
work. The integration of macroeconomic variables and the examination of
their influence on the VIX further enriched our analysis, providing a multi-
faceted view of market behavior.
The findings of our study are both significant and timely. The identifica-
tion of two regimes, one characterized by higher volatility, presents a critical
insight into the market’s underlying mechanisms. Our model forecasts an
upsurge in market volatility, suggesting that we are likely entering a period
marked by increased economic uncertainty. This projection is particularly
relevant given the current global economic landscape, which is fraught with
challenges ranging from geopolitical tensions to shifting monetary policies.
For investors and market analysts, this implies a need for heightened vig-
ilance and a potential reevaluation of risk management strategies. It also
underscores the value of predictive modeling in navigating the complex and
often turbulent financial markets.
While our study contributes valuable perspectives to the field of financial
forecasting, it is not without its limitations. The MSAR model, despite its
robustness, is inherently sensitive to the parameters and assumptions under-
pinning it. Our reliance on historical data, while extensive, cannot fully ac-
count for the capricious nature of financial markets, where unforeseen events
can dramatically alter trajectories. Additionally, the lack of out-of-sample
predictions in our current framework points to an area ripe for further de-
velopment. Future research could focus on enhancing the model’s predictive
power, perhaps by incorporating real-time data analysis or exploring alterna-
tive econometric techniques. The continuous evolution of financial markets
necessitates an adaptable and forward-looking approach to modeling and
analysis.
References
[1] Katja Ahoniemi. Modeling and forecasting the vix index. International
Finance, 2008.
14
[4] Stephen M. Goldfeld and Richard E. Quandt. A markov model for
switching regressions. Journal of Econometrics, 1(1):3–15, 1973.
[6] Akhilesh Prasad, Priti Bakhshi, and Arumugam Seetharaman. The im-
pact of the u.s. macroeconomic variables on the cboe vix index. Journal
of Risk and Financial Management, 15(3), 2022.
[8] Peter Carr. Why is vix a fear gauge? Risk and Decision Analysis,
6(2):179–185, 2017.
[9] Ghulam Sarwar. Is vix an investor fear gauge in bric equity markets?
Journal of Multinational Financial Management, 22(3):55–65, 2012.
[10] Ashok Bantwa. A study on india volatility index (vix) and its perfor-
mance as risk management tool in indian stock market. PARIPEX -
INDIAN JOURNAL OF RESEARCH, 6(1), January 2017.
[11] Gaurav Jadhao and Abhijeet Chandra. Application of vix and entropy
indicators for portfolio rotation strategies. Research in International
Business and Finance, 42:1367–1371, 2017.
15
Simulations
The simulated dataset was crafted based on an Autoregressive (AR) process
of order 4 (AR(4)), with a white noise error component characterized by a
standard deviation of 0.5. This dataset incorporates two covariates, x1 and
x2, both of which follow a normal distribution. Central to our simulation is
the inclusion of three distinct regimes, each signifying a change in the mean
value of the dependent variable. The ’y’ variable in our dataset is thus a
product of an autoregressive process that is influenced by its four preceding
values (or four lags). This process is further modulated by the covariates
x1 and x2, and is subject to shifts in mean value, contingent upon the time
index, while maintaining a fixed variance. This simulation framework is
designed to test candidate models, providing an insightful understanding of
their capability to model time-series with properties that characterize Implied
Volatility.
16
Simple Markov Switching Regression (MSR) provides an improvement
over traditional linear models by allowing the researcher to estimate a set of
linear equations conditional on the regime. This type of conditional mean
model allows us to accurately capture the change in expected value of our
regressions, which serves in estimating transition probabilities, but fails to
capture the variance within the regimes. We observe this limitation in the
in-sample predictions of the model.
17
Finally, Markov Switching Autoregression (MSAR) is able to capture
the complex behavior of the time-series describing the variable of interest
within each regime. This upgrade is observed by the superior in-sample fit
of the MSAR model compared to all other models we experimented with.
Because of this superior performance on the simulated scenario, we decided
to implement it in our main analysis to model future direction of the VIX.
Forecasting
def MSAR forecast ( model , data , s t e p s =1):
”””
F o r e c a s t u s i n g a Markov S w i t c h i n g A u t o r e g r e s s i o n model .
18
predictions = [ ]
# E x t r a c t p a r a m e t e r s from t h e model
mu1 = model . params [ ’ c o n s t [ 0 ] ’ ]
mu2 = model . params [ ’ c o n s t [ 1 ] ’ ]
# estimated c o e f f i c i e n t s of exogenous v a r i a b l e s
b e t a s = model . params . drop ( [ ’ c o n s t [ 0 ] ’ , ’ c o n s t [ 1 ] ’ ,
’ sigma2 ’ , ’ a r . L1 ’ ,
’ p[0 − >0] ’ , ’ p[1 − >0] ’ ] ) . v a l u e s
transMat = model . r e g i m e t r a n s i t i o n
# L a t e s t o b s e r v a t i o n f o r e x o g e n o u s v a r i a b l e s used
l a s t o b s = data . i l o c [ −1 , [ 1 , 3 , 4 , 6 ] ] . v a l u e s
# I n i t i a l regime p r o b a b i l i t i e s
s t a t e p r o b s = model . f i l t e r e d m a r g i n a l p r o b a b i l i t i e s . i l o c [ −1]
r e s h a p e d t r a n s i t i o n m a t r i x = transMat . r e s h a p e ( 2 , 2 )
# Forecasting loop
for in range ( s t e p s ) :
# C a l c u l a t e regime p r e d i c t i o n s
r e g i m e 1 p r e d = mu1 + np . dot ( bet as , l a s t o b s )
r e g i m e 2 p r e d = mu2 + np . dot ( bet as , l a s t o b s )
# Update s t a t e p r o b a b i l i t i e s
s t a t e p r o b s = np . dot ( r e s h a p e d t r a n s i t i o n m a t r i x , s t a t e p r o b s )
return p r e d i c t i o n s
19