CHAPTER 3: IMPLEMENTING RISK FORECAST
• In the last course FRM 1, risk measures are calculated based on the assumption that
the distribution of profit and loss was known. However, in practice, the profit and
loss distribution is estimated by using historical observations of the asset’s returns.
There are two main methods for forecasting VaR and ES which are non-parametric
and parametric.
• The non-parametric risk forecasting refers to historical simulation, which uses the em-
pirical distribution of data to compute risk forecasts. There is no assumption about
the distribution or parameters.
• The parametric method are based on estimating the underlying distribution of returns
and then obtaining risk forecasts from the estimated distribution. For most applica-
tions, the first step in the process is forecasting the covariance matrix.
3.1. Some concepts
• We define a sample to have the size T
• We separate the data into two parts, the first part is used to estimate parameters,
called estimation window or WE
• K: Number assets
• ω: K × 1 vector of portfolio weights
• yk = {yt,k }Tt=1 : T × 1 vector of returns on asset k
• y = T × K: T × K matrix of historical returns
• Σ: K × K covariance matrix of assets
• v: the portfolio value
3.2. Forecasting
• We consider the concepts In sample, out-of-sample, and forecasting
• If we use information from time t = 1 to time t = T to produce about time T , it means
that we are doing sample analysis
• However we want to forecast the risk
• That means using we use information t = 1 to t = T to produce a forecast of what
might happen later, perhaps t = T + 1
Training and Testing Samples
When working with forecasting, we usually use the following rules:
1
Static (Expanding) and Dynamic Forecast
Assume we have five samples and we want to forecast five values in the future we can use
static forecast as described as follows
Now we have five samples, if we want to forecast five values in the future we can use
dynamic forecast as described as follows
In general we have yt = f (yt−1 , yt−2 , . . . , yt−m ). To forecasting h steps ahead we have
ŷt+1 = f (yt , yt−1 , . . . , yt−m+1 )
ŷt+2 = f (ŷt+1 , yt , . . . , yt−m+2 )
ŷt+3 = f (ŷt+2 , ŷt+1 , yt , . . . , yt−m+3 )
and so on
2
Rolling forecast
Similarly static forecast, we use rolling window which is described as follows
Modeling process
• (i) Initial Steps:
• Consider the context before working with data.
• Determine expected findings and model requirements (e.g., ARIMA for short-term
forecasts).
• Set a baseline based on results from other models.
• Plot the time series.
• (ii) Estimation:
• Fit an initial model, exploring simpler and more complex models.
• Check residuals for issues using the Ljung-Box test of residual autocorrelations.
• Analyze residual plots for outliers or anomalies.
• (iii) Forecasting:
• Check for normality.
• Extrapolate patterns based on dependence.
• Compare to baseline estimates.
Example 1
Assume that the time series yt with mean µ is fitted by following MA(2) model
yt = µ + wt + θ1 wt−1 + θ2 wt−2
• Forecasting 1-step:
yn+1 = µ + εn+1 + θ1 εn + θ2 εn−1
3
ŷn+1 = µ + θ1 εn + θ2 εn−1
hence
yn+1 − ŷn+1 = εn+1
var(yn+1 − ŷn+1 ) = σ 2
2-steps
yn+2 = µ + εn+2 + θ1 εn+1 + θ2 εn
ŷn+2 = µ + θ2 εn
Hence,
yn+2 − ŷn+2 = εn+2 + θ1 εn+1
var(yn+2 − ŷn+2 ) = σ 2 (1 + θ12 )
3-steps
yn+3 − ŷn+3 = εn+2 + θ1 εn+2 + θ2 εn+1
var(yn+3 − ŷn+3 ) = σ 2 (1 + θ12 + θ22 )
Example 2
Consider ARMA(2,1) model, we have
yt = δ + ϕ1 yt−1 + ϕ2 yt−2 + εt + θ1 εt−1
• For one-step ahead
ŷn+1 = δ + ϕ1 yn + ϕ2 yn−1 + (ε̂n+1 = 0) + θ1 ε̂n
• 2-steps
ŷn+2 = δ + ϕ1 ŷn+1 + ϕ2 yn + (ε̂n+2 = 0) + θ1 (ε̂n+1 = 0)
3-steps
ŷn+3 = δ + ϕ1 ŷn+2 + ϕ2 ŷn+1 + (ε̂n+3 = 0) + θ1 (ε̂n+2 = 0)
4
3.3. Historical simulation for one asset
• Consider a portfolio of one stock with the observations in the estimation window will
be next day return.
• VaR is one of the observations in the estimation window, multiplied by the monetary
of asset holding, the portfolio value Vt is
Vt = number stocks owned × Pt
• Then we have VaR is calculated by
VaRt = −(T × α)th
the smallest return, times Vt . The following figures describe VaR of S&P 500 return
with confidence level 99% (or 1%)
5
After sorting historical returns we obtain
Procedure historical simulation
To calculate VaR using historical simulation we do the following steps.
• Decide on a probability α
• Have the sample of returns y with length T , e.g., 1000
• sort y from the smallest to the largest, call that ys
• Take (T × α)th the smallest value of ys, called ysT ×α
• We have
VaRt = −ysT ×α Pt−1 × number of stocks
• The expected shortfall is
(T ×α)th
1 X
ES = ysi
(T × α)th i=1
For multiple assets
In case we have a portfolio of multiple assets then we do the following steps to find VaR
• Take a matrix of historical portfolio return
• Denote y the T × K matrix returns
• w is the K × 1 vector of portfolio weights
• Then we get the time series vector of portfolio return
yp = y × w
6
• Then we can simply treat the portfolio as if we were a single asset and apply Historical
Simulation
Example
We have stock prices of Microsoft and IBM download from internet. Then we calculate
VaR of Microsoft and the portfolio of Microsoft and IBM
library("tseries")
p1 = get.hist.quote(instrument = "msft", start = "2015-01-01",
end = "2020-12-31", quote = "AdjClose")
p2 = get.hist.quote(instrument = "ibm", start = "2015-01-01",
end = "2020-12-31", quote = "AdjClose")
y1 = coredata(diff(log(p1)))
y2 = coredata(diff(log(p2)))
portfolio = 1000
w = matrix(c(0.4, 0.6))
alpha = 0.01
T = length(y1)
op = ceiling(alpha*T) # alpha percent smallest, rounded up
## VaR of microsoft
VaRM = -sort(y1)[op]
## VaR of portfolio
y = cbind(y1, y2) %*% w
VaR_p = -sort(y)[op] * portfolio
## Expected shortfall of portfolio
ES = -mean(sort(y)[1:op]) * portfolio
Importance of sample size and issue
• The most extreme observations fluctuate a lot more than observations that are less
extreme. Therefore, the bigger sample size the more precise the estimation of HS.
• If there is a structural break in data (like in crisis 2008) the VaR forecast take longer
to adjust to structural change in risk
• No model assumptions needed
• In the absence of structural break HS tends to perform well.
• It captures nonlinear dependence directly. But performs badly when data has structural
breaks
3.4. Parametric methods
We need to calculate VaR and ES on day t conditional on
• a return density on day t, it is f (.) with the distribution F (.)
7
• and parameter θ of perhaps, Normal distribution with (µ, σ) or student distribution
with (µ, σ, ν)
Recall that
Z −VaRα (V )
α
P(V ≤ −VaR (V )) = f (x)dx = α
−∞
The profit and loss when we own one stock
Vt = Pt − Pt−1
From the definition of return
Pt − Pt−1
Rt =
Pt
So we have
α = P(Vt ≤ −VaRα (Vt ))
= P(Rt Pt−1 ≤ −VaRα (Vt ))
VaRα (Vt )
!
Rt
=P ≤−
σ Pt−1
Denote FR the distribution function of Rt /σ, we have
VaRα (Vt ) = −σFR−1 (α)Pt−1 (*)
If we use log return
!
Pt
Yt = log
Pt−1
similarly, we obtain
−1
VaRα (Vt ) = − eFY (α)σ
− 1 Pt−1 (**)
Note that
−1
eFY (α)σ
− 1 ≈ FY−1 (α)σ
and the distribution function FY ≈ FR . Therefore, the formula (**) can be approx-
imated by (*). Meaning that the VaR for continuously compounded returns is approxi-
mately the same as the VaR using simple returns.
8
3.5. VaR with time dependent mean and volatility
Let rt denote the log-return of a asset or portfolio between periods t − 1 and t and Ft
denote the information filtration generated by these returns.
rt = µt + at
where
µt = E(rt | Ft )
and
σt2 := var(rt | Ft ) = var(at | Ft )
Note that µt and σt are known at time t − 1. The mean µt are assumed to follow a
stationary time series model such that the ARMA (p, q)
p
X q
X
µt = ϕ0 + ϕi rt−i + γj at−j
i=1 j=1
The coefficients ϕi and γj will be estimated from the observation of rt .
Note that at is the random component of log-return and the variance σt2 is time varying
and follows GARCH(L1 , L2 ) model, i.e.,
at = σt ϵt
and
L1 L2
σt2 = ω + αi a2t−i + 2
X X
βj σt−j
i=1 j=1
where ω > 0αi ≥ 0, βj ≥ 0, ϵt is i.i.d random variable with zero mean and variance 1,
usually ϵt ∼ N (0, 1).
Estimating Risk Measures
Suppose that today is date t and we wish to estimate the portfolio VaR and ES over
period t to t + 1. We use ARMA and GARCH models to forecast the value of rt+1 as
follows
rt+1 = µ̂t+1 + σ̂t+1 ϵt+1
where µ̂t+1 and σ̂t+1 are the estimates of the next periods mean log-return and volatil-
ity. So we have
α α
VaR t+1 (rt+1 ) = −µ̂t+1 + σ̂t+1 VaR (ϵt+1 )
d
and the Expected Shortfall
9
α α
ES
d (r ) = µ̂
t+1 t+1 t+1 + σ̂t+1 ES (ϵt+1 )
Special case
In case the residuals ϵt ∼ N (0, 1) and the initial value of the portfolio is P0 we have
VaRα (ϵt+1 ) = −Φ−1 (α)
and
ϕ(Φ−1 (α))
ESα (ϵt+1 ) =
α
where ϕ and Φ is the density and distribution functions of standard normal distribu-
tion. So we obtain
α −1
VaR t+1 (rt+1 ) = −(µ̂t+1 + σ̂t+1 Φ (α))P0
d
and
α ϕ(Φ−1 (α))
ES
d (r ) = (−µ̂
t+1 t+1 t+1 + σ̂t+1 )P0
α
Time aggregation of VaR with mean
• If returns are i.i.d., then both mean and variance aggregate are at the same rate
• Mean and variance over T days is equals to T times mean and variance over one day
• The T -period VaR is therefore
VaRα (T day) = −σ(T day)F −1 (α) − µ(T day)
√
= − T σ(1day)F −1 (α) − T µ(1day)
10