Expected Shortfall
Expected Shortfall
Shortfall
Daniel Isaksson
iii
Acknowledgements
I would like to thank my colleagues at SAS Institute for all supporting and
encouraging conversations. In particular I would like to thank my supervisor
Jimmy Skoglund for introducing me to the topic of robust portfolio optimiza-
tion, for always being a helping hand and a great source of inspiration. I
would also like to thank my supervisor Professor Henrik Hult at the Royal
Institute of Technology for valuable support throughout this thesis project.
iv
Contents
1 Introduction 1
1.1 Introduction to Portfolio Optimization . . . . . . . . . . . . . 2
1.1.1 Markowitz Mean-Variance Optimization Problem . . . 3
1.1.2 Reference Portfolio and Benchmark Solution . . . . . . 4
1.1.3 Stylized Assumptions in Markowitz Optimization Prob-
lem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
v
3.2.1 Statistical Uncertainty in the Portfolio Optimization
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 43
vi
Chapter 1
Introduction
The goal of this thesis project is to introduce and solve an alternative ro-
bust portfolio optimization problem to Markowitz classical mean-variance
optimization problem. Classical robust optimization focus on parameter un-
certainty for a given distribution and the contribution of this thesis project is
to extend the robust optimization to include uncertainties in log-return dis-
tribution as well. The alternative portfolio optimization problem is solved
with both elliptical distributions and asymmetric log-return distributions,
applied to a reference portfolio consisting of stocks and a bond index from
the Swedish market.
The thesis project is organized as follows. In Chapter 1, I introduce gen-
eral concepts of portfolio optimization and give a brief historical background
on the development of modern portfolio optimization. A reference portfo-
lio is then constructed and the traditional portfolio optimization problem by
Markowitz [12] is solved to obtain a benchmark solution from historical data.
The chapter concludes with arguments on why Markowitz mean-variance op-
timization problem is not particularly good in financial mathematics due to
its narrow area of practical application.
In Chapter 2 I introduce risk measures and based on a risk measure com-
monly used in financial risk management called Expected Shortfall I formu-
late a problem better fit for modern portfolio optimization. With theorems
provided by Rockafellar and Uryasev [15], the portfolio optimization problem
is approximated as a convex linear program that can be solved with stan-
dard optimization algorithms. I then show that Markowitz mean-variance
optimization problem is a special case of the portfolio optimization problem
with Expected Shortfall, connected with a risk aversion coefficient.
Chapter 3 deals with optimization under uncertainties and the concept
of robust optimization is presented, being central when solving optimization
problems with uncertainty in parameters. My contribution to the literature
of robust portfolio optimization is that I first perform a case study under
different elliptical distributions and then study robust portfolio optimization
1
with an asymmetric hybrid generalized Pareto-Empirical-Generalized Pareto
distribution. To analyze the statistical uncertainty in the results, the boot-
strap procedure is applied to calculate standard errors for the holdings and
the Expected Shortfall.
In Chapter 4 I analyze the results obtained throughout the thesis project
and compare them to the benchmark Markowitz solution. I analyze the prop-
erties of the optimization problem and conclude in which areas the optimiza-
tion problem is particularly applicable, being advantageous over Markowitz
mean-variance optimization problem. The thesis ends with remarks on areas
where further investigation can be done.
of an asset, St being the asset price at time t, are assumed to have the
Markov property from day to day and are often assumed to be weakly de-
pendent and close to independent and identically distributed. Therefore, by
letting V1 be a function of assets’ log-returns, V1 = f (R1 ) for some multivari-
ate function f , one may construct approximately independent copies of V1
as {f (R−n+1 ), . . . , f (R0 )}. Hence, investors try to predict future portfolio
value by analyzing historical log-returns. An alternative approach which is
equally good is to work directly with the log-returns and maximize the ex-
pected future portfolio log-return since this is equivalent of maximizing the
expected future portfolio value. This approach is used in this thesis project.
Furthermore, since the amount of initial capital V0 is only a scaling factor,
it is common practice to set V0 = 1 so that the solution is expressed as
proportions of the total capital invested rather than expressed as monetary
units.
2
1.1.1 Markowitz Mean-Variance Optimization Problem
Modern portfolio optimization was first introduced by Markowitz [12] in
1952 with what is often referred to as Markowitz mean-variance optimiza-
tion problem. The idea is to maximize the expected return, subject to the
constraint that the variance of the portfolio must be smaller than some pre-
determined tolerance level T . Assume an investor has the possibility to invest
in n assets and let R = (R1 , . . . , Rn ) be a random vector of log-returns at
time t = 1 corresponding to each of the assets as defined by (1.1). The
covariance matrix of returns for the n assets is further given by Σ which
is assumed to be symmetric and positive semi-definite and thus also invert-
ible. Let w = (w1 , . . . , wn ) be a vector of holdings, or weights, of the initial
capital invested in each asset. If the future portfolio log-return is denoted
by X = wT R, then Markowitz mean-variance optimization problem can be
stated
max E[X]
w
Subject to wT Σw ≤ T
Xn
wi = V0 .
i=1
Depending on the value of T the optimal solution will differ and plotting the
expected future portfolio log-return versus its standard deviation for different
values of T gives what is known as the efficient frontier.
Markowitz mean-variance optimization problem can be stated in various
ways and an alternative formulation is the mean-variance trade-off formula-
tion
c T
max wT µ − w Σw
w 2V0
n
X (1.2)
Subject to wi = V0
i=1
where µ is the mean log-return vector and the trade-off parameter c > 0
is a dimensionless constant that should be interpreted as a risk aversion
coefficient. The trade-off problem is convex and has analytical solution
max{1T Σ−1 µ − c, 0}
V0 −1
w= Σ µ− 1 .
c 1T Σ−1 1
3
allowed. The optimization problem then becomes
c T
max wT µ − w Σw
w 2V0
n
X
Subject to wi = V0 (1.3)
i=1
wi ≥ 0, i = 1, . . . , n.
The answer depends on the investor. In this thesis project the reference
portfolio consists of linear assets being stocks and a bond index from the
Swedish market. The assets included in the reference portfolio are listed in
Table 1.1 and are chosen to reflect some of the largest companies on the
Swedish market, diversified over a large range of business areas. The idea
is to include high and low volatile assets in different business sectors that
are more and less dependent on the current state of economy. An investor
should then have good control of deciding what risk level he is willing to be
exposed to, depending on how the portfolio weights are chosen. The OMRX
Total Bond Index, henceforth abbreviated the Total Bond Index, is included
in the reference portfolio to have a position which can be considered close to
riskless with small expected log-return.
The historical time period used for collecting asset price data is chosen
to be January 2, 2007 until January 22, 2016. The time period includes the
4
global 2007 financial crisis, having its peak between approximately 2007 −
2009 according to the time line in [8], where risky assets typically are more
correlated, and some years afterwards where the market is starting to rise
again and the assets are less correlated.
Table 1.1: Swedish assets included in the reference portfolio and their corre-
sponding business areas.
Asset name Business Area
AstraZeneca Health Care
Ericsson A Technology
Hennes & Mauritz B Retail
ICA Gruppen Retail
Nordea Bank Banks
SAS Travel & Leisure
SSAB A Basic Resources
Swedish Match Personal & Household Goods
TeliaSonera Telecommunications
Volvo Industrial Goods & Services
Total Bond Index Government bonds
The historical data is obtained from the Nasdaq OMX web page and
consists of daily prices for each of the assets in the reference portfolio1 .
Figure 1.1 depicts the historical price developments for the stocks to the left
and the Total Bond Index to the right.
700 6500
AstraZ OMRX Total Bond Index
Ericsson
H&M
600 ICA
Nordea
SAS 6000
SSAB
Swedish Match
500 Telia
Volvo
5500
400
300
5000
200
4500
100
0 4000
2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
Figure 1.1: Price developent for the stocks (left) and the bond index (right)
in the reference portfolio during January 2, 2007-January 22, 2016.
From Figure 1.1 one can make several observations. For instance that
1
7 data points for OMRX Total Bond Index was missing in the historical data set and
were linearly interpolated. The data is automatically adjusted for asset splits.
5
some companies managed the financial crisis better than others, for instance
Nordea Bank, and that some assets have historically had a positive trend,
such as Swedish Match, while other assets have had a negative trend, such as
SAS. There are furthermore some assets that have changed very little in value
during the time period of interest, for instance TeliaSonera, while other assets
have been more volatile, for instance ICA Gruppen. All these behaviors were
desired in the construction of the reference portfolio. Ultimately, the assets
included in a portfolio is up to the investor to decide and the reference
portfolio in this thesis project could have been selected differently. The
general conclusions in the thesis project do however not directly depend on
the reference portfolio itself other than the fact that the weight vector and
Expected Shortfall would change if other assets were used.
The optimization problem (1.3) is a quadratic programming problem and
can be solved with standard solving algorithms. The solution is presented in
Table 1.2 when applied to the reference portfolio with risk aversion coefficient
c = 5.33 (turns out later why this particular value was chosen), V0 = 1 and
empirically estimated parameters µ b and Σ.
b The solution was calculated using
the function quadprog in Optimization Toolbox in Matlab version 9.0 on a
Core i5 CPU 2.60 GHz Laptop with 8 GB of RAM using the interior point
method. See Appendix A for numerical values of the empirically estimated
parameters µ b and Σ.
b
As can be seen from Table 1.2, most assets in the portfolio will not be
invested in and about 78% of the initial capital is invested in the Total
Bond Index, about 14% is invested in Swedish Match, approximately 7%
is invested in ICA Gruppen and the remaining 1% is invested in Hennes &
Mauritz B. With the benchmark solution wBM , the expected daily future
6
portfolio log-return is
T
E[wBM T
R] = wBM b = 2.0242 · 10−4
µ (1.4)
T µ
corresponding to a yearly2 expected portfolio log-return of 251 · wBM b ≈
5.08% and the daily portfolio variance is
σp2 = wBM
T
Σw T
b BM = 7.8354 · 10−6 , (1.5)
2
There are about 251 trading days in a year on the Swedish market.
7
8
Chapter 2
9
and later specify two risk measures widely used in financial risk management.
Generally, let ρ(X) be a function measuring the risk of a stochastic vari-
able X. Different risk measures have different properties and below is a list
of such mathematical properties that are considered to be useful or desirable,
together with brief explanations on how they can be interpreted.
The reader is refered to the book [5, Ch. 6] by Hult, Lindskog, Hammarlid
and Rehn for a more thorough presentation on general risk measure theory
and more comments on the above properties, as well as more on why variance
is considered a bad risk measure in finance.
A risk measure with the properties translation invariance and monotonic-
ity is said to be a monetary measure of risk, and a risk measure considered to
replace variance in Markowitz mean-variance optimization problem should
satisfy at least these two properties. A risk measure that in addition to
translation invariance and monotonicity also satisfies convexity is said to be
a convex risk measure. The convex risk measure family is thus a subset of
the monetary risk measure family. Finally, a third risk measure family is the
coherent risk measure, where ρ(X) satisfies the properties translation invari-
ance, monotonicity, positive homogeneity and subadditivity. It is easy to see
that a risk measure satisfying positive homogeneity also satisfies normaliza-
tion by setting λ = 0. It can further be shown that positive homogeneity
and convexity together implies subadditivity but not the reverse. Hence, a
10
coherent risk measure is also a convex risk measure but the opposite does not
generally hold, so the coherent risk measure family is a subset of the convex
risk measure family and thus also a subset of the monetary risk measure fam-
ily. When choosing an appropriate risk measure for a portfolio optimization
problem replacing Markowitz mean-variance optimization problem one may
therefore consider both convex and coherent risk measures to be at least as
good as monetary risk measures. Below, two risk measures that are com-
monly used in risk management are presented that are considered good in
financial mathematics.
2.1.1 Value-at-Risk
The first risk measure presented is the Value-at-Risk, abbreviated VaR.
Value-at-Risk satisfies translation invariance, monotonicity and positive ho-
mogeneity and is hence a monetary risk measure. The Value-at-Risk at level
p ∈ (0, 1) of a stochastic variable X is defined as
11
2.1.2 Expected Shortfall
A better risk measure than Value-at-Risk in the sense that it takes into
account all losses located in the whole (1 − p)-level quantile tail of the loss
distribution is Expected Shortfall. For a stochastic variable X, the Expected
Shortfall, abbreviated ES, at level p ∈ (0, 1) is defined as
Z p Z 1
1 1
ESp (X) = V aRu (X)du = V aRu (L)du, (2.1)
p 0 1−q q
1 p −1
Z
ESp (X) = F (1 − u)du.
p 0 L
12
This is natural since exactly the total risk is decomposed, meaning that
the sum of all contributions is exactly the total risk. Expected Shortfall is
(positive) homogenous of degree 1 and a decomposition of the total portfolio
Expected Shortfall is
n
X ∂ESp (X)
ESp (X) = Ui
∂Ui
i=1
wj being the asset weight and Fj (S(t)) is a financial instrument. The deriva-
tive ∂ESp (X)/∂Ui should then be interpreted as the Euler Allocation for
asset i.
Skoglund and Chen show in [18] that if the log-returns are assumed to
have multivariate normal distribution, the Euler Allocations can be calcu-
lated analytically as
∂ESp (X) ∂σp
= λ(Φ−1 (q)) . (2.3)
∂U ∂U
Here λ(x) is the hazard function for the normal distribution, defined as
φ(x)
λ(x) = . (2.4)
1 − Φ(x)
and
∂σp
= (UT ΣU)−1/2 ΣU.
∂U
If the log-returns are modeled by their empirical distribution, the deriva-
tive ∂ESp (X)/∂U does not exist since the portfolio distribution is discrete.
In this case, we may use the results of Tasche [21] and approximate the
derivative as PD
E[Li,. |Ld > Ls ]
DESp (X) (wi ) = d=1
PD (2.5)
d=1 1(Ld >Ls )
In this thesis project the investment horizon is one day and the risk-free
interest rate can be approximated to zero, which implies that Li = −Ri ,
13
i.e. the log-returns with switched sign. With linear assets, the portfolio loss
scenarios are then obtained by multiplying the weight vector with the loss
vector, L = wT L = −wT R. Furthermore, by ordering the portfolio losses
such that L1 > L2 > · · · > LD then the portfolio Value-at-Risk at level p,
defined as the p-level loss quantile, is
Proposition 2.2 For any random variable X with continuous and strictly
−1
increasing density function FX , F−X (p) = −FX−1 (1 − p) for all p ∈ (0, 1).
I refer the reader to Hult et al.[5, pp. 170-172] for complete proofs.
14
a reasonable initial constraint would be that the entire initial capital V0
must be invested in the market. Additionally, it is common to constrain the
asset allocations to long positions, i.e. wi ≥ 0, i = 1, . . . , n. The portfolio
optimization problem can then be stated mathematically as
The above problem formulation will be the foundation of this thesis project.
Next I remark on a special case when (2.7) have the same solution as the
benchmark solution to Markowitz mean-variance optimization problem and
then I state a few general model assumptions that is used when later solving
(2.7) numerically. In Section 2.3 I show that the portfolio optimization
problem can be approximated and solved by ordinary linear programming.
In section 2.1.3 it was said that a natural decomposition of the total
portfolio risk is obtained by calculating the Euler Allocations since then
the risk contributions sum up to exactly the total risk. But this property
can be achieved for any risk decomposition method by simply normalizing.
However, Tasche proves in [20] that Euler decomposition is the only risk
decomposition method that is consistent with (local) portfolio optimization
which motivates the benefits of using Euler decomposition in this thesis
project.
In portfolio optimization the Euler Allocations have a second field of
application in addition to measuring the risk contribution from a specific
asset to the total portfolio risk. The famous Sharpe ratio, introduced by
Sharpe [17], relates the expected portfolio (log)-return to the risk as
wT µ
S= (2.8)
ρ
15
and the larger Sharpe ratio the better is the investment. Similarly to the
decomposition of the portfolio risk a natural decomposition of the Sharpe
ratio is to define
wi µ i
Si∗ = ∂ρ
(2.9)
wi ∂w i
The following remark points out a special case where portfolio optimization
problem (2.7) can be simplified to a modified version of Markowitz mean-
variance problem (1.3). This is important since the solution to (2.7) should
in this case be the same as the benchmark solution in Table 1.2.
Consider a scenario where the vector of log-returns R is assumed to be
elliptically distributed with mean µ and covariance matrix Σ. Then R can
be represented as
d
R = µ + W AZ
where I have used both Proposition 2.1 and Proposition 2.2. Now, since
minimizing −f (w) corresponds to maximizing f (w), portfolio optimization
16
problem (2.7) with elliptically distributed log-returns can be written
c √ T
max wT µ − w Σw
w 2V0
Subject to wT µ ≥ θ
Xn
wi = V 0
i=1
wi ≥ 0, i = 1, . . . , n,
where Z p
2V0 −1
c=− FW Z1 (u)du. (2.11)
p 0
Note the similarity between the above special case and Markowitz mean-
variance optimization problem (1.3). Optimization problem (2.7) can in the
special case considered hence be rewritten as an optimization problem with
solution equivalent to that of the standard quadratic optimization problem.
Assumption 1 The investor’s initial capital V0 = 1. This was already assumed when
calculating the benchmark solution and is used to simplify the analysis
work of the solutions since the asset weights wi can be seen as propor-
tions of the whole initial capital rather than monetary weights. The
solution is then easily interpreted for arbitrary initial capital V0 .
Assumption 2 The portfolio lives for one day. This means that the goal is to invest
optimally for one day ahead, which is convenient since the historical
data consists of daily log-returns calculated using (1.1).
Assumption 3 p = 0.01 (q = 0.99). Banks often use the p-value p = 0.005 but other
common levels are p = 0.01 and p = 0.05. The smaller value p, the
more unlikely it is to be exposed to a loss larger than the calculated
Expected Shortfall. p = 0.01 means that with a portfolio life of one
day, one would expect the loss to be larger than the Expected Shortfall
in 1 out of every 100 days. Plugging in p = 0.01 in (2.11) yields with
Z1 being standard normal distributed and W = 1
0.01
2φ(Φ−1 (0.01))
Z
2
c=− Φ−1 (u)du = = 5.33, (2.12)
0.01 0 0.01
17
where φ(x) denotes the probability density function of the standard
normal distribution. This explains why the risk aversion coefficient
c = 5.33 was used earlier in the benchmark solution.
ES
ESrel = .
Portfolio market value
With Assumption 1 the portfolio market value is 1 implying that in this
thesis project the relative Expected Shortfall equals the Expected Short-
fall calculated when solving the portfolio optimization problem (2.7) which
simplifies the analysis.
where (
+ x, x>0
[x] =
0, x≤0
and f (R) is the probability density function of the log-return vector R. The
theorems regarding the characterization are central in this thesis project and
are stated below.
18
Theorem 2.3 As a function of α, Hp (w, α) is convex and continuously dif-
ferentiable. The ESp of the loss associated with any w ∈ W can be determined
from the formula
ESp (X) = min Hp (w, α). (2.14)
α∈R
In this formula, the set consisting of the values α for which the minimum is
attained, namely
Ap (w) = argmin Hp (w, α),
α∈R
is a nonempty and closed bounded interval (perhaps reducing to a single
point), and the VaRp of the loss is given by
V aRp (X) = left endpoint of Ap (w). (2.15)
In particular, one always has
V aRp (X) = argmin Hp (w, α), ESp (X) = Hp (w, V aRp (w)).
α∈R
The theorem relies on the assumption that the cumulative loss distribution
function FL is continuous.
Proof. Hp (w, α) is convex from its definition (2.13) and has derivative
∂ 1 1
Hp (w, α) = 1 + (FL (w, α) − 1) = (FL (w, α) − q) .
∂α 1−q 1−q
Therefore, the values of α that minimize Hp (w, α), i.e. the set Ap (w), are
those for which FL (w, α) = q. These values form a non-empty and closed
interval since FL (w, α) is continuous and nondecreasing with limits 0 as
α → −∞ and 1 as α → ∞. This proves (2.15). Furthermore,
Z
1
min Hp (w, α) = Hp (w, αq (w)) = αq (w) + [L(w, R) − αq (w)]+ f (R)dR,
α∈R 1−q
R∈Rm
19
Theorem 2.4 Minimizing the ESp of the loss associated with all w ∈ W
is equivalent to minimizing Hp (w, α) over all (w, α) ∈ W × R, in the sense
that
min ESp (X) = min Hp (w, α),
w∈W (w,α)∈W×R
where, moreover, a pair (w∗ , α∗ ) achieves the second minimum if and only
if w∗ achieves the first minimum and α∗ ∈ Ap (w∗ ). In particular, therefore,
in circumstances where the interval Ap (w∗ ) reduces to a single point (as is
typical), the minimization of H(w, α) over (w, α) ∈ W × R produces a pair
(w∗ , α∗ ), not necessarily unique, such that w∗ minimizes ESp and α∗ gives
the corresponding VaRp .
Furthermore, Hp (w, α) is convex with respect to (w, α), and ESp is con-
vex with respect to w, when L(w, R) is convex with respect to w, in which
case, if the constraints are such that W is convex, the joint minimization is
an instance of convex programming.
Proof. The initial claims in the first section follows directly from Theorem
1 and by realizing that minimization of Hp (w, α) with respect to (w, α) ∈
W × R can be carried out by first minimizing over α ∈ R for fixed w and
then minimizing over w ∈ W.
Proving the claim in the second section starts by the observation that
Hp (w, α) is convex with respect to (w, α) when [L(w, R) − α]+ is convex.
Since a decomposition of two convex functions is convex, this is true when
L(w, α) is convex with respect to w. The convexity of ESp follows from
the fact that minimizing an extended real-valued convex function of two
variables with respect to one of these variables results in a convex function
of the remaining variable or by recalling that Expected Shortfall is a coherent
risk measure and thus satisfies the convexity property.
From Theorem 2.3 and 2.4 it follows that instead of minimizing Expected
Shortfall directly through its definition (2.1) one can equivalently minimize
Hp (w, α) defined by (2.13). With convex loss function L(w, R) this is partic-
ularly nice since then the optimization problem becomes a convex program.
By sampling D samples from the probability density function f (R) of
the return vector, Rockafellar and Uryasev then argues that the integral in
(2.13) can be approximated with the sum
D
1 X
Hp (w, α) ≈ H̃p (w, α) = α + [L(w, Rd ) − α]+
(1 − q)D
d=1
20
problem formulation (2.7) can be approximated as
D
1 X
min α + zd
w,α (1 − q)D
d=1
T
Subject to w µ ≥ θ
zd ≥ 0, d = 1, . . . , D
(2.16)
− L(w, Rd ) + α + zd ≥ 0, d = 1, . . . , D
Xn
wi = V0
i=1
wi ≥ 0, i = 1, . . . , n.
where n is as always the number of assets available in the reference portfolio.
The above problem formulation is a standard result in portfolio optimization
with Expected Shortfall and is for instance presented by Skoglund and Chen
in [19, pp. 156-157]. The problem is a convex linear program with stan-
dardized numerical solving algorithms such as the Simplex or interior point
method [3, Ch. 5,10], when L(w, R) is convex. This is for instance true for
the reference portfolio used in this thesis project, where L(w, R) = −wT R
is linear.
21
Table 2.1: Solution to (2.16) with log-returns simulated from multivariate
normal distribution with empirically estimated parameters µb and Σ.
b
Asset name Weight
AstraZeneca 0
Ericsson A 0
Hennes & Mauritz B 0.0066
ICA Gruppen 0.0693
Nordea Bank 0
SAS 0
SSAB A 0
Swedish Match 0.1422
TeliaSonera 0
Volvo 0
Total Bond Index 0.7819
Expected Shortfall 0.0072
where Φ−1 (x) is the normal quantile function and portfolio variance to Ex-
pected Shortfall as
V aRp (X)
ESp (X) = σp λ . (2.18)
σp
where λ(x) is again the hazard function for the normal distribution defined in
(2.4). Hence, by recalling the portfolio variance for the benchmark solution
calculated in (1.5), the benchmark Value-at-Risk becomes
T
V aR0.01 (wBM R) = 0.0065
22
special case of multivariate normal distributed log-returns, portfolio opti-
mization problem (2.7) is indeed identical to Markowitz mean-variance opti-
mization problem (1.3). Furthermore, since the two optimization problems
are connected only by the risk aversion coefficient c, calculated as (2.11)
for all elliptical distributions, it is clear that Markowitz mean-variance opti-
mization problem is a special case of portfolio optimization with Expected
Shortfall for all elliptical distributions. Since portfolio optimization with Ex-
pected Shortfall does not rely on the stylized assumptions that Markowitz
mean-variance optimization problem requires, (2.7) is a more general prob-
lem formulation that can be applied to more general investment situations
with possibility to include non-linear assets or model the log-returns with
asymmetric distributions for instance. In this thesis project I focus on the
modeling of asset log-returns with non-elliptical distributions.
23
24
Chapter 3
25
3.1 Model Uncertainty
Markowitz mean-variance optimization problem assumes the expected log-
return vector µ and covariance matrix Σ to be known and the same holds
for the portfolio optimization problem (2.7). However in real applications
the parameters are often estimated from historical market data, as in this
thesis project. Since there is only a limited amount of historical data avail-
able on the market, the expected log-return vector has to be estimated by
the empirical mean vector and similarly the covariance matrix must also be
approximated. One cannot be certain that the approximated parameters
equals the true market parameters and hence there exists model uncertainty
in the parameters when solving the portfolio optimization problem yielding
uncertainty in the optimal solution.
A long established problem in portfolio optimization is referred to as the
problem of error maximization, discussed by Scherer in [16, pp. 185-186].
The problem origins from that the optimization algorithm tends to select
assets with the best properties, in this case high log-return and low variance
and correlation, and not select assets with the worst properties. These are
the assets where estimation errors in µ and Σ are likely to be largest, with
strong dependence on outliers in the data. Hence the optimal solution will
have strong dependence on parameter uncertainty, where positive estimation
error leads to over-weighted assets and negative estimation error leads to
under-weighted assets.
In addition to parameter uncertainty, the multivariate distribution of the
empirical log-returns can generally only be approximated, leading to uncer-
tainty in distribution in the model. Different distributions are more or less
successful in modeling historical data and where some are better at model-
ing the tails of the empirical distribution, others are better at capturing the
behavior in the center of the empirical distribution. A perfect distribution
fit on the entire empirical distribution is generally not possible to find and
hence uncertainty in distribution must be regarded as a relevant factor in
the model.
In the 1990s, two approaches were developed to tackle model uncertainty.
One approach is called robust statistics, which involves removing or down-
weighting what is thought of as being outliers in the empirical data set. The
second approach is the concept of robust optimization, which will be consid-
ered in this thesis project. Robust optimization can intuitively be thought of
as attempting to optimize the worst-case scenario given a confidence region.
Traditionally, parameter uncertainty in Markowitz mean-variance optimiza-
tion problem has received a lot of attention from the robust optimization
community but less has been said about distribution uncertainty. This the-
sis project contributes to the robust optimization community by studying
worst-case scenario based robust optimization of portfolio optimization prob-
lem (2.16) under different distribution models. First I will perform a case
26
study on robust optimization under elliptical distributions and then move
on to study robust optimization under asymmetric log-return distributions.
A third possibility for model uncertainty is that the model itself is wrong.
For instance, there might exist liquidity risk in assets, invalidating the trans-
lation invariance property of the risk measure, included in the portfolio which
is not covered by the model, or it could be that the covariance matrix is de-
pendent on time and state of economy. These types of model uncertainties
can be harder to evaluate directly without changing the entire problem for-
mulation and will not be considered in this thesis project.
min f (w; X, a)
w
Subject to gk (w; X, b) ≤ gk,0 , k = 1, . . . , m,
min f (w; X, a)
w
Subject to gk (w; X, b) ≤ gk,0 , k = 1, . . . , m
a ∈ A, b ∈ B, FX (x) ∈ F.
27
are convex functions and A, B, F are convex sets, the robust optimization
problem is particularly nice since when an extreme point is found, we know
it is the global optimal solution.
There are different approaches on how to solve a robust optimization
problem, and one of the most commonly used approaches in portfolio opti-
mization is the worst-case scenario based robust optimization approach in-
troduced by Tütüncü and König in [22]. They argue it is a good idea to solve
the robust optimization problem by first finding the worst-case scenarios for
the parameters given the uncertainty sets and then consider the resulting
optimization problem with these worst-case scenario parameters and solve
it with ordinary optimization algorithms. For instance, if the worst-case
scenario is attained for the smallest a in the uncertainty set A and for the
largest b in the uncertainty set B then the worst-case scenario based robust
optimization problem would be
min f (w; X, a)
w,a∈A
under the additional constraint that X is drawn from the worst-case distribu-
tion, measured in some way, that is included in the distribution uncertainty
set F.
Since the worst-case scenario based robust optimization approach is widely
used in robust portfolio optimization problems, it is used in this thesis project
as well. Section 4.2 discuss in greater detail the effect of interpreting robust
optimization as finding the worst-case scenario and then optimize the result-
ing worst-case scenario optimization problem.
min max wT Σw
w Σ∈S
Subject to wT µ ≥ Rmin
Xn
wi = 1
i=1
wi ≥ wmin
and aims at finding the portfolio weights that minimize the total portfolio
variance with the worst-case covariance matrix. The authors assume that
28
the investor is ambiguous about the covariance matrix and have several pos-
sible candidates. The robust problem is discussed with box and ellipsoidal
constraints on the covariance matrix, mathematically as
min max wT Σw
w Σ∈S
Subject to min wT µ ≥ Rmin
µ∈M
Xn
wi = 1
i=1
wi ≥ wmin
and the objective is to find the portfolio weights that minimize the risk given
worst-case scenarios in both expected log-return and covariance.
The worst-case scenario based robust version of problem (2.16) is now
easy to formulate in a similar manner. Let the covariance matrix Σ b and the
mean log-return vector µ b be estimates from historical data of the uncertain
parameters Σ and µ and assume we know that the true parameters are
somewhere in the uncertainty sets M, S. The robust version of portfolio
29
optimization problem (2.16) with linear loss function is then given by
D
1 X
min α + zd
w,α (1 − q)D
d=1
T
Subject to min w µ ≥ θ
µ∈M
zd ≥ 0, d = 1, . . . , D
(3.1)
min max wT Rd + α + zd ≥ 0, d = 1, . . . , D
µ∈M Σ∈S
n
X
wi = V0
i=1
wi ≥ 0, i = 1, . . . , n
30
development for the assets in the reference portfolio and locate a time period
where the assets seem the most correlated. Typically this occurs during times
of financial crisis. Looking at the asset price development in Figure 1.1, it
seems as the assets in the reference portfolio are the most correlated during
2007 − 2009 where most asset prices have a negative trend, which fits well
with the time line for the financial crisis presented in [8]. Therefore, the
worst-case scenario covariance matrix originating from a box uncertainty set
is defined with basis on the financial crisis and taken as the covariance matrix
estimated from historical data from the first trading day of 2007 until the
last trading day of 2009, i.e.
Σ(box)
max = Σ between January 2, 2007 – December 30, 2009.
b (3.3)
The two last constraints are posed to ensure that Σmax is not too far from
Σ
b and so that all variances are positive. In this thesis project the second
31
moment of the Wishart distribution, Q, is estimated by simulating 10, 000
covariance matrices from the Wishart distribution and then calculate the
covariance matrix for all pairwise elements in the simulated covariance ma-
trices. The solution to (3.5) might not be positive semi-definite which is a
requirement for the portfolio optimization problem to be solvable. In that
case Rebonato and Jäckel present in [14] a general methodology to find the
closest symmetric and positive semi-definite matrix given a non positive semi-
definite matrix that can be used to obtain a feasible covariance matrix. The
method involves spectral decomposition of the matrix and setting negative
eigenvalues to zero. Throughout the remaining part of the thesis project I
(ellipsoidal) (ellipsoidal)
refer to the worst-case parameters µmin and Σmax given by (3.4)
and (3.5) respectively as "ellipsoidal uncertainty parameters". See Appendix
A for numerical values.
Regarding the distribution uncertainty set F, the thesis project first con-
siders elliptical distributions with the historically popular normal distribu-
tion and Student’s t distribution with different degrees of freedom, i.e.
32
1, . . . , D drawn from the multivariate N (µmin , Σmax ) distribution. The left
column in Table 3.1 presents the solution with D = 15, 000 simulated log-
return vector samples and box uncertainty parameters. The right column
of Table 3.1 presents the solution with ellipsoidal uncertainty parameters.
The last row in each column corresponds to the Expected Shortfall when
investing according to the corresponding column. Since the robust portfolio
optimization problem is still convex and linear, the problem can be solved by
the same algorithm as in Section 2.4 for the non-robust problem formulation.
33
is also interesting to study larger degrees of freedom as well to observe if the
robust solution converges to the robust solution with normal distributed log-
returns. Therefore I decide to solve (3.1) with ν = {2.1, 3.58, 10, 20} in this
thesis project. With box uncertainty parameters, the solutions for varying
degrees of freedom are presented in Table 3.2. With ellipsoidal uncertainty
parameters the corresponding solutions are presented in Table 3.3.
34
Similarly to the case with normal distributed log-returns, to avoid intro-
ducing statistical uncertainty in the solutions, it is possible to solve Markowitz
mean-variance optimization problem (1.3) with the same worst-case parame-
ters and risk aversion coefficient c calculated using (2.11) with the Student’s
t quantile function. Those solutions are presented in Appendix C.2 as refer-
ence solutions to validate the accuracy of the simulated solutions.
35
yielding poor estimates with large variation. If instead uh is too small then
more observations can be used to estimate the parameters but modeling
the excesses with the generalized Pareto distribution becomes questionable.
One approach to estimate the parameters is to pick some uh1 far out in the
empirical distribution, say the 90% quantile of the empirical distribution,
and estimate the parameters with Maximum Likelihood estimation. Then a
new threshold uh2 is chosen farther out in the empirical distribution, say the
91% quantile, and the parameters are estimated once again with Maximum
Likelihood estimation. The procedure is repeated and the threshold uh is
chosen as the first candidate for which the parameters to the generalized
Pareto distribution are stable from that point on. This method relies on the
assumption that the parameters will converge before we are not too far out
in the distribution tail. With data originating from the historical log-returns
in this thesis project the method could not be used because the parameters
did not converge until very far out in the distribution tail. Instead, the
threshold were chosen as the 90% empirical quantile, i.e. uh = Fn−1 (0.90),
for each asset in the reference portfolio1 . In Section 4.3 I mention another
strategy that could be used to choose better values of uh .
Note that the generalized Pareto distribution models excesses above a
high threshold uh which means that we are looking at the upper tail of the
log-return distribution, but Expected Shortfall only depends on the lower
tail of the log-return distribution. Luckily, if the lower tail of the log-return
distribution consists solely of negative values below some negative threshold
ul one may simply look at the absolute values and the "excess" of an ob-
servation in the lower tail is then the positive distance from the observation
to ul and one can model the lower tail of the log-return distribution with a
generalized Pareto distribution as well. Levine makes use of this strategy in
[9] when modeling the lower tail of the total monthly return distribution for
the A-rated 7- to 10-year corporate bond component of Citi’s U.S. Broad In-
vestment Grade Bond Index from January 1980 to August 2008. Similarly to
the upper threshold uh , the lower thresholds are chosen as the 10% empirical
quantile, i.e. ul = Fn−1 (0.10), for each asset in the reference portfolio.
Since generalized Pareto distributions can only be used to model the tails
of a distribution a third distribution must be used as a bridge to connect the
two tails with the center of the distribution. In this thesis project, this is
done with the empirical distribution.
A mathematical expression for the entire underlying distribution will
be derived by considering two scenarios and then combining them. Consider
first the scenario where the excesses above some threshold uh of a distribution
consisting of independent and identically distributed random variables is
modeled with a generalized Pareto distribution Ghγ,β (x − uh ). For x ≥ uh
1
Skoglund and Nyström show in [13] that Maximum Likelihood estimation of general-
ized Pareto distribution parameters is not sensitive of the choice of threshold value.
36
the cumulative distribution function of the underlying distribution is given
by
P (X ≤ x) = P (X ≤ uh ) + P (uh ≤ X ≤ x)
= P (X ≤ uh ) + (1 − P (X ≤ uh ))Ghγ,β (x − uh ).
By combining the two scenarios, the cumulative distribution function for the
entire underlying distribution is given by
F (ul ) 1 − Gl (ul − x) , x ≤ ul
n
γ,β
F (x) = Fn (x), ul < x < uh (3.7)
F (uh ) + (1 − F (uh ))Gh (x − uh ), x ≥ uh .
n n γ,β
where the shape and scale parameters γ and β may differ between the two
generalized Pareto distribution functions Glγ,β (x) and Ghγ,β (x). The cumula-
tive distribution function (3.7) is asymmetric and will be referred to in this
thesis project as the hybrid GPD-Empirical-GPD distribution.
37
between the univariate random variables is inherited. A copula is constructed
by combining probability and quantile transforms. If X is a random variable
with continuous distribution function F then the probability transform states
that F (X) has uniform distribution on (0, 1), i.e. F (X) is U (0, 1). If U is a
random variable with uniform distribution on (0, 1) and G is some distribu-
tion function then the quantile transform states that G−1 (U ) has distribution
function G, i.e. P (G−1 (U ) ≤ x) = G(x). Using the probability transform it
follows that for a random vector U = (U1 , . . . , Un ) whose components have
uniform distribution on (0, 1) a random vector X = (X1 , . . . , Xn ) whose
components have marginal distributions F1 , . . . , Fn can be constructed as
X = (F1−1 (U1 ), . . . , Fn−1 (Un )). (3.8)
Now, to describe the dependence between the components of X one may
instead describe the dependence between the components of U and use (3.8).
The distribution function C(u1 , . . . , un ) is called a copula and using the
quantile transform it follows that
C(F1 (x1 ), . . . , Fn (xn )) = P (U1 ≤ F1 (x1 ), . . . , Un ≤ Fn (xn )
= P (F1−1 (U1 ) ≤ x1 , . . . , Fn−1 (Un ) ≤ xn )
= F (x1 , . . . , xn )
where F is the joint multivariate distribution of X.
Since the components of U are transformed to components in X using
the marginal distributions of X it is important to realize that the measure of
dependence between the random variables must be invariant to this transfor-
mation. Ordinary linear correlation is not invariant to this transformation
which makes it necessary to find such a measure. One measure that has
the desired property is the rank correlation Kendall’s tau. Kendall’s tau is
defined on the random vector (X1 , X2 ) as
τ (X1 , X2 ) = P (X1 − X10 )(X2 − X20 ) > 0 − P (X1 − X10 )(X2 − X20 ) < 0
38
distributed random variables. Hence, to model the tails of the empirical dis-
tributions to each asset in the reference portfolio with generalized Pareto
distribution it is of great importance that the data has the right properties.
When initially defining log-returns in (1.1) it was mentioned that log-returns
are often assumed to be weakly dependent and close to independent and
identically distributed. In financial time series analysis of risk factors (here
the log-returns), stylized facts are effects that are commonly observed in the
data. Josefsson summarizes the stylized facts in [7] as
The stylized facts are problematic since they imply that the empirical log-
returns cannot be considered independent and identically distributed and
hence the excesses in the distribution tails cannot be modeled directly with
generalized Pareto distributions. Some sample preparation has to be done,
typically by filtering the data through a time series model. A popular model
used for filtering financial log-return data is the Generalized Autoregressive
Conditional Heteroscedasticity, GARCH, model.
The general GARCH(p,q) process is defined in the following way. Let
{Zt } be independent and identically distributed N (0, 1). Then {Xt } is called
a GARCH(p,q) process if
Xt = µ + σt Zt , t∈Z
39
which fits stylized fact 2. Also, the volatility is dependent on time and on
previous volatilities, which fits stylized facts 4 and 6. This is an intuitive
motivation on why GARCH models often fits historical log-return time series
well. Furthermore, by applying Hölder’s inequality when calculating the
kurtosis of σt Zt ,
E[(σt Zt )4 ] E[σt4 ]
k(σt Zt ) = = k(Z t ) ≥ k(Zt )
E[(σt Zt )2 ]2 E[σt2 ]2
it is seen that it is greater than the kurtosis of Zt . Hence, even though
the residuals are assumed to be normal distributed, the GARCH(p,q) model
takes into account that log-returns often have fatter tails than the normal
distribution. This fits well with stylized fact 5.
There are methods for fitting general GARCH(p,q) models to time series
data which require us to find p, q, α0 , . . . , αp , β1 , . . . , βq . In this thesis project
I settle with fitting GARCH(1,1) models to the historical log-returns, which
makes the process easier. GARCH(1,1) models are used since they are often
good enough to filter financial data and the parameters can for instance be
approximated by Maximum Likelihood estimation.
When the parameters have been estimated the standardized residuals
{Zt } are obtained as
Rt − µ b
Zt =
σ
bt
and if the GARCH filtration is successful then the stylized facts should be ac-
counted for, meaning that the standardized residuals should be independent
and identically distributed.
In Appendix D I analyze the standardized residuals to the filtered empir-
ical log-returns and conclude that the distribution tails are asymmetric and
heavier than for the standard normal distribution assumed in the definition
of the GARCH process. In practice this is not a problem since the standard-
ized residuals can be modeled by any suitable distribution. This motivates
the idea of modeling the tails with generalized Pareto distributions.
40
procedure and if a Student’s t copula were to be chosen instead, it would
outperform the normal copula model only by a small margin. In Appendix
D, Table D.1, I present the parameters for the GARCH(1,1) models used for
filtering the historical log-returns to standardized residuals. With help of
Figure D.1 I argue on basis of the stylized facts for financial time series why
the GARCH(1,1) models have filtered the empirical log-return data well into
standardized independent and identically distributed residuals. Secondly I
use Figure D.2 to describe why the standardized residuals are not well mod-
eled with the standard normal distribution which motivates the use of hybrid
GPD-Empirical-GPD distributions instead. In Table D.2 I present values of
estimated thresholds and parameters for the generalized Pareto distributions
when fitted to the standardized residuals and in Figure D.3 the estimated
generalized Pareto distribution functions are plotted to show that the models
approximate the true data well.
In Section 3.1.2.1 and Section 3.1.2.2 it was easy to simulate log-returns
by generating random variables from the multivariate normal and Student’s t
distribution. With hybrid GPD-Empirical-GPD marginal distributions and
dependence between the standardized residuals from a normal copula the
simulation of log-returns is not as straight forward. Below I provide an
algorithm that can be used to simulate log-returns in this case.
41
above algorithm is repeated 15, 000 times to obtain the appropriate sam-
ple size D. Note that log-returns with worst-case scenario parameters are
as easy to generate by simply replacing Cτ with a corresponding worst-case
scenario rank correlation matrix in step 1 and replace the forcasted standard
i
deviation σt+1 and the empirically estimated µi with corresponding worst-
case scenario forcasted standard deviation and worst-case scenario expected
log-return in step 4. To solve the robust portfolio optimization problem
(3.1) the simulated log-returns can then be directly inserted in the same
optimization solving algorithm as when solving with elliptically distributed
log-returns since the optimization problem itself is unchanged. Table 3.4
presents the optimal weight vector and portfolio Expected Shortfall when
solving (3.1) with both box and ellipsoidal uncertainty parameters.
Table 3.4: Solutions to (3.1) applied to the reference portfolio with log-
returns modeled by a normal copula with hybrid GPD-Empirical-GPD
marginals. Left column with box uncertainty parameters and right column
with ellipsoidal uncertainty parameters.
Box uncertainty Ellipsoidal uncertainty
Asset name Weight Weight
AstraZeneca 0 0
Ericsson A 0 0
Hennes & Mauritz B 0.0123 0.0037
ICA Gruppen 0.1855 0
Nordea Bank 0 0
SAS 0 0
SSAB A 0 0
Swedish Match 0.3201 0.0080
TeliaSonera 0 0
Volvo 0 0
Total Bond Index 0.4822 0.9884
Expected Shortfall 0.0233 0.0056
42
popular strategy to evaluate the sensitivity of a solution in the presence of
statistical uncertainty is the bootstrap strategy, discussed in [6] by James,
Witten, Hastie and Tibshirani.
The idea behind bootstrapping is to estimate the accuracy of some esti-
mated quantity κ when it is not possible or practical to create new samples
from the original population. This could for instance be the case when the
original distribution is unknown, because it is too expensive or too time
consuming or because it is in other ways impossible to generate new data
samples. With the reference portfolio in this thesis project it is impossible to
obtain more historical daily log-return data for the time period of interest and
simulating log-returns from the normal copula with hybrid GPD-Empirical-
GPD marginal distributions is quite time consuming. Therefore the boot-
strap procedure is appropriate to use to estimate standard errors for the
portfolio weights. In the (non-parametric) bootstrap procedure one obtains
new distinct artificial data sets by repeatedly sampling with replacement
from the original data set. That is, from the original data set X we repeat-
(1) (B)
edly draw and replace elements to construct new data sets X̃ , . . . , X̃ ,
(b)
for B being big, where in each artificial data set X̃ a specific element in
X may occur multiple or no times. From each artificial data set the desired
quantity is calculated so that we have B samples κ(1) , . . . , κ(B) originating
from the bootstrapped data sets. An estimate of the standard error can then
be calculated as
v !2
B B
u
u 1 X 1 X 0
SEB (κ) = t κ(b) − κ(b ) . (3.10)
B−1 B 0
b=1 b =1
Small standard error implies that the original data set X is large enough to
produce a precise solution.
43
In this application, I let B = 1000. With D = 15, 000 simulated log-
returns in the original data set R this means that I should draw with re-
¯
placement 15, 000 samples of log-return vectors, calculate the optimal solu-
tion to (3.1) and repeat the procedure 1000 times. The standard errors for
each asset weight wi can then be calculated with (3.10). Table 3.5 summa-
rizes each of the standard errors when conducting the described procedure
for both box and ellipsoidal uncertainty parameters so that standard errors
are estimated for every weight and the Expected Shortfall in Table 3.4.
Table 3.5: Standard errors calculated with the bootstrap procedure when
solving (3.1) with original log-returns simulated from the normal copula with
hybrid GPD-Empirical-GPD marginal distributions. Left column with box
uncertainty parameters. Right column with ellipsoidal uncertainty parame-
ters.
Box uncertainty Ellipsoidal uncertainty
Asset name Standard error Standard error
AstraZeneca 0 0.0019
Ericsson A 0 0
Hennes & Mauritz B 0.0148 0.0016
ICA Gruppen 0.0111 0.0024
Nordea Bank 0 0.0004
SAS 0 0
SSAB A 0 0
Swedish Match 0.0078 0.0037
TeliaSonera 0 0.0001
Volvo 0 0.0001
Total Bond Index 0.0092 0.0018
Expected Shortfall 6.6754 · 10−4 1.7177 · 10−4
44
Chapter 4
In this chapter the results found in the previous chapter are analyzed. I
also give comments on the interpretation of robust optimization as worst-
case scenario based optimization. The thesis project ends with comments on
alternative approaches that could have been made and comments on areas
of further investigation is mentioned.
45
often have restrictions on the portfolio weights in different asset classes and
industries.
46
small variations in parameters can have huge impact on the optimal solution
and the risk exposure.
With µmin and Σmax held fixed, the optimal robust solutions seem to be
almost independent of the underlying log-return distribution. This can for
instance be seen by comparing the left (right) column of Table 3.1 with Table
3.2 (3.3) or equivalently by comparing the reference solutions in Appendix C.
The reference solutions suggest that the optimal solution is almost indepen-
dent of the multivariate elliptical distribution used for modeling log-returns.
The multivariate distribution model only have significant impact on the Ex-
pected Shortfall where Student’s t distributed log-returns yield larger risk
than normal distributed log-returns. These two observations should come
as no surprise. The constraint wT µ ≥ θ is independent of the log-return
distribution and the optimal weight vector is insensitive of the log-return
distribution. On the contrary, the risk is directly dependent on the distri-
bution model since fatter tails implies a greater probability of encountering
extremely negative (and positive) log-returns which increases the Expected
Shortfall. The Student’s t distribution with low degrees of freedom have fat-
ter tails than the normal distribution and therefore Student’s t distributed
log-returns imply larger Expected Shortfall. Additionally, as the degrees of
freedom increases the Expected Shortfall should converge to the Expected
Shortfall for normal distributed log-returns. This behavior is verified by solv-
ing (3.1) with Student’s t distributed log-returns and plot Expected Shortfall
as function of the degrees of freedom. Figure 4.1 depicts this study with a
reference level of Expected Shortfall for normal distributed log-returns in-
cluded to observe the convergence of Expected Shortfall.
Student's t distribution
0.1 Normal distribution
0.09
0.08
0.07
Expected Shortfall
0.06
0.05
0.04
0.03
0.02
0.01
0
5 10 15 20 25 30 35 40
Degrees of freedom
47
Until now the threshold for acceptable expected daily log-return has been
held constant to θ = 2.0242 · 10−4 (5.08% expected yearly log-return). Fig-
ure 4.2 presents robust solutions to (3.1) with box uncertainty parameters
and both multivariate normal and Student’s t distributed log-returns for
varying θ. As can be seen, the solutions are almost identical with small dif-
ferences originating from statistical uncertainty and small differences in the
risk aversion coefficient. Hence the choice of distribution has little impact
on the optimal weights regardless of the level of θ. As long as θ is feasible1
the optimal solution is affected very little by different elliptical distributions
and portfolio optimization with Expected Shortfall is insensitive to which
elliptical distribution model that is used.
Multivariate normal distributed log-returns Multivariate Student's t distributed log-returns with ν = 2.1
100 100
90 90
80 80
Optimal asset weight w i in percent
70 70
AstraZeneca AstraZeneca
Ericsson Ericsson
60 H&M 60 H&M
ICA ICA
Nordea Nordea
50 SAS 50 SAS
SSAB SSAB
Swedish Match Swedish Match
40 Telia 40 Telia
Volvo Volvo
OMRX Total Bond Index OMRX Total Bond Index
30 30
20 20
10 10
0 0
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.01 0.02 0.03 0.04 0.05 0.06 0.07
Threshold for yearly portfolio log-return (251 · θ) Threshold for yearly portfolio log-return (251 · θ)
Figure 4.2: Left: Optimal robust solutions to (3.1) with box uncertainty
parameters and multivariate normal distributed log-returns as function of
θ. Right: Corresponding solutions with multivariate Student’s t distributed
log-returns and 2.1 degrees of freedom.
1
If θ is too large then the constraint wT µ ≥ θ cannot be satisfied and the portfolio
optimization problem has no solution.
48
Expected Shortfall vs. threshold for expected yearly portfolio log-return
0.2
Normal distribution
0.18 Student's t distribution, ν = 2.1
Student's t distribution, ν = 3.58
Student's t distribution, ν = 10
Student's t distribution, ν = 20
0.16
0.14
Expected Shortfall
0.12
0.1
0.08
0.06
0.04
0.02
0
0.01 0.02 0.03 0.04 0.05 0.06 0.07
Threshold for expected yearly portfolio log-return (251 · θ)
Figure 4.3: Expected Shortfall for problem (3.1) for different elliptically
distributed log-returns as function of the threshold for expected portfolio
log-return θ.
Two interesting observations can be made. The first is that given a log-
return distribution, Expected Shortfall increases marginally when the thresh-
old for acceptable expected yearly portfolio log-return is less than 3% and
afterwards it increases more pronounced. This behavior should be specific
for the reference portfolio used in this thesis project and for another reference
portfolio the behavior could differ. Looking at Figure 4.2 one notice that an
expected yearly portfolio log-return of 3% seems to be a breakpoint in the
optimal weight vector as well. It is stable and almost unchanged for expected
yearly portfolio log-returns less than 3% but then changes continuously for
larger values. This explains the behavior of Expected Shortfall. I conclude
that for the specific reference portfolio used in this thesis project the investor
can increase the acceptable expected yearly portfolio log-return up to 3% in
the robust portfolio optimization problem with box uncertainty parameters
without increasing the Expected Shortfall since the optimal asset allocations
are unchanged. With acceptable expected yearly portfolio log-return greater
than 3% the investor must accept that the Expected Shortfall will increase
more rapidly. The second interesting observation from Figure (4.3) is that
there seems to be some structure in the ratio ES Student’s t /ES Normal and the
figure looks like a folding fan. Since the optimal weight vector is almost un-
changed between different elliptical distributions, the dependence between
ES Student’s t and ES Normal for a given threshold θ must be possible to explain
from how Expected Shortfall is calculated analytically for elliptical distribu-
tions. Recalling that the portfolio log-return under elliptical distributions
49
can be written as (2.10), the portfolio Expected Shortfall is
√
ESp (wT R) = −wT µ + wT ΣwESp (W Z1 ), (4.1)
50
ing historical log-return tails and therefore the Expected Shortfall calculated
with multivariate normal distribution will be smaller than the true Expected
Shortfall. Modeling the tails with generalized Pareto distributions solves this
problem and the tails are better modeled with fatter tails which results in
larger Expected Shortfall. In this sense, the Expected Shortfall in Table 3.4
should be thought of as closer to reality than the Expected Shortfall in Table
3.1.
It is not only the Expected Shortfall that changes when simulating log-
returns from a normal copula with asymmetric marginals instead of simu-
lating from a multivariate normal distribution. The optimal asset weights
change notably as well, seen by comparing the solutions in Table 3.1 and
Table 3.4. The general structure of the investments is very similar between
the tables, but quite large differences can be seen in the amount of capi-
tal invested when comparing individual assets. This behavior was not seen
when changing between elliptical distributions. It can for instance be seen
that with box uncertainty parameters a switch to simulating with the normal
copula results in an investment increase of approximately 4% in ICA Grup-
pen and 1% in Hennes & Mauritz B and a decrease of approximately 3% in
Swedish Match and 2% in the Total Bond Index. These can be significant
changes. For instance, if $1 million is invested then there is a shift of ad-
ditional $40, 000 invested in ICA Gruppen. The optimal solutions with box
uncertainty parameters differ more than with ellipsoidal uncertainty param-
eters but this is likely due to that with ellipsoidal uncertainty parameters the
optimal solution suggests that almost all initial capital should be invested
in one asset and this decision cannot be changed much without violating
the constraint wT µ ≥ θ. In Section 4.1.3 I make an attempt to explain the
differences in optimal solution by considering changes in each asset’s risk
contribution to the entire portfolio’s Expected Shortfall.
With elliptically distributed log-returns the optimal solutions to (3.1)
could be compared to the analytical solutions in Appendix C where (1.3) is
solved with risk aversion coefficient calculated by (2.11) to find an appropri-
ate sample size. In that case it was concluded that with D = 15, 000 log-
return samples the simulated solutions are close to the analytical solutions.
With hybrid GPD-Empirical-GPD distributed log-returns the solutions can-
not be compared with analytical solutions. Instead, standard errors to the
optimal solutions in Table 3.4 were calculated using the bootstrap method
with results presented in Table 3.5. I begin by analyzing the solutions when
box uncertainty parameters were used. By comparing the magnitudes of the
standard errors with the magnitudes of the asset weights I see that most
uncertainty lies in the Hennes & Mauritz B investment. The standard error
is larger than the investment and the true optimal weight might differ quite
much from the one suggested and it is not possible to reject at a 95% con-
fidence level that the true optimal investment is zero. In this sense, larger
sample size D of log-returns is desired to lower the standard error of the
51
optimal weight on that particular asset. On the other hand, the portfo-
lio Expected Shortfall has very small standard error so the uncertainty in
the Hennes & Mauritz B investment seems to have little influence on the
uncertainty in Expected Shortfall. In this sense larger sample size seems
unnecessary. I conclude that solving (3.1) with 15, 000 log-return samples
simulated from a normal copula with hybrid GPD-Empirical-GPD marginal
distributions yields acceptable accuracy, but increasing the sample size a bit
further could increase the accuracy in the Hennes & Mauritz B investment
if this is important to the investor.
When it comes to the results in the right columns of Table 3.4 and 3.5
where ellipsoidal uncertainty parameters have been used, the standard errors
points out that it is not possible to reject that the true optimal weights for
Hennes & Mauritz B and Swedish Match are zero. This means that the
optimal solution with ellipsoidal uncertainty parameters could be to invest
everything in the Total Bond Index. The result is easily interpreted. With
large volatility and correlations and small expected log-returns on the risky
assets in a portfolio, a conservative investor would, if he is forced to invest,
invest all his capital in the least risky asset, being the Total Bond Index. The
standard error for the portfolio Expected Shortfall is furthermore small and
I conclude that with ellipsoidal uncertainty parameters, 15, 000 log-return
samples simulated from a normal copula with hybrid GPD-Empirical-GPD
marginal distributions is large enough to obtain accurate solutions regardless
of which parameters are used.
In Section 4.1.1 I concluded that portfolio optimization with Expected
Shortfall is sensitive to uncertainties in the parameters µ and Σ. Now, af-
ter analyzing the optimal robust solutions with different types of log-return
distribution models, I will extend the general conclusion and say something
about sensitivity to distribution. It has already been seen that the optimal
weight vector is quite insensitive to different elliptical log-return distribu-
tions. However, changing from an elliptical to an asymmetric distribution
changes the optimal weight vector significantly. I therefore conclude that
portfolio optimization with Expected Shortfall is sensitive to distribution
uncertainty - if both elliptical and asymmetric distributions are included in
the distribution uncertainty set F. This is an interesting conclusions since
the robust optimization community has not put much focus on distribution
uncertainty. I can thus conclude that problem (2.7) indeed is important since
not only parameter uncertainty but also distribution uncertainty affects the
optimal solution.
The major drawback with simulating log-returns from a normal copula
with hybrid GPD-Empirical-GPD marginal distributions is that the investor
is required to use an algorithm such as Algorithm 1. The algorithm is po-
tentially more time consuming than simulating log-returns from a simple
multivariate elliptical distribution since the data must be treated in four
steps rather than just one. With many risk factors the additional steps 2-
52
4 in Algorithm 1 can potentially take relatively long time. Therefore, this
approach requires careful implementation for high frequency trading opti-
mization but works perfectly fine in this application when trades are made
once a day and can be used in favor of the simpler multivariate distribution
approach. The reasoning can be taken further and it can be concluded that
if it is of great importance to model the log-returns well, then portfolio opti-
mization with Expected Shortfall is appropriate to use since the log-returns
can be modeled by copulas with different marginal distributions. The cost
is that the optimization problem becomes computationally heavier, is more
time consuming and introduces statistical uncertainty since simulations has
to be made. If a fast optimization procedure is important then Markowitz
mean-variance optimization problem might be better to consider. An ex-
ception is if the reference portfolio includes non-linear assets where portfolio
optimization with Expected Shortfall must be used or the assets must be
approximated by linear functions.
53
Table 4.1: Euler Allocations calculated under the assumption of normal
distributed log-returns with box uncertainty parameters and the expected
yearly log-return in percent.
Multivariate normal distribution - Box uncertainty
(box)
Asset name ∂ESp (X)/∂wi µmin (% per year)
AstraZeneca 0.0139 3.3057
Ericsson A 0.0243 −7.7406
Hennes & Mauritz B 0.0201 4.6686
ICA Gruppen 0.0355 6.0308
Nordea Bank 0.0293 0.4732
SAS 0.0283 −30.1588
SSAB A 0.0314 −27.2710
Swedish Match 0.0461 7.2086
TeliaSonera 0.0196 −4.8182
Volvo 0.0306 −2.3136
Total Bond Index −0.0004 3.3135
Table 4.2: Euler Allocations calculated under the assumption of normal dis-
tributed log-returns with ellipsoidal uncertainty parameters and the expected
yearly log-return in percent.
Multivariate normal distribution - Ellipsoidal uncertainty
(ellipsoidal)
Asset name ∂ESp (X)/∂wi µmin (% per year)
AstraZeneca −0.0035 −11.4241
Ericsson A −0.0160 −42.7518
Hennes & Mauritz B −0.0131 −20.5889
ICA Gruppen −0.0072 −12.5113
Nordea Bank −0.0251 −39.4785
SAS −0.0193 −80.5988
SSAB A −0.0293 −73.9089
Swedish Match −0.0021 −6.9987
TeliaSonera −0.0121 −29.2233
Volvo −0.0260 −46.4973
Total Bond Index 0.0053 5.2738
Now recall that the optimal solution with box uncertainty parameters
according to Table 3.1 is to invest approximately 50% of the initial capital
in the Total Bond Index, approximately 35% in Swedish Match and the
remaining 15% in ICA Gruppen. Looking at the yearly expected log-return
one might wonder why Hennes & Mauritz B is not invested in, having larger
expected log-return than the Total Bond Index. The answer lies in the Euler
Allocations, where it is seen that the Total Bond Index works as a hedge
against all other assets in the reference portfolio, hence decreasing the total
54
portfolio Expected Shortfall.
With ellipsoidal uncertainty parameters, the optimal solution is accord-
ing to Table 3.1 to invest approximately 99% in the Total Bond Index and
the remaining 1% split between ICA Gruppen, Swedish Match and Hennes
& Mauritz B, all three having negative expected log-return. Investing in
assets having negative expected log-return could seem counter intuitive but
is explained by fact that the assets act as hedges against investments in the
Total Bond Index, as seen in Table 4.2. Hence, by allowing the expected
total portfolio log-return to decrease a little, still being larger than θ, by
investing 1% in assets with negative expected log-return the investor is able
to decrease the portfolio’s Expected Shortfall.
Equation (2.5) and (2.6) can be used to approximate the Euler Alloca-
tions when the log-returns are simulated from the normal copula with hybrid
GPD-Empirical-GPD marginal distributions. With p = 0.01 and sample size
D = 15, 000 it follows that V aRp (X) = L151 , the 151th largest simulated
loss. The Euler Allocation for asset i is then the mean of the 150 largest
losses for that asset. With box uncertainty parameters the Euler Allocations
are presented in Table 4.3 and with ellipsoidal uncertainty parameters the
Euler Allocations are presented in Table 4.4. Recall that the optimal weight
vectors are presented in Table 3.4.
Table 4.3: Euler Allocations calculated for log-returns simulated from a nor-
mal copula with hybrid GPD-Empirical-GPD marginal distributions with
box uncertainty parameters and the expected yearly log-return in percent.
Normal copula - Box uncertainty
(box)
Asset name DESp (X) (wi ) µmin (% per year)
AstraZeneca 0.0097 3.3057
Ericsson A 0.0164 −7.7406
Hennes & Mauritz B 0.0177 4.6686
ICA Gruppen 0.0360 6.0308
Nordea Bank 0.0246 0.4732
SAS 0.0185 −30.1588
SSAB A 0.0281 −27.2710
Swedish Match 0.0518 7.2086
TeliaSonera 0.0122 −4.8182
Volvo 0.0213 −2.3136
Total Bond Index −0.0003 3.3135
55
Table 4.4: Euler Allocations calculated for log-returns simulated from a nor-
mal copula with hybrid GPD-Empirical-GPD marginal distributions with
ellipsoidal uncertainty parameters and the expected yearly log-return in per-
cent.
Normal copula - Ellipsoidal uncertainty
(ellipsoidal)
Asset name DESp (X) (wi ) µmin (% per year)
AstraZeneca −0.0008 −11.4241
Ericsson A −0.0102 −42.7518
Hennes & Mauritz B −0.0097 −20.5889
ICA Gruppen −0.0040 −12.5113
Nordea Bank −0.0203 −39.4785
SAS −0.0147 −80.5988
SSAB A −0.0211 −73.9089
Swedish Match −0.0013 −6.9987
TeliaSonera −0.0089 −29.2233
Volvo −0.0219 −46.4973
Total Bond Index 0.0057 5.2738
Similar conclusions that were drawn from analyzing the Euler Allocations
with multivariate normal distributed log-returns can be drawn from Table
4.3 and Table 4.4 as well. More interesting is instead to see if the changes in
Euler Allocations can motivate the differences in optimal solutions between
Table 3.1 and Table (3.4). With ellipsoidal uncertainty parameters it is
hard to tell whether the differences in solutions depend on the change of
underlying distribution or being an artifact of statistical uncertainty so I
focus on the solutions with box uncertainty parameters. As was previously
noted, the investment in Swedish Match decreases by 3% when going from
the multivariate normal model to the normal copula model. Furthermore,
the Euler Allocation increases compared to the multivariate normal model.
This seems reasonable that if there is an increase in Euler Allocation, i.e.
an increase in the risk contribution, the consequence is that less capital is
invested in that asset for a risk averse investor. The same principle holds for
the investment changes for Hennes & Mauritz B and the Total Bond Index
as well. Surprisingly, the investment in ICA Gruppen increases by 4% when
switching to the normal copula even though the risk contribution increases
for that asset. This behavior goes against the intuitive conclusion that an
increase in risk contribution makes an asset less attractive.
I conclude that combining Euler Allocations and expected log-return,
having in mind the marginal Sharpe ratio, can be used as a great tool when
analyzing the structure behind optimal portfolio solutions but should be used
together with other methods to understand the entire structure.
56
4.2 Comments on Worst-Case Scenario Based Ro-
bust Optimization
This thesis project has, as commonly done in optimization, interpreted ro-
bust optimization as worst-case scenario based optimization, presented in
Section 3.1.1. This section discuss the effect of this interpretation on the
optimization problem.
Firstly, worst-case scenario based robust optimization should be seen as
a conservative optimization approach. When the worst-case scenarios from
the uncertainty sets M, S are used in the optimization problem, it means
that the investor has a more negative view on the market than what the
empirically estimated parameters suggests. The investor expects the log-
returns of the assets in his portfolio to be worse than it has been historically
and believes that the assets are more correlated than seen in the covariance
matrix. In short, the investor has a conservative view on the market and
optimizes his asset allocations according to his view on the market.
One negative aspect with worst-case scenario based robust optimization
is that the approach is very sensitive to outliers in the historical data. As an
example, with a 95% confidence interval used as uncertainty set, the worst-
case scenario is more extreme when outliers are present in the historical data
than what the worst-case scenario would be with no outliers. This results in
an even more conservative view on the market for the investor. One approach
to solve this problem is to search the data for outliers and remove them prior
to constructing the 95% confidence interval.
A positive aspect with the worst-case scenario based robust optimization
approach is that the solution is quite stable if re-solving the problem period-
ically as more historical data is available. This is positive in the sense that
the investor does not need to re-balance his asset allocations very often and
trading fees can be held low for the investor.
An alternative approach to worst-case scenario based robust optimization
could be to grid the uncertainty sets and solve the optimization problem for
each combination of realizations in each uncertainty set and then analyze how
the solution changes as function of parameter location within the uncertainty
sets. This approach would however be very time consuming and perhaps
computationally impossible to solve with many assets in the portfolio and
fine grids of the uncertainty sets.
57
is constructed as in (3.2) by subtracting 20% of each element in the vec-
tor. The result is then compared to the lower limit of the 95% confidence
interval to see whether the worst-case scenario is reasonable or not. An al-
(box)
ternative approach is to define µmin directly as the lower limit of the 95%
confidence interval. With the particular historical data in this thesis project
(box)
this approach resulted in that all elements in µmin were negative. This
yields a robust optimization problem without solutions since the constraint
wT µmin ≥ θ cannot be satisfied with long positions. In light of this, this
alternative approach was not used in the thesis project but could be more
intuitive to use in other applications or with other reference portfolios.
All solutions in this thesis project are based on the same reference portfo-
lio with assets listed in Table 1.1. It would however be interesting to analyze
how different reference portfolios consisting of other types of financial as-
sets affect the optimal solution. For instance, how is the optimal solution
affected if the amount of financial assets included in the reference portfolio
increases? It seems reasonable that larger reference portfolios would decrease
the Expected Shortfall since the optimization algorithm has more assets to
choose from. However, as has been seen in all solutions throughout this
thesis project, the optimal solution often turns out to be to investment in
a small sub group of the possible assets. Therefore, including more assets
in the reference portfolio does not necessarily decrease the risk and could
at worst result in a less time efficient program that can be crucial for the
investor. What can be said is that including more assets in the reference
portfolio does at least not increase the financial risk. The perhaps most
interesting area of further investigation would be to extend the reference
portfolio to include other types of financial assets such as foreign curren-
cies or non-linear financial derivatives such as options. This would introduce
possibilities of more complex hedging opportunities than possible with linear
assets. It is furthermore interesting because non-linear portfolios makes full
use of the risk measure Expected Shortfall since Markowitz mean-variance
optimization problem cannot handle non-linear financial assets. In that case
the non-linear assets have to be approximated by linear functions. Thirdly
it would be interesting from a practical point of view to include non-linear
assets since it is common that investors have non-linear assets in their port-
folios.
In this thesis project, the p-value has been held constant to 1% but can
of course be altered as well. Decreasing the p-value should however only
increase the Expected Shortfall or the investor must decrease the threshold
for acceptable expected log-return to keep the risk constant. Hence the
investor has to find a p-level where he feels confident with both the expected
portfolio log-return and the risk he is exposed to.
Another interesting area of further investigation is to study the effect of
different copulas on the optimal solution. In this thesis project, the normal
58
copula was used together with hybrid GPD-Empirical-GPD marginal dis-
tributions but the dependency structure between different log-returns could
just as well had been modeled by another copula. However, as was men-
tioned earlier, according to Skoglund and Nyström the choice of copula has
a second-order effect on the model accuracy when modeling market risk fac-
tors. If a Student’s t copula were used instead, the results had probably only
been marginally improved.
The last area that I will mention where further investigation could be
made is in the decision of the upper and lower thresholds for the generalized
Pareto distributions. In this thesis project the thresholds were chosen as the
90% and 10% empirical quantiles respectively but no investigation was made
to establish whether these thresholds were good representatives of where the
tails begin. A better and more consistent approach to find the thresholds is
to use a parametric estimation method such as the Hill’s tail-index estimator
[4]. This could improve the results a bit further but also make the program
more time consuming.
Different approaches taken in a portfolio optimization problem, whether
it is about which problem formulation to use, which models to use or which
parameters to use, is ultimately up to the investor to decide and a trade-
off between model accuracy and time efficiency/computational possibility
always has to be considered.
59
60
Appendix A
Optimization Parameters
When solving the portfolio optimization problem (2.16) the parameters µ and
Σ are needed for instance in the process of simulating log-returns. Since the
portfolio lives for one day, µ is the expected daily log-return and Σ is the co-
variance matrix for daily log-returns. In the thesis project I begin by solving
the portfolio optimization problem with empirically estimated parameters
from historical data and then solve the robust portfolio optimization prob-
lem (3.1) after manipulating the empirical parameters to obtain two different
kinds of worst-case scenario parameters. The worst-case scenario box uncer-
tainty parameters are defined by (3.2) and (3.3) and the worst-case scenario
ellipsoidal uncertainty parameters are defined by (3.4) and (3.5). In this
Appendix I clarify the parameters by writing out their explicit numerical
values.
The numerical values of the empirical parameters estimated for the time
period January 2, 2007 until January 22, 2016 from daily log-return data to
the assets in the reference portfolio are
−4
1.6501 T
µ
b = 10 1.6463 −2.5699 2.3250 3.0034 0.2357 −10.0129 −9.0541 3.5899 −1.5997 −0.7681
2.2076 0.8564 0.7462 0.5434 0.8873 0.9597 0.9827 0.6127 0.7489 0.8089 −0.0201
0.8564 6.7548 1.4816 0.8375 2.3592 1.8613 2.5425 0.8313 1.3073 2.5546 −0.0688
0.7462
1.4816 2.6549 0.8594 1.9553 1.7679 2.1897 0.7987 1.2000 2.0837 −0.0606
0.5434
0.8375 0.8594 3.2819 1.2290 1.2987 1.4293 0.4836 0.8364 1.3421 −0.0424
0.8873 2.3592 1.9553 1.2290 4.9191 2.7372 3.5812 1.0444 1.8956 3.3907 −0.1052
b = 10−4 0.9597
Σ 1.8613 1.7679 1.2987 2.7372 13.5689 3.0979 0.7071 1.5030 2.9935 −0.0821 .
0.9827
2.5425 2.1897 1.4293 3.5812 3.0979 8.3021 1.0916 1.9876 4.4259 −0.1217
0.6127
0.8313 0.7987 0.4836 1.0444 0.7071 1.0916 2.4675 0.7539 1.0733 −0.0271
0.7489 1.3073 1.2000 0.8364 1.8956 1.5030 1.9876 0.7539 2.6576 1.9369 −0.0538
−0.1094
0.8089 2.5546 2.0837 1.3421 3.3907 2.9935 4.4259 1.0733 1.9369 5.8128
−0.0201 −0.0688 −0.0606 −0.0424 −0.1052 −0.0821 −0.1217 −0.0271 −0.0538 −0.1094 0.0195
The numerical values of the box uncertainty parameters defined by (3.2) and
(3.3) are
61
(box) −4
1.3201 T
µmin = 10 1.3170 −3.0839 1.8600 2.4027 0.1885 −12.0155 −10.8649 2.8719 −1.9196 −0.9217
3.3115 1.2960 0.9321 0.8164 1.0944 1.7779 1.2788 0.8737 0.9871 0.9052 −0.0278
1.2960 8.4445 2.3581 1.4760 4.2318 3.2969 4.3982 1.5645 2.1447 4.3579 −0.0883
0.9321
2.3581 3.9946 1.3956 3.1897 3.1002 3.6072 1.2078 1.7751 3.4598 −0.0772
0.8164
1.4760 1.3956 5.1444 1.9984 2.4270 2.4435 0.7163 1.3759 2.4078 −0.0562
1.0944 4.2318 3.1897 1.9984 8.9431 5.1257 6.1330 1.8146 3.0073 6.0763 −0.1361
(box) −4
Σmax = 10 1.7779
3.2969 3.1002 2.4270 5.1257 18.4093 5.4142 1.4777 2.9255 5.3976 −0.1051.
1.2788
4.3982 3.6072 2.4435 6.1330 5.4142 13.4753 1.8367 2.9929 7.6285 −0.1678
0.8737
1.5645 1.2078 0.7163 1.8146 1.4777 1.8367 3.7902 1.1495 1.7437 −0.0398
0.9871 2.1447 1.7751 1.3759 3.0073 2.9255 2.9929 1.1495 4.6483 3.0786 −0.0602
−0.1439
0.9052 4.3579 3.4598 2.4078 6.0763 5.3976 7.6285 1.7437 3.0786 9.6670
−0.0278 −0.0883 −0.0772 −0.0562 −0.1361 −0.1051 −0.1678 −0.0398 −0.0602 −0.1439 0.0234
(ellipsoidal) −4
2.1011 T
µmin = 10 −4.5514 −17.0326 −8.2027 −4.9846 −15.7285 −32.1111 −29.4458 −2.7883 −11.6428 −18.5248
4.5494 1.7649 1.5377 1.1198 1.8286 1.9776 2.0250 1.2627 1.5434 1.6669 −0.0413
1.7649 13.9200 3.0532 1.7258 4.8617 3.8357 5.2395 1.7130 2.6940 5.2644 −0.1417
1.5377
3.0532 5.4712 1.7710 4.0294 3.6433 4.5124 1.6459 2.4729 4.2941 −0.1248
1.1198
1.7258 1.7710 6.7631 2.5327 2.6763 2.9455 0.9966 1.7237 2.7658 −0.0873
1.8286 4.8617 4.0294 2.5327 10.1372 5.6408 7.3801 2.1522 3.9064 6.9874 −0.2168
(ellipsoidal) −4
Σmax = 10 1.9776
3.8357 3.6433 2.6763 5.6408 27.9625 6.3842 1.4572 3.0972 6.1689 −0.1693.
2.0250
5.2395 4.5124 2.9455 7.3801 6.3842 17.1087 2.2496 4.0960 9.1208 −0.2508
1.2627
1.7130 1.6459 0.9966 2.1522 1.4572 2.2496 5.0849 1.5537 2.2118 −0.0558
1.5434 2.6940 2.4729 1.7237 3.9064 3.0972 4.0960 1.5537 5.4767 3.9915 −0.1109
−0.2254
1.6669 5.2644 4.2941 2.7658 6.9874 6.1689 9.1208 2.2118 3.9915 11.9787
−0.0413 −0.1417 −0.1248 −0.0873 −0.2168 −0.1693 −0.2508 −0.0558 −0.1109 −0.2254 0.0402
62
Appendix B
Convergence as Function of
Sample Size D
This chapter studies how large the sample size D of simulated log-returns
must be for the solutions to the portfolio optimization problem (2.16) to
be precise and have small statistical uncertainty. While holding all other
parameters constant, let us first increase the sample size until the solution
weight vector no longer have visible oscillations and then find the sample
size D∗ where further increments no longer improve the solution accuracy
much. Figure B.1 presents the solution weight vector to the problem (2.16)
as function of sample size D. The simulated log-returns come from a mul-
tivariate normal distribution with empirically estimated parameters µ b and
Σ.
b
Swedish Match
70
Telia
Volvo
OMRX Total Bond Index
60
50
40
30
20
10
0
500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Sample size D
Figure B.1: The optimal weight vector as function of the sample size D of
simulated log-returns.
63
From the figure it seems as the solution converges quite fast, perhaps
as early as for D∗ = 1000 samples. However, if convergence of Expexted
Shortfall as function of sample size is investigated the behavior is different.
This behavior is presented in Figure B.2 and depicts approximate 95% confi-
dence intervals for Expected Shortfall when solving (2.16) 100 times each for
increasing sample size. With multivariate normal distributed log-returns the
Expected Shortfall seems to have converged with narrow confidence interval
for approximately D∗ = 10, 000 and for Student’s t distributed log-returns
with 2.1 degrees of freedom the threshold is approximately D∗ = 15, 000
samples. Both observations are far above the initial one of D∗ = 1000. For
an investor, accuracy in both optimal weight vector and Expected Shortfall
is important and the sample size must be chosen sufficiently large for both
factors to be accurate estimates of the true values. Therefore, when solving
the portfolio optimization problem I should use a step size of at least 10, 000
when simulating log-returns from the multivariate normal distribution and
15, 000 when simulating from the Student’s t distribution. For simplicity I
use D = 15, 000 in both cases throughout this thesis project.
64
Appendix C
Reference Solutions
65
Table C.1: Solutions to (1.3) with multivariate normal distributed log-
returns. Left column with box uncertainty parameters and right column
with ellipsoidal uncertainty parameters.
Box uncertainty Ellipsoidal uncertainty
Asset name Weight Weight
AstraZeneca 0 0.0015
Ericsson A 0 0.0004
Hennes & Mauritz B 0 0.0006
ICA Gruppen 0.1479 0.0015
Nordea Bank 0 0.0004
SAS 0 0
SSAB A 0 0.0002
Swedish Match 0.3506 0.0029
TeliaSonera 0 0.0004
Volvo 0 0.0004
Total Bond Index 0.5015 0.9917
Expected Shortfall 0.0213 0.0052
Table C.2: The risk aversion coefficient calculated numerically using (2.11)
with the Student’s t quantile function for different degrees of freedom.
Degrees of freedom ν 2.1 3.58 10 20
Risk aversion coefficient c 25.2278 11.5312 6.7265 5.9538
Note that as the degrees of freedom increases, the risk aversion coefficient de-
creases. This behavior is expected and c should converge to the risk aversion
coefficient for normal distributed log-returns, i.e. converge to c = 5.33.
With the same box uncertainty parameters that were used to obtain the
solutions in Table 3.2, the corresponding reference solutions to Markowitz
mean-variance optimization problem are presented in Table C.3. Table C.4
presents in turn the reference solutions to the simulated solutions with ellip-
soidal uncertainty parameters in Table 3.3.
66
Table C.3: Solutions to (1.3) with multivariate Student’s t distributed log-
returns with different degrees of freedom and box uncertainty parameters.
ν = 2.1 ν = 3.58 ν = 10 ν = 20
Asset name Weight Weight Weight Weight
AstraZeneca 0 0 0 0
Ericsson A 0 0 0 0
Hennes & Mauritz B 0 0 0 0
ICA Gruppen 0.1479 0.1479 0.1479 0.1479
Nordea Bank 0 0 0 0
SAS 0 0 0 0
SSAB A 0 0 0 0
Swedish Match 0.3506 0.3506 0.3506 0.3506
TeliaSonera 0 0 0 0
Volvo 0 0 0 0
Total Bond Index 0.5015 0.5015 0.5015 0.5015
Expected Shortfall 0.0914 0.0447 0.0265 0.0235
67
68
Appendix D
1
γx − γ
Gγ,β (x) = 1 − 1 + , x≥0
β
for some shape parameter γ > 0 and scale parameter β > 0 and
x
−β
Gγ,β (x) = 1 − e , x≥0
if γ = 0.
69
Table D.1: Maximum Likelihood estimated parameters to the GARCH(1,1)
models used for filtering historical log-returns to standardized residuals.
Asset name Parameter estimate Standard error t statistic
AstraZeneca α0 = 7.28056 · 10−6 5.95667 · 10−7 12.2225
α1 = 0.0931352 0.00721758 12.9039
β1 = 0.875473 0.00857937 102.044
Ericsson A α0 = 0.000272561 2.47196 · 10−5 11.0261
α1 = 0.165719 0.0223635 7.41023
β1 = 0.411872 0.0546948 7.53037
Hennes & Mauritz B α0 = 1.24182 · 10−5 1.52487 · 10−6 8.14376
α1 = 0.0763801 0.00970769 7.868
β1 = 0.874378 0.0140866 62.0714
ICA Gruppen α0 = 1.56923 · 10−5 2.08814 · 10−6 7.51496
α1 = 0.168988 0.0114856 14.713
β1 = 0.794041 0.0133304 59.566
Nordea Bank α0 = 4.66648 · 10−6 1.23123 · 10−6 3.7901
α1 = 0.0646265 0.00657526 9.82874
β1 = 0.923464 0.00792685 116.498
SAS α0 = 4.38621 · 10−5 5.27758 · 10−6 8.31103
α1 = 0.118331 0.0067498 17.5311
β1 = 0.858769 0.00767209 111.934
SSAB A α0 = 6.72565 · 10−6 8.38453 · 10−7 8.0215
α1 = 0.041997 0.00414664 10.1279
β1 = 0.948355 0.00437622 216.707
Swedish Match α0 = 1.22517 · 10−5 1.64289 · 10−6 7.45739
α1 = 0.103236 0.0101418 10.1793
β1 = 0.847721 0.0149104 56.8544
TeliaSonera α0 = 4.23416 · 10−6 6.64369 · 10−7 6.37321
α1 = 0.079082 0.0052433 15.0825
β1 = 0.909637 0.00517679 175.715
Volvo α0 = 7.43774 · −6 1.28612 · −6 5.78307
α1 = 0.0647348 0.00877252 7.37927
β1 = 0.921605 0.00983831 93.6752
Total Bond Index α0 = 2 · 10−7 9.32652 · 10−8 2.14442
α1 = 0.0865343 0.0102208 8.46645
β1 = 0.813981 0.00897679 90.6761
70
71
Figure D.1: Each sub figure consists of realizations and histograms of the standardized residuals and sample autocorrelation
functions of squared log-returns and squared standardized residuals.
The same conclusion can be drawn by observing Figure D.2 which shows
Quantile-Quantile plots for the empirical residual quantile function plotted
against the standard normal quantile function. A good fit should produce
a straight line and a reverted S-shaped curve indicates that the tails are
heavier than for the standard normal distribution. As can be seen, reverted
S-shapes appear in every plot indicating that the standardized residuals have
distributions with fatter tails than the standard normal distribution. Notice
also that the S-shapes are asymmetric, meaning that the two tails of the
residual distribution are of different size and length. Therefore, an elliptical
distribution with fat tails, for instance the Student’s t distribution, would not
capture the entire empirical residual distribution since elliptical distributions
are symmetric. Thus generalized Pareto distributions are appropriate to
model the tails of the residual distributions since this can give an asymmetric
distribution.
72
QQ Plot of Standardized Residuals to AstraZeneca versus Standard Normal QQ Plot of Standardized Residuals to Ericsson A versus Standard Normal QQ Plot of Standardized Residuals to Hennes & Mauritz B versus Standard Normal QQ Plot of Standardized Residuals to ICA Gruppen versus Standard Normal
8 25 6 6
6 20
4
4
4 15
2
2
2 10
0
0 5 0
-2
-2 0
-2
-4
-6
-6 -10
-8 -15 -8 -6
-4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4
Standard Normal Quantiles Standard Normal Quantiles Standard Normal Quantiles Standard Normal Quantiles
QQ Plot of Standardized Residuals to Nordea Bank versus Standard Normal QQ Plot of Standardized Residuals to SAS versus Standard Normal QQ Plot of Standardized Residuals to SSAB A versus Standard Normal QQ Plot of Standardized Residuals to Swedish Match versus Standard Normal
6 8 8 6
5
6
6
4
4
4
3 4
2
2
2
2
1 0 0
0
0
-2
-2
73
-2
-4
-4
-6
-3
-4 -8 -6 -6
-4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4
Standard Normal Quantiles Standard Normal Quantiles Standard Normal Quantiles Standard Normal Quantiles
QQ Plot of Standardized Residuals to TeliaSonera versus Standard Normal QQ Plot of Standardized Residuals to Volvo versus Standard Normal QQ Plot of Standardized Residuals to OMRX Total Bond Index versus Standard Normal
6 8 8
4 6
6
2 4
4
0 2
-2 0
-4 -2
-2
-6 -4
-8 -4 -6
-4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4
Standard Normal Quantiles Standard Normal Quantiles Standard Normal Quantiles
Figure D.2: Each sub figure consists of Quantile-Quantile plots of the empirical quantile function for the standardized residuals
versus the standard normal quantile function.
Table D.2 presents Maximum Likelihood estimated parameters to the
modeled generalized Pareto distributions with tail thresholds chosen as uh =
Fn−1 (0.90) and ul = Fn−1 (0.10) respectively for each asset in the reference
portfolio.
74
Cumulative distribution function for standardized residuals to AstraZeneca Cumulative distribution function for standardized residuals to Ericsson A Cumulative distribution function for standardized residuals to Hennes & Mauritz B Cumulative distribution function for standardized residuals to ICA Gruppen
1 1 1 1
F(x)
F(x)
F(x)
F(x)
0.4 0.4 0.4 0.4
0 0 0 0
-8 -6 -4 -2 0 2 4 6 8 -15 -10 -5 0 5 10 15 20 25 -8 -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6
Standardized residuals Standardized residuals Standardized residuals Standardized residuals
Cumulative distribution function for standardized residuals to Nordea Bank Cumulative distribution function for standardized residuals to SAS Cumulative distribution function for standardized residuals to SSAB A Cumulative distribution function for standardized residuals to Swedish Match
1 1 1 1
F(x)
F(x)
F(x)
F(x)
0.4 0.4 0.4 0.4
75
0.2 0.2 0.2 0.2
0 0 0 0
-4 -3 -2 -1 0 1 2 3 4 5 6 -8 -6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6
Standardized residuals Standardized residuals Standardized residuals Standardized residuals
Cumulative distribution function for standardized residuals to TeliaSonera Cumulative distribution function for standardized residuals to Volvo Cumulative distribution function for standardized residuals to OMRX Total Bond Index
1 1 1
F(x)
F(x)
F(x)
0 0 0
-8 -6 -4 -2 0 2 4 6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6 8
Standardized residuals Standardized residuals Standardized residuals
Figure D.3: Empirical cumulative distribution functions for the standardized residuals for each asset in the reference portfolio
together with estimated generalized Pareto distributions in the distribution tails.
76
Bibliography
[1] Peter J Brockwell and Richard A Davis. Introduction to time series and
forecasting. Springer Science & Business Media, 2006.
[2] Umberto Cherubini, Elisa Luciano, and Walter Vecchiato. Copula meth-
ods in finance. John Wiley & Sons, 2004.
[3] Igor Griva, Stephen G Nash, and Ariela Sofer. Linear and nonlinear
optimization. Siam, 2009.
[4] Bruce M. Hill. A simple general approach to inference about the tail
of a distribution. The annals of statistics, 3(5):1163–1174. Institute of
Mathematical Statistics, 1975.
[5] Henrik Hult, Filip Lindskog, Ola Hammarlid, and Carl Johan Rehn.
Risk and portfolio analysis: Principles and methods. Springer Science
& Business Media, 2012.
[6] Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani.
An introduction to statistical learning, volume 112. Springer, 2013.
[7] Marcus Josefsson. A copula evt-based approach for measuring tail re-
lated risk: Applied on the Swedish market. Master’s thesis, Royal Insti-
tute of Technology, 2004.
[9] Damon Levine. Modeling tail behavior with extreme value theory. Risk
Management, (17):14–18, September 2009.
[10] Filip Lindskog, Alexander Mcneil, and Uwe Schmock. Kendall’s tau for
elliptical distributions. Springer, 2003.
[11] Miguel Sousa Lobo and Stephen Boyd. The worst-case risk of a portfo-
lio. Unpublished manuscript. Available from http: // faculty. fuqua.
duke. edu/ %7Emlobo/ bio/ researchfiles/ rsk-bnd. pdf , 2000.
77
[12] Harry Markowitz. Portfolio selection. The Journal of Finance, 7(1):77–
91. American Finance Association, Wiley, 1952.
[13] Kaj Nyström and Jimmy Skoglund. Efficient filtering of financial time
series and extreme value theory. Journal of Risk, 7(2):63–84, 2005.
[14] Riccardo Rebonato and Peter Jäckel. The most general methodology to
create a valid correlation matrix for risk management and option pricing
purposes. Available at SSRN 1969689, 1999.
[16] Bernd Scherer. Portfolio Construction and Risk Budgeting, 5 ed. Risk
Books, London, Great Britain, 2015.
[17] William F Sharpe. The sharpe ratio. The journal of portfolio manage-
ment, 21(1):49–58. Institutional Investor Journals, 1994.
[18] Jimmy Skoglund and Wei Chen. Risk contributions, information and
reverse stress testing. The Journal of Risk Model Validation, 3(2):61–77.
Incisive Media Plc, 2009.
[19] Jimmy Skoglund and Wei Chen. Financial Risk Management - Applica-
tions in Market, Credit, Asset and Liability Management and Firmwide
Risk. Wiley Finance Series, 2015.
[23] Kai Ye, Panos Parpas, and Berç Rustem. Robust portfolio optimiza-
tion: a conic programming approach. Computational Optimization and
Applications, 52(2):463–481. Springer, 2012.
78