MASM006 FINANCIAL MATHEMATICS
(5) BROWNIAN MOTION
So far, we have modelled the behaviour of a share price using discrete time
stochastic processes with a small timestep δt. As we take smaller and smaller
timesteps δt, our model becomes closer to a process varying continuously in
time.
We now adopt a mathematically more sophisticated viewpoint, and work
directly with stochastic processes in continuous time. Our basic example
of such a process is Brownian motion. The stochastic processes we
shall need are all obtained by applying various transformations to Brownian
motion. The behaviour of these new processes is related to the behaviour
of the Brownian motion driving them by means of stochastic differential
equations. In order to work with these, we will need the tools of stochastic
calculus, in particular, Itô’s Lemma. This will enable us to derive the
Black-Scholes differential equation.
A continuous time stochastic process is simply a family of random
variables (Xt )t≥0 , one for each real number t ≥ 0. The variable t corresponds
to time, so that as t varies, Xt represents the path taken by some randomly
changing quantity X.
We have seen that for (discrete time) random walks (Sn )n≥0 , the
probability distribution of Sn is approximately normal for large n, with mean
and variance proportional to n. (This is a consequence of the Central Limit
Theorem.) A Brownian motion is a continuous time stochastic process in
which this normality is built in:
Definition
A Brownian motion (or Wiener process) is a continuous time
stochastic process (Wt )t≥0 with the following properties:
(i) there is a constant σ > 0 such that, for any real numbers s, t ≥ 0, the
random variable Wt+s − Ws has normal distribution N (0, σ 2 t);
(ii) given any sequence of times 0 ≤ t0 ≤ t1 ≤ . . . ≤ tn , the random
variables Wtr − Wtr−1 , for 1 ≤ r ≤ n, are independent;
(iii) W0 = 0;
(iv) Wt is a continuous function of t with probability 1 (i.e. in almost all
possible paths).
The quantity σ 2 is called the variance of the Brownian motion. If σ = 1,
the process is called a standard Brownian motion.
Notice that the increments Wt+s − Ws of a Brownian motion are
stationary (that is, they have expectation 0) by (i). This means that the
1
process is “driftless”. In particular, we have E[Wt ] = 0 for all t. Also, the
increments are independent on disjoint time intervals, by (ii).
We can easily start from some value a, instead of the initial value 0
imposed in (iii), and build in a drift of µ, by taking the process
Xt = a + µt + Wt .
Simply making the above definition does not, of course, guarantee the
existence of a stochastic process with these properties. However, one can
construct such a process by making rigorous the idea of taking limits of a
sequence of random walks. For a sketch of this, see A. Etheridge, A Course
in Financial Calculus, §3.2.
We should really say that the process (Wt )t≥0 is a Brownian motion with
respect to a particular probability measure P. Just as in the discrete binomial
model, where it was convenient to replace the actual market probability
p with the risk-neutral probability p∗ , we will allow ourselves the freedom
to reweight the probabilities, keeping the same set of possible paths for
the process. Thus (Wt )t≥0 may be a Brownian motion with respect to the
original probability measure P, but not with respect to some other probability
measure Q; for instance, we may have EQ [Wt ] 6= 0. We shall say some more
about the formalism of probability measures below.
Simulating Brownian Motion
To simulate Brownian motion in MATLAB, we must of course use an
approximation in discrete time. If we fix a small timestep δt and write Sn
for our approximation to Wnδt , then we should take
√
S0 = 0; Sn = Sn−1 + σ δtξn for n ≥ 1,
where the ξi are i.i.d. random variables from a standard normal distribution
N (0, 1).
Some Properties of Brownian Motion
(1) (Invariance under scaling.)
If (Wt )t≥0 is a standard Brownian motion, then so is (cWt/c2 )t≥0 for any
c 6= 0. This says that the process (Wt )t≥0 has the fractal-like property, that
a typical path of the process will look similar if it is scaled up. For instance,
the process (10Wt/100 ) should look similar to the original process (Wt )t≥0
itself. This can be illustrated using MATLAB.
Notice that we are not claiming that the scaled-up path is identical to
the original path, only that it is qualitatively similar. This is because the
invariance property asserts that the processes (cWt/c2 )t≥0 and (Wt )t≥0 have
the same probability distributions, not that the actual paths they take will
be the same.
2
Proof of (1): We check that the process (Vt )t≥0 given by Vt = cWt/c2 satisfies
the 4 properties in the definition of Brownian motion (with σ = 1). To do
so, we use the fact that (Wt )t≥0 itself satisfies these properties.
(i) For s, t > 0, we have that
W(t+s)/c2 − Ws/c2 ∼ N (0, t/c2 ),
since (Wt )t≥0 is a standard Brownian motion, so that
Vt+s − Vs = c W(t+s)/c2 − Ws/c2 ∼ N (0, c2 t/c2 ) = N (0, t).
(ii) Given a sequence of times 0 ≤ t0 ≤ t1 ≤ . . . ≤ tn , the random variables
Wtr+1 /c2 − Wtr /c2 are independent since (Wt )t≥0 is a Brownian
motion. Hence
the random variables Vtr+1 − Vtr = c Wtr+1 /c2 − Wtr /c2 are independent.
(iii) V0 = cW0 = 0.
(iv) As the functions t 7→ t/c2 and W 7→ cW are continuous, and t 7→ Wt
is continuous with probability 1, their composite t 7→ cWt/c2 = Vt is also
continuous with probability 1.
(2) (Wt is nowhere differentiable.)
More precisely, the function t 7→ Wt is nowhere differentiable, with
probability 1.
We will not give a rigorous proof of this, but the following heuristic
argument at least makes it clear that we cannot expect Wt to be differentiable.
(This should also be clear from the jaggedness of the simulated paths.)
To say that the function is differentiable at some point t = t0 means that
the limit
Wt0 +δt − Wt0
lim
δt→0 δt
√ But Wt0 +δt − Wt0 ∼ N (0, δt), so that |Wt0 +δt − W
exists. √t0 | is typically of
size δt. Thus |(Wt0 +δt − Wt0 )/δt| will typically be of size δt/δt = (δt)−1/2 ,
which tends to ∞ as δt → 0.
(3) (Wt eventually hits any given value.)
More precisely, for any a > 0, we have
P[Wt < a for all t] = 0,
and for any a < 0, we have
P[Wt > a for all t] = 0.
Proof of (3). Suppose that a > 0. (The case a < 0 is similar.) We begin by
defining the first hitting time
Ta = sup{t | Wt0 < a for all t0 < t}.
3
Here sup means supremum (that is, least upper bound), so by the continuity
of t 7→ Wt , Ta is the first time t for which Wt = a. If there is no t with
Wt = a then Wt < a for all t (by continuity, and using W0 = 0 < a), and by
convention Ta = ∞. We must show that P[Ta = ∞] = 0.
We will show in a moment that
P[Ta < t] = 2P[Wt > a]. (1)
√
Assuming (1) for now, set Xt = Wt / t for t > 0. As Wt ∼ N (0, t) we have
Xt ∼ N (0, 1), and
√
P[Wt > a] = P[Xt > a/ t] → 1/2 as t → ∞.
Thus P[Ta < t] → 1 as t → ∞, so that P[Ta ≥ t] → 0 as t → 0. This shows
that P[Ta = ∞] = 0, so that P[Wt < a for all t] = 0, as required.
It remains to prove (1). The idea is that, after hitting a at time Ta ,
the Brownian motion does not “remember” how it reached a, and at any
subsequent time is equally likely to be above or below a. To express
this formally, we use a conditional probability argument. From the rule
P[A | B] P[B] = P[A ∩ B], we have
P[Wt − WTa > 0 | t > Ta ] P[t > Ta ] = P[Wt − WTa > 0 and t > Ta ]
= P[Wt − WTa > 0],
since if Wt > WTa then automatically t > Ta by continuity. Now the
increments in Wt after time Ta are independent of those up to time Ta : more
precisely, the stochastic process (Wτ0 )τ ≥0 defined by Wτ0 = Wτ +Ta − WTa is
a standard Brownian motion. (This works because Ta is a stopping time:
for any t we can determine whether or not t ≤ Ta by examining the path of
the Brownian motion up to time t.) Thus for t > Ta we have
0
P[Wt − WTa > 0 | t > Ta ] = P[Wt−Ta
> 0] = 1/2.
Thus
1
P[Ta < t] = P[Wt − WTa > 0] = P[Wt > a],
2
giving (1).
Formal Framework for Probability
Before discussing further properties of Brownian motion, we need to be
a bit more precise in the way we formulate probabilistic statements. A
probability triple (Ω, F, P) consists of
• a sample space Ω; this is the set of all possible outcomes (for example,
all possible histories of the the stock market).
• a collection F of events (subsets of Ω to which a probability can be
assigned). The point here is that there can be “weird” sets of outcomes
to which it is not possible to assign a probability; such sets will not be
members of F.
We assume that F is a σ-algebra (also called a σ-ring or σ-field),
i.e., F satisfies the following properties
4
◦ Ω ∈ F;
◦ F is closed under complements:
if A ∈ F then A0 = {ω ∈ Ω | ω 6∈ A} ∈ F;
◦ Ω is closed under countable unions: if A1 , A2 , . . . is an infinite
sequence of sets in F, then
∞
[
An ∈ F.
n=1
(In particular, F is closed under finite unions: take all but finitely
many of the An to be the empty set.) Using complements, it then
follows that F is also closed under countable intersections.
• a probability measure P, assigning to each A ∈ F a probability P[A]
so that the following probability axioms hold:
◦ 0 ≤ P[A] ≤ 1 for all A ∈ F;
◦ P[Ω] = 1;
◦ P(A ∪ B) = P(A) + P(B) if A and B are disjoint;
◦ for an increasing sequence A1 ⊆ A2 ⊆ A3 ⊆ . . . of sets in F,
" ∞ #
[
P[An ] → P Am as n → ∞.
m=1
A random variable X is then just a function X: Ω −→ R (where R is the
set of real numbers) for which P[a ≤ X ≤ b] is defined whenever a ≤ b (more
formally, the set
{ω ∈ Ω | a ≤ X(ω) ≤ b}
is in F). We then say that X is F-measurable.
A filtration {Ft }t≥0 is a family of σ-algebras, one for each real number
t ≥ 0, such that Fs ⊆ Ft whenever s ≤ t, and Ft ⊆ F for all t. (Think
of t as time, and Ft as all events which are already known by time t.) We
write (FtX )t≥0 for the natural filtration associated to the stochastic process
(Ft )t≥0 : an event belongs to FtX if and only if we can determine whether it
occurs by examining the path (Xs )0≤s≤t of the process up to time t.
We say that the stochastic process (Xt )t≥0 is adapted to the filtration
(Ft )t≥0 if, for each t ≥ 0, the random variable Xt is Ft -measurable. If we
think of Ft as the history of the system up to time t, then (Xt )t≥0 is adapted
to (Ft )t≥0 if and only if the value taken by Xt is determined by time t.
Examples:
(1) (Xt )t≥0 is adapted to its natural filtration (FtX )t≥0 , by the definition of
FtX .
5
(2) The stochastic processes (Yt )t≥0 , (Mt )t≥0 , (Zt )t≥0 , defined by
Z t
Yt = Xs ds, Mt = max Xs , Zt = Xt3 − Xt/2 ,
0 0≤s≤t
are all adapted to the natural filtration (FtX )t≥0 , but the stochastic process
(Vt )t≥0 , defined by
Vt = X2t + Xt
is not. (The value of Vt is not determined until time 2t is reached.)
Conditional Expectations
Let (Ft )t≥0 be a filtration of the σ-algebra F. For a random variable
X, we write E[X | Ft ] for the conditional expectation of X relative to Ft .
Then E[X | Ft ] is a random variable which is Ft -measurable. If we think
of Ft as the history up to time t, then we should interpret E[X | Ft ] as the
expectation of X given the history up to time t; it is a random variable whose
value is determined at time t.
Just as for discrete time stochastic processes, we have the 3 rules for
conditional expectations:
(1) (Time 0 Rule).
E[Z | F0 ] = E[Z].
(2) (Tower Law). For t > s
E[E[Z | Ft ] | Fs ] = E[Z | Fs ].
In particular E[E[Z | Ft ]] = E[Z].
(3) (Taking Out a Known Factor). If the random variable Y is Ft -measurable
then
E[Y Z | Ft ] = Y E[Z | Ft ].
Martingales
We define continuous time martingales just as in the discrete time case,
except that we build in some extra flexibility by explicitly mentioning the
filtration.
Definition
A continuous time stochastic process (Mt )t≥0 is called a martingale with
respect to the filtration (Ft )t≥0 if
(i) E[Mt | Fs ] = Ms for all t > s;
(ii) E[ |Mt | ] < ∞ for all t.
6
As before the main point is (i), which says that a martingale is “driftless”.
Condition (ii) is a technical restriction: a martingale is not allowed to “get
too big too quickly”.
The terms E[Mt | Fs ] and E[ |Mt | ] are defined with respect to a particular
probability measure P on the underlying sample space Ω. When we need to
emphasize this, we say that (Mt )t≥0 is a martingale with respect to P (and
with respect to the filtration (Ft )t≥0 ), and we write the above conditions as
EP [Mt | Fs ] = Ms for all t > s; EP [ |Mt | ] < ∞ for all t.
Just as in the discrete case, we can often start with a stochastic process
which is not a martingale (with respect to the given probability measure P),
and turn it into a martingale by “reweighting the probabilities”, that is, by
replacing P by a suitably chosen new probability measure P∗ .
The following result should not come as a surprise, given our examples of
discrete time martingales.
Lemma.
Let (Wt )t≥0 be a standard Brownian motion, and let (Ft )t≥0 be its associated
filtration. Then
(i) (Wt )t≥ 0 is a martingale with respect to (Ft )t≥0 ;
(ii) the stochastic process (Wt2 −t)t≥ 0 is a martingale with respect to (Ft )t≥0
(but (Wt2 )t≥0 is not).
Proof. (i) We have to show that E[Wt | Fs ] = Ws whenever t > s, and that
E[ |Wt | ] < ∞ for all t.
Now
E[Wt | Fs ] = E[(Wt − Ws ) + Ws | Fs ] = E[Wt − Ws | Fs ] + Ws .
Since the increments in the Brownian motion are stationary, and the
increments after time s are independent of the history up to time s, we
have
E[Wt − Ws | Fs ] = E[Wt − Ws ] = 0.
Thus
E[Wt | Fs ] = 0 + Ws = Ws ,
as required.
We must also check that E[ |Wt | ] < ∞. But Wt ∼ N (0, t), and for a
random variable X with distribution N (0, t), we have
r
2t
E[ |X| ] =
π.
7
(ii) Writing Mt = Wt2 − t, we have for t > s that
E[Mt | Fs ] = E[(Wt − Ws )2 + 2Ws (Wt − Ws ) + Ws2 − t | Fs ]
= E[(Wt − Ws )2 | Fs ] + 2Ws E[Wt − Ws | Fs ] + Ws2 − t.
Now E[Wt − Ws | Fs ] = 0 since we have already shown that (Wt )t≥0 is
a martingale. The behaviour of the Brownian motion from time s on is
again a Brownian motion: more precisely, the process (Wτ0 )τ ≥0 defined by
Wτ0 = Wτ +s − Ws is itself a standard Brownian motion. Thus
0 2 0
E[(Wt − Ws )2 | Fs ] = E[Wt−s ] = Var[Wt−s ] = t − s.
Putting all this together,
E[Mt | Fs ] = (t − s) + 0 + Ws2 − t = Ws2 − s = Ms .
This shows that the process (Mt )t≥0 satisfies the first condition in the
definition of a martingale. It also shows that
E[Wt2 | Fs ] = Ws2 + (t − s) 6= Ws2 ,
so that the process (Wt2 )t≥0 cannot be a martingale (it has a positive drift).
Finally, we check that
E[ |Mt | ] = E[ |Wt2 −t| ] ≤ E[max(Wt2 , t)] = max(E[Wt2 ], t) = max(t, t) = t < ∞.
As an application of the martingale property for the standard Brownian
motion (Wt )t≥0 , we calculate the covariances of this process. Recall that for
any random variables X and Y we have
Cov[X, Y ] = E[XY ] − E[X]E[Y ].
Lemma
Let (Wt )t≥0 be a standard Brownian motion. Then
Cov[Ws , Wt ] = min(s, t).
Proof. Let (Ft )t≥0 be the natural filtration associated to (Wt )t≥0 . Without
loss of generality, we suppose s ≤ t. We know E[Ws ] = E[Wt ] = 0. Using the
fact that Ws is Fs -measurable, we calculate
Cov[X, Y ] = E[Ws Wt ] − 0
= E[E[Ws Wt | Fs ]] (Tower Law.)
= E[Ws E[Wt | Fs ]] (Taking out Ws .)
= E[Ws Ws ] ( as (Wt )t≥0 is a martingale.)
= Var[Ws ]
= s.
Nigel Byott
February 2007