02 Markov Chain
02 Markov Chain
1 Markov chain
Denition 1. A (discrete time) Markov chain with state space X is a sequence X0 , X1 , . . . of
X -valued random variables such that for all states i, j, k0 , k1 , . . . and all times t = 0, 1, 2, . . . ,
where pi,j depends only on the states i, j and not on the time t or the previous states kt−1 , kt−2 , . . . .
A very important property of a Markov chain is that it is Markovian, i.e., only the current
state aects the probability distribution of the future values of the process. History does not
matter.
In fact, a Poisson process is also Markovian. The number of arrivals in the next moment
N (t + δ), given the current number of arrivals N (t), depends only on the current level, but
not the history. It is a continuous-time Markov chain. In this chapter, we will focus on a
discrete-time Markov chain.
2 Transition matrix
The main ingredient of a Markov chain is the transition probability, pij , which states the prob-
ability of a transition from some state i to state j . In a Markov chain, this probability is
unchanged throughout. We can collect all transition probabilities in the matrix form and form
the transition matrix P.
p1,1 . . . p1,M
. ... ..
P = (pi,j )i,j .
= . .
pM,1 . . . pM,M
1
If the initial state is random with probability q0 = (q01 , . . . , q0M ), then the unconditional prob-
ability at period 1 is given by
M
X
P (X1 = i) = q1i = P (X1 = i ∩ X0 = j)
j=1
M
X
= P (X0 = j)P (X1 = i|X0 = j)
j=1
M
X
= q0j pji .
j=1
The above equation can be interpreted easily: the probability that the Markov chain is at state
i at time n = 1 is the sum of the probabilities that it is at j at time n = 0 and transit to state
i. In matrix form,
P
M PM
q1 = j=1 q0j Pj1 ··· j=1 q0j PjM
p11 ··· p1M
.. ... ..
= q01 · · · .
q0M .
pM 1 · · · pM M
= q0 P
Theorem 2.1. Let Pijn = P (Xn+k = j|Xk = i) be the n-step transition probabilities. Then the
Chapman-Kolmogorov equations are
M
(n+m) (n) (m)
for all n, m ≥ 0, all i, j.
X
pij = pik pkj
k=1
However, by the Markovian property of Xn , the information X0 = i is not useful when we know
Xn = k , i.e.,
P (Xm+n = j|Xn = k ∩ X0 = i) = P (Xm+n = j|Xn = k).
2
Now gathering terms we get
M
(m+n)
X
pij = P (Xm+n = j ∩ Xn = k|X0 = i)
k=1
XM
= P (Xm+n = j|Xn = k)P (Xn = k|X0 = i)
k=1
M
(m) (n)
X
= pkj pik .
k=1
P(2) = P(1+1) = PP = P2 .
If the initial state is random with probability q0 , then the unconditional probability at period
n is given by qn = q0 Pn . If n is very large, it may converge to a stationary probability.
Theorem 2.2. The limiting probabilities of a Markov chain always exist and will not depend on
the initial state if the chain is aperiodic and irreducible. Moreover, it is given by the solution of
X X
πj = πi Pij , πj = 1
i j
or in matrix form πP = π ,
The meaning of periodicity and reducibility will be discussed in the next section. To nd out
the stationary probability, notice that by taking transpose of the equation πP = π ,
P′ π ′ = π ′ =⇒ (P′ − I)π ′ = 0.
Av = λv.
Therefore, the vector π ′ , if exists, is actually the eigenvector of matrix P′ corresponding to the
eigenvalue λ = 1, up to normalization π1 = 1.
Example (Gambler's ruin). Consider a gambling game in which on any turn you win $1
with probability p = 0.4 or lose $1 with probability 1 − p = 0.6. Suppose initially you have $2
and you will quit the game either if your fortune reaches $4 or $0. Let Xn be the amount you
have after n plays.
3
In this example, the state space, i.e., the amount of money you can have at any time n, is
X = {0, 1, 2, 3, 4, 5}. The transition matrix is given by
1 0 0 0 0
0.6 0 0.4 0 0
P = 0 0.6 0 0.4 0 .
0 0 0.6 0 0.4
0 0 0 0 1
Given the initial probability q0 = (0, 0, 1, 0, 0), the unconditional probabilities at dierent n are
plotted in the diagram below:
We know that the player will end up with either $0 or $4. The probabilities otherwise converge
to zero very quickly. Although the unconditional probability seems to stabilize in the above
diagram, the limiting probability actually does not exists. In fact, if we compute the eigenvectors
of P′ , we will nd two eigenvectors π = (1, 0, 0, 0, 0) and π = (0, 0, 0, 0, 1) that correspond to
the eigenvalue λ = 1. With a dierent initial probability, the unconditional probability will
also converge to a dierent value. (Students are encouraged to play around with the R le I
provided.)
Example (Brand preference). Suppose there are three types of laundry detergent, and
let Xn be the brand chosen on the nth purchase. Customers who try these brands are satised
and choose the same thing again with probability 0.8, 0.6, and 0.4 respectively. When they
change they pick one of the other two brands at random. Suppose also that q0 = (0, 0.5, 0.5).
4
In this example, the transition matrix is given by
0.8 0.1 0.1
P=
0.2 0.6 0.2 .
0.3 0.3 0.4
The unconditional probability can be viewed as the market share of each brand, and is plotted
below.
We see that the market share stabilizes. After normalization, the eigenvector of P′ corresponding
to λ = 1 is
6 3 2
π= ,
11 11 11
which is also limiting probability of the Markov chain.
3 Classication of states
3.1 Accessibility and class
A Markov chain is a model of (random) transitions among states. From the transition matrix
P we can deduce a lot of properties of the states. In this section, we try to classify the states
into dierent groups or types according to their properties.
Denition 3. 1. State j is said to be accessible from state i if pij(n) > 0 for some n ≥ 0.
2. Two states i and j that are accessible to each other are said to communicate, and we write
i ↔ j.
5
3. Two states that communicate are said to be in the same class.
4. The Markov chain is said to be irreducible if there is only one class, i.e., if all states
communicate with each other.
The above denitions are all related with each other. Accessibility means it is possible (i.e.,
with probability larger than zero) to transit from one state to another after some steps. If
pij = 0 for all n,
(n)
∞
!
P (ever be in j |start in i) = P
[
(Xn = j)|X0 = i
n=0
∞
X
≤ P (Xn = j|X0 = i)
n=0
∞
(n)
X
= pij
n=0
= 0.
Therefore, if state j is not accessible from state i, then starting in i, the process will never end
up in j .
Example (Gambler's ruin). Continuing with the example, the relations among states in
the state space X = {0, 1, 2, 3, 4} are
0←1↔2↔3→4
It is easy to see that every state is accessible from states {1, 2, 3}. However, after we get to states
{0} or {4}, we cannot get to other states anymore. Therefore, the states {1, 2, 3} communicate
with each other and form one class .{0} and {4} do not communicate with each other or other
states and each of them forms a class on its own. There are in total three classes:
Since there is more than one class, this Markov chain is reducible. This also explains why
Theorem 2.2 does not apply: the limiting probabilities do not exist.
Example (Brand preference). In this example, it is possible to switch from any state to
any state. Therefore, all three states are accessible from each other and they all communicate.
There is only one class. The Markov chain is irreducible.
2. If i ↔ j , then j ↔ i.
3. If i ↔ j , and j ↔ k, then i ↔ k.
6
Proof. The rst two properties are trivial. They are true by denition. To show the third point,
notice that i ↔ j implies that for some p(n)
ij > 0 for some n and j ↔ k implies that pjk > 0 for
(m)
We can show following the same steps that p(s) ki > 0 for some s. Therefore, i ↔ k . The fourth
point is a logical consequence of the third point.
Denition 4. For any state i, let fi,j be the probability that, starting in state i, the process
will ever enter state j . State i is said to be recurrent if fi,i = 1 and transient if fi,i < 1.
A recurrent state is one that once we enter the state, we will eventually re-enter the same
state in the future, i.e.,
f (i, i) = P (i → · · · → i) = 1.
In contrary, a transient state is one that, with positive probability, the process will never re-visit
the same state after it leaves the state. This can happen when the process jumps to another
class. Remember that if two states, say i and j , are in dierent classes, they do not communicate.
Therefore, if i → j , then j ↛ i. If the process enters j after leaving i, it will never visit i again.
Example (Gambler's ruin). If the player has $0 or $4, they will quit and their fortune
will remain unchanged thereafter. Therefore, obviously {0} and {4} are two recurrent states.
For the states {1, 2, 3}, since there is positive probability that Xn reaches 0 or 4 after leaving
the state and before re-entering the same state, they are transient states.
Example (Brand preference). Since it is always possible to re-enter any state, all states
are recurrent in this example.
The above examples already give some hints about the properties of recurrent/transient
states, which we will discuss below.
Property. 1. If state i is recurrent, then starting in state i, the process will reenter state i
innitely often.
4. If state i is transient, then starting in state i, the number of time periods that the process
will be in state i has a geometric distribution with nite mean 1/(1 − fi,i ).
7
Proof. 1. It is easy to verify: since the state i is recurrent, starting in i, the process will
eventually return to i, i.e., f (i, i) = P (i → · · · → i) = 1. Due to the Markovian property
of a Markov chain, the process will eventually return to i twice is
P (i → · · · → i → · · · → i) = f (i, i)2 = 1.
By induction, the probability that the process will return to i innitely many times after
innitely many steps is still one.
2. Since i ↔ j , there is a positive probability that starting in i, the process will reach j .
Therefore, the probability that starting in i, the process reaches j before returning to i
is gij = P (i → · · · → j → · · · → i) > 0. The probability that the process starts in i
and returns to i without passing through j for n times is (1 − gij )n . From the above,
we know that starting in i, the probability that the process never reaches j is therefore
limn→∞ (1 − gij )n = 0, implying that the process will eventually reach j , i.e., fi,j = 1. The
probability that the above happens twice, i.e., i → · · · → j → . . . i → · · · → j → · · · → i,
is fi,j
2 = 1, implying that f
j,j = 1.
4. First, since recurrence is a class property, starting in i, the process will never come back
to i if it jumps to another class. The probability of this happening is 1 − fi,i . If we dene
E as the event that the process starts in i and jumps to another class before reaching i
again, then the probability of success is P (E) = 1 − fi,i . E is a Bernoulli trial and the
number of time periods that the process will be in i is the same as the number of failed
trials, which follows a geometric distribution with p = P (E) = 1 − fi,i .
5. Since the number of time periods that the process will be in a transient state follows a
geometric distribution, the probability that it is nite equals one. It means that after a
nite amount of time, say Ti , the Markov chain will never go back to state i. Suppose that
the state space is X = {1, . . . , M }, then if all states are transient, the process will never
go back to any of the state after T1 + · · · + TM periods. However, the process must visit a
state. Therefore, at least one class of state is recurrent.
The above tells us about the properties of recurrent and transient states. We know that
the process will eventually go to a recurrent class. Now, we focus on the transient states and
consider the question: on average, how many time periods will the process stay in a transient
8
state? To answer this question, suppose the transient states are collected in T = {1, . . . , t}. We
dene a transition matrix from one transient state to another transient state
p11 . . . p1t
. ..
.
PT = . ... .
.
pt1 . . . ptt
We can compute the mean time spent in transient states using the following theorem.
Theorem 3.1. Let PT species only the transition probabilities from transient states into tran-
sient states, and let S = (sij )i,j=1,...,t denote the matrix that collects the expected number of time
periods that the Markov chain is in state j , given that it starts in state i. Then,
S = (I − PT )−1 ,
Finally, we also want to nd out the probability that the probability that a Markov chain
ever makes a transition into state j given that it starts in i. We already know that if i is
recurrent, then fi,i = 1 and fi,j = 0 if j is transient. The focus is on the case when both i and
j are transient.
Theorem 3.2. The probability that the Markov chain ever makes a transition into a transient
state j given that it starts in a transient state i is given by
sij − δi,j 1, i = j;
fi,j = , δi,j =
sjj 0, i ̸= j.
9
3.3 Positive recurrence
In the last part, we introduce the concepts of recurrence and transience, and discuss the prop-
erties of the transient states. In this part, we will further separate recurrent states into positive
or null recurrent states.
Denition 5. If state j is recurrent, let mj be the expected number of transitions that it takes
the Markov chain when starting in state j to return to that state. Then, the recurrent state j
is said to be positive recurrent if mj < ∞ and null recurrent if mj = ∞.
Let Tj = min{n > 0 : Xn = j} be the number of transitions until the Markov chain rst
makes a transition into state j , then mj = E [Tj |X0 = j]. The relation between mj and the
long-run proportion of time in state j , πj , is stated below.
Theorem 3.3. If the Markov chain is irreducible and recurrent, then for any initial state πj =
1/mj .
Proof. Suppose the process starts in state i, and that it takes T1 to reach j for the rst time.
Then, denote the amount of time for the process to return to j for the k-th time as Tk . Then,
at the time when the process return to j for the n-th time, the proportion of time the process
spends in state j is given by
n
πj,n = Pn .
i=1 Tn
As n → ∞,
n 1 1
πj = lim Pn = lim Pn = lim .
n→∞ n−1 n→∞ n−1 T1 + n−1 ni=2 Tn
P
i=1 Tn i=1 Tn
n→∞
Since the Markov chain is irreducible and recurrent, states i and j communicate and it takes
nitely many time for the process to reach j from i. Therefore, n−1 T1 → 0. Next, following the
Markovian property of the Markov chain, Tj are independent and identically distributed for all
j ≥ 2. By the Law of Large Number,
n n
X n−1 1 X p
n−1 Tn = Tn −→ mj .
n n−1
i=2 i=2
Therefore, πj = 1/mj .
Proof. 1. Suppose i is positive recurrent and i ↔ j . Then, there exists an n such that
0. Now let πi > 0 be the long-run proportion of time that the chain is in i. Then,
(n)
pi,j >
(n)
πi pi,jrepresents the long-run proportion that the process is in i and then in j after n
transitions. Equivalently, it is the long-run proportion that the process is in j and was in
10
i n transitions before, which is smaller than or equal to the long-run proportion that the
chain is in j . Finally,
(n)
πj ≥ πi pi,j > 0.
2. It follows directly from the above. Since recurrence is a class property, if i is recurrent and
i ↔ j , then j is also recurrent. Since positive recurrence is also a class property, i and
j can either be both positive or null recurrent. Therefore, null recurrence is also a class
property.
3. We already know that an irreducible nite state Markov chain must be recurrent and that
positive/null recurrence is a class property. If every state is null recurrence, the long-run
proportion of each state will be πi = 1/mi = 0, which is impossible since there are only
nitely many states. Therefore, the chain can only be positive recurrent.
Theorem 3.4. The long-run proportions of an irreducible Markov chain are the unique solution
of the equations
X X
πj = πi pij , πj = 1
i j
Proof. Noting that πi is the long-run proportion of transition from state i, the long-run pro-
portion of transitions from state i to j is πi pij . Therefore, the long-run proportion in state j is
equivalent to adding up the long-run proportion of transitions from all states i to j , i.e.,
X
πj = πi pij .
i=1
The long-run proportions are also called the stationary probabilities. The reason is that if
the initial state is chosen according to πj , then the probability of being in state j at any time n
is also equal to πj . You may also notice that they are the same as the limiting probabilities, if
they exist. The relations between them will be discussed next.
3.4 Periodicity
Denition 6. The period of a state is the largest number that will divide all the n ≥ 1 for
which (n)
pi,i > 0. A Markov chain is said to be aperiodic if all of its states have period one, and
it is periodic otherwise.
11
Example (Gambler's ruin). In this example, one can only either earn one dollar or lose
one dollar unless one has already quit the game. We have for example
(2) (3) (4)
p22 = 0, p22 > 0, p22 = 0, p22 > 0...
Then, we say the period of state 2 is 2, and that the process is periodic.
Example (Brand preference). In this example, pii > 0 for all i. Obviously, the process
is aperiodic.
Theorem 3.5. 1. A periodic Markov chain does not have limiting probabilities.
2. The limiting probabilities of an irreducible, aperiodic chain always exist and do not depend
on the initial state.
3. The limiting probabilities, when they exist, will equal the long-run proportions.
Combining the second and third part of the above theorem, the limiting probabilities of
an irreducible, positive recurrent, aperiodic Markov chain is always the same as the long-run
proportion. In other words, as long as we observe the random process for a long enough period
of time, we will be able to estimate the limiting probabilities be simply computing the long-run
proportion of each state. Such process is said to be ergodic.
12