0% found this document useful (0 votes)
21 views38 pages

Stochastic Processes

The document discusses stochastic processes, focusing on the classification of states and chains in Markov processes. It defines key concepts such as recurrent and transient states, mean return time, and the classification of states into null and positive recurrence. Additionally, it explores the communication between states and the implications of these classifications on the behavior of Markov chains.

Uploaded by

Foo Zi Yee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views38 pages

Stochastic Processes

The document discusses stochastic processes, focusing on the classification of states and chains in Markov processes. It defines key concepts such as recurrent and transient states, mean return time, and the classification of states into null and positive recurrence. Additionally, it explores the communication between states and the implications of these classifications on the behavior of Markov chains.

Uploaded by

Foo Zi Yee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

LTCC:

Stochastic Processes

Maria De Iorio

Department of Statistical Science

University College London

m.deiorio@ucl.ac.uk

1 / 38
Outline

1 Summary

2 Classification of States

3 Classification of chains

4 Invariant distribution

2 / 38
Summary


=1 State i is recurrent
P(Xn = i, for some n ≥ 1 | X0 = i} =
<1 State i is recurrent

summary of previous lecture


fij (n) := prob. 1st visit from j to i occurs at time n ≥ 1.
P
fij := n∈N fij (n) = prob. ever visiting j from i [N := {1, 2, . . .}]
fjj = prob. ever returning to j
Tj := 1st hitting time of state j

P Tj = n X0 = i = fij (n), n ≥ 1
P (n)
j recurrent ⇔ fjj = 1 ⇔ n pjj = ∞
P (n) (n)
j transient ⇔ fjj < 1 ⇔ n pjj < ∞ ⇒ pij → 0, ∀ i ∈ S

3 / 38
Classification of States

Definition 1 (mean return time)


Define the mean return time to state j as

 X
µj := E Tj X0 = j = n fjj (n)
n=1

4 / 38
Classification of States

Remark 2 (a note on mean return time and recurrence/transience)


For j transient, fjj < 1 ⇒
 X
P Tj < ∞ X0 = j = fjj (n) = fjj < 1
n∈N
i.e. the probability of returning in finite time is strictly less than 1.
Therefore it is possible that the return is unbounded:

P Tj = ∞ X0 = j > 0
I.e. limn→∞ fjj (n) > 0. The mean return time to a transient state j is

 X
µj = E Tj X0 = j = n fjj (n) 6< ∞
n=1
because fjj (n) →> 0 ⇒ n fjj (n) → ∞.
However, for j recurrent, µj may or may not diverge.

5 / 38
Classification of States

Definition 3 (null and positive recurrence)

j recurrent. Then
(
null-recurrent, µj = ∞
j is
positive (non-null) recurrent, µj < ∞

Remark 4
Markov chain returns to a recurrent state w.p. 1 and does so ∞ many
times. However, unlike a positive recurrent state the mean return time of a
null-recurrent state diverges.

Example 5 (random walk,... again)


1 For p 6= q, show that all states are transient.
2 For p = q = 1/2, show that all states are null-recurrent.

Solution We’ll prove this for state 0 and then invoke a theorem later on to
extend result to all states (recurrence is a class property!).
6 / 38
Classification of States

Recall gambler’s ruin. Player A and player B play a series of games.


P (A wins) = p
P (B wins) = 1 − p =: q
Let Xn = A’s lead over B after n games. X (0) = 0, S = Z. Want to first
find f00 and see if it is < or = to 1. Note Xn = 0 [chain enters state 0 at
time n] iff A and B win equal number (n/2) of games, where n is even.
There are  
n
[n choose n/2]
n/2
many ways of this occurring, each with probability (pq)n/2 .
[A wins n/2 times = p n/2 , B wins n/2 times = q n/2 ]
I.e. chain returns to state 0 at n w.p.
 
(n) n
p00 = (pq)n/2 , n even
n/2
Recall pij∼ (s) = δij + fij∼ (s)pjj∼ (s). Then
∼ 1
f00 (s) = 1 − ∼ [i, j = 0]
p00 (s)
7 / 38
Classification of States

∼ 1
f00 (s) = 1 − ∼ (s)
p00
Now
∞ ∞  

X (n)
X n √ n
p00 (s) := s n p00 = pqs
n/2
n=0 n=0

Recall:
∞  
X 2m m 1
x =√
m 1 − 4x
m=0

Then
∞   ∞  

X n √ n X 2m √ 2m
p00 (s) = pqs = pqs
n/2 m
n=0 m=0
∞  
X 2m m 1
= pqs 2 = p [exercise?!]1
m 1 − 4pqs 2
m=0
1
a nice alternative here is to use Stirling’s approximation to n!
8 / 38
Classification of States

p
∼ (s) = 1 −
Hence f00 1 − 4pqs 2 , and the probability of returning is


(
X

p = 1, p=q
f00 = f00 (n) = lim f00 (s) = 1 − 1 − 4pq =
n=1
s→1− < 1, p 6= q

Hence, we have state 0 is transient for p 6= q and recurrent for p = q. We


now need to show null recurrence for p = q.

9 / 38
Classification of States

Set p = q and consider mean return time:



µ0 := E T0 X0 = 0

(n)
X
= n f00
n=1

d ∼  ∼ X
s n f00
n

= lim f00 (s) f00 (s) =
s→1− ds
1
d p 
= lim 1 − 1 − s2 from previous slide
s→1− ds
s
= lim √ =∞
s→1 −
1 − s2
Therefore, state 0 is null-recurrent for p = q.
Remark 6 (random walk summary)
n p<q Xn → −∞
transient
p>q Xn → +∞
n p=q X returns to initial state infinitely often
null-recurrent
although mean return time is infinite
10 / 38
Classification of States

Theorem 7
(n)
A recurrent state j is null iff pjj → 0, n → ∞.
(n)
A recurrent state j is null ⇒ pij → 0, ∀i ∈ S.
(n)
I.e., a recurrent state j is positive (non-null) iff pjj 6→ 0

Proof. See Ergodic theorem, later!


Remark 8
Recall, we now have 3 different classes of states:
P (n) (n)
j transient ⇔ n pjj < ∞ ⇒ pij → 0, ∀ i ∈ S
P (n)
j recurrent ⇔ n pjj = ∞
(n)
null-recurrent: pij → 0, ∀ i ∈ S
(n)
positive recurrent: pjj 6→ 0
(n)
For null-recurrent state j, the n-step return probs pjj → 0 but don’t go to
P (n)
zero quick enough for n pjj to converge.
11 / 38
Classification of States

Sometimes it is only possible to return to a state on an even number of


steps (c.f. random walk) or, more generally, on a multiple of d-many steps.
This motivates the following state classification
Definition 9
Denote/define the set of all number of steps for which a return to state i is
 (n)
possible by: Di := n ≥ 1 : pii > 0 . Let
(
gcd Di , Di 6= {} greatest common divisor
di :=
0, Di = {}
Then, state i is said to have period di and is called
(
periodic, if di > 1
aperiodic, if di = 1

di is is the greatest common divisor of the epochs at which return is


possible, i.e. pii (n) = 0 unless n is a multiple of di .

12 / 38
Classification of States

Example 10 (random walk: Di = {2, 4, 6, . . .}, di = 2)

··· i −1 i i +1 ···

The states all have period 2 and form a single irreducible (later!) class.
if p 6= q the states are all transients and the walk will eventually drift
to +∞ (p > q) or −∞ (p < q).
if p = q = 0.5 the states are null recurrent so that the walk will return
to its initial state infinitely often, although the mean return time
between returns will be infinite.

13 / 38
Classification of chains

Definition 11
(n)
State i communicates with state j, written as i → j, if pij > 0 for
some n ≥ 0.
State i and j intercommunicate if i → j and j → i, and we write
i ↔ j.a
a
caveat: unfortunately, some texts define i ↔ j if i and j communicate

If two states i and j do not intercommunicate, then either


(n)
pij = 0 for all n ≥ 0
or
(n)
pji = 0 for all n ≥ 0
or both relations are true.

14 / 38
Classification of chains

Lemma 12
The binary relation · ↔ · is an equivalence relation on the state space S.
I.e. ∀ i, j, k ∈ S:
1 i ↔i [reflexive]
2 i ↔j ⇒j ↔i [symmetry]
3 i ↔ j ,j ↔ k ⇒ i ↔ k [transitive]

15 / 38
Classification of chains

Proof
(0)
ii = 1 > 0 ∀ i ∈ S.
1 p

(m) (n)
2 i ↔ j, ⇒ ∃m, n ≥ 0 s.t. p
ij , pji > 0 ⇒ j ↔ i.
(m) (n)
3 i ↔ j, j ↔ k ⇒ ∃m, n ≥ 0 s.t. pij , pjk > 0. Chapman-Kolmogorov:

(m+n) (m) (n)


X
pik = pi` p`k
`∈S
(m) (n)
≥ pij pjk >0

I.e., a sum of non-negative terms is greater than or equal to one of its terms.
Hence i → k. By symmetry (repeat with i and k swapped) k → i

16 / 38
Classification of chains

We can now partition the totality of states in equivalence classes:

the states in an equivalence class are those which intercommunicate with


each other.

Note: it may be possible, starting in one class, to jump into another class
with positive probability, but obviously the chain cannot go back to the
original class otherwise the two classes would form just one single class.

A Markov chain is irreducibile if the equivalence relation induces only one


class, i.e. if all the states intercommunicate.

17 / 38
Classification of chains

Example: consider the transition probability matrix



1 1 .. 
. 0 0 0
 2 2
.. 
 1 1
. 0 0 0

 4 4 
 ··· ··· ··· ··· ··· ··· 
   
P1 0
P=  .. =

 0 0 . 0 1 0  0 P2

 0 .
.. 1

1 
 0 2 0 2 
..
0 0 . 0 1 0
This Markov chain divides into the 2 classes composed of states {1, 2} and
{3.4, 5}.
If X0 lies in the first class, then the state of the system remain in the first
class with transition matrix P1 . Similarly if X0 lies in the second class.
Basically we have two completely unrelated processes labelled together.

18 / 38
Classification of chains

Example: random walk with 2 absorbing states.

states
1 0 0 0 ··· 0 0 0 0
q 0 p 0 ··· 0 0 0 1
P= 0 q 0 p ··· 0 0 0 2
.. .. .. .. ..
. . . . .
0 ··· ··· ··· ··· q 0 p a−1
0 ··· ··· ··· ··· 0 0 1 a

We have 2 classes: {0}, {1, 2, . . . , a − 1}, {a}. It is possible to reach the


first and third from the second class, but it is not possible to return to the
second from either the first or the third class.

19 / 38
Classification of chains

We now bring together the ideas of classifying the states and that of
intercommunicating equivalence classes.
Theorem 13
i ↔j ⇒
1 di = dj
2 i transient ⇔ j transient
i recurrent ⇔ j recurrent
3 i null-recurrent ⇔ j null-recurrent

Sketch Proof
 (n)
1 Recall D :=
i n ≥ 1 : pii > 0 , di := gcd Di , Di 6= {}. Then
(m) (n)
i ↔ j ⇒ ∃m, n ≥ 0 s.t. pij , pji > 0 as i ↔ j
Chapman-Kolmogorov:
(m+n) (m) (n) (m) (n)
X
pii = pi` p`i ≥ pij pji > 0
`∈S
(`)
Hence ∃` ≥ 1 s.t. pii > 0 ⇒ ` ∈ Di . I.e. Di 6= {}.
20 / 38
Classification of chains

Let ` ∈ Di . Now,
(m+n+`) (n) (`) (m)
XX
pjj = pjr pru puj
r ∈S u∈S
(n) (`) (m)
≥ pji pii pij > 0, ∀ ` ∈ Di
Hence m + n + ` ∈ Dj . Similarly m + n + 2` ∈ Dj . This means that
∃k1 , k2 ∈ N s.t.
m + n + ` = k1 dj since dj is the gcd
m + n + 2` = k2 dj
⇒ ` = (k2 − k1 ) dj ⇒ ∀ ` ∈ Di . we have dj is a common divisor of Di . But
di is greatest common divisor of Di . Therefore dj ≤ di . By symmetry (swap
i and j and repeat all previous arguments), we also have that di ≤ dj ⇒ 1 .

21 / 38
Classification of chains

For 2 we follow a similar initial step as before:


(m) (n)
i ↔ j ⇒ ∃m, n ≥ 1, s.t. pij , pji > 0
As before, use Chapman-Kolmogorov to show
(m+n+r ) (m) (r ) (n)
pii ≥ pij pjj pji (1)
Similarly
(r ) (m) (r −n−m) (n)
pjj ≥ pji pii pij (2)

(r ) (r +k)
(1) ⇒ pjj ≤ c1 pii set k = m + n (3)
(r ) (r −k)
(2) ⇒ pjj ≥ c2 pii (4)
P (r ) P (r )
But this means that r pjj and r pii converge or diverge together.
(n)
For 3 , recall recurrent state i is null iff pii → 0. In this case, (3)
(n) (n) (n)
⇒ pjj → 0. Similarly, if ⇒ pjj → 0 then (4) ⇒ pii → 0.

22 / 38
Classification of chains

Remark 14
Intercommunication decomposes the state space into equivalence classes of
aperiodic/periodic and transient/(null-or positive)recurrent states, i.e.
states in the same class behave similarly.

Example 15
Recall in random walk example we showed that, for state 0:
transient 6 q
p=
null-recurrent p=q
Also recall, for random walk, that i ↔ j ∀ i, j ∈ S. Hence all states are
transient for p 6= q; all states are null-recurrent for p = q; and we have
solved problem Example 5!

23 / 38
Classification of chains

Definition 16
A set C ⊆ S of states is:
1 closed if pij = 0, ∀ i ∈ C, j 6∈ C
2 irreducible if i ↔ j ∀ i, j ∈ C

Remark 17
Once a chain takes a value in a closed set of states it never leaves C
subsequently. A closed set containing one state is called absorbing.
By definition, the equivalence classes induced by the · ↔ · operator are
irreducible. We can call a closed set of classes C aperiodic (or null etc) if all
states in C have this property. Furthermore, if the entire state space is, say,
recurrent, then it is irreducible and we can talk of a recurrent, irreducible
chain, etc.

24 / 38
Classification of chains

Theorem 18 (decomposition theorem)


State space S can be uniquely partitioned as
[
S=T C`
`
where T is a set of transient states and C` are irreducible closed sets of
recurrent states.
‘Proof’ Omitted. Assume Cr not closed for some r and see p224 G&S.
If X0 ∈ C` , then the chain never leaves C` and we might take C` to be
the whole state space
If X0 ∈ T , then the chain either stays in T forever or moves eventually
to one of the C` where it subsequently remains.
When S is finite, the chain cannot stay in the transient states forever
For a finite Markov chain, not all states can be transient and there can
be no null recurrent states. Thus, if S is finite and irreducible, it must
be positive recurrent.
25 / 38
Classification of chains

Example 19
Let S = {1, 2, 3, 4, 5, 6} and
 1 1 
2 2 0 0 0 0
1 3

4 4 0 0 0 0 
1 1 1 1
 

4 4 4 4 0 0 
P= 1 1 1 1


 4 0 4 4 0 4


 0 1 1
0 0 0 2 2

1 1
0 0 0 0 2 2

{1, 2} and {5, 6} are irreducible closed sets and therefore contain recurrent
non-null states. States 3 and 4 are transient because 3 → 4 → 6 but return
from 6 is impossible.
All states have period 1 because pii (1) > 0 for all i.

26 / 38
Classification of chains

Lemma 20
If state space is finite then at least one state is recurrent and all recurrent
states are positive (non-null).

Proof Let state space S be finite (|S| < ∞). Assume all states are
(n)
transient, then pij → 0, ∀ i, j ∈ S. But this means that the finite sum:
X (n)
lim pij = 0
n→∞
j∈S
P 
which is a contradiction, since j∈S P Xn = j X0 = i = 1. Therefore ∃
recurrent class.

27 / 38
Classification of chains

Now, assume ∃ null recurrent class C0 6= {}. Then, for i ∈ C0 :


(n)
pij → 0, j ∈ C0
and, since recurrent classes are closed:
(n)
pij = 0, j ∈ S \ C0
and again we have that the following finite sum leads to a contradiction:
X (n)
lim pij = 0, S = C0 ∪ S \ C0
n→∞
j∈S

Remark 21
A good reason for classifying states and classes of states (as well as entire
chains) in this way is that it allows us to talk about what happens to the
Markov chain, or at least the probability distribution of the Markov chain, in
the long run.

28 / 38
Invariant distribution

Remark 22 (brief recap on notation)


The probability distribution of the chain at time n is written as the row
vector
(n)  (n)
p(n) = pj j∈S , pj = P Xn = j


and we had, from Chapman-Kolmogorov that


p(n) = p(0) Pn
i.e., the distribution at time n is the initial distribution (vector-matrix)
multiplied by the n-step transition matrix.

Definition 23 (invariant distribution)



The row vector π = πj j∈S is called an invariant distribution of the chain
if
P
π is a probability vector: πj ≥ 0, ∀ j ∈ S, j∈S πj = 1.
P
π = πP, i.e. πj = i∈S πi pij , ∀ j ∈ S

(Unfortunately, an invariant distribution is sometimes called a stationary


distribution.) 29 / 38
Invariant distribution

Remark 24
As the name invariant suggests, π remains unchanged by P:
π = πP = πP2 = · · · = πPn
Note that if the initial distribution is invariant: p(0) = π, then
p(n) = πPn = π = p(0)
i.e. the distribution of the chain remains the same over all time.

Definition 25
If, for every initial distribution p(0) we have that p(n) → π, then π is an
equilibrium distribution.

An invariant distribution need not exist, and if it does exist, it need not be
unique. However, for an irreducible chain,

unique invariant distribution exists ↔ chain is positive recurrent

30 / 38
Invariant distribution

Before formal result, note following example where π is not unique.


Example 26 (gambler’s ruin)
p p p p p

1 0 1 2 ··· a+b−2 a+b−1 a+b 1


q q q q q

Any π of the form [π0 , 0, 0, . . . , 0, 0, 1 − π0 ], where 0 ≤ π0 ≤ 1 is an


invariant distribution (exercise: check). I.e. π is not unique.
The states {1, . . . , a + b − 1} are transient; the states 0 and a + b are
separate irreducible, positive recurrent, aperiodic classes.
Note also that, since the chain cannot stay in the finitely many states
{1, . . . , a + b} forever, the limiting distribution limn→∞ p(n) is either
[1, 0, 0, . . . , 0] or [0, 0, 0, . . . , 1]

Remark 27
On the other hand, the urn model is irreducible, positive recurrent (and
aperiodic), so a unique invariant distribution exists. (We’ll return to that
example later on.) 31 / 38
Invariant distribution

Theorem 28 (p 227, G&S)

An irreducible chain has invariant distribution π = πP iff all states are


positive recurrent. Furthermore, π is the unique invariant distribution and is
given by π = πj j∈S with πj = µ−1 j , ∀ j ∈ S where µj is the mean return
time to state j. [Intuitively, if on average, a chain visits state j once every µj time
steps, then the prob. that the chain is in state j (in the long run) should = 1/µj ]

32 / 38
Invariant distribution

proof of Theorem 28 plan


1 show all j transient ⇒ π = 0
existence of π = µ−1

2
j j

π = µ−1

3
j j
⇒ all states are positive recurrent

1 : An irreducible chain implies all states are transient or all states are
recurrent. Assume all states are transient and that an invariant distribution
π exists, i.e.
(n)
X
πj = πi pij
i∈S
(n)
But transience implies pij → 0 ∀ i, j ⇒ π = 0 which contradicts property
that π is a probability vector. Hence, all states must be recurrent.

33 / 38
Invariant distribution

2 : [Want ∃!π s.t. πj = µ−1


j ] Suppose we define the initial distribution as:

πj := P X0 = j , ∀ j ∈ S
Then, by definition (Tj = first hitting time of j)
 
πj µj = P X0 = j E Tj X0 = j
∞ ∞
" #
X   X
= P X0 = j P Tj ≥ n X0 = j EY = P(Y ≥ n)
n=1 n=1
X∞

= P Tj ≥ n, X0 = j
n=1

Since Tj := min{n ≥ 1 : Xn = j} , we have P Tj ≥ 1 = 1.
We now look at each term in the sum.
The first term on the RHS (i.e. for n = 1) is
 
P Tj ≥ 1, X0 = j = P X0 = j

34 / 38
Invariant distribution
 
For n = 1 : P Tj ≥ 1, X0 = j = P X0 = j .
Now, for n ≥ 2:

P Tj ≥ n, X0 = j

= P X0 = j, X1:n−1 6= j
 
= P X1:n−1 6= j − P X0 6= j, X1:n−1 6= j [P(A, B) = P(B) − P(A, B)]
 
= P X0:n−2 6= j − P X0 6= j, X1:n−1 6= j [time-homog.]

=: αn−2 − αn−1 [αn := P X0:n 6= j ]
I.e.

X

πj µ j = P X0 = j + αn−2 − αn−1
| {z } n=2
n = 1 term | {z }
telescopes: α0 −
α 1 −
1 +
α α 2 −...
α
2 +

= P X0 = j + α0 − lim αn
n→∞
But, limn→∞ αn = P(Xm 6= j, ∀m) = prob. chain never visits j. But X is
recurrent. Therefore this prob. is 0 and we have:
  
πj µj = P X0 = j + P X0 6= j = 1 [α0 := P X0 6= j ]
35 / 38
Invariant distribution

: [want: π = µ−1

3
j j
⇒ all states are positive recurrent]
We have shown ∃!π s.t. πj = µ−1j . From Defn. 3, we want to show
µj < ∞, i.e. that πj > 0. Suppose that πj = 0 for some j. Then
(n) (n)
X
0 = πj = πk pkj ≥ πi pij , ∀ i ∈ S, n ≥ 0
k∈S
(n)
This implies πi = 0, ∀ pij > 0, i.e. ∀ i → j. But, chain is irreducible, i.e.
i ↔ j, ∀i, j ∈ S. Therefore
X
πj = 0 for some j ⇒ πi = 0 ∀i ∈ S ⇒ πj = 0 6= 1
j∈S
which contradicts Defn. 23. Hence µj < ∞ and therefore all states are
positive recurrent.

36 / 38
Invariant distribution

Theorem 29
For an irreducible, aperiodic Markov chain
(n) 1
pij → , as n → ∞ ∀i, j ∈ S
µj
Proof omitted. see, e.g. G&S pp232-235

Remark 30
If a Markov chain is transient, or null recurrent, then
(n)
µj = ∞ and pij → 0
If a Markov chain is positive recurrent then
(n) 1
pij → πj =
µj
I.e. (informally) if, on average, a chain visits state j once every µj steps then
the probability that the chain is in state j (in the long run) should be 1/µj .

37 / 38
Invariant distribution

Theorem 31
For an irreducible Markov chain with period d:
(nd) d
pjj → , as n → ∞ ∀j ∈ S
µj
(m)
and pjj = 0 if d is not a divisor of m.

Theorem 32 (Ergodic theorem [general)


For any aperiodic state j of a Markov chain,
(n) 1
pjj → , as n → ∞
µj
and (recall, fij is prob. of ever going to j from i) for any ther state i
(n) fij
pij → , i 6= j
µj
C.f. p235 G&S

38 / 38

You might also like