0% found this document useful (0 votes)
72 views161 pages

GT Part2

The document outlines a syllabus for a course on dynamic games of complete information, covering topics such as Nash Equilibrium, sequential rationality, and subgame perfection. It includes examples of dynamic games, their extensive form representation, and the relationship between extensive and normal forms. The course emphasizes the importance of attendance and participation, with structured classes and office hours for student support.

Uploaded by

praroy030
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views161 pages

GT Part2

The document outlines a syllabus for a course on dynamic games of complete information, covering topics such as Nash Equilibrium, sequential rationality, and subgame perfection. It includes examples of dynamic games, their extensive form representation, and the relationship between extensive and normal forms. The course emphasizes the importance of attendance and participation, with structured classes and office hours for student support.

Uploaded by

praroy030
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 161

EC 005 Part II: Dynamic Games of Complete

Information

Soumendu Sarkar
soumendu@econdse.org

February 18, 2025


Syllabus: PART B

▶ Sequential games preliminaries (Tadelis Chapter 7) [2 lectures]


▶ Nash Equilibrium of dynamic games of complete information
(Tadelis Chapter 7) [1 lecture]
▶ Sequential rationality and subgame perfection (Tadelis
Chapter 8) [1 lecture]
▶ One stage deviation and backward induction algorithm
(Tadelis Chapter 8,9) [2 lectures]
▶ How good are SPNE predictions [1 lecture]
▶ Repeated games (Tadelis Chapter 10) [3 lectures]
▶ Bargaining (Tadelis, Chapter 11) [4 lectures]
Logistics

▶ Classes:
▶ Mondays 1015-1130 (B); 1415-1530 (A)
▶ Wednesdays 1130-1245 (B); 1530-1645 (A)
▶ Fridays 0900-1015 (B); 1300-1415 (A)
▶ Structure: 5 minutes recap; 1 hour primary lecture with only
clarificatory questions allowed; 10 minutes sumup/substantial
Q&A
▶ Attendance in classes and tutorials crucial
▶ second midsem sometime late March/ early April
▶ Office hours (Room 120): Mondays 1245-1345 (A); Fridays
1200-1300 (B)
Introduction: recap

▶ Games in strategic/normal form


▶ Examples of static/One-shot simultaneous-move games
(complete information): BoS, PD, Meeting point, Matching
Penny, Stag hunt
▶ Nash equilibrium in pure and mixed strategies (actions?)
▶ Dominance, iterated elimination of strictly dominated
strategies
▶ Rationality and common knowledge
Introduction

▶ Dynamic games: multiple time periods


▶ Sequence of moves rather than simultaneous moves
▶ Each component of sequence corresponds to a period/interval
of time
▶ Different sets of players may get to make decisions at different
points of time
▶ Set of actions a player may choose from may differ over time
Introduction contd.

▶ Now strategies are not same as actions: but a plan of actions,


one for each period where a player makes a decision
▶ Dynamic games of complete information: history of past
actions is observed by players when making decision in current
period
▶ Payoffs may be realized in multiple periods
▶ Discounting of payoffs may be essential to allow for
intertemporal comparison of payoffs
Introduction contd.

▶ The structure of dynamic games adds new layers to


interpretation of rationality
▶ Players are supposed to be sequentially rational, assuming
that opponents are sequentially rational as well
▶ Some Nash equilibria no longer remain reasonable, new ideas
of perfection or refinements are introduced
Our tasks

▶ Writing a dynamic game: the extensive form


▶ Strategy in dynamic games: relating the extensive form to the
normal form
▶ Sequential rationality and equilibrium in dynamic games of
complete information:
▶ Nash
▶ Backward induction
▶ Subgame perfection
▶ One stage deviation principle
▶ Simple applications; repeated games; bargaining
Example 1: a model of entry

▶ A firm called Entrant is facing the choice of entering a market


▶ If it enters the market, another firm, called Incumbent,
observes its entry and chooses to either accommodate or fight
▶ Accommodate means incumbent maintaining a moderately
high price, allowing the entrant to earn positive profit
▶ Fight means incumbent slashing prices to make the entrant
suffer losses Show Figure
Example 2: Stackelberg Leadership

▶ A simple dynamic version of Cournot duopoly


▶ Two firms L and F with identical marginal cost c
▶ Inverse demand function P = 1 − Q where Q is total output
▶ In period 1 firm L chooses output level qL
▶ In period 2 firm F observes qL and chooses qF
▶ Both firms realize profits at the end of period 2
Example 3: investment followed by quantity competition

▶ Two firms with known identical marginal costs


▶ In period 1 Firm 1 faces a choice between making a fixed
cost-reducing investment and not making an investment
▶ After this choice has been made, firms simultaneously
compete in quantities in period 2
Example 4: Bargaining

▶ Two players are bargaining over the share of a rupee


▶ If player 1 makes an offer in period 1 player 2 can accept or
reject
▶ If player 2 accepts then the game ends with each player
getting the agreed shares
▶ If player 2 rejects, then he makes offers in period 2 that player
1 can accept or reject
▶ If player 1 accepts the game ends with each player getting
agreed shares, else the game proceeds to period 3 where 1
makes a new offer...
Example 5: Repeated Prisoner’s Dilemma

▶ Two players are playing a Prisoner’s Dilemma game


repeatedly, choosing between cooperating and
non-cooperation
▶ Cooperation is better than non-cooperation, but
non-cooperation is better when the opponent is cooperating
▶ Payoffs are realized at the end of each period
▶ Variants: finite vs infinite repetitions
Show figure
Example 6: Repeated Cournot

▶ Firms 1 and 2 with identical marginal costs c


▶ They face the same market in every period repeatedly
▶ Inverse demand function P = 1 − Q where Q is total output
▶ Both firms realize profits at the end of each period according
to their profit functions
▶ Variants: finite vs infinite repittions
Example 7: job market signalling

▶ A firm may be uncertain about the productivity of his


prospective employee which determines his own profits
▶ A prospective employee can be of high productivity or low
productivity
▶ He can acquire costly education to signal his productivity
▶ Education can be of two qualities– high or low
▶ Once an employee chooses an education level, the firm
observes it and decides on a wage level
▶ The game ends with the firm earning profits net of wages, and
the employee earning wages net of cost of education acquired
Extensive form representation

▶ A game in extensive form is represented as

⟨I , χ, A, p, α, H, H, ι, ρ, u⟩ (1)

▶ Alternatively, graphically, in the form of a game tree, provided


the sets of actions available to a player are finite
Note aside: graph theory
▶ A graph is a collection of two sets, V , a set of vertices, and
E , a set of edges
▶ The set of edges E is essentially a subset of V × V , the
Cartesian product of V with itself
▶ If a, b ∈ V and (a, b) ∈ E , then it means that a and b are
directly connected
▶ A path is a sequence of directly connected nodes
▶ A graph is connected if there exists a path between every pair
of nodes
▶ A graph is complete if every pair of nodes is directly connected
▶ Every path beginning and ending at the same node is called a
cycle
▶ Every graph containing a cycle is cyclical, every graph which is
not cyclical is acyclical
▶ A connected acyclical graph is a tree
Game Tree of Example 1

not enter enter

I
(0, 2)
accommodate fight

(1, 1) (-1, -1) Go back


▶ Two of the five nodes are assigned to players E and I
respectively
▶ Each of these nodes are connected to two more nodes, via
edges
▶ Each of these edges are assigned an action available to the
player
▶ Remaining nodes are not connected to any nodes further
down, but are assigned payoff numbers
▶ The first set of nodes are decision nodes– as the player
assigned to such a node must make a decision that leads to
either a payoff or another decision node
▶ The second set of nodes are called terminal nodes
Game Tree of Example 5

C N
2
C N C N
1 1 1 1
C N C N C N C N
2 2 2 2
C NC N C NC N C N C N C N C N

Go back
▶ In this diagram, apart from decision nodes, terminal nodes,
and edges representing actions, we see ellipses containing two
decision nodes assigned to a player
▶ The choices available to a player at each of the decision nodes
inside an ellipse are the same
▶ Such ellipses represent information sets of players
▶ If there are two or more nodes in an information set, the fact
that the same choice of actions are available at each of these
nodes denote that the corresponding player is unable to
distinguish between these nodes given the information about
sequence of actions observed so far
▶ Every decision node not enclosed in an ellipse is a singleton
information set — depicting that the corresponding player
perfectly knows the sequence of previous play
Game tree of Example 7

wH wH
eH tH eL

wL q wL

F N F

wH 1−q wH

eH tL eL
wL wL
game tree=extensive form

▶ I : (finite) set of players which may include nature


▶ χ: (finite) set of nodes
▶ A: (finite) set of all possible actions in the game
▶ p : χ \ x0 → χ assigns a predecessor node to every node
except the initial one
▶ If x = p(y ), then y = s(x), successor of x
▶ T : terminal set of nodes without a successor
▶ α : χ \ x0 → A assigns every non-initial node an action that
links it to its predecessor; if p(x) = p(x ′ ) then α(x) ̸= α(x ′ )
Extensive form contd.

▶ H: collection of information sets


▶ H : χ \ T → H: assignment of non-terminal nodes to
information sets such that if c(x) is the set of actions leading
from node x to its successor, then c(x) = c(x ′ ) for all
x ′ ∈ H(x)
▶ ι : H → I : assignment of information sets to players
▶ ρ: assignment of probabilities to nature’s actions
▶ u : T → RI : assignment of payoff vectors to terminal nodes
Information sets

▶ Information sets are the most important components of an


extensive form game
▶ These are collections of decision nodes such that
▶ same choice of decisions are available at every node in an
information set (requirement)
▶ perfect recall (assumption):
▶ no node in an information set is a predecessor or successor of
another node in the set
▶ Consider a pair of nodes x, x ′ in an information set of a player
i, Hi (x). If a is a past action taken by the player at decision
node x ′′ on the path to x, then there must be x ′′′ ∈ Hi (x ′′ )
such that a is on the path from x ′′′ to x ′ .
Perfect vs imperfect information

▶ A game in extensive form having only singleton information


sets is a game of perfect information
▶ Example 1,2, 4 are games of perfect information
▶ A game having at least one non-singleton information set is a
game of imperfect information
▶ Example 3, 5, 6 are game of imperfect information
▶ Every one-shot/static/simultaneous move game is a game of
imperfect information, e.g., the one-shot prisoner’s dilemma
Game tree for one-shot Prisoner’s dilemma

1
C N
2
C N C N
Complete vs Incomplete Information

▶ A game of incomplete information involves payoff uncertainty


for players
▶ These are usually modelled by nature selecting a state of the
world using a probability distribution, e.g., one shot auction
games
▶ A game of complete information may be a game of imperfect
information (e.g., the one-shot Prisoner’s dilemma)
▶ A game of incomplete information is always a game of
imperfect information
A static game of incomplete information in extensive form

N
p 1−p
t1 t2

C N C N
2
C N C N C N C N
Strategies in extensive form

▶ Recall that a strategy of a player is a complete contingent


action plan
▶ In extensive form, the information sets of a player represent
these contingencies where he must take a decision
▶ Consequently, a (pure) strategy for a player in an extensive
form game, denoted si , maps his information sets to the set of
actions that are available to him in these sets:

si : Hi → A, si (Hi ) ∈ c(Hi ).

▶ The collection of all possible strategies of a player i is referred


to as his strategy set, denoted Si
Mixed and Behavioral Strategies

▶ A mixed strategy is a probability distribution over pure


strategies
▶ A behavioral strategy specifies probability distributions over
feasible actions for each information set
▶ A mixed strategy can induce a behavioral strategy and
vice-versa, if the game is one of perfect recall
Equivalence of normal and extensive forms

▶ Every tuple of strategies in an extensive form game induce a


path of play originating in an initial node and ending in a
terminal node
▶ Therefore, we can associate the payoff vector assigned to this
terminal node to the tuple of strategies and present the game
in the familiar normal/matrix form
▶ As long as each strategy in normal form clearly assigns an
action to each contingency, we can convert it into an
extensive form
Normal form of Example 1

Player I
accommodate fight
Player E enter 1, 1 -1, -1
not enter 0, 2 0, 2
Normal form of Example 5

▶ Both players have 5 information sets


▶ Both players have choice of two actions in each of their
information sets
▶ Consequently, there are 25 = 32 strategies for each player in
this game
Nash equilibrium

▶ The idea of Nash equilibrium generalizes easily from static to


dynamic games after acknowledging the general definition of
strategies
▶ A Nash equilibrium for an extensive form game is a collection
of strategies s ∗ = (s1∗ , . . . , sI∗ ) such that for all i ∈ I , si ∈ Si ,

ui (s ∗ ) ≥ ui (si , s−i
∗ ∗
) for all s−i

▶ Nash equilibria of example 1 are displayed in red


Exercise 7.1

▶ Imagine an extensive-form game in which player i has K


information sets.
1. If the player has an identical number of m possible actions in
each information set, how many pure strategies does he have?
2. If the player has mk actions in information set
k ∈ {1, 2, . . . , K }, how many pure strategies does the player
have?
Exercise 7.2

▶ Consider a two-player game in which player 1 can choose A or


B. The game ends if he chooses A while it continues to player
2 if he chooses B. Player 2 can then choose C or D, with the
game ending after C and continuing again with player 1 after
D. Player 1 can then choose E or F , and the game ends after
each of these choices.
1. Model this as an extensive-form game tree. Is it a game of
perfect or imperfect information?
2. How many terminal nodes does the game have? How many
information sets?
3. How many pure strategies does each player have?
4. Imagine that the payoffs following choice A by player 1 are (2,
0), those following C by player 2 are (3, 1), those following E
by player 1 are (0, 0), and those following F by player 1 are (1,
2). What are the Nash equilibria of this game? Does one strike
you as more appealing than the other? If so, explain why.
Exercise 7.2

A B
2
(2, 0)
C D
1
(3, 1)
E F

(0, 0) (1, 2)
Exercise 7.2

Player 2
C D
Player 1 AE 2, 0 2, 0
AF 2, 0 2,0
BE 3, 1 0, 0
BF 3, 1 1, 2
Exercise 7.3 (Tic-tac-toe)

▶ The extensive-form representation of a game can be cumber-


some even for very simple games. Consider the game of
tic-tac-toe, in which two players mark X or O in a 3x3 matrix.
Player 1 moves first, then player 2, and so on. If a player gets
three of his kind in a row, column, or one of the diagonals
then he wins, and otherwise it is a tie. For this question
assume that even after a winner is declared, the players must
completely fill the matrix before the game ends.
1. Is this a game of perfect or imperfect information? Why?
2. How many information sets does player 2 have after player 1’s
first move?
3. How many information sets does player 1 have after each of
player 2’s moves for the first time?
4. How many information sets does each player have in total? e.
5. How many terminal nodes does the game have?
Exercise 7.4 (Centipede)
▶ Imagine a two-player game that proceeds as follows. A pot of
money is created with $6 in it initially. Player 1 moves first,
then player 2, then player 1 again, and finally player 2 again.
At each player’s turn to move, he has two possible actions:
grab (G) or share (S). If he grabs he gets 23 of the current pot
of money, the other player gets 31 of the pot, and the game
ends. If he shares then the size of the current pot is multiplied
by 32 and the next player gets to move. At the last stage at
which player 2 moves, if he chooses S then the pot is still
multiplied by 23 , player 2 gets 13 of the pot, and player 1 gets
2
3 of the pot.
1. Model this as an extensive-form game tree.
2. How many pure strategies does each player have?
3. Find the Nash equilibria of this game.
4. Now imagine that at the last stage at which player 2 moves, if
he chooses to share then the pot is equally split among the
players. Does your answer to part 4 change?
Exercise 7.4

1 S 2 S 1 S 2 S
( 81 81
4 , 8 )

G G G G

(4, 2) (3, 6) (9, 29 ) ( 27 27


4 , 2 )
Exercise 7.4

Player 2
GG GS SG SS
Player 1 GG 4, 2 4, 2 4, 2 4, 2
GS 4, 2 4, 2 4, 2 4, 2
SG 3, 6 3, 6 9, 9/2 9, 9/2
SS 3, 6 3, 6 27/4, 27/2 81/4, 81/8
Exercise 7.4

Player 2
GG GS SG SS
Player 1 GG 4, 2 4, 2 4, 2 4, 2
GS 4, 2 4, 2 4, 2 4, 2
SG 3, 6 3, 6 9, 9/2 9, 9/2
SS 3, 6 3, 6 27/4, 27/2 243/16, 243/16
Exercise 7.5 (Veto)

▶ Two players must choose among three alternatives, a, b, and


c. Player 1’s preferences are given by a ≻1 b ≻1 c while player
2’s preferences are given by c ≻2 b ≻2 a. The rules are that
player 1 moves first and can veto one of the three alternatives.
Then player 2 chooses one of the remaining two alternatives.
1. Model this as an extensive-form game tree (choose payoffs
that represent the preferences).
2. How many pure strategies does each player have?
3. Find all the Nash equilibria of this game.
Exercise 7.5

1
va vb vc
2 2 2
b c c a a b
2 1 1 3 3 2
     
2 3 3 1 1 2
Exercise 7.5

Player 2
bca bcb baa bab cca ccb caa cab
va 2, 2 2, 2 2, 2 2, 2 1, 3 1, 3 1, 3 1, 3
Player 1 vb 1, 3 1, 3 3, 1 3, 1 1, 3 1, 3 3, 1 3, 1
vc 3, 1 2, 2 3, 1 2, 2 3, 1 2, 2 3, 1 2, 2
Exercise 7.7 (Roommates voting)
▶ Three roommates need to vote on whether they will adopt a
new rule and clean their apartment once a week or stick to
the current once-a-month rule. Each votes “yes” for the new
rule or “no” for the current rule. Players 1 and 2 prefer the
new rule while player 3 prefers the old rule.
1. Suppose the players require a unanimous vote to adopt the
new rule. Player 1 votes first, then player 2, and then player 3,
the latter two observing the previous votes. Draw this as an
extensive-form game and find the Nash equilibria.
2. Suppose that the players require a majority vote now. Again
player 1 votes first, then player 2, and then player 3, the latter
two observing the previous votes. Draw this as an
extensive-form game and find the Nash equilibria.
3. Suppose the game is as in part 2, but the votes of earlier
movers are not observed by the later movers — and the votes
are counted after all have voted. Draw this as an
extensive-form game and find the Nash equilibria. In what way
is this result different from the result in 2?
Exercise 7.7 Part 1

1
y n
2 2
y n y n
3 3 3 3
y n y n y n y n

1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0
0 1 1 1 1 1 1 1
Exercise 7.7 Part 2

1
y n
2 2
y n y n
3 3 3 3
y n y n y n y n

1 1 1 0 1 0 0 0
1 1 1 0 1 0 0 0
0 0 0 1 0 1 1 1
Exercise 7.7 Part 3

1
y n
2
y n y n
3
y n y n y n y n

1 1 1 0 1 0 0 0
1 1 1 0 1 0 0 0
0 0 0 1 0 1 1 1
Exercise 7.7 Part 3

Player 2
yes no
Player 1 yes 1, 1, 0 1, 1, 0
no 1, 1, 0 0, 0, 1
yes
Player 3
Player 2
yes no
Player 1 yes 1, 1, 0 0, 0, 1
no 0, 0, 1 0, 0, 1
no
Player 3
Exercise 7.8
▶ Consider the following two-stage game: In the first stage one
brother (player 2) has two $10 bills and can choose one of two
options: he can give his younger brother (player 1) $20 or give
him one of the $10 bills. This money will then be used to buy
snacks at the show they will see, and each $1 of snacks
purchased yields one unit of payoff for a player who uses it. In
the second stage the show they will see is determined by the
following Battle of the Sexes game:
Player 1
O F
Player 2 O 16, 12 0, 0
F 0, 0 12, 16
1. Present the entire game in extensive form (a game tree).
2. Write the (pure) strategy sets for both players.
3. Present the entire game in one matrix.
4. Find the Nash equilibria of the entire game (pure and mixed
strategies).
Exercise 7.8

2
O10 O20
1 1
O F O F
2 2
O F O F O F O F
26 10 10 22 16 0 0 12
       
22 10 10 26 32 20 20 36
Exercise 7.8

2
O10 OO O10 OF O10 FO O10 FF O20 OO O20 OF O20 FO O20 FF
OO 26, 22 26, 22 10, 10 10, 10 16, 32 0, 20 16, 32 0, 20
1 OF 26, 22 26, 22 10, 10 10, 10 0, 20 12, 36 0, 20 12, 36
FO 10,10 10, 10 22, 26 22, 26 16, 32 0, 20 16, 32 0, 20
FF 10,10 10, 10 22, 26 22, 26 0, 20 12, 36 0, 20 12, 36
Exercise 7.9
▶ A student stole the DVD player from the student lounge. The
dean of students (player 1) suspects the student (player 2)
and begins investigation. Concrete evidence will be available
to the dean only with probability 12 . The student knows that
the investigation is on but does not know whether the dean
has collected evidence or not. The game proceeds as follows:
The dean realizes whether or not he has evidence and can
then choose whether to accuse the student (A) or bounce the
case (B ) and forget it. Once accused, the student can either
confess (C) or deny (D). If the dean bounces the case then
both players get 0. If the dean accuses the student and the
student confesses, the dean gains 2 and the student loses 2. If
the dean accuses the student and the student denies, then
payoffs depend on the evidence: If the dean has no evidence it
gives him a loss of 4, while the student gets a payoff of 4. If
the dean has evidence he gains 4, while the student is put on
probation and loses 4.
Exercise 7.9 contd.

1. Draw the game tree that represents the extensive form of this
game.
2. Write down the matrix that represents the normal form of the
extensive form you drew in (a).
3. Solve for the Nash equilibria of the game.
Exercise 7.9

N
1 1
2 2
te tn
S
A B A B
0 0
 
C D 0 C D 0

2 4 2 −4
   
−2 −4 −2 4
Exercise 7.9

Student
C D
AA 2, 2; -2 4, -4; 0
Dean AB 2, 0; -1 4, 0; -2
BA 0, 2; -1 0, -4; 2
BB 0, 0; 0 0, 0; 0
Exercise 7.9

▶ Action A is dominant for type te ; no dominant action for tn


▶ tn mixes between A and B to make the student indifferent
between choosing C and D
▶ Solve 12 × −2 + 21 × p × −2 = 12 × −4 + 12 × p × 4 or, p = 1
3
▶ Student mixes between C and D to make tn indifferent
between A and B
▶ Solve 2q + (1 − q) × −4 = 0 or q = 23
▶ Nash equilibrium in behavioral strategies:
((1, 0), ( 31 , 23 ); ( 32 , 13 ))
Exercise 7.10 (Perfect and imperfect recall)

▶ Consider the following game:


2
a b
1 1

L R W E
2 2
t z t z x y x y

2 0 0 1 2 0 0 1
       
1 0 0 2 1 0 0 2
Exercise 7.10 contd.

1. What are the pure-strategy sets for each player?


2. Show that for any behavioral strategy for player 1 there is a
mixed strategy that leads to the same distribution over the
terminal nodes regardless of the strategy chosen by player 2.
3. Show that for any behavioral strategy for player 2 there is a
mixed strategy that leads to the same distribution over the
terminal nodes regardless of the strategy chosen by player 1.
4. Now imagine that the game does not have perfect recall so
that player 2’s bottom two information sets are now one large
information set. Can you find an example showing that the
claim in 3 is no longer true?
Are all Nash reasonable?

▶ (enter, accommodate) and (not enter, fight) are the two Nash
equilibria of the entry game in Example 1
▶ Consider (not enter, fight); the latter describes I’s strategy in
the information set reached when E plays enter
▶ In this information set, I’s optimal action is to play
accommodate rather than fight
▶ Therefore, I cannot commit to fight if his information set was
reached and (not enter, fight) is unreasonable
▶ In contrast, E has first mover advantage/ power of
commitment/ credibility when he chooses to enter
▶ E’s entering the market forces I to accommodate, therefore
(enter, accommodate) is reasonable
Sequential Rationality

▶ The idea of reasonable equilibria in dynamic games is


formalized as the notion sequential rationality
▶ This notion generalizes the idea of best response strategies
from static games

Definition
Given strategies s−i of other players, a strategy of player i is
sequentially rational if it is a best response to s−i in each of i’s
information sets
Backward Induction

▶ Backward induction is an algorithm to find sequentially


rational equilibrium strategies in dynamic games of perfect
information with finite terminal nodes
▶ Consider the decision nodes in the game tree which have the
terminal nodes as successors
▶ Substitute a payoff vector that gives the player corresponding
to this decision node the highest payoff over all actions
available here; remove the terminal nodes and edges
connecting the decision node to the terminal node
▶ Repeat above step for the reduced game tree
▶ Since there are finite terminal nodes, the process will
terminate after finite iterations, leaving a single payoff vector
Backward induction contd.

▶ The corresponding payoff vector corresponds to a Nash


equilibrium with sequentially rational strategies
▶ The collection of actions chosen at each information set in the
steps that leads to the corresponding terminal node from the
initial node in the original game tree outlines the Nash
equilibrium involving sequentially rational strategies for every
player
Game Tree of Example 1

not enter enter

I
(0, 2)
accommodate fight

(1, 1) (-1, -1)


Game Tree of Example 1

not enter enter

(0, 2) (1, 1)
Game Tree of Example 1

not enter enter

I
(0, 2)
accommodate fight

(1, 1) (-1, -1)


Example 2: Stackelberg leadership
▶ Profit of firm i ∈ {L, F } is

πi (qi , qj ) = (1 − qi − qj − c)qi

▶ A strategy of L is simply an action in R+


▶ A strategy of F maps possible values of qL to possible values
of qF : qF (qL )
▶ Infinitely many such strategies can be defined
▶ Best response function of F is also a strategy: for each qL , it
picks the qF which maximizes F ’s profit
▶ Best response function of F is obtained by maximizing πF
with respect to qF and setting it to zero:
1 − qL − c
qF =
2
▶ Question: how many Nash equilibria are there in this game?
Nash equilibria of Stackelberg game

▶ A strategy pair (q̂L , qˆF (qL )) is a Nash equilibrium of the


Stackelberg game if

πL (q̂L , qˆF (qL )) ≥ πL (qL , qˆF (qL )), ∀qL ∈ R+ (2)


πF (q̂L , qˆF (q̂L )) ≥ πF (q̂L , qF (qL )), ∀qF (qL )
or, πF (q̂L , qˆF (q̂L )) ≥ πF (q̂L , qF ), ∀qF ∈ R+ (3)

▶ The first equation requires that when F is playing qˆF (qL ), L


does not get higher profit by choosing any output other than
q̂L
▶ Second equation only requires that F plays the best response
to q̂L — not necessarily for all qL
▶ Any qL ≤ q M (monopolistic output) can be sustained in a
Nash equilibrium!!!
Cournot output in Stackelberg model Nash equilibrium
qF

qL∗ (qF )

πL (qL , qF ) = πLC

qF (qL ) = q C

qC qF∗ (qL ) qL
Arbitrary output of leader in Stackelberg model
qF

qL∗ (qF )

qF (qL )

πL (qL , qF ) = πL (q̄L , qF∗ (q̄L ))


qF∗ (q̄L )

q̄L qF∗ (qL ) qL


Sequentially rational equilibrium in Stackelberg model
qF

qL∗ (qF )

πL (qL , qF ) = πLS

qFS

qLS qF∗ (qL ) qL


Example: power of commitment

▶ Player 1 (government) wants to influence choice of player 2


▶ Player 2 chooses an action a2 ∈ {0, 1} and receives a transfer
t ∈ {0, 1} from 1 who observes a2
▶ 2’s objective is to maximize expected value of transfer net of
cost of action, which is 0 for a2 = 0 and 21 for a2 = 1
▶ Player 1’s objective is to minimize 2(a2 − 1)2 + t
▶ Before 2 chooses his action, 1 can announce a transfer rule
t(a2 )
▶ Result: 1’s announcement impacts 2’s action if and only if 1
can commit to it!
Game tree with commitment

1
t00 t01 t10 t11
2 2 2 2
0 1 0 1 0 1 0 1
−2 0 −2 −1 −3 0 −3 −1
     
0 − 12 0 1
2
1 − 12 1 1
2
Game tree without commitment

1
t00 t01 t10 t11
2 2 2 2
0 1
1 1
0 1 0 1
−2 −3 0 −1
  
0 1 − 21 1
2
Subgame perfection

▶ Note that the backward induction algorithm is not


well-defined for games of imperfect information
▶ Subgame perfection is a more general algorithm to find out
sequentially rational Nash equilibria
▶ In games of perfect information, subgame perfection and
backward induction are the same
Subgames

▶ Given an extensive form game G , consider any decision node


x ∈ χ(G ) which belongs to a singleton information set
▶ Note: There is at least one such node in every extensive form
game
▶ The subgame of G beginning at x, denoted G (x), contains x
and all its successors
▶ If x ′ ∈ G (x) and x ′′ ∈ H(x ′ ) then x ′′ ∈ G (x)
▶ The assignments of actions, information sets, and payoffs on
the terminal nodes for every player in G (x) are the same as in
G but restricted to the nodes in G (x)
▶ Note: G is a subgame of itself
Subgame Perfect Nash Equilibrium

Definition
A strategy profile s in an extensive form game G is a subgame
perfect Nash equilibrium if the corresponding restrictions of the
strategy profile constitute Nash equilibria for each subgame of G
Subgame perfection algorithm
▶ Given an extensive form game, consider the subgames with
the smallest number of nodes:
▶ If there is only one player with a move in any of these
subgames, find an optimal action and replace the initial node
of this subgame with the corresponding payoff vector; remove
remaining nodes and edges of this subgame
▶ If there are multiple players with a move in any of these
subgames, find a Nash equilibrium strategy profile for these
players, and replace the initial node of this subgame with the
corresponding payoff vector; remove remaining nodes and
edges of this subgame
▶ Repeat the previous step in the reduced game
▶ As the game is finite, the algorithm terminates after finite
iterations, yielding a subgame perfect Nash equilibrium payoff
vector
▶ The strategies yielding this payoff vector is a subgame perfect
Nash equilibrium
Remarks

▶ Note that in games of perfect information, subgame


perfection is the same as backward induction
▶ In a game of imperfect information G where the only subgame
is G itself, subgame perfection cannot refine Nash equilibria
any further: all Nash equilibria are then subgame perfect
Example 3: investment followed by Cournot competition

▶ Firms 1 and 2 have identical constant marginal cost of 2 per


unit
▶ 1 can install a new technology with a marginal cost of zero by
paying a fixed cost f
▶ 2 observes whether 1 invests in new technology
▶ Once the investment decision is observed, the two firms will
simultaneously choose output levels q1 and q2 as in Cournot
competition
Example contd.

▶ Suppose demand is p(q1 , q2 ) = 14 − q1 − q2


▶ Firm 1’s payoff is (12 − q1 − q2 )q1 if it does not invest, and
(14 − q1 − q2 )q1 − f if it invests
▶ Firm 2’s payoff is (12 − q1 − q2 )q2 regardless of firm 1’s
decision to invest
▶ We apply subgame perfection algorithm
▶ There are three subgames, one for each investment decision
by firm 1, and the entire game itself
Game tree

invest not invest


1 1
q1 q1

0 2 ∞ 0 2 ∞

q2 q2
0 ∞ 0 ∞
Example concl.
▶ In the subgame corresponding to 1 not investing, the reaction
function of firm 1 is q1 = 6 − q22 , and that of firm 2 is
q2 = 6 − q21
▶ Solving gives the Nash equilibrium output levels for this
subgame (4, 4) with payoff (16, 16)
▶ In the subgame corresponding to 1 investing, the reaction
function of firm 1 is q1 = 7 − q22 , and that of firm 2 is
q2 = 6 − q21
▶ Solving gives the Nash equilibrium output levels for this
subgame ( 16 10 256 100
3 , 3 ) with payoffs ( 9 − f , 9 )
▶ Firm 1 invests if 256
9 − f > 16 or f < 9
112

▶ (invest, 4, 16 10
3 ; 4, 3 ) is a subgame perfect Nash equilibrium for
112
any f ≤ 9
▶ (not invest, 4, 16 10
3 ; 4, 3 ) is a subgame perfect Nash equilibrium
for any f ≥ 1129
Exercise 8.1 (Mixed strategy SPNE)

1
Y N
1
1.5

O F 1.5
2
o f o f
2 0 0 1
   
1 0 0 2
Exercise 8.1

Player 2
o f
Player 1 YO 2, 1 0, 0
YF 0, 0 1, 2
NO 1.5, 1.5 1.5, 1.5
NF 1.5, 1.5 1.5, 1.5
Exercise 8.1

▶ Suppose 1 and 2 mix in the smallest subgame with


probabilities (p, 1 − p) and (σ, 1 − σ) resp.
▶ In equilibrium of the subgame, p = 23 and σ = 13 with payoff
( 23 , 32 )
▶ Then ((0, 0, 23 , 13 ), ( 13 , 32 )) is the SPNE in mixed strategies
Past year question

▶ Player 1, the agenda-setter, chooses a point a1 ∈ [0, 1]. Player


2, the voter, can either accept a1 or reject it, maintaining the
status-quo, a0 ∈ [0, 1]. The end outcome, denoted x, is
therefore, either a0 or a1 .
1. If player 2’s utility from x is given by u2 (x) = −(x − 13 )2 , and
player 1’s utility is u1 (x) = x, find a subgame perfect Nash
equilibrium of this 2-stage game.
2. How will your answer change if player 1’s utility changes to
u1 (x) = −(x − 25 )2 ? (7)
Past year question

▶ Maya and Elif approach a King with an infant, each claiming


that the infant is her child. The true mother values the child
at 100 whereas the impostor values the child at 50 – each of
them knows the other’s value as well. The king knows the
“values” that the true mother and the impostor ascribe to the
child, but he cannot distinguish the true mother from the
impostor. He announces the following steps:
1. He asks Maya if the child is hers. If she answers “no”, the
child is given to Elif. If she answers “yes”, then next step.
2. He asks Elif if the child is hers. If she answers “no”, the child
is given to Maya. If she answers “yes”, Elif will pay the king
75, and receive the child, and Maya will pay the king 10.
▶ Show that the game designed by the king guarantees that in
either state of the world, the only subgame perfect equilibrium
is the one under which the true mother gets her child and
neither woman pays anything at all.
One Stage Deviation Principle

▶ Given a finite extensive form game, sometimes we want to


check if a particular strategy profile is a subgame perfect Nash
equilibrium
▶ If we assume these games are multi-stage games with
observed action, the following result is useful for this purpose:

Theorem
A strategy profile s in a finite horizon multi-stage games with
observed action is subgame perfect if and only if no player can
profitably deviate from s in a single stage and conform to s
thereafter.
Multistage games with observed action

▶ At any stage, all players move simultaneously (this allows for


an action like “doing nothing”)
▶ A history at any stage t is a profile of past actions that leads
to the stage, observed by all players at t
▶ History at zero, h0 is the null set ∅
▶ The profile of actions taken by the agents in the stage
immediately following history at stage t, ht , is at
▶ For all t ≥ 1, ht = (ht−1 , at−1 ) = (∅, a0 , a1 , . . . , at−1 )
▶ The set of possible histories is denoted H
▶ Note: each history induces a subgame
▶ Strategies are now defined as mappings from history to
actions si : H → A
Proof of OSD Principle: Sufficiency
▶ To prove that every subgame perfect Nash equilibrium
strategy profile s must satisfy OSD
▶ Suppose not: there is a player i, a history ht̂ , and an action a
such that the strategy si′ , defined as follows, is a profitable
deviation for i at the subgame beginning at history ht̂ given
s−i :

(
si (ht ) for all ht ̸= ht̂
si′ (ht ) =
a, a ̸= si (ht̂ ) for ht = ht̂

▶ Then the restriction of s in the subgame following history ht̂


cannot be a Nash equilibrium of the subgame
▶ This contradicts that s is a subgame perfect Nash equilibrium
strategy profile
Game Tree example

ht̂
si (·)
si′ (·)
Proof of OSD Principle: Necessity

▶ To prove that every strategy profile s satisfying OSD must be


subgame perfect
▶ Suppose not: there is a player i and another strategy si′ ̸= si
such that the restriction of si′ to the subgame beginning at ht
is a better response to s−i in this subgame than si
▶ Let t̂ be the largest number such that si′ (ht̂ ) ̸= si (ht̂ )
▶ Notice that t̂ > t by OSD
Proof of OSD Principle: Necessity contd.

▶ Consider another strategy si′′ constructed as follows:


 ′ t
′′ t si (h ) for all t ≤ t̂
si (h ) =
si (ht ) for t > t̂

▶ Notice that for t > t̂, si (ht ) = si′ (ht ) = si′′ (ht ) and
si′ (ht̂ ) = si′′ (ht̂ )
▶ Thus, si′ and si′′ are identical in the subgame beginning at
history ht̂
▶ But si′′ is a one stage deviation from si in the subgame
beginning at history ht̂ , so it cannot be better than si in this
subgame unless si (ht̂ ) = si′′ (ht̂ )
▶ This implies si (ht̂ ) = si′ (ht̂ )
An example

ht
si (·)
si′ (·)
ht̂
si′′ (·)
Proof of OSD Principle: Necessity concl.

▶ If t̂ = t + 1, then there was only one possible deviation to


check and we are done
▶ If t̂ > t + 1, then we may need more iterations of the above
procedure
▶ For instance, if t̂ = t + 2, then the first iteration shows
si′ (ht+2 ) = si (ht+2 ), so then t̂ = t + 1, and the next iteration
shows si′ (ht+1 ) = si (ht+1 ) and so on
An example

ht
si (·)
si′ (·)

ht̂
Check

▶ A backward induction solution in an MSGOA satisfies OSD


▶ A subgame perfection solution in an MSGOA satisfies OSD
Games with infinite stages

▶ The OSD principle goes through in games with infinite stages


provided the game is continuous at inifinity:

For all i, sup |ui (h) − ui (h̃)| → 0 as t → ∞.


h,h̃:ht =h̃t

▶ Continuity at infinity will be satisfied if overall payoffs are sum


of discounted per period payoffs git (at ), and per period
payoffs are uniformly bounded, i.e. there is B such that

For all i, max


t
|git (at )| < B.
t,a
Theorem
A strategy profile s in an infinite horizon multistage game with
observed actions satisfying continuity at infinity is subgame perfect
if and only if no player can profitably deviate from s in a single
stage and conform to s thereafter.
Proof sketch
▶ The argument of the previous proof shows that every subgame
perfect strategy profile must satisfy OSD
▶ The argument also shows that a strategy profile satisfying
OSD cannot be improved upon by any finite sequence of
deviations in any subgame
▶ Suppose not: s is a strategy profile satisfying OSD which is
not subgame perfect: there is stage t, history ht , and player i
who could improve his payoff by ϵ > 0 using strategy si′
different from si in the subgame beginning ht
▶ By continuity at infinity, there is a t ′ such that strategy si′′
which agrees with si′ until t ′ and with si from t ′ onwards must
improve on si by at least 2ϵ
▶ This contradicts that no strategy profile satisfying OSD can
be improved upon by any finite sequence of deviations in any
subgame
Critiques of backward induction: a centipede game

1 A 2 A I A
(2, . . . , 2)

D D D

(1, . . . , 1) ( 12 , . . . , 12 ) ( 1I , . . . , 1I )
Critiques of backward induction

▶ Consider the I-player centipede game above


▶ Backward induction leads to unique prediction (A, . . . , A)
▶ (A, . . . , A) looks plausible prediction if I is small
▶ The payoff 2 requires all I players to play A: if players choose
A with independent probability p < 1, the probability that
I − 1 players will choose A can be very small if I is large
▶ The prediction also relies heavily on the common knowledge
assumption
Critiques of backward induction (Rosenthal 1981)

1 A1 2 A2 1 A3 2 A4 1 A5
(5, 5)
D1 D2 D3 D4 D5

(1, 0) (0, 1) (3, 0) (2, 4) (6, 3)


Critique of backward induction

▶ In the centipede game above, the unique backward induction


strategy profile involves players playing D at every decision
node
▶ If player 2 observes that 1 has chosen A instead, should he
pick A or D?
▶ Seems like if there is a chance of 1 responding with A, then 2
might want to gamble by playing A
▶ How should players form beliefs about opponents’ future
behaviour?
▶ Critics argue that reasonable theory should not rule out any
behaviour after observing an event which is theoretically
impossible
▶ Another way to address this issue is to update beliefs about
future events based on current observations
Critique of subgame perfection (Rabin 1988)

1
L R
2 R
(8, 6, 8)
(6, 0, 6) L
3
F G
1
F G F G

(0,0,0) (7,10,7) (7,10,7) (0,0,0)


Critique of subgame perfection

▶ Subgame perfection supposes that players expect the same


Nash equilibria in all subgames
▶ The coordination game between 1 and 3 in third stage has
three Nash equilibria: two where 1 and 3 coordinate
successfully with a payoff of (7, 10, 7) and a mixed strategy
equilibrium with a payoff of (3 12 , 5, 3 21 ).
▶ If 1 and 3 coordinate, then 2 plays L and and 1 plays R
▶ If 1 and 3 cannot coordinate, 2 plays R and 1 plays R
▶ But if 1 believes that he cannot coordinate in stage 3, but he
believes that 2 thinks otherwise, 1 can play L!
Repeated games: Prisoner’s Dilemma

▶ Suppose the per-period payoffs depend only on current actions


gi (at ) and given by
Player 2
C N
Player 1 C 1, 1 -1, 2
N 2, -1 0,0
Average discounted payoff

▶ Suppose player discount future payoffs with common discount


factor δ
▶ Utility of a sequence {a0 , . . . , aT } is normalized to

T
1−δ X t
δ gi (at )
1 − δ T +1
t=0

▶ The above expression is the average discounted payoff:


present value multiplied by a factor such that it would equal 1
if per period payoffs were 1
Finite repetitions

▶ If the game is played only once, then non-cooperation is the


dominant strategy for both players
▶ Thus the unique Nash equilibrium of the stage game is (N, N)
▶ If the game is repeated a finite number of times, then playing
(N, N) in every period is the unique subgame perfect Nash
equilibrium
Infinite repetitions

▶ If the game is repeated infinitely many times, playing (N, N)


in every stage remains a subgame perfect equilibrium
▶ Further it is the only equilibrium where play in any stage does
not depend on the history of the previous stage
▶ But if δ > 12 then the following describes a subgame perfect
equilibrium strategy:
play C in first period, and in every period where the action
profile in the previous period was (C, C); if the action
profile in previous period was different, play N from next
period onwards
Proof
▶ Two classes of subgames: A, where both players play C, and
B, where some player has played N
▶ A player conforming to the strategies in every subgame in A
has an average discounted payoff 1
▶ A player deviating at some t and conforms to the (class B)
strategies thereafter, has average discounted payoff

(1 − δ)(1 + δ + δ 2 + . . . + δ t−1 + 2δ t + 0 + . . .)
1 − δt
 
t
= (1 − δ) + 2δ
1−δ
= 1 − δ t (2δ − 1)

▶ This expression is less than 1 when δ > 12


▶ Since no player can gain by deviating from the proposed
strategy profile in a single period and conforming thereafter, it
is a subgame perfect equilibrium
Remarks

▶ Folk theorem: there can be many other subgame perfect


equilibria in the game provided the discount factor is
sufficiently large
▶ Note that the nature and range of subgame perfect equilibria
can be different depending on the number of repetitions
A finitely repeated game

▶ Consider a game comprising of two repetitions of the following


stage game:
Player 2
L M R
Player 1 U 0, 0 3, 4 6,0
M 4, 3 0, 0 0, 0
D 0, 6 0, 0 5, 5
▶ This stage game has three Nash equilibria: (M, L), (U, M)
and a mixed strategy equilibrium (( 73 , 74 , 0), ( 37 , 47 , 0)), with
payoffs (4, 3), (3, 4) and ( 12 12
7 , 7 )
▶ The efficient payoff (5, 5) is not an equilibrium
A finitely repeated game

▶ In the two stage game, the following strategy profile is


subgame perfect is δ > 97 :
play (D, R) in the first stage; if the first stage outcome
is (D, R), play (M, L) in the second stage; if the first
stage outcome is not (D, R), play (( 73 , 47 , 0), ( 37 , 47 , 0)) in
the second stage
▶ Notice that second stage actions prescribed are Nash
equilibria of the stage game
▶ If player 1 deviates in the first stage, he gains 1 in the first
stage but his second period payoff goes down from 4 to 12 7 : so
he will not deviate if 1 < δ(4 − 12 7 ) or, δ > 7
16
▶ If player 2 deviates in the first stage, he gains 1 in the first
stage but his second period payoff goes down from 3 to 12 7 : so
he will not deviate if 1 < δ(3 − 12 7 ) or, δ > 7
9
Infinitely repeated game with observed action

▶ The simultaneous move game which is repeated is called the


stage game
▶ Suppose stage game involves a finite set of players I , with
finite action spaces Ai (typical element ai ), and payoff
functions gi : A → R, where A = A1 × . . . × AI
▶ Mixed strategies in the stage game are denoted αi
▶ We assume that players discount future payoffs using common
discount factor δ < 1
Aggregate and continuation payoffs

▶ For any strategy profile σand history ht a players expected


payoffs from period t onwards is his continuation payoff:

X
Eσ (1 − δ) δ τ −t gi (σ t (ht ))
τ =t

▶ Player i’s objective is to maximize the normalized sum



X
ui = Eσ (1 − δ) δ t gi (σ t (ht ))
t=0

where σ denotes a mixed strategy profile of the entire game


and E an expectation operator
Observation

▶ If α∗ is a Nash equilibrium of the stage game, the strategies


“play αi∗ in every period” is a subgame perfect Nash
equilibrium
▶ If there are m Nash equilibria, then for any map j from set of
j(t)∗
periods to indices, “play αi in period t” is a subgame
perfect Nash equilibrium
▶ Check the above statements using the OSD principle
Preliminaries

▶ Player i’s reservation utility or minmax value is


 
v i = min max gi (αi , α−i )
α−i αi

▶ v i is the lowest payoff that his opponents can force him to


realize by any choice of α−i provided he foresees it and plays
his best response
▶ Let m−i be the minmax strategy profile for i’s opponents that
attains the minimum above
▶ Let mi be a strategy for player i such that gi (mi , m−i ) = v i
An example

▶ To find the minmax strategies of players in the following game:


Player 2
L R
Player 1 U -2, 2 1, -2
M 1, -2 -2, 2
D 0, 1 0, 1
Example contd.

▶ Given any mixed strategy of player 2, (q, 1 − q), the payoffs of


player 1 from his three strategies, U, M and D, are 1 − 3q,
3q − 2 and 0
▶ Notice that

 1 − 3q, 0 ≤ q < 31

max(1 − 3q, 3q − 2, 0) = 0, 13 ≤ q < 23


3q − 2 23 ≤ q ≤ 1

▶ Hence the minmax value of player 1 is 0


▶ The minmax strategy of player 2 is any q ∈ [ 13 , 23 ]
Example contd.

▶ Given any mixed strategy of player 1, (pU , pM , 1 − pU − pM ),


the payoffs of player 2 from his two strategies, L and R, are
2(pU − pM ) + (1 − pU − pM ) and (1 − pU − pM ) − 2(pU − pM )
▶ Notice

max(2(pU − pM ) + (1 − pU − pM ), (1 − pU − pM ) − 2(pU − pM ))

2(pU − pM ) + (1 − pU − pM ), pU ≥ pM
=
(1 − pU − pM ) − 2(pU − pM )) pU < pM

▶ The above function is minimized when pU = pM and


1 − pU − pM = 0, which gives us pU = pM = 21
▶ Thus the minmax value of player 2 is also zero
▶ The minmax strategy of player 1 against 2 is ( 12 , 12 , 0)
An observation

▶ Player i’s payoff is at least v i in any Nash equilibrium of the


stage game since Nash equilibrium strategies are best
responses, and v i is the minimum best response payoff
▶ Player i’s payoff is at least v i in any Nash equilibrium of the
repeated game as well
▶ Consider any Nash profile σ̂ of the repeated game: even the
“naive” strategy of choosing ai (ht ) to maximize
gi (ai , σ̂−i (ht )) (i.e., choosing the best response to the Nash
strategy profile of opponents in each period) yields at least v i
Feasible payoffs

▶ The set of feasible payoffs include the payoff vectors arising


out of pure strategies and mixed strategies that are
independent of each other
▶ If we allow for correlated mixing probabilities as well, then the
set of feasible payoffs is the convex hull of pure strategy
payoffs, denoted V
▶ Recall that the convex hull of a set A is the smallest convex
set containing A
▶ The feasible payoffs where every player gets at least his
minmax value are called individually rational
Illustration: Set of feasible payoffs

v2
(-2, 2)

(0,1)

v1

(1,-2)
Illustration:Feasible individually rational payoffs

v2
(-2, 2)

(0,1)

v1

(1,-2)
Sidenote: independent vs correlated mixing

▶ If players 1 and 2 use independent mixed strategies (p, 1 − p)


and (q, 1 − q) respectively in the Prisoners Dilemma game,
then the probability distribution over the four outcomes
possible is given by the following matrix:
Player 2
C N
Player 1 C pq p(1 − q)
N (1 − p)q (1 − p)(1 − q)
▶ For example, if p = 12 and q = 13 then the numbers in these
boxes are 61 , 13 , 16 and 13 starting from the top left box
clockwise
▶ The payoffs that can arise in the game are convex
combinations of the four payoffs (1, 1), (−1, 2), (2, −1), and
(0, 0) with probability weights as in the matrix above
Sidenote: independent vs correlated mixing

▶ But allowing only independent mixing probabilties implies that


not all convex combinations are feasible
▶ For instance, no two independent mixing distributions
(p, 1 − p), (q, 1 − q) can induce the following probability
distribution over outcomes:
Player 2
C N
Player 1 C 12 0
N 0 12
▶ How do we allow players to coordinate in a
simultaneous-move game over their mixing probabilities?
Sidenote: independent vs correlated mixing

▶ Suppose at the beginning of the game players are able to


observe a public signal that turns green or red with equal
probabilty
▶ If players condition their playing their pure strategies on the
signal (say, play C if green, and N if red), then we can observe
the above probability distribution over outcomes
▶ Note: There is a notion of correlated equilibrium (like Nash
equilibrium) which is a probabilty distribution p : S → [0, 1]
over strategies such that no individual player can gain from a
permutation of his strategies given these probabilties, i.e. for
all i and any permutation di : Si → Si ,
X X
p(s)ui (s) ≥ p(s)ui (di (si ), s−i )
s∈S s∈S
Assumptions required for convex feasible set

▶ We will assume that in the beginning of each period t ,


players are able to observe a public signal ω t
▶ We then redefine history at period t as

ht = (a0 , . . . , at−1 ; ω 0 , . . . , ω t )

▶ Pure strategies are now mappings from histories to actions,


si : H → A
Two Folk Theorems

Theorem (Nash Folk Theorem)


For every feasible payoff vector v ∈ V such that vi > v i for all
players i, there exists a δ < 1 such that for all δ ∈ (δ, 1), there is a
Nash equilibrium of G (δ) (the repeated game with discount factor
δ) with payoffs v .

Theorem (Perfect Folk Theorem)


Let α∗ be an equilibrium of the stage game, with payoffs e. For
every feasible payoff vector v ∈ V such that vi > ei for all players
i, there exists a δ < 1 such that for all δ ∈ (δ, 1), there is a
subgame perfect Nash equilibrium of G (δ) (the repeated game
with discount factor δ) with payoffs v .
Proof of Nash Folk Theorem

▶ Suppose there is a pure action profile a such that g (a) = v


▶ Consider the following strategies for each i:
Play ai in period 0 and continue to play ai as long as either
(i) the action profile realized in previous period is a, or (ii)
the action profile realized in previous period differs from a
in at least two components; if only one player i deviates
from a in some period, every other j plays mji for the rest
of the game
▶ If i deviates from a, he gets at most maxai gi (ai , a−i ) in that
period and v i from next period onwards
Proof of Nash Folk Theorem
▶ Thus the maximum payoff that i gets from unilaterally
deviating at t:

(1 − δ)(vi + δvi + · · · + δ t−1 vi + δ t max gi (ai , a−i )


ai
t+1 t+2
+δ vi + δ vi + · · · )
1 − δt δ t+1
 
t
= (1 − δ) vi + δ max gi (ai , a−i ) + v
1−δ ai 1−δ i
= (1 − δ t )vi + δ t (1 − δ) max gi (ai , a−i ) + δ t+1 v i
ai

▶ The value of δ for which this payoff is equal to vi is given by

(1 − δ t )vi + δ t (1 − δ) max gi (ai , a−i ) + δ t+1 v i = vi


ai

or, − δ vi + δ (1 − δ) max gi (ai , a−i ) + δ t+1 v i = 0


t t
ai
or, (1 − δ) max gi (ai , a−i ) + δv i = vi
ai
Proof of Nash Folk Theorem

▶ Call this discount level δ i , notice that δ i < 1


▶ For all δ̂ > δ i the deviation payoff is less than the conforming
payoff
▶ Take δ = maxi δ i to complete the argument
Proof of Nash Folk Theorem

▶ If v cannot be realized using a pure strategy, consider a


strategy contingent on a public signal a(ω) with expected
payoff v
▶ The maximum continuation payoff when deviating in period t
is (1 − δ) maxai gi (ai , a−i ) + δv i
▶ The continuation payoff from conforming in period t is at
least (1 − δ) minai gi (ai , a−i ) + δvi
▶ Call value of δ at which these two expressions are equal, δ i
▶ Take δ = maxi δ i to complete the argument
Proof of Perfect Folk Theorem

▶ Suppose there is a pure action profile a such that g (a) = v


▶ Consider the following strategies for each i:
Play ai in period 0 and continue to play ai as long as the
action profile realized in previous period is a; if at least one
player deviated from a in some period, every i plays αi∗ for
the rest of the game
▶ This strategy profile is Nash equilibrium for all δ such that

(1 − δ) max gi (ai , a−i ) + δei < vi


ai
Proof of Perfect Folk Theorem

▶ Notice that the inequality above holds strictly for δ = 1, so it


must hold for a range of δ < 1
▶ To check that the suggested profile is subgame perfect, notice
that in any subgame off the equilibrium path players are
playing α∗ which is a Nash equilibrium for the subgame by our
earlier observation
▶ If there is no pure strategy a such that g (a) = v , we use
correlated mixing strategies as in the last proof
Illustration: Prisoner’s dilemma

▶ In our last example, the minmax payoff for both players is 0,


maxai gi (ai , a−i ) = 2, minai gi (ai , a−i )gi (a) = −1, unique
Nash equilibrium payoff in stage game (0, 0)
▶ Thus the level of δ above which it is possible to sustain a
payoff of (1, 1) every period in a Nash equilibrium of the
repeated game is given by 2(1 − δ) = 1 or δ = 12
▶ Since the minmax payoffs are also the unique Nash
equilibrium payoffs, any δ > 12 also sustain (1, 1) every period
in a subgame perfect Nash equilibrium of the repeated game
▶ A per period payoff ( 21 , 21 ) which does not correspond to any
pure strategy in the stage game can be sustained in a
Nash/subgame perfect equilibrium for levels of δ above the
one defined by 2(1 − δ) = 2δ − (1 − δ) or δ = 67
Folk Theorems for finitely repeated games

▶ Recall that in the Prisoner’s Dilemma example above, (N, N)


was the unique Nash equilibrium in the stage game
▶ Playing (N, N) every period is the unique subgame perfect
Nash equilibrium of the finitely repeated Prisoner’s Dilemma
▶ In fact, this is the only Nash equilibrium of the finitely
repeated Prisoner’s Dilemma game
▶ However, in another example involving a stage game with
multiple Nash equilibria, an outcome that Pareto-dominated
Nash equilibria was sustained in SPNE of the finitely repeated
game through a scheme of rewards and punishments
▶ Can we generalize this insight further and sustain any feasible
individually rational payoff in a finitely repeated game through
such schemes?
(Limit) Folk Theorems for finitely repeated games

Theorem (Nash Folk)


Suppose for every player i there is a Nash equilibrium of the stage
game α∗ (i) with gi (α∗ (i)) > v i . The set of Nash equilibrium
payoffs of the T -period repeated game converges to the set of
feasible individually rational payoffs as T → ∞ and δ → 1.

Theorem (Perfect Folk)


Suppose for some T ′ and δ ′ , there exist a set of I + 1 subgame
perfect Nash equilibrium strategies for the stage game repeated T ′
times, viz., α∗ , α∗ (1), . . . , α∗ (I ), such that for every i,
ui (α∗ ) − ui (α∗ (i)) > 0 for δ > δ ′ . The set of subgame perfect
Nash equilibrium payoffs of the T -period repeated game converges
to the set of feasible individually rational payoffs as T → ∞ and
δ → 1.
Exercise 9.1

▶ Consider the following simultaneous-move game that is played


twice (the players observe the first-period outcome prior to
the second- period play):
Player 2
L C R
Player 1 T 10, 10 2, 12 0, 13
M 12, 2 5, 5 0, 0
B 13, 0 0, 0 1, 1
▶ Find all the pure-strategy subgame-perfect equilibria with no
discounting (δ = 1).
▶ For each of the equilibria you found in (a), find the smallest
discount factor that supports it.
Exercise 9.2

▶ Two players are playing two consecutive games. First, they


play the following Centipede Game:
1 C 2 c 1 C 2 c
(3, 3)
N n N n

(1, 1) (0, 3) (2, 2) (1, 4)


▶ After the Centipede Game they play the following
coordination game:
Player 2
a b
Player 1 A 1, 1 0, 0
B 0, 0 3, 3
Exercise 9.2 Contd.

▶ What are the Nash equilibria of each stage-game?


▶ How many pure strategies does each player have in the
multistage game?
▶ Find all the pure-strategy subgame-perfect equilibria with
extreme discounting (δ = 0).
▶ Now let δ = 1. Find a subgame-perfect equilibrium for the
two-stage game in which the players receive the payoffs (2, 2)
in the first stage- game.
▶ What is the lowest value of δ for which the subgame-perfect
equilibrium you found above survives?
▶ For δ greater than the value you found above, are there other
outcomes of the first-stage Centipede Game that can be
supported as part of a subgame-perfect equilibrium?
Exercise 10.2

▶ Consider the following stage game being infinitely repeated


with discount factor δ < 1 :
Player 2
L C R
Player 1 T 6, 6 -1, 7 -2, 8
M 7, -1 4, 4 -1, 5
B 8, -2 5, -1 0, 0
▶ For which values of δ can the players support the pair of
actions (M, C) played in every period?
▶ For which values of δ can the players support the pair of
actions (T , L) played in every period? Why is your answer
different from the above part?
Exercise 10.3

▶ Consider the following stage game being infinitely repeated


with discount factor δ < 1 :
Player 2
m f
Player 1 M 4, 4 -1, 5
F 5, -1 1, 1
▶ Assume that the players wish to choose a “length-T
punishment” strategy: if there is a deviation from (a1 , a2 )
then players will play (F , f ) for T periods and then resume
playing (a1 , a2 ).
▶ Let δT be the critical discount factor so that if δ > δT , then
the adequately defined strategies will implement the desired
path of play with length-T punishment as the threat.
Exercise 10.3 contd.

▶ Let T = 1. What is the critical value δ1 to support the pair of


actions (M, m) played in every period?
▶ Let T = 2. What is the critical value δT to support the pair
of actions (M, m) played in every period?
▶ Compare the two critical values obtained above and explain
why they are different.
Exercise 10.8 (Tacit collusion)

▶ Two firms, which have zero marginal cost and no fixed cost,
produce some good, each producing qi ≥ 0, i ∈ {1, 2}. The
demand for this good is given by p = 200 − Q, where
Q = q1 + q2 .
▶ Solve for the static stage-game Cournot-Nash equilibrium.
▶ Consider the case of Cournot competition, in which each firm
chooses qi and this game is infinitely repeated with a discount
factor δ < 1. For which values of δ can you support the firms
equally splitting monopoly profits in each period as a
subgame-perfect equilibrium that uses grim-trigger strategies
(i.e., after one deviates from the proposed split, they resort to
the static Cournot-Nash equilibrium thereafter)?
Exercise 10.8 contd.

▶ Consider the case of Bertrand competition, in which each firm


chooses a price pi ≥ 0, where the lowest-price firm gets all the
demand and in case of a tie they split the market. Solve for
the static stage-game Bertrand-Nash equilibrium.
▶ For which values of δ can you support the firms equally
splitting monopoly profits in each period as a subgame-perfect
equilibrium that uses grim-trigger strategies (i.e., after one
deviates from the proposed split, they resort to the static
Bertrand-Nash equilibrium thereafter)?
▶ Now try to support the firms equally splitting monopoly profits
as a subgame-perfect equilibrium in which after a deviation
firms would resort to the static Bertrand competition for only
two periods. For which values of δ will this work? Why is this
answer different than your answer in the part above?
Exercise 10.12

▶ Suppose the following game is being infinitely repeated:


Player 2
C D
Player 1 N 0, 0 0, 0
T 1, 1 -1, 2
▶ Draw the convex hull of average payoffs.
▶ Are the average payoffs (v̄1 , v̄2 ) = (−0.4, 1.1) in the convex
hull of average payoffs? Can they be supported by a pair of
strategies that form a subgame-perfect equilibrium for a large
enough discount factor δ?
▶ Show that there is a pair of subgame-perfect equilibrium
strategies for the two players that yields average payoffs that
approach (v̄1 , v̄2 ) = ( 31 , 43 ) as δ approaches 1.
Bargaining

▶ Bargaining usually refers to a situation of conflicting interests


of multiple agents
▶ Can be modelled as cooperative as well as non-cooperative
game
▶ One-period bargaining model also known as ultimatum game:
▶ Two players are bargaining over the share of a rupee
▶ If player 1 makes an offer in period 1 player 2 can accept or
reject
▶ If player 2 accepts then the game ends with each player getting
the agreed shares
▶ If player 2 rejects, then the game ends with zero payoffs for
both
Equilibria of the ultimatum game

▶ For any x ∈ [0, 1], (offer x, accept any offer of x or more) is a


Nash equilibrium
▶ (offer 0, reject all offers) is also a Nash equilibrium
▶ Unique subgame perfect equilibrium: (offer 0, accept any
positive offer), with outcome (1, 0) shares
2 periods

▶ If player 1 makes an offer in period 1 player 2 can accept or


reject
▶ If player 2 accepts then the game ends with each player
getting the agreed shares
▶ If player 2 rejects, then player 2 makes an offer in period 2
player 1 can accept or reject
▶ If player 1 accepts then the game ends with each player
getting the agreed shares
▶ If player 1 rejects, then the game ends with zero payoffs for
both
SPNE of 2 period bargaining

▶ By an argument similar to that of the ultimatum game, the


unique SPNE outcome of the period-2 game is (0, 1)
▶ In other words, player 2 can get at least 1 in period 2 by
rejecting player 1’s period 1 offer
▶ Getting 1 in period 2 is worth δ < 1 in period 1
▶ Therefore, if 1 offers 2 a share of δ or more in period 1, 2
must accept it
▶ The highest payoff that 1 can get therefore is 1 − δ
▶ The unique SPNE ((offer δ in period 1, accept any positive
share in period 2), (accept any share of δ or more in period 1,
offer 0 in period 2)) yields an outcome (1 − δ, δ) in period 1
itself
SPNE of finite period bargaining

▶ The unique SPNE outcome of three period bargaining game:


(1 − δ(1 − δ), δ(1 − δ))
▶ The unique SPNE outcome of four period bargaining game:
(1 − δ(1 − δ(1 − δ)), δ(1 − δ(1 − δ))
▶ The unique SPNE outcome of T period bargaining game:

(1 − δ + δ 2 + . . . + (−1)T −1 δ T −1 , δ − δ 2 + . . . + (−1)T δ T )
1 − (−δ)T δ(1 − (−δ)T −1 )
 
= ,
1+δ 1+δ
1
▶ If T → ∞, this outcome approaches ( 1+δ δ
, 1+δ )
SPNE of infinite period bargaining

▶ Question: Suppose bargaining extends to potentially infinite


number of periods. Since it is not possible to apply backward
induction, how do we then figure out reasonable equilibrium
strategies?
▶ Is the limiting equilibrium outcome of the finite period game
obtainable?
SPNE strategies in the infinite period game

▶ Consider the following strategy


δ
offer the other player 1+δ whenever it is one’s turn to offer,
δ
accept any offer with a share of 1+δ or more whenever the
other player is making an offer
▶ It is straightforward to check that such a pair of strategies
satisfy OSD and hence it constitutes a SPNE
▶ Rubinstein (1982): above strategies constitute the unique
SPNE of infinite period bargaining game
Proof of uniqueness

Lemma
1
In no SPNE, 1 gets more than 1+δ .
1
▶ If 1 gets more than 1+δ when making an offer, 2 must be
δ
getting less than 1+δ .
▶ Player 2 can reject and get at least 1+δ1
next period by playing
the SPNE strategies.
▶ If 1 gets more than 1+δ 1
when 2 is making an offer, 2 can offer
her slightly less, and 1 will accept the new offer since she can
1
earn at most 1+δ if she rejects and makes an offer herself.

Corollary
δ
In no SPNE, 2 gets less than 1+δ
Proof of uniqueness contd.
Lemma
δ
In no SPNE, 2 gets more than 1+δ .

▶ If 2 gets more than 1+δδ


when making an offer, 1 must be
1
getting less than 1+δ .
δ
▶ If 1is getting less than 1+δ , she can reject and get at least
1
1+δ next period by playing the SPNE strategies.
▶ If 1 gets more than 1+δ δ
when 2 is making an offer, 2 can offer
her slightly less, and 1 will accept the new offer since she can
1
earn at most 1+δ if she rejects and makes an offer herself.
▶ If 2 gets more than 1+δ δ
when 1 is making an offer, 1 can offer
her slightly less, and 2 will accept the new offer since she can
1
earn at most 1+δ if he rejects and makes an offer himself.

Corollary
1
In no SPNE, 1 gets less than 1+δ .

You might also like