Decision Making under Uncertainty
Abdul Quadir
XLRI
December 10, 2019
Introduction
I Note that division manager in the previous lecture has two
possible actions: go for on R&D or maintain status quo.
I Thus, A = {g , s}.
I To begin with, suppose there are only two outcomes:
successful with profit 10 million, obsolete with 0 profit.
I Thus, X = {0, 10}.
I Observe that the outcome of R&D is not uncertain.
I We introduce random or stochastic outcome.
I This implies that outcome will have some distribution
(probabilities).
Lotteries
I Assume that a successful product line is more likely to be
created if the manager go for R&D.
I Suppose the probability for successful product line 0.75 and
not successful is 0.25 if the manager chooses g .
I If the manager chooses s, then it 50-50 for success or not.
I This could be depicted in the following decision tree:
10
0.75
N
g
0.25 0
Decision Maker
s N 0.5 10
0.5
0
Lotteries
I Think that the decision maker is choosing between lotteries.
I A lottery is defined by the random payoff.
I For instance, choosing g is like choosing a lottery that pays 0
with probability 0.25 and pays 10 million with probability 0.75.
I We can introduce a neutral player called ‘Nature’ which
chooses the probabilities for the decision maker.
I Consider a decision problem with n possible outcomes,
X = {x1 , x2 , . . . , xn }.
I A simple lottery over outcomes X is defined as a probability
distribution p = (p1 (x1 ), p2 (x2 ), . . . , pn (xn )), where p(xk ) ≥ 0
and nk p(xk ) = 1.
P
Lotteries
I Think that the decision maker is choosing between lotteries.
I A lottery is defined by the random payoff.
I For instance, choosing g is like choosing a lottery that pays 0
with probability 0.25 and pays 10 million with probability 0.75.
I We can introduce a neutral player called ‘Nature’ which
chooses the probabilities for the decision maker.
I Consider a decision problem with n possible outcomes,
X = {x1 , x2 , . . . , xn }.
I A simple lottery over outcomes X is defined as a probability
distribution p = (p1 (x1 ), p2 (x2 ), . . . , pn (xn )), where p(xk ) ≥ 0
and nk p(xk ) = 1.
P
Example: Lotteries
I For R&D example, Nature chooses the lottery p(10) = 0.75
and p(0) = 0.25 after the choice of g .
I Similarly, Nature chooses the lottery p(10) = 0.5 and
p(0) = 0.5 after the choice of s.
I Therefore, the lottery that Nature chooses is conditional on
the action taken by DM.
Pnxk ∈ X occurs is given
Thus, the conditional probability that
I
by p(xk |a), where p(xk |a) ≥ 0 and k=1 p(xk |a) for all a ∈ A.
I If every action is associated with a single outcome, then we
can assign probability 1 for that outcome.
I This is known as degenerate lottery.
How to Evaluate Random Outcomes?
I In the R%D example which action the manager would choose?
I Observe that it is very easy to compare the lotteries that
follow g and s.
I Since both have the same set outcomes, the choice of g yields
higher chance to get payoffs 10.
I Let us tweak the outcome set a little bit assuming there is a
fixed cost to undertake R%D.
9
0.75
N
g
0.25 −1
Decision Maker
s N 0.5 10
0.5
0
How to Evaluate Random Outcomes?
I In the R%D example which action the manager would choose?
I Observe that it is very easy to compare the lotteries that
follow g and s.
I Since both have the same set outcomes, the choice of g yields
higher chance to get payoffs 10.
I Let us tweak the outcome set a little bit assuming there is a
fixed cost to undertake R%D.
9
0.75
N
g
0.25 −1
Decision Maker
s N 0.5 10
0.5
0
How to Evaluate Random Outcomes?
I In the R%D example which action the manager would choose?
I Observe that it is very easy to compare the lotteries that
follow g and s.
I Since both have the same set outcomes, the choice of g yields
higher chance to get payoffs 10.
I Let us tweak the outcome set a little bit assuming there is a
fixed cost to undertake R%D.
9
0.75
N
g
0.25 −1
Decision Maker
s N 0.5 10
0.5
0
Expected Payoff
I We need a methodology to compare lotteries.
I The good news is that there exists already a methodology to
compare lotteries.
I This is known as expected utility theory.
I It was developed by John von Neumann and Oskar
Morgenestern in 1944.
I The detail of their theory is beyond this course.
I The intuition lies behind the idea of averages.
I If on average the outcome which we choose provides more
payoff, then we view our outcome as good outcome.
I Technically, let u(x) be the player’s payoff function over
outcomes in X = {x1 , x2 , . . . , xn }, and p = (p1 , p2 , . . . , pn ) be
lottery over X such that pk = Pr [x = xk ]. Then we define the
player’s expected payoff from the lottery p as
n
X
E [u(x)|p] = pk u(xk ) = p1 u1 (x1 ) + p2 u(x2 ) + · · · + pn u(xn )
k=1
Expected Payoff
I We need a methodology to compare lotteries.
I The good news is that there exists already a methodology to
compare lotteries.
I This is known as expected utility theory.
I It was developed by John von Neumann and Oskar
Morgenestern in 1944.
I The detail of their theory is beyond this course.
I The intuition lies behind the idea of averages.
I If on average the outcome which we choose provides more
payoff, then we view our outcome as good outcome.
I Technically, let u(x) be the player’s payoff function over
outcomes in X = {x1 , x2 , . . . , xn }, and p = (p1 , p2 , . . . , pn ) be
lottery over X such that pk = Pr [x = xk ]. Then we define the
player’s expected payoff from the lottery p as
n
X
E [u(x)|p] = pk u(xk ) = p1 u1 (x1 ) + p2 u(x2 ) + · · · + pn u(xn )
k=1
Expected Payoff
I We need a methodology to compare lotteries.
I The good news is that there exists already a methodology to
compare lotteries.
I This is known as expected utility theory.
I It was developed by John von Neumann and Oskar
Morgenestern in 1944.
I The detail of their theory is beyond this course.
I The intuition lies behind the idea of averages.
I If on average the outcome which we choose provides more
payoff, then we view our outcome as good outcome.
I Technically, let u(x) be the player’s payoff function over
outcomes in X = {x1 , x2 , . . . , xn }, and p = (p1 , p2 , . . . , pn ) be
lottery over X such that pk = Pr [x = xk ]. Then we define the
player’s expected payoff from the lottery p as
n
X
E [u(x)|p] = pk u(xk ) = p1 u1 (x1 ) + p2 u(x2 ) + · · · + pn u(xn )
k=1
Example
I What are the expected payoff of the modified R%D example
for each action?
v (g ) = E [u(x)|g ] = 0.75 × 9 + 0.25 × (−1) = 6.5
v (s) = E [u(x|s)] = 0.5 × 10 + 0.5 × 0 = 5
I Warning: Now it is not ordinal notion.
I For our earlier example,we know that 10 9 0 −1 where
u(x) = x.
I However, consider the following payoff functions:
u(10) = 10, u(9) = 9, u(0) = 0and u(−1) = −8.
I Thus, we get
v (g ) = E [u(x)|g ] = 0.75 × 9 + 0.25 × (−8) = 4.5.
Example
I What are the expected payoff of the modified R%D example
for each action?
v (g ) = E [u(x)|g ] = 0.75 × 9 + 0.25 × (−1) = 6.5
v (s) = E [u(x|s)] = 0.5 × 10 + 0.5 × 0 = 5
I Warning: Now it is not ordinal notion.
I For our earlier example,we know that 10 9 0 −1 where
u(x) = x.
I However, consider the following payoff functions:
u(10) = 10, u(9) = 9, u(0) = 0and u(−1) = −8.
I Thus, we get
v (g ) = E [u(x)|g ] = 0.75 × 9 + 0.25 × (−8) = 4.5.
Example
I What are the expected payoff of the modified R%D example
for each action?
v (g ) = E [u(x)|g ] = 0.75 × 9 + 0.25 × (−1) = 6.5
v (s) = E [u(x|s)] = 0.5 × 10 + 0.5 × 0 = 5
I Warning: Now it is not ordinal notion.
I For our earlier example,we know that 10 9 0 −1 where
u(x) = x.
I However, consider the following payoff functions:
u(10) = 10, u(9) = 9, u(0) = 0and u(−1) = −8.
I Thus, we get
v (g ) = E [u(x)|g ] = 0.75 × 9 + 0.25 × (−8) = 4.5.
Example
I What are the expected payoff of the modified R%D example
for each action?
v (g ) = E [u(x)|g ] = 0.75 × 9 + 0.25 × (−1) = 6.5
v (s) = E [u(x|s)] = 0.5 × 10 + 0.5 × 0 = 5
I Warning: Now it is not ordinal notion.
I For our earlier example,we know that 10 9 0 −1 where
u(x) = x.
I However, consider the following payoff functions:
u(10) = 10, u(9) = 9, u(0) = 0and u(−1) = −8.
I Thus, we get
v (g ) = E [u(x)|g ] = 0.75 × 9 + 0.25 × (−8) = 4.5.
Rational Decision Making under Uncertainty
I We require that the DM must know the probabilities of each
outcome conditional that he or she has taken an action.
I Therefore, now DM maximize the expected payoff over his
action.
I Technically, a∗ ∈ A is chosen if and only if
v (a∗ ) = E [u(x)|a∗ ] ≥ E [u(x)|a] = v (a) for all a ∈ A.
Example: Getting MBA Degree
I Consider you are working in some firm.
I Now you are contemplating to get an MBA degree from XLRI.
I The cost of getting MBA degree is 10 lacs rupees. (including
opportunity cost).
I Your future value is the stream of income you get for the next
decade.
I Thus, the income depends on the situation of the labor
market.
I Income value from MBA and current status are given below in
each state of the labor market:
MBA Current Status State
32 12 strong
16 8 average
12 4 weak
Example: Getting MBA Degree
I Consider you are working in some firm.
I Now you are contemplating to get an MBA degree from XLRI.
I The cost of getting MBA degree is 10 lacs rupees. (including
opportunity cost).
I Your future value is the stream of income you get for the next
decade.
I Thus, the income depends on the situation of the labor
market.
I Income value from MBA and current status are given below in
each state of the labor market:
MBA Current Status State
32 12 strong
16 8 average
12 4 weak
Example: Getting MBA Degree
I Consider you are working in some firm.
I Now you are contemplating to get an MBA degree from XLRI.
I The cost of getting MBA degree is 10 lacs rupees. (including
opportunity cost).
I Your future value is the stream of income you get for the next
decade.
I Thus, the income depends on the situation of the labor
market.
I Income value from MBA and current status are given below in
each state of the labor market:
MBA Current Status State
32 12 strong
16 8 average
12 4 weak
Example: Getting MBA Degree
I Suppose after extensive research you are convinced that the
probabilities of each state of labor market.
I For strong, average and weak, they are 0.25, 0.5, and 0.25,
respectively.
I Given these information, what would you decide?
I Let us draw decision tree:
32 − 10
0.25
N 0.5
Get MBA 16 − 10
0.25 12 − 10
You
12
0.25
Dont’t N 0.5
8
0.25
4
Example: Getting MBA Degree
I Suppose after extensive research you are convinced that the
probabilities of each state of labor market.
I For strong, average and weak, they are 0.25, 0.5, and 0.25,
respectively.
I Given these information, what would you decide?
I Let us draw decision tree:
32 − 10
0.25
N 0.5
Get MBA 16 − 10
0.25 12 − 10
You
12
0.25
Dont’t N 0.5
8
0.25
4
Example: Getting MBA Degree
v (Get MBA) = 0.25 × 22 + 0.5 × 6 + 0.25 × 2 = 9
v (Don’t Get) = 0.25 × 12 + 0.5 × 8 + 4 × 0.25 = 8
Application: Value of Information
I For our MBA example, we have decided to for MBA.
I Now think an all-knowing oracle came to you before just
before you are about resign from your job and go for MBA.
I The oracle said “I know wgat the state the labor market will
be and I can tell for a price”.
I What would you do in this situation?
I You will try to ask the following two questions:
I Is this information valuable?
I How much is it worth?
I The answer for the first question is quite easy.
I The information would be valuable if it causes you to change
your decision that you have taken without it.
Application: Value of Information
I For our MBA example, we have decided to for MBA.
I Now think an all-knowing oracle came to you before just
before you are about resign from your job and go for MBA.
I The oracle said “I know wgat the state the labor market will
be and I can tell for a price”.
I What would you do in this situation?
I You will try to ask the following two questions:
I Is this information valuable?
I How much is it worth?
I The answer for the first question is quite easy.
I The information would be valuable if it causes you to change
your decision that you have taken without it.
Application: Value of Information
I For our MBA example, we have decided to for MBA.
I Now think an all-knowing oracle came to you before just
before you are about resign from your job and go for MBA.
I The oracle said “I know wgat the state the labor market will
be and I can tell for a price”.
I What would you do in this situation?
I You will try to ask the following two questions:
I Is this information valuable?
I How much is it worth?
I The answer for the first question is quite easy.
I The information would be valuable if it causes you to change
your decision that you have taken without it.
Application: Value of Information
I To answer this look the decision problem:
I If you learn that the labor market is strong, then you will not
change you decision.
I But if it is weak or average, you will not go for MBA.
I Thus, it is quite clear that the information of Oracle may be
considered valuable.
I To answer the second question: you are going to compare the
expected payoff that you get without the oracle’s information
and the expected payoff with oracle’s information.
I We know that the expected payoff of getting MBA degree is 9
without seeking oracle’ advice.
I What is expected payoff in anticipation of receiving the
oracle’s advice.
I The correct way to compute that is that after learning the
states of the labor market, calculate the expected payoff as if
you have got the oracle’s payoff.
Application: Value of Information
I To answer this look the decision problem:
I If you learn that the labor market is strong, then you will not
change you decision.
I But if it is weak or average, you will not go for MBA.
I Thus, it is quite clear that the information of Oracle may be
considered valuable.
I To answer the second question: you are going to compare the
expected payoff that you get without the oracle’s information
and the expected payoff with oracle’s information.
I We know that the expected payoff of getting MBA degree is 9
without seeking oracle’ advice.
I What is expected payoff in anticipation of receiving the
oracle’s advice.
I The correct way to compute that is that after learning the
states of the labor market, calculate the expected payoff as if
you have got the oracle’s payoff.
Application: Value of Information
I This decision problem is depicted in the following decision
tree:
22
Get MBA
You
Don’t Get 12
0.25
Get MBA 6
0.5
N
Don’t Get 8
0.25
You Get MBA 2
Don’t Get 4
Application: Value of Information
I Thus, the expected payoff before hearing the oracle’s advice
with the intention of using it is
E [u] = 0.25 × 22 + 0.5 × 8 + 0.25 × 4 = 10.5
I Therefore the increment in your expected payoff because of
the oracle’s advice is 10.5 − 9 = 1.5.
I Hence, you would be willing to pay 1.5 to oracle.
Strategic Decision Making
Reading: Chapter 3 Dixit and Skeath
Key Elements of Games
I Players: Who is interacting?
I Strategies: What are their options?
I Payoffs: What are their incentives?
I Information: What do they know?
I Rationality: How do they think?
Games with Sequential Moves
I There are many games where players move in sequence.
I Let us see the following game of removing coins:
I There are 21 coins.
I Two players move sequentially and remove 1, 2 or 3 coins.
I They have to remove at least one coin.
I Winner is who removes the last coin(s).
I Analyze it from the end back to beginning.
I If there 1, 2 or 3 chips left, the player who moves next will
win.
I Suppose there are 4 coins. Then the player who moves next
must leave either 1,2 or 3 coins and his opponent will win.
I So leaving 4 coins is the loss for the next player.
I Thus, with 5, 6 or 7 coins left, the player who moves next will
win by leaving 4 coins left.
Games with Sequential Moves
I There are many games where players move in sequence.
I Let us see the following game of removing coins:
I There are 21 coins.
I Two players move sequentially and remove 1, 2 or 3 coins.
I They have to remove at least one coin.
I Winner is who removes the last coin(s).
I Analyze it from the end back to beginning.
I If there 1, 2 or 3 chips left, the player who moves next will
win.
I Suppose there are 4 coins. Then the player who moves next
must leave either 1,2 or 3 coins and his opponent will win.
I So leaving 4 coins is the loss for the next player.
I Thus, with 5, 6 or 7 coins left, the player who moves next will
win by leaving 4 coins left.
Games with Sequential Moves
I With 8 coins, the next player can leave 5,6, or 7 coins for the
opponent to choose.
I Thus, the previous Player will win.
I Thus, the positions with 0,4,8,12,16, . . . coins are target
positions.
I Since 21 is not divisible by 4, the player who moves first will
win.
I In the sequential game, we use backward induction
argument to solve the problem.
Entry Game
I Google Pay is contemplating entering in online bank market
and Paytm can either fight the entry or accommodate.
G
Out In
P
0, 20
F A
−5, 0 10, 10
Game Trees
Game trees consists on
I nodes
I initial nodes
I decision nodes
I terminal nodes
I branches
I player labels
I action labels
I payoffs
I information set (to be seen later).
A pure strategy of a player specifies an action choice at each
decision node of the player.
Game Trees
Game trees consists on
I nodes
I initial nodes
I decision nodes
I terminal nodes
I branches
I player labels
I action labels
I payoffs
I information set (to be seen later).
A pure strategy of a player specifies an action choice at each
decision node of the player.
Example
Out In
P
0, 20
F A
−5, 0 10, 10
I Google Pay strategies are {In, Out}.
I The strategies of Paytm are {A, F }.
Example: Strategies
1 C 2 c 1 C
2, 4
S s S
1, 0 0, 2 3, 1
I The strategies for Player 1 is {SS, SC , CS, CC }.
I The strategies for Player 2 is {c, s}.
Example: Strategies
1 C 2 c 1 C
2, 4
S s S
1, 0 0, 2 3, 1
I The strategies for Player 1 is {SS, SC , CS, CC }.
I The strategies for Player 2 is {c, s}.
How to Solve Sequential Games?
I What should Paytm do if Google Pay enters?
I Given what it knows about paytm’s response to entry,what
should Google Pay do?
I This solution is known as backward induction equilibrium or
rollback equilibrium or subgame perfect equilibrium.
I In backward induction equilibrium each player plays optimally
at every decision node in the game tree.
I (In, A) is the unique backward induction equilibrium of the
entry game.
How to Solve Sequential Games?
I What should Paytm do if Google Pay enters?
I Given what it knows about paytm’s response to entry,what
should Google Pay do?
I This solution is known as backward induction equilibrium or
rollback equilibrium or subgame perfect equilibrium.
I In backward induction equilibrium each player plays optimally
at every decision node in the game tree.
I (In, A) is the unique backward induction equilibrium of the
entry game.
Example: Backward Induction Equilibrium
1 C 2 c 1 C
2, 4
S s S
1, 0 0, 2 3, 1
Unique backward induction equilibrium is (SS, s).
Example: Backward Induction Equilibrium
1 C 2 c 1 C
2, 4
S s S
1, 0 0, 2 3, 1
Unique backward induction equilibrium is (SS, s).
Power of Commitment
I Remember that (In, A) is the unique backward induction
equilibrium of the entry game. Paytm payoff is 10.
I Suppose Paytm commits to fight if entry occurs.
I What would Google Pay do?
I Outcome would be out and Google Pay would be better off.
I Is this commitment credible?