Open navigation menu

Scribd

0% found this document useful (0 votes)

132 views6 pages

Assignment #3:: Group 15

1. This document is a group assignment from students Ajay Guru, Heet Sankesara, and Mahima Arora submitted on January 31, 2019. 2. It involves two questions - the first asks to find the optimal value function for an agent moving in a 4x3 gridworld using value iteration for different reward functions, and the second formulates an MDP for managing bicycle rentals between two locations. 3. The third question modifies the bicycle rental problem by allowing free movement of one bike between locations and adding a cost if over 10 bikes are kept at a location overnight.

Uploaded by

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

132 views6 pages

Assignment #3:: Group 15

1. This document is a group assignment from students Ajay Guru, Heet Sankesara, and Mahima Arora submitted on January 31, 2019. 2. It involves two questions - the first asks to find the optimal value function for an agent moving in a 4x3 gridworld using value iteration for different reward functions, and the second formulates an MDP for managing bicycle rentals between two locations. 3. The third question modifies the bicycle rental problem by allowing free movement of one bike between locations and adding a cost if over 10 bikes are kept at a location overnight.

Uploaded by

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Assignment #3:

Group 15

Ajay Guru(201651005)|Heet Sankesara(201651018)|Mahima Arora(201651055)

January 31, 2019

Assignment #3 (Group 15):Ajay Guru(201651005)|Heet Sankesara(201651018)|Mahima Arora(201651055)

Question 1
Suppose that an agent is situated in the 4x3 environment as shown in Figure . Beginning in the
start state, it must choose an action at each time step. The interaction with the environment
terminates when the agent reaches one of the goal states, marked +1 or -1. We assume that
the environment is fully observable, so that the agent always knows where it is. You may
decide to take following four actions in every state: Up, Down, Left and Right. However, the
environment is stochastic, that means the action that you take may not lead you to desired
state. Each action achieves the intended effect with probability 0.8, but the rest of the time,
the action moves the agent at right angles to the intended direction with equal probabilities.
Furthermore, if the agent bumps into a wall, it stays in the same square. The immediate reward
for moving to any state (s) except for the terminal states S+ is r(s)= -0.04. And the reward for
moving to terminal states is +1 and -1 respectively. Find the value function corresponding to
the optimal policy using value iteration. Find the value functions corresponding optimal policy
for the following: r(s)=-2 r(s)=0.1 r(s)=0.02 r(s)=1

Transition Function:
X
V ∗ (s) = max∀a T (s, a, s0 )[R(s, a, s0 ) + rV ∗ (s0 ) (1)
s0

s: Probability of Transition
V∗ (s) = 0
∗
W hileVn+1 6= Vn∗ :

∗ (s) = 0 0 0
P
∀sVn+1 s0 T (s, a, s )[R(s, a, s ) + rVn (s )]
Policy Extraction:
X
x∗ (s) = argmax T (s, a, s0 )[R(s, a, s0 ) + rV ∗ (s0 ) (2)
s0

Question 1 continued on next page. . . Page 2 of 6

Assignment #3 (Group 15):Ajay Guru(201651005)|Heet Sankesara(201651018)|Mahima Arora(201651055)

(a) r(s)=-2

(b) r(s)=0.1

(c) r(s)=0.02

Question 1 continued on next page. . . Page 3 of 6

Assignment #3 (Group 15):Ajay Guru(201651005)|Heet Sankesara(201651018)|Mahima Arora(201651055)

(d) r(s)=1

Question 2
[Gbike bicycle rental] You are managing two locations for Gbike. Each day, some number of
customers arrive at each location to rent bicycles. If you have a bike available, you rent it out
and earn INR 10 from Gbike. If you are out of bikes at that location, then the business is lost.
Bikes become available for renting the day after they are returned. To help ensure that cars
are available where they are needed, you can move them between the two locations overnight,
at a cost of INR 2 per bike moved.
Assumptions: Assume that the number of bikes requested and returned at each locations are
Poisson random variables. Expected numbers of rental requests are 3 and 4 and returns are 3
and 2 at the ﬁrst and second locations respectively. No more than 20 bikes can be parked at

Question 2 continued on next page. . . Page 4 of 6

Assignment #3 (Group 15):Ajay Guru(201651005)|Heet Sankesara(201651018)|Mahima Arora(201651055)

either of the locations. You may move maximum 5 bikes from one location to the other in one
night. Consider the discount rate to be 0.9.
Formulate the continuing ﬁnite MDP, where time steps are days, the state is the number of
cars at each location at the end of the day, and the actions are the net number of bikes moved
between the two locations overnight.

Question 3
Write a program for policy iteration and re-solve gbike bicycle rental problem with the following
changes. One of your employee at the ﬁrst location rides a bus home each night and lives
near the second location. She is happy to shuttle one bike to the second location for free.
Each additional bike still costs INR 2, as do all bikes moved in the other direction. In addition,
you have limited parking space at each location. If more than 10 bikes are kept overnight at a
location (after any moving of cars), then an additional cost of INR 4 must be incurred to use a
second parking lot (independent of how many cars are kept there).

Question 3 continued on next page. . . Page 5 of 6

Assignment #3 (Group 15):Ajay Guru(201651005)|Heet Sankesara(201651018)|Mahima Arora(201651055)

Optimal Policy:

Page 6 of 6

You might also like

AI 3000 / CS 5500: Reinforcement Learning Assignment 1: Problem 1: Markov Reward Process
No ratings yet
AI 3000 / CS 5500: Reinforcement Learning Assignment 1: Problem 1: Markov Reward Process
5 pages
Wa 1
No ratings yet
Wa 1
9 pages
CS 188 Fall 2018 Written HW4 Soln
No ratings yet
CS 188 Fall 2018 Written HW4 Soln
6 pages
RL Solution3
No ratings yet
RL Solution3
4 pages
RL Class Notes
No ratings yet
RL Class Notes
68 pages
A12 Spring2024
No ratings yet
A12 Spring2024
5 pages
Intro RL Paper Grock
No ratings yet
Intro RL Paper Grock
6 pages
RL 10 QUESTIONS FOR MID II Scheme of Evaluvation
No ratings yet
RL 10 QUESTIONS FOR MID II Scheme of Evaluvation
15 pages
RL-solution 4
No ratings yet
RL-solution 4
4 pages
Wa 2
No ratings yet
Wa 2
6 pages
Practice Assignment 6: Reinforcement Learning Prof. B. Ravindran
No ratings yet
Practice Assignment 6: Reinforcement Learning Prof. B. Ravindran
24 pages
AI Exam Prep for CS Students
No ratings yet
AI Exam Prep for CS Students
4 pages
RL Exam Tutti
No ratings yet
RL Exam Tutti
47 pages
Unit 5 - Policy Based
No ratings yet
Unit 5 - Policy Based
30 pages
RL 3
No ratings yet
RL 3
31 pages
Quiz2 Sol
No ratings yet
Quiz2 Sol
4 pages
Tutorial Questions (Annexure I) Que S-Tion No Questions Co BTL
No ratings yet
Tutorial Questions (Annexure I) Que S-Tion No Questions Co BTL
6 pages
New CZ3005 Module 4 - Markov Decision Process
No ratings yet
New CZ3005 Module 4 - Markov Decision Process
38 pages
HGTFHGFHTF
No ratings yet
HGTFHGFHTF
5 pages
CS 747, Autumn 2020: Week 4, Lecture 1: Shivaram Kalyanakrishnan
No ratings yet
CS 747, Autumn 2020: Week 4, Lecture 1: Shivaram Kalyanakrishnan
103 pages
Assignment 4
No ratings yet
Assignment 4
6 pages
RL-UNIT2 - RL Unit 2 RL-UNIT2 - RL Unit 2
No ratings yet
RL-UNIT2 - RL Unit 2 RL-UNIT2 - RL Unit 2
23 pages
RL - Exam2023 Solved
No ratings yet
RL - Exam2023 Solved
6 pages
AI 3000 / CS5500: Reinforcement Learning Exam 1: Instructions
0% (1)
AI 3000 / CS5500: Reinforcement Learning Exam 1: Instructions
4 pages
ML Exam 2016
No ratings yet
ML Exam 2016
5 pages
Reinforcement Learning: B.Tech., Last Year, Semester-Viii
No ratings yet
Reinforcement Learning: B.Tech., Last Year, Semester-Viii
32 pages
Reinforcement Learning Assignment
No ratings yet
Reinforcement Learning Assignment
4 pages
Reinforcement Learning: Foundations Exam
No ratings yet
Reinforcement Learning: Foundations Exam
42 pages
Reinforcement Learning Cheat Sheet: Return
No ratings yet
Reinforcement Learning Cheat Sheet: Return
7 pages
RL Paper Deepsk
No ratings yet
RL Paper Deepsk
4 pages
CS6700 RL 2024 Wa1
No ratings yet
CS6700 RL 2024 Wa1
7 pages
Experiment 3
No ratings yet
Experiment 3
6 pages
Reinforcement Learning Exercises
No ratings yet
Reinforcement Learning Exercises
10 pages
Policy Gradient Methods Guide
No ratings yet
Policy Gradient Methods Guide
28 pages
Assignment 2 AI 2025
No ratings yet
Assignment 2 AI 2025
2 pages
Assignment 12: Introduction To Machine Learning Prof. B. Ravindran
100% (1)
Assignment 12: Introduction To Machine Learning Prof. B. Ravindran
4 pages
Solution 3
No ratings yet
Solution 3
4 pages
DRL - AI309 - A - Assignment - 1 - F24 - GIKI
No ratings yet
DRL - AI309 - A - Assignment - 1 - F24 - GIKI
3 pages
CS 747, Autumn 2023: Lecture 6: Shivaram Kalyanakrishnan
No ratings yet
CS 747, Autumn 2023: Lecture 6: Shivaram Kalyanakrishnan
68 pages
Temporal Difference (TD) Learning: Slides Prepared by DR J Alamelu Mangai
No ratings yet
Temporal Difference (TD) Learning: Slides Prepared by DR J Alamelu Mangai
57 pages
19.5 Markov Decision Processes: Resolving Unbounded Expected Rewards
No ratings yet
19.5 Markov Decision Processes: Resolving Unbounded Expected Rewards
13 pages
Reinforcement Learning Exam
No ratings yet
Reinforcement Learning Exam
6 pages
Practice Problem Set 3 IE - 708 - MDP - July24
No ratings yet
Practice Problem Set 3 IE - 708 - MDP - July24
3 pages
SRE Report Merged
No ratings yet
SRE Report Merged
16 pages
Assignment 1
No ratings yet
Assignment 1
10 pages
Assignment #6
No ratings yet
Assignment #6
2 pages
3 - Chapter 9 Policy Gradient Methods
No ratings yet
3 - Chapter 9 Policy Gradient Methods
24 pages
MarkovDecisionProcesses Analysis
No ratings yet
MarkovDecisionProcesses Analysis
10 pages
RL Cheatsheet Quiz1
No ratings yet
RL Cheatsheet Quiz1
2 pages
DRL Homework 1
No ratings yet
DRL Homework 1
4 pages
This Study Resource Was: Assignment 1: Due On Sep.4, 2020 The Solution Is For Your Own Reference Only. Do Not CIRCULATE
No ratings yet
This Study Resource Was: Assignment 1: Due On Sep.4, 2020 The Solution Is For Your Own Reference Only. Do Not CIRCULATE
5 pages
CSE2530 Reinforcement Learning 2025 P1+2
No ratings yet
CSE2530 Reinforcement Learning 2025 P1+2
115 pages
RL 2021 22 Exam I
No ratings yet
RL 2021 22 Exam I
4 pages
Infinite Horizon Problems
No ratings yet
Infinite Horizon Problems
69 pages
DRL #4-5 - Introducing MDP and Dynamic Programming Solution
No ratings yet
DRL #4-5 - Introducing MDP and Dynamic Programming Solution
74 pages
OT April 2022
No ratings yet
OT April 2022
8 pages
Integer Program Revision
No ratings yet
Integer Program Revision
9 pages
Bits
No ratings yet
Bits
5 pages
Mid Solutions
No ratings yet
Mid Solutions
12 pages
Linux (Fedora or Slackware) CPU Scheduling
No ratings yet
Linux (Fedora or Slackware) CPU Scheduling
19 pages
Systers Mentorship App Overview
No ratings yet
Systers Mentorship App Overview
12 pages
Computational Physics Report: Divyesh Puri - 201652006
No ratings yet
Computational Physics Report: Divyesh Puri - 201652006
3 pages
Gas Laws Practice Problems
No ratings yet
Gas Laws Practice Problems
7 pages
Exercise Testing Protocol Using A Roller System For Manual Wheelchair Users With Spinal Cord Injury
No ratings yet
Exercise Testing Protocol Using A Roller System For Manual Wheelchair Users With Spinal Cord Injury
11 pages
Quick Breakfasts: Editors Picksrecipecollection
No ratings yet
Quick Breakfasts: Editors Picksrecipecollection
5 pages
Unit 4 SportsReporting
No ratings yet
Unit 4 SportsReporting
22 pages
Qatar Tourism
No ratings yet
Qatar Tourism
3 pages
Kinetic Theory
No ratings yet
Kinetic Theory
3 pages
Under - The - Moon AK
No ratings yet
Under - The - Moon AK
10 pages
Gasket Brochure
100% (1)
Gasket Brochure
8 pages
Mẫu
No ratings yet
Mẫu
1 page
Florence Student Housing Guide
No ratings yet
Florence Student Housing Guide
1 page
National Senior Certificate: Grade 12
No ratings yet
National Senior Certificate: Grade 12
19 pages
1111
No ratings yet
1111
34 pages
Key Account Management Presentation
100% (2)
Key Account Management Presentation
37 pages
(Ebook) Biomechanics: Principles and Practices by Donald R. Peterson, Joseph D. Bronzino ISBN 9781439870983, 1439870985 Digital Download
No ratings yet
(Ebook) Biomechanics: Principles and Practices by Donald R. Peterson, Joseph D. Bronzino ISBN 9781439870983, 1439870985 Digital Download
154 pages
Student Element Assignments List
No ratings yet
Student Element Assignments List
5 pages
TmForum ODA
No ratings yet
TmForum ODA
42 pages
Benthel Test Question in Eim
No ratings yet
Benthel Test Question in Eim
3 pages
12 Rao Elements 2004
No ratings yet
12 Rao Elements 2004
9 pages
80+ Sets of Puzzles BSC Publication PDF
100% (8)
80+ Sets of Puzzles BSC Publication PDF
123 pages
Math 1: Learning Philippine Money
No ratings yet
Math 1: Learning Philippine Money
7 pages
Assignment 2, Part 1: Improving Program Efficiency
No ratings yet
Assignment 2, Part 1: Improving Program Efficiency
2 pages
Dokumen - Pub - The Strategy Book 9781292264134 9781292264141 9781292264158
No ratings yet
Dokumen - Pub - The Strategy Book 9781292264134 9781292264141 9781292264158
304 pages
PhonePe Statement Jun2024 Jun2025
No ratings yet
PhonePe Statement Jun2024 Jun2025
24 pages
AW109SP QRH - Issue 2 - Rev.2
100% (1)
AW109SP QRH - Issue 2 - Rev.2
380 pages
Three Methods For Removing DRM From EPUB On Adobe Digital Editions
No ratings yet
Three Methods For Removing DRM From EPUB On Adobe Digital Editions
5 pages
Material Safety Data Sheet: Ref No: CSMSDS18 1. Manufacturer
No ratings yet
Material Safety Data Sheet: Ref No: CSMSDS18 1. Manufacturer
3 pages
e Bill
No ratings yet
e Bill
1 page
Sigma Marine Coatings Manual - Part66
No ratings yet
Sigma Marine Coatings Manual - Part66
2 pages
Flash On English For Tourism - Answer Key and Transcripts: Unit 1, Pp. 4-7
100% (1)
Flash On English For Tourism - Answer Key and Transcripts: Unit 1, Pp. 4-7
15 pages
D9R Hydraulic System
No ratings yet
D9R Hydraulic System
24 pages