0% found this document useful (0 votes)
25 views20 pages

ML 5

re

Uploaded by

Sathish Koppoju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views20 pages

ML 5

re

Uploaded by

Sathish Koppoju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

1

~ Cll\\·f'ORC t.. H(; f\[f [ LC l)/:11\III\I G1 ~)-


~1a /_I"! £'Pi,>-!; ,c, n7, 11 / l,, C1 ':J n ;, ~/I
i _, f,\,•,.,.j~, CL'tY'l, ' 11 I L· n"" n:2)'. i '5 Cl l'cccH:,ac/, - ba.sed I1achin~ ·
lr,1">lf)i1:_1 1ec/1()\1c,e ff) which an oaen{ lca':)ln.S -1-u h:,hQw, I
/nan envi~1on7'ent b'.:/ pc,-fotlY)icJ 1he ackons and Saf"j I
th!': 9tesulb of arh'ofl fu,i ('.(J,_Ch aoodacl:ion;the a"8enl:
1

:,e ts positive -l?ccclbod,) and "fur each bad ac Hon, -l-he


a~en-1: geb nualive -fC'edbQck o, penal~ .
• Jn r,
in-J;, cc-rne n t le a'}J() lnj , +he °.Jen t lea 'lt n8 au bima ~ C~
- I~ . U 5~ -fcecf 60 clcs wt Hroui. arl_j la befeJ cla,b, unl/ke ,
onsupewl.sed le,r:in rne· '. ' I
4. 51nce -the,-,,e f'5 no la be.le J da {c; J So +l.e a8en t 1'5 bourid
4:-o [ea._51 n b:J i 6 0t pe'1 r'e nee on~. ·
~ fl solveo a Specific ~pe oJ? p3tobferr) whe~e dec,'5(or,
I roa krc'J 15 Se 1ue nl-Ja./, and the 9oaf ,'5 lo(J- bM, Sue 1-i
1

as 3ar{}e-pla.:1rn1, -vobo ~cs) e k.


1 ~ The a-3en-! rn tc, ac-f-.s wi/-h -Ifie. env\9/onmen'tari) erplo,6 .

ft /,:J ;f:self? The p91ifY)a,ij 300 /of nn a3en-f. Ir, rc1n-Ki,ee'rnen{ i


• I [e()';)1n} () j ;6 to frnp"f Dve ../he pe'1 +fmnance b~ 3e ~ b·u
: / --!he mMi'rf\U.l'Yl pooi l:ive reWa~d.s.
~ The atf\ f !ea:11no wi th-lte pr-oce50 of hi/- and hra/, and
/ t,
based on ine evpe:)Jf~nce) it b9ln5 iu pertyrn +G_5 k ,'n
1

/ a be HcrvQ(J' !-fe0 ce, ~ Can So,:j -lh.a t ~ /2ein{i-,.c:e<Y1eri t-


1 l("o'71nlnJ rs a ~pe of rna__ch,'rle Iea'1nin3 me tfi0 d where_ an

I 1n l ~ ll;(r~n a9en t (coff)pde..- P' 0 ofQ(YJ) /n ~ciac6


t (,J ifh the '
eov,'"7onrncnt and l~ei._91n -f--o ac-l w1'l--h +hot- "+tow Gt
-----:----
b bi do 1eavn.s n,
1'-e rnovcmen-/: of his Q•rn~,
... \ ,,,,,
,

/
120 ° c
L
3 n . 0
O ~ ~c i n-to-r a
n>c n t kc:..,, n i11:.9<J · \ '\
\
I • ex.CLrnp e r I-' ~·~ca f tn ~c fl~~ ence and a 11 ·
I
I~, CL CoJt.e pa.~t ot
·1'1 t ( I

I ,., 15 ' the Concept of ,, e rn rCe ff)('() eo,'llflt

I 1' 1 agcni wo9t b


I
on
-tle31e we do no-t nee
cf -bv p9i e -i-DT051
V
_,, r
ran') "171'=' Clge n t j€
(J
-6 J
~ 1--h t_
, b5 Ou.9n ~y.~-Jt;ence t,o, ou
c-han.3~d as i t lea91n.s hOrn
(I
'

( anj human rn k'fvrnifon• ' f ,


;, ttarnpf~ :- Suppose rhe9te i 5 Q()
Al
a3 ent r9teSen
d JJ r Co,i/i:-,,d/
aoa. I ~8 -to -+r'n Th~ cllarnvn
1
_ __,__ 1 t r_
( Q ('()a_z_e envi9t0nrner, tJ Q'ld rifo
I The a3en~ rn lc-.Qc-15 pe1 ~rnr?J, /
u,'Jlt-h -lf.e cnvf'.lonrnen 1 b:j
aome ad'on s , and ba:k cl on -lhroe a c Hons, th e.- .S +de O 'f /
. -\he a~en { 3e./s dw.08e d,. arid ; 1:- a l.5 o '.hec<'.'.lves a ~ew<nr.d /
or penal~ ao -Feedback · t

* -rhe CZJen-/ dot~ +he-.se #nee .f/,1;Js ( ta. k e /


C'.oninue5
ach·on, chan3e 5 b te / ~c.main In +he Sarne B h-, bc'J ancl I
1
1d -f,-_c.dbod'i), a0j b'.! c:lor"j --/f,e.se acl-ron s, he. lea •n.s /
an d ~'1-p loic6 -+he en vh-on rnen t ·
/ .\Jf<' The a;ynt fearn5 l-1id t.o~i ad-fon.s lead fu po.s1 l:rve /
I ieeJba_ck 091 ~ewa'lzd and whaf a.cf-fon:s feacl~0 ri(~i°!-f-,e
/ -l'ced bu.ck pcnCL r~. A-s a. posf 6ve 9tec..xt~d, -!lie Q ~-n~ ~ .. f .:i
a. poslfiv,:c_ potn f, and as Q pcnol ~ , I ~ '-yb a neoa. /iv<'.'
I
point I
~s ~ ~ge~-~ntero~t l ~ ~_,
t.~ : -A<i ent•~ +ha { Can pl'.9icei~ /e,,:plo':le ~he envi,on~1
-eni and QCi upon il · ,
• rn~;"10n!D"n lo '. -A Si l:,w( Bon fn which on a~e()t f5°p,esrnl
()'1. SunQu()dcd 6~ 'ln e L, toe a55urne +he otoc~s6'c
envi~ onm(?,n L c.ohich ffi~an5 i h Ls ,-andif() in oa tu"1.e •
• Ac8mc) : ./kl-ions a,ie the move -l-uken 6~ an a:fnt ~th
In the cnv,~Onn"lent-
• 5 \:a. \:e o : 5 b_ b t:s o.. Si kdion '.}le h..n ne d ho -ll-ic C'()Vi ron~n -f.
a.f k~ i:zach ac~on -lo.ken bj -the °(/enf:
. ~CD'.19i_dC);-A f~edba..ck 9te-tu~ned-bo +he a3eo ~ hum~
c.n v,'1anrnen t -m eva{ua ~c -\he ac tltr'i of the a Jen b.
• pu Ir c~ t 2._: Po hc_j /5 a 5 l--ro.. -l-e3_<1 applied b~ I-he nf ~ lacr
-fhe_ nat ac Hon 6a5ed on the cu~')len -f sla ie- · .
• •~ I l:--15 e,cpecL,d long--lorn ,e-h.,,ned wi'+h the
I d15 coun \: -fa c lo ')1 ari d opp o5 i ~ e iu Hie vh© l -k, rn '>te w a.'ni

• (9-Vll lue U_! J 1:-- 15 mo ;s !:-~ S irn; la"' -tD H-e va k.ie , b~ -f 1 t
· ~a 1':"e.s oritc add1--1-iorw.l pm1_QrYJe k, a5 a cu,,Ln-l ac+ion {a).
fcie j ~-a i1Jye5 ~.f?ein~cernen-l lea'l<0111J :-
• ln Q L, +he a:5en-l r ~ no ~ in 5 ~ u.c b_.l a bou ~ ~he env~ onrncni ,
· and c.oha -l adilfl s oced -hi be- -b kt<o-
• \t IS based on the hd: and hta I proc<'.-55
I
, 1he agent -6 ~5 the r\tcYl ac tton arx=J clu,n<Je.5 5 k t:'..s
D.cco,dirij -hi tfie --ftced b'.1.cls o.f' -lhe p,cviau.s ac h"m -
• 1he Q3e nt mo~ :ielctd e [qrd ,Zwo.5td~ /
• 1he cnvi'71o n f'fll' n t lo 8t'oc has li c /ord the Q ~r f){ flc ~J-e-
1
~d~
--lo r
c·v IO':'\ e i l -1.o '1tt ac h-lo 9ecL 1he IYla )Id r()U m po.s i ti Ve_ ;
\ ~

1
\ ~e WQ9'.ci5· \ \
\
~~s ~~t ~ t l ~:-
1.he':lte. 0.9\e l'rtoin\.'.1 -tl-i.sv,e woQs +o lrff le<Ylen{ tein.ffircetfle
-n \'- - leo910l~ 1n HLJ tuh\c h o'.lie : ·

----------bQsc::d
\· \Jalue ..-bQsed t
1h e vatue·- opp~'OQch 1'5 a 60(.( t -6 ~nd The. cpbmal
voJu~ --tnc6~on) wh,ch ,5 the rna1lrnttrn v~fcte al Q3ttlte
u.nd~ an~ po\lc~. ihe,ie hl9le, ihe a3en t etpecb The
tong - ~er~ rr lu-rr) G ~ Onj 5ta_ bee5) uI)de_~ pohc~ rf1?) Ti-
2,. ~ - -ba.5ed - ~ ·
Polic~ ~bused app:,roach ts-6 ~nd 1t)e optin)al polrcj
-I;, r the, rn a.,. Imum -\'u. ~ '1 e.. 91 et1l a"1 els i.,,! H-iod "-'Sf cJ the
vatl.Le -funcbiuo. r() -rhi5 app:nOQC.h) ~e aaen ~ hte:.5-toafP_:,
fo_c ~ Cl. pol fC_'j -fha hfhe a_dim r~')l ~1N1cd \n <-'QC ~ 5 ~p h, lp-s
"to ('(lQ 1/_i m I ze -fhe futu'<e 9Lt CJ.Ju"1d · Tue po\tcj ~ txls!d OfPOQCf)
ha5 roam~ bo-t.'.Jpe5 °r polic:r
• De~r~15Hc__: 't~e some actions i5 p~CtiQced b~ 1he
poh c':J (-n) a!:- ari'j sl:c,Je ·
• s 6chas I:! c ln i1');5 po II c':J , (l,m6a b',l; ~ cl e-kv Nl inCCJ
---------
¼a p"-ocluceJ act-ion.

-- ---------
I
I
6· th)(~{!\
' -

ft)Or~ ('
'- _.. • \f)
"' bo~~·
I \i) ('.111"(}
·\f )(! ~
(Y H) r t!

~c•rl r&, \he!


r . r,)(I
I Sec[ opp-, oor
c-'r>lli°JIU()((]<'IJ (
. r))
I ~a I
a '/tt lv
oncl Ille Otjrl)i
I
t vp\o9u\s lhe (·nvl ')lonro .,n ~ lo (l"avn ,t · T~:11e /5 rlo pa.,t{cc;f(l(
Solt, l:io() f 11 u \,JD" i \ hrn {;1 th;s apf.Y'ocAi because --lhe rn1>lt/
.,,eP91 es C' n-la l7o() i 5 cfi ffo en { - ~r i"O cf-, er1"19t o nFnen +·
d:rn1.ri! 5 ~-f £.eJ.0~~-{ learni?J :-
lhe91e a9le -~u')l (Y)a((') af emcn-l:s of Qein~yeeme-n t
l eaH'l',nj, which o'.J1e 3rven b<'.low:
I, Po J;cj .
:,. Qec.C0.91d S~no.1
3 -Va [uc_=, tun c tio()

I

,'.i ·
Hocle f o.f +he
i) l?euJQ.'}td v! crw~o/1onrnen-l
of 91eln~rcemcn-l lea:,tf I
1he 300.I
15 cle~neJ bcJ -the '1\ewa:nd S~nal •Ab-- ea.ch sk te,
I --\-l,e erw):'Jlonrne.n-L Sencf.s Q rrnN'le ]1a le .srar.al -lo -liie. Lear() t I
- ~ a.ae n-1:. anJ the S~na I 15 l"na-.o() as a. 97Ma91.d tnJ. '
lhe.se '1('wc61d0 o.91'\jiven acw1clic) tu -lhe aood and I
bad a.c-h'ons --6 lren bo tl,e ~en f.. The 08enb Ma·,n i
o~ e c -H ve ;5 -!-o rn oXiN'l 1-z.e +he --fo k / n u.rnb e. r of re ward .5
---fur cf oocl a ctl"On 5. 1/i e o/leiffi'}i J .0~f\a / Ca. n cf.w.1f -Hie
pdr ca 'Su.c~ as i-f an a.c +ion 5ele C ~eel ~ 'the llf}mt
\eo.J5 ·--1-o (oC.O ')'le ~d, -Hien 11,e pol Jco /'f)Q~ C ha.ir -io 5.,: { t <e

o¼e'.}J ac l-fof\S fr, the -+;, \:ci91.e


6• ~5 V ~
•u €.
' Ct 1 -n.H) ~ t1
1

~
r _1·on ·, 1he vu lu.e =-Rinction Jaivt-5 f~r~
~ tfO(
/
'
~bout h; ~ood H,e sik -lion ·and action Q91e onr;./ ·
how much ~1ewaS'td an c~ent Co() e1tpec./:.. ,-fl .'Jlewa9ld
inclica be 5 -the frnlY)edio. ~e S1·3na I ~'1 each aood and. bQd
ad1onJ loh<'..'.:)te o.s a_ value ...funcl"lon 5peci~•c..s inc c/ood
sla 1-e onJ adlon ..-Iii, -the. ~-1:u.~c' · The vo..k~e f{..nc tl'or) J
d c:pe~5 Or) +he ~ ewo.91-d 0- 5) t0f t-hout )i<?LU0.91.d) ther~ I
J

Co~ Id De: f\o vat u.e The 8ocil of es l:!ma li(j va l<..1e5 i's hi
QC he"-e IYl o'.}Je o/le Wtt9tcfa. /

=l 4) Mode I:- 'The lasl-- eferner).f o.f '1einfu-rcernen4 · I


Ie.::etr1l(J 16 +he mode I, wh;ch mrrnrc5 f-he behoviou. y
o-7-1-he envi31.0nrnenf f-lith the help of f-he.rnodel,
One Can rnah'e ,nfe.31en.ce0 ab-Ju+ how f1.-e rnv;ronrnen~
I
I

G.Jfll be hove . Sctc.~ as) i!-u '5!--u tc and an ac~'Or\5

Q'}(e. 81ven, +hen o rnode I Caf) p,ted rd +he neY ~ 5 !-a ~E


and 91ewa~-
g. 1he rnocle I i's u.sed -?a'.)! plaMina, wh;d--, rncar,s ;{ prov /
-i Je a wo~ ---to take Q cour6e. of adion.s ~ Cof\51Jn;2J /
Cl I[ {,_ tu. "1e. 5,'i.Ltfrh"on 5 . bc .fo~ e Qd-ua Ilj e y_ pe '71/1".() cr,"W J

th o5e 51 ha bims · 1/.e opp91o o.c hes foi 5 olvirJ_'.f -#ie p_ L !


p'Jloblems wit-h l-he he(p· of? +he mode( a91e b:rrnecl
a~ -I-he Mocle 1- hi'5ecl oppvoach. Cornpoo/li ~eve~, ah
app91. ou.c.h w I f-hou-L usin8 a mock I rs ca I(,. J a rno de l-
-fre e app~ach.
~ ~es /Q:'.'.:-/~JTc eroen .j_ ~cm n,ZJ kJO'?l k? --
' 10 unclers-b ()cl -the Wo:Jtk'ina prnce 55 of +he R L, ~e
I oec J -lo (of)slci e91 -two rna·1n -thrn~s:
/ • fuv 1'.'lionr-oen-1 : I!:- con be an'.:J thi~ Su.ch a5 tL Yoo rn,

ma -z._ e , +o o l- ba U grou rid c: l-C •, .


• ~ : -ltr.. tn -b lirc/<!'.n t a. Wn-l foe h a_ 5 A I vo bo b
lcl '.s -1:ake an ~Yarnple O ~ o mo. ze erwi9ton f'()en t -tho+
-1-he a8rn t f\eecls to Cl<' plo:ne- Coi"\5/ cl eJi f-he be! ow ;rnc::r
I I e=..-tl

O'omond /
51 52. 53
e::. _,
S-5 sG
/

51 5g
f;·~e
V
eobot
5q sJO sl/ 51 l I

~091e rn r~.su rn
ln the above fma8e, -f-heaaen-!. {~ o-l {-1,q Ve3lj --l?i,-sl:
b\oc k of +he ma 'Z.~ - --rh e maze i5 cons; s tinCJ o ~ on 5 6
block, wh;ch fs Q waif, S~ a -h'9le p·, ~ -'aflc-4 5; Q d(ornond 1

bloc.k• '
1he aaenf COnf"\o!:- c31o5S +he Ss 6ioclS,a5 rt- (5 0. 5ol I J
wa fl • 1-f --Hie o.aen-l '1 eache.s +he 5y brO Ck, t-hen 8(' f the
-ti "1 e wo.,i cl; i .f i-l: :n ea c he-5 +he -1;,::n e pi i , fl.en 8t' L5 -I
~eLiJO'.nd pain-!., I~ can -lake --fuu91 ac/.-ron5: rnoveo(=>, move
clown, move lefl, anci fY1ove "'13hl •
-~---=--~

The aaen~ ca.n -laK'e an'j pa f-h to 91.eac1) to I-he f'.·na / \


~-·--~r
f:>oini, bu~ he neecls io roa ke f ~ fn posslb le tc CAJ e9i \
.st cps , 3-ippo.Se -H,e a.a en./. ( onsJc-Jet5 Hie (Xl Hi 5 ~ - 5 ~ _ \1
1 SI - 51- S3, So he wdl a~t- -/he fl -'reward porn b. \
i~e a8ent wfll hcJ-to ';)]errembe'.71 fhe p91ecedr'cj
steps -fhai it hC1.5 -l-akenio 31e..a_ch the. -final sb:-ep .
To rnerno YI z.e the S l:-ep, d_. ss18 r\5 I va fu.e -fu ea.c I, Q

p'.hev1rus cs be.p. Cons ,d e.,i +he be_ fow s ~e pt


I ·I

V= \ 'v :: I V-:: I 0 i atnoft:'


sl 51- s3 5'1
I , '

v~, I

h' 91C
'-S sG 51 Sl
\;
V=I
I sq Sto SIi
I ~ll
I
-
Wow, +he Qge..ni has S~rLce55n.t
u Pf[
J 5 Jm~c/ -¾e 91 .
I as:'\J" ins, the I vuJ,e hi ca ~ . • p ev,o<.J6 s4'p5
I , \J C p.9-?evtoC1o broc~- B~-l.
wdl -the Q gent do {f' he s/u ! whu-1.
V ,i 5 rnov't)q Ruf'() 1-h 6l I
whtc h has / vcdLte bfod~ on b L , V, I e oc ~}
~
d la...J'(afYl: i----,----i-----Ci(? s?· ( Olr) BI . I
the
I
6e..tvw O() 5ictc-,

eObot ;) 'v =- I
I
'v ::: I
1 I

~
5

5~ -
\J: \
55 56 -R~~
s&

-1
s1!__
' - --p-:--~~~ I
V:I
I
59
i-=--c-- 510
-t-=---t~ --1-~~,2. )- SIi I
L--- - - - - --- -~ I
,
II ~ . . ,ti\ ht c( ,~;(l rr( ,ll rrJ,,,Jrf1,"1 f11, lf,r 11 ;f'' ' 111
1., lu""

)I_..
, I
d t)01/
f I
, ... \
,,f) Lt r;' Y, r I,,,. /I Ir
11(1 1Ii , ( fit iJr I rJ I
I' r11 I/ ,1 1 I
1
If

' Vql,.,. :irl, 111(' ,,1~1v, ''/lf''l/'I' ht.,,,,,/ :,,,:J,),/,, t,,, It,,. 'Ir(,

,l) 'll"rll h 1(, ,-, lr()fl t,rJ()• (l.,,t ,. I,, '1111 111 ,,_,., (n'1')'UO/
t [ .J
1
11"'

c._, i \ r u tJ ,r-- I •n ,- [J
, J,.- 11 • <1, H l
r"'/' II.,.n I /I f1, , 'J ,JI IJ I', f'1,,,. , tI I (, r,, / lfl./

bet,:n ,.l '"'1""1 n (ii, r.-,(\(', , I /,,,, ·~' fi, {/'


°!1'.:' 13~ 11,non 1,1, '" li,Jr J :
1I")e ~_) I"" r rr()Clr') <~ qti d I ltJf I VYJ f , j() l1 . II/Jr ' , 1 r /I, ' r.,;;
• t,
lfl~ 11VJ 1r
,, /" 1
'1

-Ci~ick,n l?rcflc19td C,nt :J I e,.Jl,,Jt)f) )(, :i,,.. :Jr/// l'-' ;;/.,///

hence ii• ;5 rr,(lt"l'I a:) (j t1rJJu,rJ,, ~r,.,r;l;,,r, II i:., /1:.,:.,'/11/r.l

wr-lh ci::Jf'loffl)( r1103w1nrnJnr7 r~n,1 ,,;,,,,,.,/ !,, 1',/, ,,blr ff,,.


VO lue 5 ?{' o. cf,.r1,,1 <Jr 1 (h1,,bl ,,,r, g ) o I'~ : ,, , , :,,, r, I t/ j
!ncfu dlnj fhe \/a IlJ e of
1

p:;1,-.:, 11,J'J~~ {;lo I r-'1

-:>I~ 10 u. wuc] v f {c, h IrJ -IJ, 'J Ih,. ' 1(1 [ ,,. '",, ,, I;,,,, ,; '. , -~ 0
-rnlc, p9wwomrr,)c) n·:n Pr,,))1,,rw,,,r,-l ·11,o-l !Mr!:,,.,, ,,.,.,In,

-ron1VYcern "n -I f,..o9trJ 1n{f


1he k~:.J" re(l')e0 _:Ls u ;:_, er I ;,- I. r:;; ,. ~/(I"., I , ,,. Tif.I /-i,~ 0 :, , :

~ Ac lion pn,-Ki, rn eel b'j -!~,,,. u3enf I', 11rfr'i,.-,.,,,l .!1J 1: ',/'
1

-t' sic, !e 011ur,ecf bj pe,t,r0lr1J tt,,, ock'lr, i~ ':y''.

-'I<' 1 he 9Jewa,icl /.Fadk/<i 06 ioin,..) -~, "1/' ~ Qr;,;,,/ r:,,, ,' ~,,."

ac Hon , s 11
Q ''.

t1 dt 5 ( o UI') -Fa c-hr, 'i ~ C, a«> r0 o •


1
1
-f ( :
l he Be II ma ~ eTw. tfon Can be w v1b-kn a5 ',

\} ( ' ) = IY)Q )( [ Q { 5,a. ) ., '{ V( 5 I ) J


(A-1~ ~l(, I. l -b
Ve,s) :: volue {alcu la led a-l Ct 01:n·t1cu a9r po1n ·
I
R( s, Q) =- f2e t0a.91d o. -t: o.
r pa~ ifcu /a'.n j t,J:e 5 b'j pcrfurrri '
r- tn~ an ac Hon
'{ = 0 f5CC) ll()~ {a_cto:;,
V(5') =- !he vd c.e at t-h e ()91 eifi ou 5 s k tc ·

\n -the a bove ~ua tion' we


Q9\e -bki~ +he /YKJX of +he

cornple le vo.lueo be_~au.se ¾e G.ffen-l h;eo -fu .ffrd the


bP l::i'rno.l 5 olu ./{on alcao.85 ·
5o now 1 usi1} 6ellrnan e9w tfon, we d ll --fillcf value
+he
Q l- ea.c}, 5 6- be o ~ fhe 8,vc0 er,vi91onment, Cue (J), II fjt_o,r t

ho fYl -the bloc_~, whl c. h 1-,, r:>ex { -lu '/he --Ir, , e .f bfock,
tin 1:st bfock ~
8
' - ,

I V{s 3) max[Q(5,a) t YV(s')], he,e V{5'):c O because flie'.}ze 15


-c

1
(}O -41rf--hey 5 b t~ --f-o fY)DVf'.

V(s ::,") = rnax. [R Cs, a)] =-> VCs 3) :c rncix [, J::::, V{s 3) =-I.

1
.___. ----
-fin 2nd block:
V(s1)= YYJ,ar[ecs,ei)t f V{s')],he_'}ie -l=o.9 {Ids\
) 'J(5') =I) and r:2Cs)a)::o) hccause rhe~c ;s no -r~taJycl

I al- this s~a ~e-


VCs2-):: ma v. Co. qC 1)J :: > v€5) ~ (Y)a v(o .c,J ~ > V~s1.) ::: 0. 1
-
\J(s \J~ maxr12cs,o) ·1-YV(!>')J,he~IP ./ tl-'l(lci s\ V[:)')so-"l
an cl I<('-,, a) , o, bn avtic II, C' ')Je f.s no ; ,eudao/1 J D\ l-h·6 5j a·\t

a\so-
\J(s l ) ' ll"lo. Y [ o-9 ( o.9 )j =>'\/Cs 3) =- ma.'{ [o -31] =->V(.s 1)=0.81
-tu,_____
½~ block : _::___

~
\I ( 5 5) :- PY"\O. 1,{ e.{s) a)1 v( 51 ) ] , he'}Je "{ =0-9 Cle! s), vcs'J=O· 8 I,
arid Q(s,a) := o, b<'.cau.se the91e 1.s no ~e{J'.}a'1d a.+ +his s!aJc

a\ so.
VC s 5) = ma\( [ o.9 ( o,\l 1) J">\J ( .s 5) = m all [ o-8 1J =>\JC s5) o.13 ~
-fu-, 5H, bfod~ '.
----__:,.._-~-->

r
V(5 °I)= IYlO.Y 12Cs,a }t .; vc 5 1) ] I he'.ll e 1-= 0-"l Cleb-5\ vc s~=0-13
o.0cl ec :'>iO.) -:oO, because --lhe9J£? 15 no '.}lecDGl9id al- {h;5 s+a l:.e
also . ,
\J Cs q) , rfla t [ c, -9 ( o-1.3)J,,> Vt SY) ::o!YlC!. Y.. [o .'31J=N:S~) = 0·6b

C~ns\d e'.l'I fuc ~ rrna~


\J .o,i 1 \J o,q 'J I
I Olurnorid
~, 2

Vol~
s .56

V:.066
5Cf ~,o 5/l

t--.low, -!lie °z}e()-l has ¾9tee opl-bn.s -fo ~ve ; if he movt5 ·fo
' the ti.ue box'., +hen ~e wilt -feel o. bump J.f he movc5 -b ·\ he
l ~'9le p,· I,then he w;ll 5e { +he -1 )eWo.."-d. Gu l- he9'.e c,,c a ~ e
1

, to~incf onlu po:.d:ive ;)\("c.'.lo.'1d5, So ~'1 -!hrs I he wi 11 rv,ove -I 0


"fLU 0. ,ids on 1u- The romp re ~e 6IO d:. I; a[ ll.(..5 t.d II be (o IC uIcJ d
U '.l '"cl fhI 5 -l'o-rmu. fQ , ( ons f cJ e., -fhe befO w Irn a fr : ,

l
V.:. Q•'8f V=0-9 V:- I Oiotflof"\cl
~ 51 55 .C. <;

V=o-1 o ~

V=o '1 -f,·~ t


.sS- ...
,c,f,
s7 5'8
-
v ~o•66 v--(J.73 V: o ?} I \1·013
-
51 5/o SIi Sil
•~e 91.eto-'c~~.\:.
4p<f()lldes
Mll.9tl'iov Decl5ion ~~~~ : r \

f'-fo 91l1ov -~
---- · n 1 C<scd +u 10,rn o. I e
Dcrbloo ~,occss o':lt f 10 ,, , 0
,z 1

-\i)c "'leln-l~11cernen-l lw)1()i(J r~oblt>r{)S, If- -/he e()v'r,onrnen{ I


i~ lorop lc ~el':/ obnvC1ble, -lhe() f {,'.i tJnarnic Con he IYlode/ccf ·
Q,) U ni,i l'ov f9,oce:i5, I() NOP, the Qd('/l-l (on.s-!:an·li:i

l n\-1'., o.c k, <u I j-h {he en vi~ onMe n-l. ar, d Pe, "hi, rn a qc Hons_;
a~ eac~ ac\-!on, -I-he rnvi~onlY)~n-t o/tespr,Js onJ ~fcn<,a-k.s ·
Q netu s k-le · ·
I

MDP ~ -Ju de5ol be ~eenv19ton1Ylen{ {;, 1-1-ie Q {,,


1
and al/Y)os-l a II +lie QL PJiob klYl Can be --/?v,rnaltud u.srcJ 1

MDP.

Hop Con-k.in.s ~ -lup!e of---fuu_'1J e\crnen-ls ( S_,E I Pa, Qq):


• A-set of .(;rn[le Sb ~e 5
• A 5e t b ~ -Kn~{(_~ ktron5 -A
• f.?et0a'nJo 91ece ive- d afk~ hq nsi--tfontn_j h-u <Y) sb l~ S tu
I s-6,,-1: '5 ', ci'-l(' to ac ti·on a. '
I • p~o 6ab fl;~ Pa ·
j MOP uSt.S Ha.'1! lcovf»!ope,i'j, ond -6 be -t.b uncle, 51afl d10
I -1+, e No P, &Jc need -to lea,'> a6ou-l l f.
I
1}·1n-\~" 1 f'n,,·"\ ln,.,11\in,f /\~
,; ~
" "' Ill°''~1,,,n1 ·, ,n
~
I
~
(' CI r r'"") ',t
)
{ e-,r, ~\u,n ~ ft) C:' 1) \ f ("U'-) )f
\
r
\
, l'
l ' -(
I l\(/ '- 1

• 11 '~I n,~/11 11 l)f\) tJ n')1,· ())11\n, q ," ,t ,i tq Ill


CIP( \I It ,1 l1_\~ l " c\1) l ~ l1 li n\lnJ 1?p1 , \1 Cl'l \ lorn , '[l1 r ,·nr, 1\j i L//1 r cJ
aI~1tn d ~ni" n ~, 1· :

~ .J!.:.o.:~,n ind: -

i ~
[ro•'11Wb rs nr_) ,off poIre~ /1 t , 1\J °' i Ih,,,, , 1, / ,rr h r~ ," ,,.,,.(
-fu\ -If,~- -Ir- ,-, lp (YI{ d rl 1'.r·f'( ql(i',I)( t ' Ien ')(()/~i( 'rh,. I(f'() {XII (( I(11( fn,nrc
ln,~n,,~ "1\ 1•\ho.~.5 ct'>1c '.'or tADa~ o~ Cvropmir(} -lrmpo,ol~
Stt ({ CbSfv \.. p-=-itl·<~fctilJn~.

•II: lec11n.s the- vc1k1c - func/fon (9(s,a), c1.1h1 c ~ ro,•o()ti ho,o


\) Ol'd bo lc1 r(" Cle l for, I0. ' Cl. t Cl pc1':J\ ·ltc ub91 s l C( t e 5 ''. 11

• Jh2 ~~ lo,u {t~ ch,~ e~_lg-10_s t~ (0~1 -~0(/ (J- r,.O!.[Ji11:

~k~c~
5l 0. .\ e ./\ C4~O(') (2 <' u,a ~)l J '5 10 l C a c-l I on ( 5 .n e SA) '
--
...__ - -.._.. ..__ -
• S f'.H~5.(\ s L:1rH~$ r;) ' )J ;) lr. l e /\ ( ltun t2(" u..Y,(9L(_~ \.3 l c1lcoctfonJ
h i ~ on on . r o\ 1Clj ~\C'fY- ~Jui o,I C'I, .f-' f< ">1 ~ (} ce {c O. ~ ()1 () j
t,._')\ )\ C

rne ~ hue! - 1 h, o, 1 . Poli c~ (o() ho I M e {hoc~ 5 ~\cc b~ -lhe od101

.f'u, C\1ch 5 le, -le loh: le leu':>1nin3 u. 5\'(/ Q Sp ec l -He p o lf c5


• Tf, c d()Cl r of 5{\ Qs.f'\ 'b-l-o Co\ cula {_~ -lh e (:y TI ( s)o.) -I?or +he
s~frc~c.."d cu~91en -l p o\lc~ n ancJ al I pa,1015 o ~ (5-a).
• The moin ciif 4c9te () ce 6ek en () - [e a..,1ni~ and S AesA
,.\j o, d hlY) s +tw--l u n I: Ice Q-1ea'}>n lQ , the mu x: iIY)U ffJ
I5

"J\C c0ai:ll J -¼, -the n e\l -l S 6 le f "' I') o { '7ce<f-' i ~ eel --fov upd Q k'\'.j
the CSLvc,.\ 4-e \r, ~e -to61e .
• In ' ne LU o.clio () and ~e Cu o.'rtd a9te Se l~c -kJ usna
;)A(2S+\

the Sarn12 polr c.Cj, C>Jh,c_h has de k,fY1inecl +he O'\:Jif) a._l
ac.tion-
• ihe SA Q ';I A, 15 narne.d bffoU5e i~ use..s th::: 1..linh.ple
9
~C s)a,'"", s')a'). whe':h~,
5 ~ 0 Y1 9, n ci. I 5 6 le?
Cl ! Qyf j,na( a.c Hon
"' . .., e (J.'.)Q__,t_ d obs:_-rve.d wh ae i-o
(II l ,
1)LL) J(lJ 1-J
TT) e s-b-k.s
5 1 ~ a nd ot ~ - Ne_w s~ctlc-, Qc kon pui~-
Oeep q _Neu-r a__ f N e ti.DoYC Co ~NJ t -
• A.sI ihe'- ()Q N) C? s'-'-Clrje..s is J O&)N f5 Q ~-\ '
._J 4' cav() 1n9 u '51 ()y {\fr: tn al
rie --rw on:. . s. U __J

e ~(
1
Q 6ra S~u.-L~ spo.ce C:Ov)~onrnerrl 1d
'
6 e acno
t
J lld)j,nq
'\ \
c..o,
'
anci Ct,(Y)plex ta5lc-m cic?fi·ne 1
an~ llpcia le ()._ ~- table.
-1
.lo ,so\11e, 'Sue h o() fssu e, we (a f) use a O~ N al ml thm
3
rJJY\e91.e 1 tns a k d o~ cle~ninoJ a r0 tabl 4 -
1
e; neu. lYO ( nt iwo.9ik ,I
I ks -\h e ~ -va Im ~ r
app,ovirno.. -'
e11u1 ,, on a0 cl s 6. k
ac 1:1' /
1~ .9iffcunc~CTO) :- ,
h ~ Wfe~~nce
lernpmcd . I.'5 an i' I . . .
~ ent ea91nid f?..omenvironment
-H,
r
\ h 15
:o~ episodes wlff.
me an 5 -k_ mpora j 1, op
no p,io-r
C11++r~ence I Ir
[oowlec{qe of
-tar-.c!o
1he envivCMen t.
a.V ('()()Jel-h'ee o3i .
U<J'::lUOr f'Y vl 5 e:d fc: o91n •,n{J aprroa_c nL · v.ro u Ca·n Cons :d<'.91 lb (I'. a,n•,r:'I '
\?-ro ff') tvi Ct I o.nc.J e'l '1 o'n , ' . , ' '
. '
V
we' 11 d1:it Lt."iS 3 c.i\Qm i +hrns I TOLo) , roe i) o;d 10 ( ~). , .'

I• Gi am roa (y) : +he d1;,LO u l'l I- w le, A vo lcte bel:ween O and /.


~
1he f-iiaho1tl,e +he le55 cJOU O'.ll.12 J1scounti13 ,
vo.[u.e
• ~rn bd o_ CJL) _: -the C9t edi ~ ass~ nroen !: 'IO'.l'llo 61 e. A w \cte ,
bclween O on d I· lhe ~\1he'71 -the valcte--lhe rno')le C.:Jte~i i:-
(Jou Can , Q':>S~n ---l:-o ,-Ri,-!ne, bad:: shb and ac h·ms-
, ~ ·, +he lea:')in \r"cJ '1Ct ~c-*u;H'fluc_h ofthe l.''1')!,D'.)]
sh oct \d we accep--l oncl -le-, e'31-e ~9te a clJus~ ou r es-Hrna -k.s
7:0t-0a~s-A vafu~ be bee() Oand 1- A- hijhe,i valut a1u6,b
a ~1e s sNe ly, ace ep-1,nJ ,,,i'o~e of fu? <?'.)!'1 ar tJh; Ie a SY>a !16 ·
one a. clju S b Co() serni ~\/d'j Du! N):'). ~ N'la \c,, IYI0.'1 l Cons!,va~\c
9
1
to u')a.,tc/s -fh e acka,_ I VO [u.i,S ·
l"fl OveS

:t D~k~Cl; o. chi1 o':11 Ji~-k,ience


1 :r 1n vak,e
Con1~\~cncc lrJev vab t
1 1-. ·lh~
1
--- - t r uncc){otni..., assocro
One ( ornr<Yo n w llj - o cw set rx ' ? t"
1

wl Ih ClfJ l" s I imo le f5 .-lo 8ive o() In l11,1a / . wrl h,n wh!c t-i
P I t0
1
(J.0 1 ·Ih th e
1hr hue v'Q fu.e 1s e \(pee k<-l -lo -+a I, a '\'j
r91 oho bil; h.J wH I, c,'.)h tc h I~ Is e )( pec-k cl -lo f O 11 In iu this
\n~-e wo \. :) "- ch es Hrna -ks a ~ e Co Ik J Gn-fi c.J t/JU in +c,v\,Q I

c.?s h'mo. +es.


l:k \'in i 8on:- An N '/ { on-l;•J e()CI' ln-kwa I ~"I Sorn e po.Yamek, /
p rs o.n ln-kwal fho+ rs expec-kd w)#i ~oba6ild~ r.JY. I
-to Con~ln p~ I
1

!
I

The 12e:n fo-.c e rri en -1. l.eo.'1 ()jflj a. n d Supei vl .,e d lw.')1 ll f cJ
borh a:}tc2 -+he po.~J O e mach1ne leo.9t()A{)3 bu l- bo+h ½1J~5 I

I o ~ 1eOc}t n in j s Cdte -fo.'1 op pool be -to eac h tl th., 9J . 1 h" R L


a3erib ln~eMc~ w,'-1-h Hie env~'clnl'flenl.evplo9te r b, t<').k-e
ilC ti'ul\, and (f t -,eo.,a.!}ld ,._d. 1,Jhe'n20.S s'-lrvised lea . . nif"l_j
a.tjcr<i -ti,, m.i lecein ./:ofYl ~a h ~e feel c-la b.se t O'l d OG fhe
lxi 5 ! 5 o f #. e 't'ta inl~~ , pre d d- -fh~ Ou. tpuJ.
The d11 ~ +'.- ~.,nc.e -l-a 6'J:. be.-h.:,ee() QL o.nd 51.<feovi 5 e J
Lw~n;n j 15 Q;ven bdo uJ :

~ - ~ n~ , f
1
bel-wi..Q.n
,f-1everite - t ~emn,,nca ano s1.1pev111:••d
~: - - -
rrt.-e
· 1" re1rlovc.e...rn0r'lt
" ',.. • '-' Ynm1
•er. ' a nc,J s,,tperV,·sec>J learn!, balh are 1h~ ,
j_

10
pl'rt of tna~hine le(1'1!)ln~ , bui -tl1e -lta of /eomln<f are fo. Jrl1t,
--lo ea.~h olhtv •'The RL a~<nt inierMt 1A1rth -lhL environn1ent, e~p1ore \•!,tat<
o.J!'TTon ond Cje-1 re.r,uovde.d. WhL" O! supe,vl1rd lea,nl"'3 o\ o,t~hrris
\eo1'1l ·hor/'I ➔he la~\ed cbl,& o~d , on -lhe bruls of ,the. -l r.Jnlij
_JJi

predid --the oufy1'A·

'-· \ \ l:i!
1hi- ~L o.\io~lhrn \A.IOYtq
• ., ,.,.. \Iv hlA"'\
0
~ h-urria.n 'oY0--1 n lN u

'ma 1:J~a sOl"'1'-. d,W <ion


rrhoe_ f~ hO bl!)eft.d dl{o~d qs
, '"' ~~ p<olfid~ -lo -th'-
pyeqe11t 'l t a1n1r 'd
pt0<1iJJ a \~on·+'"''-""''
l--'

11.I prt1.1lol)1 -lvo.lnin~ ig


so -t:no.t ,t con predi'd
O

~ ;_ne \e_o.rnrn~ a~dlt. .{ht_ o,.Jp-i.t-r · ,

'Rl halpg +o foK"- dtds1'0~

--~ eq,UU\H~
k',J n /nvet rv1eT> t- ~ea,nln i ~
~~
a Is 'l.l9ed ; n l<obo+ m.virt1on , rc,bO ·S • w v, \,\) o!l<.ln~ ,j "18'1
e-lc . ..
RL ro1"\ ~e 'l.tqrd I 01 ll(}opfotl'. ('.'\onho\ ~uch Oi fal'tov~ proc.ti,

CldMl~1on c"o~wol P"'


1

"1cd1c,)Y'flt Y11m\m~lon, ctnd fl"lkop-lev pilo-l ,s

-~~ ~~MC! f~~jt nj t

R l L"(Jn be. 'ltl!d in Gorn~ plo~, 1 SVC.h OJ --tfc.-la c.- to,e I lrie.JS,
,etc .. ...

4.:_f he rYl \S"h ~ ~


RL con be. 1.l!ed fo, opllrr,\Jlnq -the chm1lcaJ feadJong

5. BtU'tntU :
RL ;,-, now 'lmd .fo t 6\J.'1'1n'-l1 s11ate~~ pannlnf
~· Ma~du~n.~ ~
Tn v1ov10 '\..IJi O '1ttON10b1 ,~ '(Y)Of"l'\laclur,n3 cofflrar) I a I +~ Y() bo~ VJt

ct<!ep rel nlo~ce r>')e.nt lea,nl '"a -to ptd:: 1oocli and puf --\h~Nl 11,
sorn~ conbJniv~ .
ri .,· '-l=1 rot\t{' sector:
,__ . ._
the cuvrerrll'l'JlJ ~ed in 1hL -fl~l"\Ce. sectth' fo , .eva.l'1Jtt11'rij
1:

-lvodin~ sbo-1:e.iiQJ ·

You might also like