9 Environment
9 Environment
• Nonstationary Environments
Generalization performance
Generalization performance
Generalization performance
Generalization performance
Generalization performance
Generalization performance
Only cats and dogs?
• Alexa
(‘Please turn off the coffee machine’ vs. ‘coffee machine off’)
Why?
Data Distribution
Why
Empirical Sample
Fixing it
• Validation set
(hold out separate data that is not used for training)
• Chernoff bound
( m
)
1 X
Pr l(f (xi ), yi )) E [l(f (x), y)] > ✏ exp 2m✏2
m
<latexit sha1_base64="V7Md1UqJmAHT9/mYEzz4sWjjeXc=">AAAQU3ictZdbb9s2FMeVrms7b13T7XEvxIxgyZAakbFhfdnQ5tIb1tZNkzaN5QaURNlMeFEpKrWn6gvufRj2WYYBO5QVW2SSPrkCbJHn/yN5xMshGaaMZnpj45+lK59d/fza9RtftL786ubXt5Zvf/Mqk7mKyH4kmVQHIc4Io4Lsa6oZOUgVwTxk5HV4smX016dEZVSKPT1JyYDjoaAJjbAG09Hyu6CngihURZAoHBV+WfASBVnOj4rHv/rlW47YarI6PqJr62gC/2voDgp2QkCgTCWtrU/WSvQbCkiaUSYFFGfkHWTHaaAAutNFfCa+7ZZHy+2Nzkb1oPMJv060vfrpHd3++s8gllHOidARw1nW9zdSPSiw0jRipGwFeUZSHJ3gIelDUmBOskFR9U2JVsASo0Qq+AmNKmuzRIF5lk14CCTHepS5mjFepPVzndwdFFSkuSYimjaU5AxpiUxHo5gqEmk2gQSOFAVfUTTC0MkahqPVWkH3GRn/gGKS4Jxp86aCmkHJEBYxqpvJOuglPiUZ0pQTKLW4BzyAzwN3JcsWWu9FHXhhj7dWHKse8bJloTx+bxZBuegv1yMiFeFVTyeKEhF/uj6o24I+EOR9JDmHRotgk4G8KcdlEaickcLv/EwgM32VCK0gcArJBKVKyqQqS8QpVVKYdVBU1rLvD/pBLrITmg6KIMUqEJKK2ABBmKCegVDb7wSohHZGCWVs1nAQ9LucD6Zu1T7OfC32mk6fqWSMeQorrl8bBsVObbEwRuAbG9DvVd5CYBXnPDWzvcHdnxstGL41lRl16F7DauHww+qkQe5ODRYUSSUZw2rS4LZmNgudr8wGuz03OvWKY1j2uSJWxTOj3QtjKnmzA6r8oud6gkU0qaLfYue4tXpJHmG2cNeHCqcjFCv8norhp1qgfczYoBibUd+SPKWMPMVa0Yhki/4cEwqRCX0QcqvQU0eDBceeZpgh70yQKFYDRZKi7ZdrThjiE24+dlz2uxBBGEl0PwjJkArYXhWelKYMandhAwd6agoUHY70wAx2CNvZCYF5ZdVJhSBqXmHAsBgyAnFo3VRUlQ5UZXOcgeNC5W1VbLXtT1nX5axJ9c+ogUNFTSoozrCgdDhhcR9m3AeHw01uhn0wvZCd64HdXQi3ZrTDsNh1W3z2bC4+c8XDw7l4WJrap+ODMp0nic1KQc7gBE5uBo4lx1Q43myapTkFIVVsum1u2/q2q7+09Zeu/sLWX7j6nq3vufoDW3/g6j1b77n6lq1vufpzW3/u6ge2fuDqT239qas/tvXHrv7E1p+4+htbf+PqD2394bk5Y+uH5yacrZ+bc/u2vu/qO7a+4+qPbP2Rq2vKYlpO3wTCQ3UivIA5njPHFzEhVqYaeF1SCSjHZ8BZDRCkJIsTHBHno8L52tmZLjTYhJFMicJawgk92CZw21BmMxg9r60/QhxQQ47HdVkFe/g0X36Mp8LmIX8pnw0bsMlcSmo1ByF9KRfJ0zm4BZlLSXCMzlGT4/QP8hEeDjANHnI1v+B90wwanGk/xV3F3rc0hPJplDdXYr/TEe4US+DSVkHdGjIb5JG/DpfeWOpsHVX5ot0tS+cqE5qTXQtuv7571z2feNXt+Bsd/8VP7Xt363vwDe8773tv1fO9X7x73iOv5+17kfe399/StaXrt/669e/yleWrU/TKUl3mW896lm/+D/HMRus=</latexit>
I=1
• Why does it work?
• Validation set was never used for training
(often violated)
• Loss bounded within [0,1] (otherwise rescale)
Fixing it
• Web search
• Training - page relevance data for the US market
• Testing - recommend pages for Canada (UK, Australia)
• Speech recognition
• Training - West coast accent
• Testing - Southern drawl, Texan, non-native speaker
• Language
• Training - ‘James, bring me a soda’
• Testing - ‘John, bring me a ‘pop’ (or coke, etc.)
Covariate Shift
• Medical
• Training - University students + old men with prostate cancer
• Testing - Potentially sick old men
• Reinforcement Learning
• Training - Data gathered with current policy
• Testing - Environment reacting to updated policy
• Databases
• Training - DB tuned to 2017 usage pattern
• Testing - DB deployed on AWS in 2018
What is happening? q(x, y) = q(x)p(y|x) <latexit sha1_base64="5E2R2jkqH7cVtuPMC7cun7MakS0=">AAAP9nictZfdbts2FMfV7qOdt9XptrvdEDOCJUNgRMGG9WZAGydtWizfSZvGMgJKomwmFKlSVGJX0asMuxmG7WrPsTfY2+xQVmyRSXrlCBBEnv+P5CFFHpJ+wmiqlpf/u3f/o48/+fTBw88an3/x5aPm3OOvXqcikwE5DAQT8sjHKWGUk0NFFSNHiSQ49hl54591tP7mnMiUCn6gRgnpxbjPaUQDrMB0MvfNu4Xh0mgR/YIgsYiShdHlcPFkrrXcXi4fdD3hVomWUz07J48f/euFIshiwlXAcJp23eVE9XIsFQ0YKRpelpIEB2e4T7qQ5DgmaS8v3S/QPFhCFAkJL1eotNZL5DhO01HsAxljNUhtTRtv0rqZip70csqTTBEejBuKMoaUQHosUEglCRQbQQIHkoKvKBhgiQMFI9ZozKNnjAy/RyGJcMaU/lJO9bilCPMQVc2kbbSPz0mKFI0JlJrdAx5A98BdwdKZ1nvTAN444o15y6oGcdEw0Di80PO0mHXP1YAISeJypCNJCQ/vbgyqtmAMOLkIRBxDo7m3ykBeFcMi92TGSO62fyKQGX8KhOYROIVEhBIpRFSWJfycSsH1OshLa9F1e10v4+kZTXq5l2DpcUF5qAHPj9COhlDLbXuogHYGEWVs0rDndVfiuDd2q/Jx4mt+UHf6SiVDHCew4rqVoZevVxYDYwT6WIN+LfMGAqs4ixM922vcs6nRgKGviUipRe/UrAYOL5ZnNXJvbDCgQEjBGJajGteZ2Ax0ujJr7NrUaNXLT2HZZ5IYFU+M5igMqYjrA1DmZz3XI8yDURn9ZjvHjdVLsgCzmbvelzgZoFDiC8r7d7VAu5ixXj7Uf70j4oQysomVpAFJZ90dHQqRDn0QcsvQU0WDGceeepgh73SQyBc8SaK85RaLVhiKR7Hu7LDorkAEYSRSXc8nfcphe5V4VOgyqLWCPIhGlcmTtD9QPf2zfdjOzgjMK6NOyjmR0wo9hnmfEYhDS7qisrQnS5vljPRl6W1ZbKHljlnb5bROda+onkUFdcrLrzCvsDhucJcT7tLicJ2bYJd6FNJrI7C3B+FW/23fz/fsFre2puKWLR4fT8XjQtc+/j8oVVkUmazg5AqOcreEQxFjyi1vVvXSHIOQylftNtdMfc3W901939Z3TX3X1g9M/cDWn5v6c1vfMfUdW++YesfWt01929aPTP3I1jdNfdPWX5r6S1t/ZeqvbP2tqb+19Rem/uLanDH142sTztSvzblDUz+09XVTX7f1DVPfsHVFWUiL8ZdAeChPhDcwp1Pm9CbGx1JXA59bKgHl9Aq4qgGClGBhhANidcqfrp318UKDTRiJhEisBJzQvTUCtw2pN4PBdmX9AeKA7Md4WJWVsIeP88WHeMpNHvK38mm/BuvMraSSUxDSt3KBOJ+CHcjcSoJjdIrqXEzfkw/wcICp8ZCr+Bnvm/qnwZn2Lu4q5r6lIJSPozxsH3AWb3N7ikVwaSuhlQrSG+SJu4Q8FgqVLqEyn7dWisK6yvj6ZNeA269r33WvJ16vtN3ltrv7Y+vpk+oe/ND51vnOWXBc52fnqbPh7DiHTuC8d353/nL+bg6bvzX/aP45Ru/fq8p87RhP85//ASxrJyI=</latexit>
Training data
• Training Risk
Z Z
minimize dx p(x) dy p(y|x)l(f (x, w), y)
w
m
X
1
or rather minimize l(f (xi , w), yi )
w m
<latexit sha1_base64="Na2UH/8m86g+n6zJy0/XyYdPtbs=">AAAQaXictZdbb9s2FMed7tZ5W+usL8MuADEjXTIERhRsWF8GtLn0hrVN06RNY3kGJVE2E140ikrsKXred9s32GcY9h12KCmxyCR9cgUYJs//x8P7IRkkjKZ6be2fhRsffPjRx5/c/LT92edf3LrdWfzydSozFZL9UDKpDgKcEkYF2ddUM3KQKIJ5wMib4HjT6G9OiEqpFHt6mpABxyNBYxpiDaZh56+7yOdU0OEp8qnQKJogfxUly5OVOj+t8tMzsORsOV6erJ6urKLpSoF8H/maTHSOpEIK6zFRqEB32xcOY4XD3CtyDmya8WH+5Fev+J3Xfoa08jSk4GvY6a711soPXU54daLbqr+d4eKtv/1IhhknQocMp2nfW0v0IMdK05CRou1nKUlweIxHpA9JgTlJB3k5YgVaAkuEYmh3LKGXpbVZIsc8Tac8AJJDx1JXM8artH6m43uDnIok00SEVUVxxpCWyAw/iqgioWZTSOBQUWgrCscYxknDJLXbS+gBI5MfUERinDFt/mEszVSlCIsI1dWkPfQKn5AUacoJlJrfBy2A7kFzJUvn6veqAbxyxNtLjlWPedG2UB6dmq1RzLvnsIKlIrwc6VhRIqL3NwZ1XTAGgpyGknOoNPc3GMgbclLkvsoYyb3ezwQy1V+B0BKCRiEZo0RJGZdliTihSgqzD/LSWvS9Qd/PRHpMk0HuJ1j5QlIRGcAPYrRjINT1ej4qoJ5xTBm7qNj3++ucD6pm1W28aGu+12z0uUommCew4/q1YZBv1xYLYwT62IB+K/MWArs444lZ7Q3uwcxowdDXRKbUoXcaVguHH1bHDXK3MlhQKJVkDKtpg9u8sFnobGc22K2Z0fErjmDbZ4pYji+M9ihMqOTNASjz817rMRbhtIx+813j1u4lWYjZ3Js+UjgZo0jhUypG72uD9jFjg3xiZn1T8oQy8gxrRUOSzrs7JhQiE/og5Jahp44Gc449zTBD/jBBIl/2FYnzrlesOGGIT7np7KTor0MEYSTWfT8gIyrgeFV4WpgyqLuOfIhGtclXdDTWAzPZARxnxwTWleWTCkHUzKHPsBgxAnFo1TgqS/uqtDmNUYEqW1sWW+56Fes2OW1S/XNq4FBhk/Lzc8wvHE5Y3NkFd+ZwuMldYGdmFNJLI7C7C+HWzHYQ5Ltujc+fz8Tnrnh4OBMPC+O9mh+U6iyObVYKcg7HcPkycCQ5psJpzYbZmhUIqXzDrXPL1rdc/ZWtv3L1l7b+0tX3bH3P1R/a+kNX37H1HVfftPVNV39h6y9c/cDWD1z9ma0/c/Untv7E1Z/a+lNXf2vrb139ka0/urRmbP3w0oKz9Utrbt/W911929a3Xf2xrT92dU1ZRIvqn0B4KG+EVzBHM+boKibAyriBv2ucgHJ0Dpx7gCAlWRTjkDidCmZ7Z7vaaHAII5kQeN9IuKH7WwReG8ocBuMXtfVHiANqxPGkLqvgDK/yxbt4Kmwe8tfy6agBm8y1pFYzENLXcqE8mYGbkLmWNK+5GWpynP5J3sHDBabBQ67m53xumkmDO+37eKvY55aGUF5FeTg+4C7eE+4Si+HRVkLrNWQOyKG3inwWSZ2uojKfd9eLwnnKBOZm14bXr+e+dS8nXq/3vLWe9/Kn7v179Tv4Zuub1vet5ZbX+qV1v/W4tdPab4Wt/xbuLHy78N3tfzuLna86X1fojYW6zJ2W9XW6/wPwiEtF</latexit>
I=1
• Test Risk is different
Z Test data Z
dx q(x) dy p(y|x)l(f (x, w), y)
<latexit sha1_base64="RkOBIDPx9KZCo3emC4AW/allTNY=">AAAQFnictZfLbtw2FIaV9JJ02sZOu+yG6MDouDAGlpGiWSa+JE5Q3+3E8WhgUBI1Q5siFYqyZyrrPYo+TNFNUTSrbvs2PZRkj0jbWY0FCCLP/5E8pMhD0k8YTdXi4n/37n/y6WefP3j4RevLr75+NDP7+Js3qchkQA4CwYQ89HFKGOXkQFHFyGEiCY59Rt76pytaf3tGZEoF31fjhPRjPOA0ogFWYDqefeJRrlA4Qt4Cyt93RvMFqixjbUk644vRPMpZJ+qMFs7nF9AYgNbxbHuxu1g+6HrCrRNtp362jx8/+uCFIshiwlXAcJr23MVE9XMsFQ0YKVpelpIEB6d4QHqQ5DgmaT8vu1egObCEKBISXnCttDZL5DhO03HsAxljNUxtTRtv0nqZip72c8qTTBEeVA1FGUNKID1WKKSSBIqNIYEDScFXFAyxxIGCEW215tBzRkY/oJBEOGNKfymnelxThHmI6mbSLtrDZyRFisYESk3vAQ+ge+CuYOlU671pAG8c8dacZVXDuGgZaBye63lcTLvnakiEJHE50pGkhId3NwZ1WzAGnJwHIo6h0dxbZiAvi1GRezJjJHe7PxHIVJ8CoTkETiERoUQKEZVlCT+jUnC9DvLSWvTcfs/LeHpKk37uJVh6XFAeasDzI7StIdR2ux4qoJ1hRBm7atjzektx3K/cqn288jXfbzp9qZIRjhNYcb3a0M/XaouBMQJ9bEC/lHkDgVWcxYme7Q3u+cRowNDXRKTUorcbVgOHF8vTBrlbGQwoEFIwhuW4wa1c2Qx0sjIb7OrEaNXLT2DZZ5IYFV8ZzVEYURE3B6DMT3uuR5gH4zL6TXeOG6uXZAFmU3d9IHEyRKHE55QP7mqB9jBj/Xyk//qKiBPKyAZWkgYknXZ3dChEOvRByC1DTx0Nphx7mmGGvNdBIu94kkR52y3mrTAUj2Pd2VHRW4IIwkikep5PBpTD9irxuNBlUHsJeRCNapMn6WCo+vpn+7CdnRKYV0adlHMiJxV6DPMBIxCHFnRFZWlPljbLGenL0tuyWKftVqztctqkepdU36KCJuXll5hXWBw3uIsr7sLicJO7wi70KKTXRmB3F8Kt/tu+n+/aLW5uTsRNWzw6mohHha69+j8oVVkUmazg5BKOcreEQxFjyi1vlvXSrEBI5ct2m6umvmrre6a+Z+s7pr5j6/umvm/rL0z9ha1vm/q2ra+Y+oqtb5n6lq0fmvqhrW+Y+oatvzL1V7b+2tRf2/o7U39n6y9N/eW1OWPqR9cmnKlfm3MHpn5g62umvmbr66a+buuKspAW1ZdAeChPhDcwJxPm5CbGx1JXA59bKgHl5BK4rAGClGBhhANidcqfrJ21aqHBJoxEQiRWAk7o3iqB24bUm8Fwq7b+CHFADmI8qstK2MOrfPExnnKTh/ytfDpowDpzK6nkBIT0rVwgzibgCmRuJcExOkF1Lqa/ko/wcIBp8JCr+Snvm/qnwZn2Lu4q5r6lIJRXUR62DziLd7k9xSK4tJXQUg3pDfLYXUAeC4VK4R6s83l7qSisq4yvT3b69uvad93riTdLXXex6+48aT97Wt+DHzrfOd87Hcd1fnaeOevOtnPgBM7vzp/OP86Hmd9m/pj5a+bvCr1/ry7zrWM8M//+D4D+Mms=</latexit>
Fixing it (covariate shift correction)
• Basic algebra
Z Z Z
q(x)
dx q(x)f (x) = dx p(x) f (x) = dx p(x)↵(x)f (x)
p(x)
| {z }
<latexit sha1_base64="zySECyHE+10MKQRamhil0zKZQW8=">AAAQW3ictZdbb9s2FMeVdt06r2vdDXvaCzEjWDoERhRsWF8GtLm0abE29zaNaQSURNlMKFKlqNSeoi+4b7CHYV9lh7Ici0zSJ0eALfL8fySPeDkkg5SzTK+s/LNw5+4X97786v7XrW8efPvwUfvxd+8ymauQHoaSS3UUkIxyJuihZprTo1RRkgScvg/O1o3+/pyqjElxoMcp7SdkIFjMQqLBdNIeYyY0ikYIL6OPS6MnKDZ/f6BWw54aE85FRFWgSEgLHMOrMHhZGLEsTwpMeDokJnNzFVOkIk7anZXuSvWgqwm/TnS8+tk5efzwbxzJME+o0CEnWdbzV1LdL4jSLOS0bOE8oykJz8iA9iApSEKzflH1UYkWwRKhWCr4gVuVtVmiIEmWjZMAyIToYeZqxnid1st1/LRfMJHmmopw0lCcc6QlMh2OIqZoqPkYEiRUDHxF4ZBAB2oYllZrET3ndPQzimhMcq7NmwlmBidDRESobibron1yTjOkWUKh1Pwe8AA+D9yVPJtrvdd14LU93lp0rHqYlC0LTaJPZjGU8/5yPaRS0aTq6VgxKqLb64O6LegDQT+FMkmg0QKvcZDX5KgssMo5LfzubxQyk1eJ0CICp5CMUaqkjKuyVJwzJYVZB0VlLXt+vwcLNDtjab/AKVFYSAbrFQAcxGjHQKjjdzEqoZ1hzDi/bBjj3mqS9Cdu1T5e+locNJ2eqnREkhRWXK829IvN2mJhnMI3NqA/q7yFwCrOk9TM9gb3fGa0YPjWVGbMoXcaVguHH1FnDXJvYrCgUCrJOVHjBrd+abPQ2cpssBszo1OvOIVlnytqVXxptHthxGTS7IAqP++5HhMRjqvoN985bq1emoeEz931gSLpEEWKfGJicFsLtEc47xcjM+rrMkkZp2+IViyk2bw/x4RCZEIfhNwq9NTRYM6xpxlm6EcTJIolrGhcdPzyiROGknFiPnZU9lYhgnAa6x4O6IAJ2F4VGZemDOqsIgzRqDZhxQZD3TeDbY4FZxTmlVUnE4KqWYWYEzHgFOLQsqmoKo1VZXOcgWNG5W1VbKnjT1jX5axJ9aZU36HCJoWLKYZLhxMWd3HJXTgcaXKX2IXphexKD+ztQbg1ox0ExZ7b4tu3M/GtKx4fz8Tj0tQ+GR+U6TyObVYKOoXjwq/gSCaECcebNbM0JyCkijW3zQ1b33D1fVvfd/VdW9919QNbP3D1F7b+wtV3bH3H1ddtfd3Vt21929WPbP3I1d/Y+htXf2Xrr1z9ta2/dvUPtv7B1V/a+ssrc8bWj69MOFu/MucObf3Q1TdtfdPVt2x9y9U14xErJ2+4RLDqRHgNczpjTq9jAqJMNfC6oRJQTqfAtAYIUpJHMVxfnI8KZmtnc7LQYBNGMqWKaAkndLxB4bahzGYw3K6tv0AcUIOEjOqyCvbwSb78HM+EzUP+Rj4bNGCTuZHUagZC+kYulOczcB0yN5LgGJuhJpewv+hneDjANHjI1fyc900zaHCmvY27ir1vaQjlkygP2wecxbvCnWIxXNoqaLWGzAZ54i8jzCOps2VU5YvOalk6V5nAnOxacPv13bvu1cS71a6/0vV3f+08e1rfg+97P3o/eUue7/3uPfO2vB3v0Au9/xbuLTxcePTo3/bddqv9YILeWajLfO9ZT/uH/wGAiUTW</latexit>
↵(x)
courses.d2l.ai/berkeley-stat-157
Training set
dog
cat
dog
dog
courses.d2l.ai/berkeley-stat-157
Test set
dog
cat
dog
dog
courses.d2l.ai/berkeley-stat-157
The classifier might perform a lot
worse during test time
courses.d2l.ai/berkeley-stat-157
Why?
courses.d2l.ai/berkeley-stat-157
Simple regression problem
+ + +
covariates
train
eval train
courses.d2l.ai/berkeley-stat-157
Simple regression problem
+
+ + +
covariates
train
eval train
courses.d2l.ai/berkeley-stat-157
Simple regression problem
+
+
+
+ + +
covariates
train
eval train
courses.d2l.ai/berkeley-stat-157
Simple regression problem
+
+
Optimal estimate not
+
in class of linear functions
+ + +
covariates
train
eval train
courses.d2l.ai/berkeley-stat-157
Simple regression problem
+
+
Optimal estimate not
+
in class of linear functions
+ + +
covariates
train
eval train
courses.d2l.ai/berkeley-stat-157
Training error may be misleading (e.g. faces)
≫
Train on IMDB
Test on weird prof
courses.d2l.ai/berkeley-stat-157
No Protection against Bias Theorem
0
probability mass
courses.d2l.ai/berkeley-stat-157
TL;DR Testing where we have
insufficient amounts of training
data may yield strange results.
courses.d2l.ai/berkeley-stat-157
Logistic Regression
REGRE
SSION
Recall - Multiclass Classification
∑
−log p(y | o) = log exp(oi) − oy
i
courses.d2l.ai/berkeley-stat-157
Two Classes
• Classes 1 and -1
exp(o1)
p(y = 1 | o) = softmax(o) =
exp(o−1) + exp(o1)
• Shift invariance oi ← oi + c
exp(o1 + c) exp(o1)
p(y = 1 | o) = =
exp(o−1 + c) + exp(o1 + c) exp(o−1) + exp(o1)
courses.d2l.ai/berkeley-stat-157
Two Classes
• Choose o−1 = 0
exp(o1) 1
p(y = 1 | o) = =
exp(0) + exp(o1) 1 + exp(−o1)
• Negative log-likelihood
courses.d2l.ai/berkeley-stat-157
lim log(1 + exp(−x)) = log 1 = 0
x→∞
lim log(1 + exp(−x)) + x = lim log(1 + exp(x)) = 0
x→−∞ x→−∞
courses.d2l.ai/berkeley-stat-157
Logistic Regression Summary
• Data
(xi, yi) where xi ∈ 𝒳 and yi ∈ {± 1}
• Objective
m
∑
minimize − log(1 + exp(−yi f(xi, w))) + penalty(w)
w
i=1
• Propensity scoring
Z Z Z
q(x)
dx q(x)f (x) = dx p(x) f (x) = dx p(x)↵(x)f (x)
p(x)
| {z }
<latexit sha1_base64="zySECyHE+10MKQRamhil0zKZQW8=">AAAQW3ictZdbb9s2FMeVdt06r2vdDXvaCzEjWDoERhRsWF8GtLm0abE29zaNaQSURNlMKFKlqNSeoi+4b7CHYV9lh7Ici0zSJ0eALfL8fySPeDkkg5SzTK+s/LNw5+4X97786v7XrW8efPvwUfvxd+8ymauQHoaSS3UUkIxyJuihZprTo1RRkgScvg/O1o3+/pyqjElxoMcp7SdkIFjMQqLBdNIeYyY0ikYIL6OPS6MnKDZ/f6BWw54aE85FRFWgSEgLHMOrMHhZGLEsTwpMeDokJnNzFVOkIk7anZXuSvWgqwm/TnS8+tk5efzwbxzJME+o0CEnWdbzV1LdL4jSLOS0bOE8oykJz8iA9iApSEKzflH1UYkWwRKhWCr4gVuVtVmiIEmWjZMAyIToYeZqxnid1st1/LRfMJHmmopw0lCcc6QlMh2OIqZoqPkYEiRUDHxF4ZBAB2oYllZrET3ndPQzimhMcq7NmwlmBidDRESobibron1yTjOkWUKh1Pwe8AA+D9yVPJtrvdd14LU93lp0rHqYlC0LTaJPZjGU8/5yPaRS0aTq6VgxKqLb64O6LegDQT+FMkmg0QKvcZDX5KgssMo5LfzubxQyk1eJ0CICp5CMUaqkjKuyVJwzJYVZB0VlLXt+vwcLNDtjab/AKVFYSAbrFQAcxGjHQKjjdzEqoZ1hzDi/bBjj3mqS9Cdu1T5e+locNJ2eqnREkhRWXK829IvN2mJhnMI3NqA/q7yFwCrOk9TM9gb3fGa0YPjWVGbMoXcaVguHH1FnDXJvYrCgUCrJOVHjBrd+abPQ2cpssBszo1OvOIVlnytqVXxptHthxGTS7IAqP++5HhMRjqvoN985bq1emoeEz931gSLpEEWKfGJicFsLtEc47xcjM+rrMkkZp2+IViyk2bw/x4RCZEIfhNwq9NTRYM6xpxlm6EcTJIolrGhcdPzyiROGknFiPnZU9lYhgnAa6x4O6IAJ2F4VGZemDOqsIgzRqDZhxQZD3TeDbY4FZxTmlVUnE4KqWYWYEzHgFOLQsqmoKo1VZXOcgWNG5W1VbKnjT1jX5axJ9aZU36HCJoWLKYZLhxMWd3HJXTgcaXKX2IXphexKD+ztQbg1ox0ExZ7b4tu3M/GtKx4fz8Tj0tQ+GR+U6TyObVYKOoXjwq/gSCaECcebNbM0JyCkijW3zQ1b33D1fVvfd/VdW9919QNbP3D1F7b+wtV3bH3H1ddtfd3Vt21929WPbP3I1d/Y+htXf2Xrr1z9ta2/dvUPtv7B1V/a+ssrc8bWj69MOFu/MucObf3Q1TdtfdPVt2x9y9U14xErJ2+4RLDqRHgNczpjTq9jAqJMNfC6oRJQTqfAtAYIUpJHMVxfnI8KZmtnc7LQYBNGMqWKaAkndLxB4bahzGYw3K6tv0AcUIOEjOqyCvbwSb78HM+EzUP+Rj4bNGCTuZHUagZC+kYulOczcB0yN5LgGJuhJpewv+hneDjANHjI1fyc900zaHCmvY27ir1vaQjlkygP2wecxbvCnWIxXNoqaLWGzAZ54i8jzCOps2VU5YvOalk6V5nAnOxacPv13bvu1cS71a6/0vV3f+08e1rfg+97P3o/eUue7/3uPfO2vB3v0Au9/xbuLTxcePTo3/bddqv9YILeWajLfO9ZT/uH/wGAiUTW</latexit>
↵(x)
• Logistic regression
1
r(y = 1 | x) =
1 + exp(−f(x))
r(y = − 1 | x)
⟹ α(x) = = exp( f(x))
r(y = 1 | x)
Covariate Shift Correction Redux
∑ ∑
l(xi, yi, g(xi, w)) ⟶ exp( f(xi)) ⋅ l(xi, yi, g(xi, w))
i i
courses.d2l.ai/berkeley-stat-157
Label shift
Training set
Training set
Test set
Why would anyone do this?
Label Shift
• Medical diagnosis
• Train on data with few sick patients
• Test on data during flu season where q(flu) > p(flu)
while flu symptoms p(symptoms|flu) are still the same
• Speech recognition
• Train on newscast data before election
• Test on newscast after election (new topics, names,
discussions, but still same language)
Label Shift q(x, y) = q(y)p(x|y)
<latexit sha1_base64="PKJRzfFGJ7uhqerpUKcteq5+5Iw=">AAAP9nictZfdbts2FMfV7qOdt9XptrvdEDOCJUNgRMGG9WZAGydtWizfSZvGMgJKomwmFKlQVGJX0asMuxmG7WrPsTfY2+xQVmyRSXrlChBEnv+P5CFFHpJ+wmiqlpf/u3f/o48/+fTBw88an3/x5aPm3OOvXqcikwE5DAQT8sjHKWGUk0NFFSNHiSQ49hl54591tP7mgsiUCn6gRgnpxbjPaUQDrMB0MvfN+cJwabSIfkHnC/BJFoZXo8WTudZye7l80M2EWyVaTvXsnDx+9K8XiiCLCVcBw2nadZcT1cuxVDRgpGh4WUoSHJzhPulCkuOYpL28dL9A82AJUSQkvFyh0lovkeM4TUexD2SM1SC1NW28TetmKnrSyylPMkV4MG4oyhhSAumxQCGVJFBsBAkcSAq+omCAJQ4UjFijMY+eMTL8HoUkwhlT+ks51eOWIsxDVDWTttE+viApUjQmUGp2D3gA3QN3BUtnWu9tA3jriDfmLasaxEXDQOPwUs/TYtY9VwMiJInLkY4kJTz8cGNQtQVjwMllIOIYGs29VQbyqhgWuSczRnK3/ROBzPhTIDSPwCkkIpRIIaKyLOEXVAqu10FeWouu2+t6GU/PaNLLvQRLjwvKQw14foR2NIRabttDBbQziChjk4Y9r7sSx72xW5WPE1/zg7rT1yoZ4jiBFdetDL18vbIYGCPQxxr0a5k3EFjFWZzo2V7jnk2NBgx9TURKLXqnZjVweLE8q5F7Y4MBBUIKxrAc1bjOxGag05VZY9emRqtefgrLPpPEqHhiNEdhSEVcH4AyP+u5HmEejMroN9s5bqxekgWYzdz1vsTJAIUSX1Le/1ALtIsZ6+VD/dc7Ik4oI5tYSRqQdNbd0aEQ6dAHIbcMPVU0mHHsqYcZcq6DRL7gSRLlLbdYtMJQPIp1Z4dFdwUiCCOR6no+6VMO26vEo0KXQa0V5EE0qkyepP2B6umf7cN2dkZgXhl1Us6JnFboMcz7jEAcWtIVlaU9WdosZ6QvS2/LYgstd8zaLqd1qntN9SwqqFNefo15hcVxg7uacFcWh+vcBLvSo5DeGIG9PQi3+m/7fr5nt7i1NRW3bPH4eCoeF7r28f9BqcqiyGQFJ9dwlLslHIoYU255s6qX5hiEVL5qt7lm6mu2vm/q+7a+a+q7tn5g6ge2/tzUn9v6jqnv2HrH1Du2vm3q27Z+ZOpHtr5p6pu2/tLUX9r6K1N/ZetvTf2trb8w9Rc35oypH9+YcKZ+Y84dmvqhra+b+rqtb5j6hq0rykJajL8EwkN5IryFOZ0yp7cxPpa6GvjcUQkop9fAdQ0QpAQLIxwQq1P+dO2sjxcabMJIJERiJeCE7q0RuG1IvRkMtivrDxAHZD/Gw6qshD18nC/ex1Nu8pC/k0/7NVhn7iSVnIKQvpMLxMUU7EDmThIco1NU52L6jryHhwNMjYdcxc9439Q/Dc60H+KuYu5bCkL5OMrD9gFn8Ta3p1gEl7YSWqkgvUGeuEvIY6FQ6RIq83lrpSisq4yvT3YNuP269l33ZuL1Sttdbru7P7aePqnuwQ+db53vnAXHdX52njobzo5z6ATOO+d35y/n7+aw+Vvzj+afY/T+varM147xNP/5HzvuJyM=</latexit>
q(y)
• Need to reweight according to (y) = to get
<latexit sha1_base64="jEk70yAufNoRpzKpQW43rR85bi4=">AAAP/nictZfdbts2FMfV7qvztjrdLrcLYkawdCiMKNiw3gxo46RNi+U7adNYRkBJlM2EIlWKSuwpAoY9zLCbYdiu9gx7g73NDmU5FumkV44AW+T5/0geUuQh6SeMpmp5+b87d997/4MPP7r3ceOTTz+731x48PmrVGQyIIeBYEIe+TgljHJyqKhi5CiRBMc+I6/9s47WX58TmVLBD9QoIb0Y9zmNaIAVmE4WvvJ8ovDS6CH6EXmRxEH+FjJFnuj/k4XWcnu5fNBswq0SLad6dk4e3P/XC0WQxYSrgOE07brLierlWCoaMFI0vCwlCQ7OcJ90IclxTNJeXnajQItgCVEkJPy4QqW1XiLHcZqOYh/IGKtBamvaeJ3WzVT0uJdTnmSK8GDcUJQxpATSY4JCKkmg2AgSOJAUfEXBAMNgKBi5RmMRPWVk+A0KSYQzpvSbcqrHL0WYh6hqJm2jfXxOUqRoTKDU/B7wALoH7gqWzrXe6wbw2hFvLFpWNYiLhoHG4YWer8W8e64GREgSlyMdSUp4eHtjULUFY8DJRSDiGBrNvVUG8qoYFrknM0Zyt/09gcz4VSC0iMApJCKUSCGisizh51QKrtdBXlqLrtvrehlPz2jSy70ES48LykMNeH6EdjSEWm7bQwW0M4goY1cNe153JY57Y7cqH698zQ/qTk9UMsRxAiuuWxl6+XplMTBGoI816KcybyCwirM40bO9xj2dGg0Y+pqIlFr0Ts1q4PDD8qxG7o0NBhQIKRjDclTjOlc2A52uzBq7NjVa9fJTWPaZJEbFV0ZzFIZUxPUBKPPznusR5sGojH7znePG6iVZgNncXe9LnAxQKPEF5f3bWqBdzFgvH+qv3hFxQhnZxErSgKTz7o4OhUiHPgi5ZeiposGcY089zJC3OkjkS54kUd5yi4dWGIpHse7ssOiuQARhJFJd2Lz7lMP2KvGo0GVQawV5EI0qkydpf6B6+mP7sJ2dEZhXRp2UcyKnFXoM8z4jEIce6YrK0p4sbZYz0pelt2WxpZY7Zm2X0zrVnVA9iwrqlJdPMK+wOG5wl1fcpcXhOneFXepRSGdGYG8Pwq3+2r6f79ktbm1NxS1bPD6eiseFrn38fVCqsigyWcHJBI5yt4RDEWPKLW9W9dIcg5DKV+0210x9zdb3TX3f1ndNfdfWD0z9wNafmfozW98x9R1b75h6x9a3TX3b1o9M/cjWN01909ZfmPoLW39p6i9t/Y2pv7H156b+fGbOmPrxzIQz9Zk5d2jqh7a+burrtr5h6hu2rigLaTF+EwgP5YnwGuZ0ypxex/hY6mrgdUMloJxOgEkNEKQECyMcEKtT/nTtrI8XGmzCSCREYiXghO6tEbhtSL0ZDLYr67cQB2Q/xsOqrIQ9fJwv3sVTbvKQv5FP+zVYZ24klZyCkL6RC8T5FOxA5kYSHKNTVOdi+jN5Bw8HmBoPuYqf876pPxqcaW/jrmLuWwpC+TjKw/YBZ/E2t6dYBJe2ElqpIL1BnriPkMdCodJHqMznrZWisK4yvj7ZNeD269p33dnEq5W2u9x2d79rPXlc3YPvOV86XztLjuv84DxxNpwd59AJnF+d352/nL+bvzR/a/7R/HOM3r1TlfnCMZ7mP/8DqnArKg==</latexit>
p(y)
Z Z Z Z
q(y)
q(y) dy p(x|y) dx l(f (x), y) = p(y) dy p(x|y) dx l(f (x), y)
<latexit sha1_base64="/lAZ5768bh1egV6RA0Zln4Q/glI=">AAAQV3ictZfdbts2FMeVbusyd2uS7XI3xIxgzhAYUbBhvRnQxkmbFst30qaxjICSKJsJRaoUldhT9IJ7g2IPsx3Kii0ySbELV4At8vx/JA+/DkU/YTRVa2sf5x598eVXj7+e/6bx5Nvvni4sLn3/NhWZDMhJIJiQpz5OCaOcnCiqGDlNJMGxz8g7/7Kj9XdXRKZU8GM1Skgvxn1OIxpgBabzxcyjXKEPrdEK8lZROEJlPmkNbyrLUP+zVtQarqwisP2BGhWigUjiINeli1wbiv9Zyflic629Vj7obsKtEk2nevbPl57+7YUiyGLCVcBwmnbdtUT1ciwVDRgpGl6WkgQHl7hPupDkOCZpLy/Hp0DLYAlRJCT8wLHSWi+R4zhNR7EPZIzVILU1bbxP62YqetbLKU8yRXgwbijKGFIC6cFGIZUkUGwECRxICr6iYIBhzBRMSaOxjF4wMvwZhSTCGVP6TTnVE5MizENUNZO20RG+IilSNCZQanYPeADdA3cFS2da730DeO+IN5YtqxrERcNA4/Bab4Ri1j1XAyIkicuRjiQlPPx8Y1C1BWPAyXUg4hgazb0NBvKGGBa5JzNGcrf9G4HM+FUgtIzAKSQilEghorIs4VdUCq73QV5ai67b63oZTy9p0su9BEuPC8pDDXh+hPY1hJpu20MFtDOIKGOThj2vux7HvbFblY8TX/PjutO3KhniOIEd160MvXyrshgYI9DHGvRnmTcQ2MVZnOjVXuNeTI0GDH1NREoter9mNXD4YXlZIw/HBgMKhBSMYTmqcZ2JzUCnO7PGbk6NVr38ArZ9JolR8cRojsKQirg+AGV+1ms9wjwYldFvtmvc2L0kCzCbuet9iZMBCiW+prz/uTZoFzPWy4d61jsiTigjO1hJGpB01t3RoRDp0Achtww9VTSYceyphxnyQQeJvOVJEuVNt1ixwlA8inVnh0V3HSIII5Hqej7pUw7Hq8SjQpdBzXXkQTSqTJ6k/YHq6cn24Ti7JLCujDop50ROK/QY5n1GIA6t6orK0p4sbZYz0pelt2WxVtMds7bLaZ3q3lI9iwrqlJffYl5hcdzgbibcjcXhOjfBbvQopHdG4PAQwq2ebd/PD+0Wd3en4q4tnp1NxbNC1z6eH5SqLIpMVnByC0e5W8KhiDHlljcbemuOQUjlG3abm6a+aetHpn5k6wemfmDrx6Z+bOsvTf2lre+b+r6td0y9Y+t7pr5n66emfmrrO6a+Y+uvTf21rb8x9Te2/t7U39v6K1N/dWfNmPrZnQVn6nfW3Impn9j6lqlv2fq2qW/buqIspMX4TSA8lF+E9zAXU+biPsbHUlcDrwcqAeXiFritAYKUYGGEA2J1yp/una3xRoNDGImESKwEfKF7mwRuG1IfBoO9yvoLxAHZj/GwKivhDB/ni0/xlJs85B/k034N1pkHSSWnIKQf5AJxNQU7kHmQBMfoFNW5mP5FPsHDB0yNh1zFz/jc1JMG37Sf465inlsKQvk4ysPxAd/ibW4vsQgubSW0XkH6gDx3V5HHQqHSVVTm8+Z6UVhXGV9/2TXg9uvad927ibfrbXet7R782nz+rLoHzzs/Oj85Lcd1fneeO9vOvnPiBM4/c3NzjbknCx8X/l18vDg/Rh/NVWV+cIxncek/E9ZB4w==</latexit>
p(y)
Digital manipulation
to dodge recognition
• Modify data
slightly such as to
obtain wrong class
maximize l(f (x + ), y)
subject to k k ✏
<latexit sha1_base64="7QpLaHwGSBBYhTTwHPgnKzrYDOs=">AAAQNHictZfdbts2FMfV7qOdtzXtdrkbYka2dAuMKNiwXrb56BeWNk2TNo1pBJRE2UwoUqWoxJ6iPdawZ9gbDBh2Mwzb1Z5hh5ISi0zSK0eAYfL8fyQPj8hDMUg5y/TS0u/Xrr/3/gcf3rj5UefjTz69NXf7zmevMpmrkO6Ekku1G5CMcibojmaa091UUZIEnL4ODleN/vqIqoxJsa0nKR0kZChYzEKiwbR/ewsnZMz2cUS5Juhn9BXiC/HC+NvacHdxchdh3MGajnWR5cEBDTXSEpUA4pMawicIc/oWYZpmjJtOu0u9pepB5wt+U+h6zbO5f+fWbziSYZ5QoUNOsqzvL6V6UBClWchp2cF5RlMSHpIh7UNRkIRmg6KafInmwRKhWCr4CY0qa7tFQZIsmyQBkAnRo8zVjPEirZ/r+N6gYCLNNRVhPVCcczN9E0kUMQXR4BMokFAx8BWFI6JIqCHenc48esDp+GsU0ZjkXJt/JpiJeoaIiFAzTNZDL8kRzZBmCYVWs3vAA5geuCt5NtN+LwrghRHvzDtWPUrKjoUm0bFZ5eWsZ65HVCqaVJGOFaMiuroYNGNBDAQ9DmWSwKAFXuEgr8hxWWCVc1r4ve8pVOq/EqF5BE4hGaNUSRlXbak4YkoKsw+Kylr2/UEf5yI7ZOmgwClRWEgmIgPgIEabBkJdv4dRCeOMYsb52cAY95eTZFC71fh45mux3Xb6VKVjkqSw4/qNYVCsNxYL4xTm2IJ+rOoWArs4T1Kz2lvcg6nRgmGuqcyYQ2+2rBYOP6IOW+RWbbCgUCrJOVGTFrd6ZrPQ6c5ssWtTo9OvMEkwV9Tq+MxoR2HMZNIOQFWf9VqPiQgnVfab7Rq3di/NQ8Jn7vpQkXSEIkWOmRhe1QbtE84Hxdi89VWZpIzTDaIVC2k26+mYVIhM6oOUW6WeJhvMOPe00wx9a5JEsYAVjYuuX9510lAyScxkx2V/GTIIp7Hu44AOmYDjVZFJadqg7jKc3kDXJqzYcKQH5mUHcJwdUlhXVp9MCKqmHWJOxJBTyEOLpqOqNVaVzXFGBarytmq20PVr1nU5a1P9U2rgUGGbwsUphkuHExZ3csadOBxpc2fYiYlCdi4CW1uQbs3bDoJiyx3x2bOp+MwV9/am4l5peq/fD8p0Hsc2KwU9hePCr+BIJoQJx5sVszVrEErFijvmmq2vufpLW3/p6i9s/YWrb9v6tqs/tPWHrr5p65uuvmrrq67+3Nafu/qure+6+oatb7j6E1t/4upPbf2pq7+x9Teu/sjWH51bM7a+d27B2fq5Nbdj6zuuvm7r667+2NYfu7pmPGJl/U8hPVRfhBcwB1Pm4CImIMp0A3+XdALKwSlw2gMkKcmjmITUmVQw3Tvr9UaDQxjJlCqiJXyh4zUKtw1lDoPR88b6DeQBNYQrUdNWwRle18t38UzYPNQv5bNhCzaVS0mtpiCUL+VCeTQFV6FyKQmOsSlqagn7ib6Dhw+YFg+1hp/xuWleGnzTXsVdxT63NKTyOsvD8QHf4j3hLrEYLm0VtNxA5oDc9xfhjhtJnS2iql50l8vSucoE5suuA7df373rni+8Wu75Sz3/xXfd+/eae/BN7wvvS2/B870fvPveY2/T2/FC71fvT+8f79+5X+b+mPtr7u8avX6tafO5Zz1z//0PQZhAXg==</latexit>
Different norms
Different datasets
Different papers …
Why does this work?
‘Unnatural’ data
• While TRUE
• Mail host extends dataset and trains new classifier
• Spammer’s e-mails are rejected
• Spammer finds a modification that succeeds
• Examples
• Add highly scoring words (or sentences) to email
• Add highly scoring sentences (and vary them)
• Change or forge header (‘Dear Alex, …’)
Invariances
Adversarially Robust
L(x, y, f ) = sup ⌘( )l(f (x + ), y)
2
Networks
<latexit sha1_base64="/r6yJfHCFXO961h8YcCB6PS1Hbk=">AAAQJnictZfLbtw2FIaV9JZO29hpl90QHRgdt8bAMlo0mwKJL4kTxHc7cTwaGJREzdCmSJWi7JnKep+ij9JF0U1RpKs+Sg8l2SPSdlZjAYLI838kDynykPQTRlO1uPju3v0PPvzo408efNr67PMvHs7MPvrydSoyGZCDQDAhD32cEkY5OVBUMXKYSIJjn5E3/umK1t+cEZlSwffVOCH9GA84jWiAFZiOZ1dfdUYL44VoHv2MvDRLjnMvJExh5FGOvFWdLJBHFO5U9nnEOlFnhL5HdX4BjeePZ9uL3cXyQdcTbp1oO/Wzffzo4R9eKIIsJlwFDKdpz11MVD/HUtGAkaLlZSlJcHCKB6QHSY5jkvbzsrsFmgNLiCIh4eUKldZmiRzHaTqOfSBjrIaprWnjTVovU9Hjfk55kinCg6qhKGNICaTHDoVUkkCxMSRwICn4ioIhljhQMMKt1hx6ysjoWxSSCGdM6S/lVI9zijAPUd1M2kV7+IykSNGYQKnpPeABdA/cFSydar03DeCNI96as6xqGBctA43Dcz2vi2n3XA2JkCQuRzqSlPDw7sagbgvGgJPzQMQxNJp7ywzkZTEqck9mjORu90cCmepTIDSHwCkkIpRIIaKyLOFnVAqu10FeWoue2+95GU9PadLPvQRLjwvKQw14foS2NYTabtdDBbQzjChjVw17Xm8pjvuVW7WPV77m+02nL1UywnECK65XG/r5Wm0xMEagjw3oVZk3EFjFWZzo2d7gnk6MBgx9TURKLXq7YTVweLE8bZC7lcGAAiEFY1iOG9zKlc1AJyuzwa5OjFa9/ASWfSaJUfGV0RyFERVxcwDK/LTneoR5MC6j33TnuLF6SRZgNnXXBxInQxRKfE754K4WaA8z1s9H+q+viDihjGxgJWlA0ml3R4dCpEMfhNwy9NTRYMqxpxlmyC86SOQdT5Iob7vFvBWG4nGsOzsqeksQQRiJVM/zyYBy2F4lHhe6DGovwa4OdGXyJB0MVV//bB+2s1MC88qok3JO5KRCj2E+YATi0IKuqCztydJmOSN9WXpbFuu03Yq1XU6bVO+S6ltU0KS8/BLzCovjBndxxV1YHG5yV9iFHoX02gjs7kK41X/b9/Ndu8XNzYm4aYtHRxPxqNC1V/8HpSqLIpMVnFzCUe6WcChiTLnlzbJemhUIqXzZbnPV1Fdtfc/U92x9x9R3bH3f1Pdt/ZmpP7P1bVPftvUVU1+x9S1T37L1Q1M/tPUNU9+w9Rem/sLWX5r6S1t/a+pvbf25qT+/NmdM/ejahDP1a3PuwNQPbH3N1Ndsfd3U121dURbSovoSCA/lifAG5mTCnNzE+FjqauBzSyWgnFwClzVAkBIsjHBArE75k7WzVi002ISRSIjESsAJHe4zcNuQejMYbtXW7yAOyEGMR3VZCXt4lS/ex1Nu8pC/lU8HDVhnbiWVnICQvpULxNkEXIHMrSQ4RieozsX0V/IeHg4wDR5yNT/lfVP/NDjT3sVdxdy3FITyKsrD9gFn8S63p1gEl7YSWqohvUEeuwvIY6FQ6QIq83l7qSisq4yvT3YtuP269l33euL1Utdd7Lo7P7SfPK7vwQ+cr51vnI7jOj85T5x1Z9s5cALnd+cv553z78xvM3/O/D3zT4Xev1eX+coxnpn//gc0iDkg</latexit>
sha1_base64="XktEDP1+6bhg7+tj+0d0kC+z1/I=">AAAP0nictZfdbts2FMfV7qvzuqa93g0xI1g2BIZlYNgu1zhp02JJnK82jWUElETZTChSo6jEnqLrAbsYBuyt9gZ7mx3Ksi3SSa8cAYLI8/+RPKTIQ9JPGE1Vu/3fo8effPrZ5188+bLx1dPG18/Wnj99l4pMBuQ0EEzIMx+nhFFOThVVjJwlkuDYZ+S9f9XV+vtrIlMq+ImaJGQQ4yGnEQ2wAlPv4nmz3WqXD1pOuFWi6VTPxYtn/3qhCLKYcBUwnKZ9t52oQY6logEjRcPLUpLg4AoPSR+SHMckHeSlnwVaB0uIIiHh5QqV1nqJHMdpOol9IGOsRqmtaeNdWj9T0c+DnPIkU4QH04aijCElkO40CqkkgWITSOBAUvAVBSMscaBgaBqNdfSSkfF3KCQRzpjSX8qpHqAUYR6iqpm0hY7xNUmRojGBUqt7wAPoHrgrWLrSeu8awDtHvLFuWdUoLhoGGoc3ekIWq+65GhEhSVyOdCQp4eHDjUHVFowBJzeBiGNoNPe2GMhbYlzknswYyd3WjwQy00+B0DoCp5CIUCKFiMqyhF9TKbheB3lpLfruoO9lPL2iySD3Eiw9LigPNeD5EeppCDXdlocKaGcUUcbmDXtevxPHg6lblY9zX/OTutMzlYxxnMCK61eGQb5TWQyMEehjDfq1zBsIrOIsTvRsr3EvF0YDhr4mIqUW3atZDRxeLK9q5NHUYECBkIIxLCc1rju3GehiZdbY7YXRqpdfwrLPJDEqnhvNURhTEdcHoMyveq5HmAeTMvqtdo4bq5dkAWYrd30ocTJCocQ3lA8faoH2MWODfKz/elfECWVkDytJA5Kuujs6FCId+iDklqGnigYrjj31MEN+00Ei3/AkifKmW3xvhaF4EuvOjot+ByIII5Hqez4ZUg7bq8STQpdBzQ7yIBpVJk/S4UgN9M/2YTu7IjCvjDop50QuKvQY5kNGIA5t6orK0p4sbZYz0pelt2WxjaY7ZW2X0zrVn1EDiwrqlJfPMK+wOG5wt3Pu1uJwnZtjt3oU0qURODqCcKv/tu/nR3aL+/sLcd8Wz88X4nmha5/+H5SqLIpMVnAyg6PcLeFQxJhyy5stvTSnIKTyLbvNbVPftvVjUz+29UNTP7T1E1M/sfVXpv7K1num3rP1rql3bf3A1A9s/czUz2x9z9T3bP2Nqb+x9bem/tbWP5j6B1t/beqvl+aMqZ8vTThTX5pzp6Z+aus7pr5j67umvmvrirKQFtMvgfBQngjvYC4XzOVdjI+lrgY+91QCyuUMmNUAQUqwMMIBsTrlL9bOznShwSaMREIkVgJO6N42gduG1JvB6KCy/gBxQA5jPK7KStjDp/niYzzlJg/5e/l0WIN15l5SyQUI6Xu5QFwvwC5k7iXBMbpAdS6mv5OP8HCAqfGQq/gV75v6p8GZ9iHuKua+pSCUT6M8bB9wFm9xe4pFcGkroU4F6Q3ywt1EHguFSjdRmc+bnaKwrjK+Ptk14PLr2lfd5cS7Tsttt9zDtvPE+cb51tlwXOcn5xdn1+k5p07ghM6fzj9reO2Ptb+ml+THj6rb8gvHeNb+/h8r0h4G</latexit>
sha1_base64="3w36K9JG8hjwVFyUK31QvkFGWPQ=">AAAQG3ictZdbb9s2FMfV7tZ5XZPudS/EjGDOFhhWgGF7GbDm0qZFc0/aNJYRUBJlM6FIjaISe4q+z7CPsodhL8PQfZodSkosMkmfHAGCyPP/kTykyEPSTxhNVa/3/sHDjz7+5NPPHn3e+uLxl0/m5p8+fpOKTAbkMBBMyCMfp4RRTg4VVYwcJZLg2GfkrX+2qvW350SmVPADNUnIIMZDTiMaYAWmk/m1153x0mQpWkQ/Iy/NkpPcCwlTGHmUI29NJwvkEYU7lX0RsU7UGaPvUZ1fQpPFk/l2r9srH3Qz4daJtlM/OydPn/zphSLIYsJVwHCa9t1eogY5looGjBQtL0tJgoMzPCR9SHIck3SQl90t0AJYQhQJCS9XqLQ2S+Q4TtNJ7AMZYzVKbU0bb9P6mYp+GuSUJ5kiPKgaijKGlEB67FBIJQkUm0ACB5KCrygYYYkDBSPcai2gZ4yMv0UhiXDGlP5STvU4pwjzENXNpF20j89JihSNCZSa3QMeQPfAXcHSmdZ72wDeOuKtBcuqRnHRMtA4vNDzuph1z9WICEnicqQjSQkP728M6rZgDDi5CEQcQ6O5t8JAXhHjIvdkxkjudn8gkKk+BUILCJxCIkKJFCIqyxJ+TqXgeh3kpbXou4O+l/H0jCaD3Euw9LigPNSA50doR0Oo7XY9VEA7o4gydt2w5/WX43hQuVX7eO1rftB0+kolYxwnsOL6tWGQr9cWA2ME+tiAXpd5A4FVnMWJnu0N7tnUaMDQ10Sk1KJ3GlYDhxfLswa5VxkMKBBSMIblpMGtXtsMdLoyG+za1GjVy09h2WeSGBVfG81RGFMRNwegzM96rkeYB5My+s12jhurl2QBZjN3fShxMkKhxBeUD+9rgfYxY4N8rP/6qogTysgmVpIGJJ11d3QoRDr0QcgtQ08dDWYce5phhvyqg0Te8SSJ8rZbLFphKJ7EurPjor8MEYSRSPU9nwwph+1V4kmhy6D2MuzqQFcmT9LhSA30z/ZhOzsjMK+MOinnRE4r9BjmQ0YgDi3pisrSnixtljPSl6W3ZbFO261Y2+W0SfWvqIFFBU3Ky68wr7A4bnCX19ylxeEmd41d6lFIb4zA3h6EW/23fT/fs1vc2pqKW7Z4fDwVjwtde/V/UKqyKDJZwckVHOVuCYcixpRb3qzopVmBkMpX7DbXTH3N1vdNfd/Wd01919YPTP3A1p+b+nNb3zH1HVtfNfVVW9829W1bPzL1I1vfNPVNW39p6i9t/ZWpv7L1d6b+ztZfmPqLG3PG1I9vTDhTvzHnDk390NbXTX3d1jdMfcPWFWUhLaovgfBQnghvYU6nzOltjI+lrgY+d1QCyukVcFUDBCnBwggHxOqUP10769VCg00YiYRIrASc0OE+A7cNqTeD0XZt/Q7igBzGeFyXlbCHV/niQzzlJg/5O/l02IB15k5SySkI6Tu5QJxPwVXI3EmCY3SK6lxMfyMf4OEA0+AhV/Mz3jf1T4Mz7X3cVcx9S0Eor6I8bB9wFu9ye4pFcGkroeUa0hvkibuEPBYKlS6hMp+3l4vCusr4+mTXgtuva991bybeLHfdXtfd7TmPnK+db5yO4zo/Or84G86Oc+gEzh/O385757+53+f+mvunuic/fFBfmL9yjGfu3/8Bdgk3iw==</latexit>
sha1_base64="vOdkC1+iRmBWHRxoWMLMnjYf2yU=">AAAQJnictZdbb9s2FMfV7tLO2+p2e9wLMSOYswWGFWBYXwa0ubRp0VybtGksI6AkymZCkRpFpfYUfZ9hH2UPw16GoXvaR9mhpMQik/TJFSCIPP8fyUOKPCT9hNFU9fvvbt3+6ONPPr1z97PW5198ea99/8FXr1KRyYAcBIIJeejjlDDKyYGiipHDRBIc+4y89k9Xtf76jMiUCr6vpgkZxnjEaUQDrMB0fH/tRXeyNF2KFtHPyEuz5Dj3QsIURh7lyFvTyQJ5ROFuZV9ErBt1J+gHVOeX0HTx+H6n3+uXD7qacOtEx6mfneMH9/7wQhFkMeEqYDhNB24/UcMcS0UDRoqWl6UkwcEpHpEBJDmOSTrMy+4WaAEsIYqEhJcrVFqbJXIcp+k09oGMsRqntqaN12mDTEUPhznlSaYID6qGoowhJZAeOxRSSQLFppDAgaTgKwrGWOJAwQi3WgvoMSOT71BIIpwxpb+UUz3OKcI8RHUzaQ+9xGckRYrGBErN7wEPoHvgrmDpXOu9bgCvHfHWgmVV47hoGWgcvtXzuph3z9WYCEnicqQjSQkPP9wY1G3BGHDyNhBxDI3m3goDeUVMityTGSO52/uRQKb6FAgtIHAKiQglUoioLEv4GZWC63WQl9Zi4A4HXsbTU5oMcy/B0uOC8lADnh+hHQ2hjtvzUAHtjCPK2GXDnjdYjuNh5Vbt46Wv+X7T6QuVTHCcwIob1IZhvl5bDIwR6GMDelHmDQRWcRYnerY3uMczowFDXxORUoveaVgNHF4sTxvkXmUwoEBIwRiW0wa3emkz0NnKbLBrM6NVLz+BZZ9JYlR8aTRHYUJF3ByAMj/vuR5hHkzL6DffOW6sXpIFmM3d9ZHEyRiFEr+lfPShFugAMzbMJ/qvr4o4oYxsYiVpQNJ5d0eHQqRDH4TcMvTU0WDOsacZZsgvOkjkXU+SKO+4xaIVhuJprDs7KQbLEEEYidTA88mIctheJZ4WugzqLMOuDnRl8iQdjdVQ/2wftrNTAvPKqJNyTuSsQo9hPmIE4tCSrqgs7cnSZjkjfVl6WxbrdtyKtV1Om9TgghpaVNCkvPwC8wqL4wZ3fsmdWxxucpfYuR6F9MoI7O1BuNV/2/fzPbvFra2ZuGWLR0cz8ajQtVf/B6UqiyKTFZxcwFHulnAoYky55c2KXpoVCKl8xW5zzdTXbP2lqb+09V1T37X1fVPft/Unpv7E1ndMfcfWV0191da3TX3b1g9N/dDWN01909afmfozW39u6s9t/Y2pv7H1p6b+9MqcMfWjKxPO1K/MuQNTP7D1dVNft/UNU9+wdUVZSIvqSyA8lCfCa5iTGXNyHeNjqauBzw2VgHJyAVzUAEFKsDDCAbE65c/Wznq10GATRiIhEisBJ3S4z8BtQ+rNYLxdW7+HOCBHMZ7UZSXs4VW+eB9PuclD/kY+HTVgnbmRVHIGQvpGLhBnM3AVMjeS4BidoToX01/Je3g4wDR4yNX8nPdN/dPgTPsh7irmvqUglFdRHrYPOIv3uD3FIri0ldByDekN8thdQh4LhUqXUJnPO8tFYV1lfH2ya8Ht17XvulcTr5Z7br/n7vY7jx7W9+C7zjfOt07XcZ2fnEfOhrPjHDiB87vzl/PO+bf9W/vP9t/tfyr09q26zNeO8bT/+x8zSDkc</latexit>
courses.d2l.ai/berkeley-stat-157
Interaction with Environment
courses.d2l.ai/berkeley-stat-157
Batch
build
model
courses.d2l.ai/berkeley-stat-157
Online
4 8 3 5
courses.d2l.ai/berkeley-stat-157
Bandits
courses.d2l.ai/berkeley-stat-157
Stateful Systems
no memory memory
courses.d2l.ai/berkeley-stat-157
Reinforcement Learning & Control
• Take action
• Environment reacts
• Observe stuff
• Update model
Repeat
• environment (cooperative, adversary, doesn’t care)
• memory (goldfish, elephant)
• state space (tic tac toe, chess, car)
• past observations (server log, generated during training)
courses.d2l.ai/berkeley-stat-157
Reinforcement Learning & Control
• Games
• Chess, Go, Backgammon (fully observed)
• Poker, Starcraft, ATARI (partially observed, random)
• Parallelism
• Computation advertising, recommender systems (multiple agents & independent
parallel games)
• Load balancing & scheduling (multiple agents)
• Actions
• Continuous decisions (driving, flying, robots in general, HVAC)
• Discrete (elevator, work allocation)
• Simulations
• MuJoCo style
• Only reality (server center)
courses.d2l.ai/berkeley-stat-157
Training ≠ Testing
• Nonstationary Environments