0% found this document useful (0 votes)
9 views80 pages

9 Environment

The document discusses the concepts of training versus testing in deep learning, highlighting the differences in empirical and covariate distributions. It introduces logistic regression and methods to correct for covariate shift, as well as the implications of label shift. The content is part of a course on deep learning at UC Berkeley.

Uploaded by

wen zhou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views80 pages

9 Environment

The document discusses the concepts of training versus testing in deep learning, highlighting the differences in empirical and covariate distributions. It introduces logistic regression and methods to correct for covariate shift, as well as the implications of label shift. The content is part of a course on deep learning at UC Berkeley.

Uploaded by

wen zhou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 80

Introduction to Deep Learning

9. Environment and Covariate Shift

STAT 157, Spring 2019, UC Berkeley

Alex Smola and Mu Li


courses.d2l.ai/berkeley-stat-157
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Training ≠ Testing

• Generalization performance pemp (x, y) 6= p(x, y)


(the empirical distribution lies)
<latexit sha1_base64="mXy6G3P5BccDZTot0sfJc2VgV7s=">AAAQAXictZfLbtw2FIaV9JJ02sZOu+yiRAdGncIYWEaLZpn4kjhB7fiaOB4NDEqiZmjzolCUPVNZq6IPU3RTFO2qb9A36Nv0UCN7RNrOaixAEHn+j+QhRR6SYcpophcX/7tz94MPP/r43v1PWp9+9vmDmdmHX7zOZK4ish9JJtVBiDPCqCD7mmpGDlJFMA8ZeROerBj9zSlRGZViT49S0uO4L2hCI6zBdDT7dXoUcKwHiheEp+X8cGH0CAWCvENplT6abS92FqsHXU34daLt1c/W0cMH/waxjHJOhI4YzrKuv5jqXoGVphEjZSvIM5Li6AT3SReSAnOS9YqqJyWaA0uMEqngFRpV1maJAvMsG/EQSON15mrGeJ3WzXXyuFdQkeaaiGjcUJIzpCUyw4Jiqkik2QgSOFIUfEXRACscaRi8VmsOPWVk+C2KSYJzps2XCmqGMENYxKhuJuugXXxKMqQpJ1Bqeg94AN0DdyXLplrvdQN47Yi35hyrHvCyZaE8PjNTtpx2z/WASEV4NdKJokTEtzcGdVswBoKcRZJzaLQIlhnIy3JYFoHKGSn8zg8EMuNPidAcAqeQTFCqpEyqskScUiWFWQdFZS27fq8b5CI7oWmvCFKsAiGpiA0QhAnaMhBq+50AldDOIKGMXTYcBN0lzntjt2ofL30t9ppOX6hkiHkKK65bG3rFWm2xMEagjw3opypvIbCKc56a2d7gnk6MFgx9TWVGHXqrYbVweLE6aZA7Y4MFRVJJxrAaNbiVS5uFTlZmg12dGJ16xTEs+1wRq+JLoz0KQyp5cwCq/LTneoJFNKqi33TnuLV6SR5hNnXX+wqnAxQrfEZF/7YWaBcz1iuG5q+vSJ5SRjawVjQi2bS7Y0IhMqEPQm4VeupoMOXY0wwz5J0JEsV8oEhStP3ykROG+Iibzg7L7hJEEEYS3Q1C0qcCtleFR6Upg9pLKIBoVJsCRfsD3TM/O4Tt7ITAvLLqpEIQNakwYFj0GYE4tGAqqkoHqrI5zqhQVd5Wxebb/ph1Xc6aVPeC6jlU1KSC4gILSocTFnd+yZ07HG5yl9i5GYXsygjs7EC4NX87DIsdt8XNzYm46YqHhxPxsDS1j/8PynSeJDYrBbmAk8Kv4FhyTIXjzbJZmmMQUsWy2+aqra+6+q6t77r6tq1vu/qere+5+jNbf+bqW7a+5eortr7i6q9s/ZWrH9j6gatv2PqGq7+w9Reu/tLWX7r6W1t/6+rPbf35lTlj64dXJpytX5lz+7a+7+prtr7m6uu2vu7qmrKYluMvgfBQnQivYY4nzPF1TIiVqQY+N1QCyvEFcFEDBCnJ4gRHxOlUOFk7a+OFBpswkilRWEs4oQerBG4bymwGg1e19TuIA6rP8bAuC3ecOl++j6fC5iF/I5/1G7DJ3EhqNQEhfSMXydMJuAKZG0lwjE5Qk+P0Z/IeHg4wDR5yNT/lfdP8NDjT3sZdxd63NITycZSH7QPO4h3hTrEELm0VtFRDZoM88hdQwGKpswVU5Yv2Ulk6V5nQnOxacPv13bvu1cTrpY6/2PG3v28/eVzfg+97X3nfePOe7/3oPfHWvS1v34u8X73fvb+8v2d+mflt5o+ZP8fo3Tt1mS8965n553+WbCzG</latexit>

• Covariate shift p(x) 6= q(x) <latexit sha1_base64="ZsAiHdQjfU7q6OMIDtQsRTW29rA=">AAAP7nictZfdbts2FMfV7qvztrrdLndDzAiWDoFhBRvWyzYfbVosifPVprHcgJIomwlFKhSV2FP0GsNuhmG72rPsDfY2O5SVWKSTXjkCBJHn/yN5SJGHpJ8wmqpO57979z/6+JNPP3vweeOLL7962Hz0+Os3qchkQA4CwYQ89HFKGOXkQFHFyGEiCY59Rt76p6taf3tOZEoF31fjhPRjPOA0ogFWYHqfLI6eII+TM3QGqeNHrU67Uz5oNuFWiZZTPd3jxw//9UIRZDHhKmA4TXtuJ1H9HEtFA0aKhpelJMHBKR6QHiQ5jknaz0u3C7QAlhBFQsLLFSqt9RI5jtN0HPtAxlgNU1vTxpu0Xqaip/2c8iRThAeThqKMISWQHgMUUkkCxcaQwIGk4CsKhljiQMFINRoL6Dkjo+9RSCKcMaW/lFM9XinCPERVM2kb7eFzkiJFYwKl5veAB9A9cFewdK713jSAN454Y8GyqmFcNAw0Di/0/Czm3XM1JEKSuBzpSFLCw7sbg6otGANOLgIRx9Bo7q0wkFfEqMg9mTGSu+2fCGQmnwKhBQROIRGhRAoRlWUJP6dScL0O8tJa9Nx+z8t4ekqTfu4lWHpcUB5qwPMj1NUQarltDxXQzjCijF037Hm95TjuT9yqfLz2Nd+vO32lkhGOE1hxvcrQz9cri4ExAn2sQb+UeQOBVZzFiZ7tNe751GjA0NdEpNSiuzWrgcOL5WmN3J0YDCgQUjCG5bjGrV7bDHS6Mmvs2tRo1ctPYNlnkhgVXxvNURhREdcHoMzPe65HmAfjMvrNd44bq5dkAWZzd30gcTJEocQXlA/uaoH2MGP9fKT/+qqIE8rIJlaSBiSdd3d0KEQ69EHILUNPFQ3mHHvqYYac6SCRL3qSRHnLLZ5YYSgex7qzo6K3DBGEkUj1PJ8MKIftVeJxocug1jLyIBpVJk/SwVD19c/2YTs7JTCvjDop50ROK/QY5gNGIA4t6YrK0p4sbZYz0pelt2WxxZY7YW2X0zrVu6L6FhXUKS+/wrzC4rjBXV5zlxaH69w1dqlHIZ0Zgd1dCLf6b/t+vmu3uLU1Fbds8ehoKh4VuvbJ/0GpyqLIZAUnV3CUuyUcihhTbnmzopfmBIRUvmK3uWbqa7a+Z+p7tr5j6ju2vm/q+7b+wtRf2HrX1Lu2vmrqq7a+berbtn5o6oe2vmnqm7b+ytRf2fprU39t6+9M/Z2tvzT1lzNzxtSPZiacqc/MuQNTP7D1dVNft/UNU9+wdUVZSIvJl0B4KE+ENzAnU+bkJsbHUlcDn1sqAeXkCriqAYKUYGGEA2J1yp+unfXJQoNNGImESKwEnNC9NQK3Dak3g+F2Zf0B4oAcxHhUlZWwh0/yxYd4yk0e8rfy6aAG68ytpJJTENK3coE4n4KrkLmVBMfoFNW5mP5KPsDDAabGQ67i57xv6p8GZ9q7uKuY+5aCUD6J8rB9wFm8ze0pFsGlrYSWK0hvkMfuEvJYKFS6hMp83louCusq4+uTXQNuv659151NvFluu522u/Nj69nT6h78wPnW+c5ZdFznZ+eZs+F0nQMncKTzu/OX83czaf7W/KP55wS9f68q841jPM1//gdy5SUk</latexit>

(the covariate distribution lies)


• Logistic regression log(1 + exp( yf (x))) <latexit sha1_base64="l1J6cMLaHIbGh+wkxlFcIMRwY6M=">AAACEXicbVC7SgNBFJ2NryS+ohYKNoNB2CCGXS1MGbSxjGAekIQwO7mbDJl9MDMrWUPAf/ATbLWxsxNbv0Cw80ecPAqTeODC4Zx7ufceJ+RMKsv6MhJLyyura8lUen1jc2s7s7NbkUEkKJRpwANRc4gEznwoK6Y41EIBxHM4VJ3e1civ3oGQLPBvVRxC0yMdn7mMEqWlVma/wYOOaeMT3IB+aJ7G2DX7uVyulclaeWsMvEjsKckWD+6/Uw+vl6VW5qfRDmjkga8oJ1LWbStUzQERilEOw3QjkhAS2iMdqGvqEw9kczB+YIiPtdLGbiB0+QqP1b8TA+JJGXuO7vSI6sp5byT+59Uj5RaaA+aHkQKfTha5EccqwKM0cJsJoIrHmhAqmL4V0y4RhCqd2ewWDv2hTsWez2CRVM7y9nneutHxFNAESXSIjpCJbHSBiugalVAZUTRET+gZvRiPxpvxbnxMWhPGdGYPzcD4/AXs0J6n</latexit>

(tools to fix shift) 1


• Covariate shift correction (p(x) (1, y) + q(x) ( 1, y))
2
<latexit sha1_base64="7vQJO7RMR3qypwX4uT/zvxOTcwc=">AAACMnicbZDLSsNAFIYn9VbvUZduBkWoqCWpC7ssuHFZwbZCU8pkeqJDJxdnJtIQ8hhufQpBcO1W97oTt/oOTtIuvP0w8POdczhnfjfiTCrLejFKU9Mzs3Pl+YXFpeWVVXNtvS3DWFBo0ZCH4twlEjgLoKWY4nAeCSC+y6HjDo/zeucahGRhcKaSCHo+uQiYxyhRGvXNuuMJQlM7S2sZdoQr0qgy2sXOALgiFXsfJ7t4D199YwcFzPrmtlW1CuG/xp6Y7YZ1Y2/dP3w2++a7Mwhp7EOgKCdSdm0rUr2UCMUoh2zBiSVEhA7JBXS1DYgPspcWP8zwjiYD7IVCv0Dhgn6fSIkvZeK7utMn6lL+ruXwv1o3Vl69l7IgihUEdLzIizlWIc7jwgMmgCqeaEOoYPpWTC+JjkzpUH9u4TDKU7F/Z/DXtGtV+7Bqnep46misMtpEW6iCbHSEGugENVELUXSLHtETejbujFfjzXgft5aMycwG+iHj4wvw8Kus</latexit>

• Label shift p(y) 6= q(y)


(the label distribution lies)
<latexit sha1_base64="XvSqyTUTIiirHHNPjEnlRxEI0bE=">AAAP7nictZfdbts2FMfV7qvztrrdLndDzAiWDoFhBRvWyzYfbVosifPVprHcgJIomwlFKhSV2FP0GsNuhmG72rPsDfY2O5SVWKSTXjkCBJHn/yN5SJGHpJ8wmqpO57979z/6+JNPP3vweeOLL7962Hz0+Os3qchkQA4CwYQ89HFKGOXkQFHFyGEiCY59Rt76p6taf3tOZEoF31fjhPRjPOA0ogFWYHqfLI6fII+TM3QGqeNHrU67Uz5oNuFWiZZTPd3jxw//9UIRZDHhKmA4TXtuJ1H9HEtFA0aKhpelJMHBKR6QHiQ5jknaz0u3C7QAlhBFQsLLFSqt9RI5jtN0HPtAxlgNU1vTxpu0Xqaip/2c8iRThAeThqKMISWQHgMUUkkCxcaQwIGk4CsKhljiQMFINRoL6Dkjo+9RSCKcMaW/lFM9XinCPERVM2kb7eFzkiJFYwKl5veAB9A9cFewdK713jSAN454Y8GyqmFcNAw0Di/0/Czm3XM1JEKSuBzpSFLCw7sbg6otGANOLgIRx9Bo7q0wkFfEqMg9mTGSu+2fCGQmnwKhBQROIRGhRAoRlWUJP6dScL0O8tJa9Nx+z8t4ekqTfu4lWHpcUB5qwPMj1NUQarltDxXQzjCijF037Hm95TjuT9yqfLz2Nd+vO32lkhGOE1hxvcrQz9cri4ExAn2sQb+UeQOBVZzFiZ7tNe751GjA0NdEpNSiuzWrgcOL5WmN3J0YDCgQUjCG5bjGrV7bDHS6Mmvs2tRo1ctPYNlnkhgVXxvNURhREdcHoMzPe65HmAfjMvrNd44bq5dkAWZzd30gcTJEocQXlA/uaoH2MGP9fKT/+qqIE8rIJlaSBiSdd3d0KEQ69EHILUNPFQ3mHHvqYYac6SCRL3qSRHnLLZ5YYSgex7qzo6K3DBGEkUj1PJ8MKIftVeJxocug1jLyIBpVJk/SwVD19c/2YTs7JTCvjDop50ROK/QY5gNGIA4t6YrK0p4sbZYz0pelt2WxxZY7YW2X0zrVu6L6FhXUKS+/wrzC4rjBXV5zlxaH69w1dqlHIZ0Zgd1dCLf6b/t+vmu3uLU1Fbds8ehoKh4VuvbJ/0GpyqLIZAUnV3CUuyUcihhTbnmzopfmBIRUvmK3uWbqa7a+Z+p7tr5j6ju2vm/q+7b+wtRf2HrX1Lu2vmrqq7a+berbtn5o6oe2vmnqm7b+ytRf2fprU39t6+9M/Z2tvzT1lzNzxtSPZiacqc/MuQNTP7D1dVNft/UNU9+wdUVZSIvJl0B4KE+ENzAnU+bkJsbHUlcDn1sqAeXkCriqAYKUYGGEA2J1yp+unfXJQoNNGImESKwEnNC9NQK3Dak3g+F2Zf0B4oAcxHhUlZWwh0/yxYd4yk0e8rfy6aAG68ytpJJTENK3coE4n4KrkLmVBMfoFNW5mP5KPsDDAabGQ67i57xv6p8GZ9q7uKuY+5aCUD6J8rB9wFm8ze0pFsGlrYSWK0hvkMfuEvJYKFS6hMp83louCusq4+uTXQNuv659151NvFluu522u/Nj69nT6h78wPnW+c5ZdFznZ+eZs+F0nQMncKTzu/OX83czaf7W/KP55wS9f68q841jPM1//geR6yUm</latexit>

• Nonstationary Environments
Generalization performance
Generalization performance
Generalization performance
Generalization performance
Generalization performance
Generalization performance
Only cats and dogs?

• Images, too (e.g. He et al., 2015, ResNet paper)

• Alexa
(‘Please turn off the coffee machine’ vs. ‘coffee machine off’)
Why?

• Data Distribution p(x,y)


• Dataset drawn from p(x,y)
• Training minimizes empirical risk (plus regularization)
m
1 X
minimize l(f (xi , w), yi )
w m
<latexit sha1_base64="JEdkWag5OqebFGYOMVWosE2EngU=">AAAQE3ictZdbb9s2FMfV7tZ5W9Nuj3shZgRLhsCIgg3ry4A2lzYtljRJkzaN5RqURNlMeNEoKrGn6GMM+zDDXoZhAwbsA+zb7FBWYpFJ+uQKEESe/4+3I/KQDFNGM728/N+t2++9/8GHH935uPXJp5/dnbt3//OXmcxVRA4iyaQ6DHFGGBXkQFPNyGGqCOYhI6/CkzWjvzolKqNS7OtxSnocDwRNaIQ1mPr3/IBTQftnKEgUjgq/LHiJgizn/eLpD375hiO2kCyM+nTpbHEJjft0sX+vvdxZrh50NeHXibZXPzv9+3f/CWIZ5ZwIHTGcZV1/OdW9AitNI0bKVpBnJMXRCR6QLiQF5iTrFdXYSjQPlhglUsErNKqszRIF5lk25iGQHOth5mrGeJ3WzXXyoFdQkeaaiGjSUJIzpCUyjkIxVSTSbAwJHCkKfUXREIOTNLiz1ZpHjxgZfY1ikuCcafMFRxqnZgiLGNXNZB30Ap+SDGnKCZSa3QM9gOFBdyXLZlrvdQ681uOteceqh7xsWSiPz8wkLmc9cj0kUhFeeTpRlIj43fmgbgt8IMhZJDmHRotglYG8KkdlEaickcLvfEcgM/mUCM0j6BSSCUqVlElVlohTqqQw66CorGXX73WDXGQnNO0VQYpVICQVsQGCMEE7BkJtvxOgEtoZJpSxy4aDoLvCeW/SrbqPl30t9pudvlDJCPMUVly3NvSKjdpiYYzAGBvQj1XeQmAV5zw1s73BPZoaLRjGmsqMOvROw2rh8GJ10iD3JgYLiqSSjGE1bnBrlzYLna7MBrs+NTr1imNY9rkiVsWXRtsLIyp50wFVftZzPcEiGlfRb7Zz3Fq9JI8wm3nXBwqnQxQrfEbF4F0t0C5mrFeMzF9fkzyljGxhrWhEslkPx4RCZEIfhNwq9NTRYMaxpxlmyE8mSBQLgSJJ0fbLRScM8TE3gx2V3RWIIIwkuhuEZEAFbK8Kj0tTBrVXUADRqDYFig6Gumd+dgjb2QmBeWXVSYUgalphwLAYMAJxaMlUVJUOVGVzOqNCVfW2KrbQ9ies2+WsSXUvqJ5DRU0qKC6woHQ4YXHnl9y5w+Emd4mdGy9kVzywtwfh1vztMCz23Ba3t6fitiseHU3Fo9LUPvk/KNN5ktisFOQCTuDkZeBYckyF05tVszQnIKSKVbfNdVtfd/UXtv7C1XdtfdfV921939Uf2/pjV9+x9R1XX7P1NVd/buvPXf3Q1g9dfcvWt1z9qa0/dfVntv7M1V/b+mtXf2LrT67MGVs/ujLhbP3KnDuw9QNX37D1DVfftPVNV9eUxbScfAmEh+pEeA1zPGWOr2NCrEw18LmhElCOL4CLGiBISRYnOCLOoMLp2tmYLDTYhJFMicJawgk9WCdw21BmMxg+r63fQBxQA45HdVkFe/gkX76Np8LmIX8jnw0asMncSGo1BSF9IxfJ0ym4BpkbSXOVm6Imx+nP5C08HGAaPORqfsb7pvlpcKZ9F3cVe9/SEMonUR62DziLd4Q7xRK4tFXQSg2ZDbLvL6GAxVJnS6jKF+2VsnSuMqE52bXg9uu7d92riZcrHX+54+9+2374oL4H3/G+9L7yFjzf+9576G16O96BF3m/er97f3l/z/0y99vcH3N/TtDbt+oyX3jWM/fv/6syM5M=</latexit>
I=1
• At test time expected risk matters
(all the other data we could have seen)
E(x,y)⇠p [l(f (x, w), y)]
<latexit sha1_base64="fiqx/tm78+gvVUbMtHGNjzp7fhc=">AAAQB3ictZfLbtw2FIaV9JZO29hpl90QHRi1C2NgGSmaZeJL4gT13U4cjwYGJVEztHlRKcqeqawHKPowRTdF0a667Rv0bXqokT0ibWc1ESCIPP9H8pAiD8kwZTTTS0v/3bv/wYcfffzJg09bn33+xcOZ2Udfvs5kriJyGEkm1VGIM8KoIIeaakaOUkUwDxl5E56tGv3NOVEZleJAj1LS47gvaEIjrMF0MjsXrIcnxfxwcbSAgoxylJbwDVXB5hOwXiwsotFCeTLbXuosVQ+6mfDrRNurn52TRw//DWIZ5ZwIHTGcZV1/KdW9AitNI0bKVpBnJMXRGe6TLiQF5iTrFVV/SjQHlhglUsErNKqszRIF5lk24iGQHOtB5mrGeJvWzXXypFdQkeaaiGjcUJIzpCUyg4Niqkik2QgSOFIUfEXRACscaRjCVmsOPWNk+C2KSYJzps2XCmoGMkNYxKhuJuugfXxOMqQpJ1Bqeg94AN0DdyXLplrvbQN464i35hyrHvCyZaE8vjATt5x2z/WASEV4NdKJokTE728M6rZgDAS5iCTn0GgRrDCQV+SwLAKVM1L4ne8JZMafEqE5BE4hmaBUSZlUZYk4p0oKsw6Kylp2/V43yEV2RtNeEaRYBUJSERsgCBO0YyDU9jsBKqGdQUIZu244CLrLnPfGbtU+XvtaHDSdvlLJEPMUVly3NvSK9dpiYYxAHxvQj1XeQmAV5zw1s73BPZsYLRj6msqMOvROw2rh8GJ11iD3xgYLiqSSjGE1anCr1zYLnazMBrs2MTr1ilNY9rkiVsXXRnsUhlTy5gBU+WnP9QSLaFRFv+nOcWv1kjzCbOqu9xVOByhW+IKK/vtaoF3MWK8Ymr++KnlKGdnEWtGIZNPujgmFyIQ+CLlV6KmjwZRjTzPMkJ9MkCjmA0WSou2XC04Y4iNuOjssu8sQQRhJdDcISZ8K2F4VHpWmDGovowCiUW0KFO0PdM/87BC2szMC88qqkwpB1KTCgGHRZwTi0KKpqCodqMrmOKNCVXlbFZtv+2PWdTlrUt0rqudQUZMKiissKB1OWNzlNXfpcLjJXWOXZhSyGyOwtwfh1vztMCz23Ba3tibiliseH0/E49LUPv4/KNN5ktisFOQKTgq/gmPJMRWONytmaY5BSBUrbptrtr7m6vu2vu/qu7a+6+oHtn7g6s9t/bmr79j6jquv2vqqq2/b+rarH9n6katv2vqmq7+09Zeu/srWX7n6W1t/6+ovbP3FjTlj68c3Jpyt35hzh7Z+6Orrtr7u6hu2vuHqmrKYluMvgfBQnQhvYU4nzOltTIiVqQY+d1QCyukVcFUDBCnJ4gRHxOlUOFk76+OFBpswkilRWEs4oQdrBG4bymwGg+3a+h3EAdXneFiXVbCHj/Plu3gqbB7yd/JZvwGbzJ2kVhMQ0ndykTyfgKuQuZMEx+gENTlOfybv4OEA0+AhV/NT3jfNT4Mz7fu4q9j7loZQPo7ysH3AWbwj3CmWwKWtgpZryGyQJ/4iClgsdbaIqnzRXi5L5yoTmpNdC26/vnvXvZl4vdzxlzr+7uP20yf1PfiB97X3jTfv+d4P3lNvw9vxDr3I+9X73fvL+3vml5nfZv6Y+XOM3r9Xl/nKs56Zf/4HiywuFA==</latexit>
Why

Data Distribution
Why

Data Distribution with Empirical Sample


Why

Empirical Sample
Fixing it

• Validation set
(hold out separate data that is not used for training)
• Chernoff bound
( m
)
1 X
Pr l(f (xi ), yi )) E [l(f (x), y)] > ✏  exp 2m✏2
m
<latexit sha1_base64="V7Md1UqJmAHT9/mYEzz4sWjjeXc=">AAAQU3ictZdbb9s2FMeVrms7b13T7XEvxIxgyZAakbFhfdnQ5tIb1tZNkzaN5QaURNlMeFEpKrWn6gvufRj2WYYBO5QVW2SSPrkCbJHn/yN5xMshGaaMZnpj45+lK59d/fza9RtftL786ubXt5Zvf/Mqk7mKyH4kmVQHIc4Io4Lsa6oZOUgVwTxk5HV4smX016dEZVSKPT1JyYDjoaAJjbAG09Hyu6CngihURZAoHBV+WfASBVnOj4rHv/rlW47YarI6PqJr62gC/2voDgp2QkCgTCWtrU/WSvQbCkiaUSYFFGfkHWTHaaAAutNFfCa+7ZZHy+2Nzkb1oPMJv060vfrpHd3++s8gllHOidARw1nW9zdSPSiw0jRipGwFeUZSHJ3gIelDUmBOskFR9U2JVsASo0Qq+AmNKmuzRIF5lk14CCTHepS5mjFepPVzndwdFFSkuSYimjaU5AxpiUxHo5gqEmk2gQSOFAVfUTTC0MkahqPVWkH3GRn/gGKS4Jxp86aCmkHJEBYxqpvJOuglPiUZ0pQTKLW4BzyAzwN3JcsWWu9FHXhhj7dWHKse8bJloTx+bxZBuegv1yMiFeFVTyeKEhF/uj6o24I+EOR9JDmHRotgk4G8KcdlEaickcLv/EwgM32VCK0gcArJBKVKyqQqS8QpVVKYdVBU1rLvD/pBLrITmg6KIMUqEJKK2ABBmKCegVDb7wSohHZGCWVs1nAQ9LucD6Zu1T7OfC32mk6fqWSMeQorrl8bBsVObbEwRuAbG9DvVd5CYBXnPDWzvcHdnxstGL41lRl16F7DauHww+qkQe5ODRYUSSUZw2rS4LZmNgudr8wGuz03OvWKY1j2uSJWxTOj3QtjKnmzA6r8oud6gkU0qaLfYue4tXpJHmG2cNeHCqcjFCv8norhp1qgfczYoBibUd+SPKWMPMVa0Yhki/4cEwqRCX0QcqvQU0eDBceeZpgh70yQKFYDRZKi7ZdrThjiE24+dlz2uxBBGEl0PwjJkArYXhWelKYMandhAwd6agoUHY70wAx2CNvZCYF5ZdVJhSBqXmHAsBgyAnFo3VRUlQ5UZXOcgeNC5W1VbLXtT1nX5axJ9c+ogUNFTSoozrCgdDhhcR9m3AeHw01uhn0wvZCd64HdXQi3ZrTDsNh1W3z2bC4+c8XDw7l4WJrap+ODMp0nic1KQc7gBE5uBo4lx1Q43myapTkFIVVsum1u2/q2q7+09Zeu/sLWX7j6nq3vufoDW3/g6j1b77n6lq1vufpzW3/u6ge2fuDqT239qas/tvXHrv7E1p+4+htbf+PqD2394bk5Y+uH5yacrZ+bc/u2vu/qO7a+4+qPbP2Rq2vKYlpO3wTCQ3UivIA5njPHFzEhVqYaeF1SCSjHZ8BZDRCkJIsTHBHno8L52tmZLjTYhJFMicJawgk92CZw21BmMxg9r60/QhxQQ47HdVkFe/g0X36Mp8LmIX8pnw0bsMlcSmo1ByF9KRfJ0zm4BZlLSXCMzlGT4/QP8hEeDjANHnI1v+B90wwanGk/xV3F3rc0hPJplDdXYr/TEe4US+DSVkHdGjIb5JG/DpfeWOpsHVX5ot0tS+cqE5qTXQtuv7571z2feNXt+Bsd/8VP7Xt363vwDe8773tv1fO9X7x73iOv5+17kfe399/StaXrt/669e/yleWrU/TKUl3mW896lm/+D/HMRus=</latexit>
I=1
• Why does it work?
• Validation set was never used for training
(often violated)
• Loss bounded within [0,1] (otherwise rescale)
Fixing it

Data Distribution with Empirical Sample


Fixing it

• Input noise (more on this later)


• Dropout (noise within the layers)
• Smoothing the function f (e.g. weight decay)
covariate
Training set
Training set
At test time
Why would anyone do this?
Covariate Shift

• Web search
• Training - page relevance data for the US market
• Testing - recommend pages for Canada (UK, Australia)
• Speech recognition
• Training - West coast accent
• Testing - Southern drawl, Texan, non-native speaker
• Language
• Training - ‘James, bring me a soda’
• Testing - ‘John, bring me a ‘pop’ (or coke, etc.)
Covariate Shift

• Medical
• Training - University students + old men with prostate cancer
• Testing - Potentially sick old men
• Reinforcement Learning
• Training - Data gathered with current policy
• Testing - Environment reacting to updated policy
• Databases
• Training - DB tuned to 2017 usage pattern
• Testing - DB deployed on AWS in 2018
What is happening? q(x, y) = q(x)p(y|x) <latexit sha1_base64="5E2R2jkqH7cVtuPMC7cun7MakS0=">AAAP9nictZfdbts2FMfV7qOdt9XptrvdEDOCJUNgRMGG9WZAGydtWizfSZvGMgJKomwmFKlSVGJX0asMuxmG7WrPsTfY2+xQVmyRSXrlCBBEnv+P5CFFHpJ+wmiqlpf/u3f/o48/+fTBw88an3/x5aPm3OOvXqcikwE5DAQT8sjHKWGUk0NFFSNHiSQ49hl54591tP7mnMiUCn6gRgnpxbjPaUQDrMB0MvfNu4Xh0mgR/YIgsYiShdHlcPFkrrXcXi4fdD3hVomWUz07J48f/euFIshiwlXAcJp23eVE9XIsFQ0YKRpelpIEB2e4T7qQ5DgmaS8v3S/QPFhCFAkJL1eotNZL5DhO01HsAxljNUhtTRtv0rqZip70csqTTBEejBuKMoaUQHosUEglCRQbQQIHkoKvKBhgiQMFI9ZozKNnjAy/RyGJcMaU/lJO9bilCPMQVc2kbbSPz0mKFI0JlJrdAx5A98BdwdKZ1nvTAN444o15y6oGcdEw0Di80PO0mHXP1YAISeJypCNJCQ/vbgyqtmAMOLkIRBxDo7m3ykBeFcMi92TGSO62fyKQGX8KhOYROIVEhBIpRFSWJfycSsH1OshLa9F1e10v4+kZTXq5l2DpcUF5qAHPj9COhlDLbXuogHYGEWVs0rDndVfiuDd2q/Jx4mt+UHf6SiVDHCew4rqVoZevVxYDYwT6WIN+LfMGAqs4ixM922vcs6nRgKGviUipRe/UrAYOL5ZnNXJvbDCgQEjBGJajGteZ2Ax0ujJr7NrUaNXLT2HZZ5IYFU+M5igMqYjrA1DmZz3XI8yDURn9ZjvHjdVLsgCzmbvelzgZoFDiC8r7d7VAu5ixXj7Uf70j4oQysomVpAFJZ90dHQqRDn0QcsvQU0WDGceeepgh73SQyBc8SaK85RaLVhiKR7Hu7LDorkAEYSRSXc8nfcphe5V4VOgyqLWCPIhGlcmTtD9QPf2zfdjOzgjMK6NOyjmR0wo9hnmfEYhDS7qisrQnS5vljPRl6W1ZbKHljlnb5bROda+onkUFdcrLrzCvsDhucJcT7tLicJ2bYJd6FNJrI7C3B+FW/23fz/fsFre2puKWLR4fT8XjQtc+/j8oVVkUmazg5AqOcreEQxFjyi1vVvXSHIOQylftNtdMfc3W901939Z3TX3X1g9M/cDWn5v6c1vfMfUdW++YesfWt01929aPTP3I1jdNfdPWX5r6S1t/ZeqvbP2tqb+19Rem/uLanDH142sTztSvzblDUz+09XVTX7f1DVPfsHVFWUiL8ZdAeChPhDcwp1Pm9CbGx1JXA59bKgHl9Aq4qgGClGBhhANidcqfrp318UKDTRiJhEisBJzQvTUCtw2pN4PBdmX9AeKA7Md4WJWVsIeP88WHeMpNHvK38mm/BuvMraSSUxDSt3KBOJ+CHcjcSoJjdIrqXEzfkw/wcICp8ZCr+Bnvm/qnwZn2Lu4q5r6lIJSPozxsH3AWb3N7ikVwaSuhlQrSG+SJu4Q8FgqVLqEyn7dWisK6yvj6ZNeA269r33WvJ16vtN3ltrv7Y+vpk+oe/ND51vnOWXBc52fnqbPh7DiHTuC8d353/nL+bg6bvzX/aP45Ru/fq8p87RhP85//ASxrJyI=</latexit>

Training data
• Training Risk
Z Z
minimize dx p(x) dy p(y|x)l(f (x, w), y)
w
m
X
1
or rather minimize l(f (xi , w), yi )
w m
<latexit sha1_base64="Na2UH/8m86g+n6zJy0/XyYdPtbs=">AAAQaXictZdbb9s2FMed7tZ5W+usL8MuADEjXTIERhRsWF8GtLn0hrVN06RNY3kGJVE2E140ikrsKXred9s32GcY9h12KCmxyCR9cgUYJs//x8P7IRkkjKZ6be2fhRsffPjRx5/c/LT92edf3LrdWfzydSozFZL9UDKpDgKcEkYF2ddUM3KQKIJ5wMib4HjT6G9OiEqpFHt6mpABxyNBYxpiDaZh56+7yOdU0OEp8qnQKJogfxUly5OVOj+t8tMzsORsOV6erJ6urKLpSoF8H/maTHSOpEIK6zFRqEB32xcOY4XD3CtyDmya8WH+5Fev+J3Xfoa08jSk4GvY6a711soPXU54daLbqr+d4eKtv/1IhhknQocMp2nfW0v0IMdK05CRou1nKUlweIxHpA9JgTlJB3k5YgVaAkuEYmh3LKGXpbVZIsc8Tac8AJJDx1JXM8artH6m43uDnIok00SEVUVxxpCWyAw/iqgioWZTSOBQUWgrCscYxknDJLXbS+gBI5MfUERinDFt/mEszVSlCIsI1dWkPfQKn5AUacoJlJrfBy2A7kFzJUvn6veqAbxyxNtLjlWPedG2UB6dmq1RzLvnsIKlIrwc6VhRIqL3NwZ1XTAGgpyGknOoNPc3GMgbclLkvsoYyb3ezwQy1V+B0BKCRiEZo0RJGZdliTihSgqzD/LSWvS9Qd/PRHpMk0HuJ1j5QlIRGcAPYrRjINT1ej4qoJ5xTBm7qNj3++ucD6pm1W28aGu+12z0uUommCew4/q1YZBv1xYLYwT62IB+K/MWArs444lZ7Q3uwcxowdDXRKbUoXcaVguHH1bHDXK3MlhQKJVkDKtpg9u8sFnobGc22K2Z0fErjmDbZ4pYji+M9ihMqOTNASjz817rMRbhtIx+813j1u4lWYjZ3Js+UjgZo0jhUypG72uD9jFjg3xiZn1T8oQy8gxrRUOSzrs7JhQiE/og5Jahp44Gc449zTBD/jBBIl/2FYnzrlesOGGIT7np7KTor0MEYSTWfT8gIyrgeFV4WpgyqLuOfIhGtclXdDTWAzPZARxnxwTWleWTCkHUzKHPsBgxAnFo1TgqS/uqtDmNUYEqW1sWW+56Fes2OW1S/XNq4FBhk/Lzc8wvHE5Y3NkFd+ZwuMldYGdmFNJLI7C7C+HWzHYQ5Ltujc+fz8Tnrnh4OBMPC+O9mh+U6iyObVYKcg7HcPkycCQ5psJpzYbZmhUIqXzDrXPL1rdc/ZWtv3L1l7b+0tX3bH3P1R/a+kNX37H1HVfftPVNV39h6y9c/cDWD1z9ma0/c/Untv7E1Z/a+lNXf2vrb139ka0/urRmbP3w0oKz9Utrbt/W911929a3Xf2xrT92dU1ZRIvqn0B4KG+EVzBHM+boKibAyriBv2ucgHJ0Dpx7gCAlWRTjkDidCmZ7Z7vaaHAII5kQeN9IuKH7WwReG8ocBuMXtfVHiANqxPGkLqvgDK/yxbt4Kmwe8tfy6agBm8y1pFYzENLXcqE8mYGbkLmWNK+5GWpynP5J3sHDBabBQ67m53xumkmDO+37eKvY55aGUF5FeTg+4C7eE+4Si+HRVkLrNWQOyKG3inwWSZ2uojKfd9eLwnnKBOZm14bXr+e+dS8nXq/3vLWe9/Kn7v179Tv4Zuub1vet5ZbX+qV1v/W4tdPab4Wt/xbuLHy78N3tfzuLna86X1fojYW6zJ2W9XW6/wPwiEtF</latexit>
I=1
• Test Risk is different
Z Test data Z
dx q(x) dy p(y|x)l(f (x, w), y)
<latexit sha1_base64="RkOBIDPx9KZCo3emC4AW/allTNY=">AAAQFnictZfLbtw2FIaV9JJ02sZOu+yG6MDouDAGlpGiWSa+JE5Q3+3E8WhgUBI1Q5siFYqyZyrrPYo+TNFNUTSrbvs2PZRkj0jbWY0FCCLP/5E8pMhD0k8YTdXi4n/37n/y6WefP3j4RevLr75+NDP7+Js3qchkQA4CwYQ89HFKGOXkQFHFyGEiCY59Rt76pytaf3tGZEoF31fjhPRjPOA0ogFWYDqefeJRrlA4Qt4Cyt93RvMFqixjbUk644vRPMpZJ+qMFs7nF9AYgNbxbHuxu1g+6HrCrRNtp362jx8/+uCFIshiwlXAcJr23MVE9XMsFQ0YKVpelpIEB6d4QHqQ5DgmaT8vu1egObCEKBISXnCttDZL5DhO03HsAxljNUxtTRtv0nqZip72c8qTTBEeVA1FGUNKID1WKKSSBIqNIYEDScFXFAyxxIGCEW215tBzRkY/oJBEOGNKfymnelxThHmI6mbSLtrDZyRFisYESk3vAQ+ge+CuYOlU671pAG8c8dacZVXDuGgZaBye63lcTLvnakiEJHE50pGkhId3NwZ1WzAGnJwHIo6h0dxbZiAvi1GRezJjJHe7PxHIVJ8CoTkETiERoUQKEZVlCT+jUnC9DvLSWvTcfs/LeHpKk37uJVh6XFAeasDzI7StIdR2ux4qoJ1hRBm7atjzektx3K/cqn288jXfbzp9qZIRjhNYcb3a0M/XaouBMQJ9bEC/lHkDgVWcxYme7Q3u+cRowNDXRKTUorcbVgOHF8vTBrlbGQwoEFIwhuW4wa1c2Qx0sjIb7OrEaNXLT2DZZ5IYFV8ZzVEYURE3B6DMT3uuR5gH4zL6TXeOG6uXZAFmU3d9IHEyRKHE55QP7mqB9jBj/Xyk//qKiBPKyAZWkgYknXZ3dChEOvRByC1DTx0Nphx7mmGGvNdBIu94kkR52y3mrTAUj2Pd2VHRW4IIwkikep5PBpTD9irxuNBlUHsJeRCNapMn6WCo+vpn+7CdnRKYV0adlHMiJxV6DPMBIxCHFnRFZWlPljbLGenL0tuyWKftVqztctqkepdU36KCJuXll5hXWBw3uIsr7sLicJO7wi70KKTXRmB3F8Kt/tu+n+/aLW5uTsRNWzw6mohHha69+j8oVVkUmazg5BKOcreEQxFjyi1vlvXSrEBI5ct2m6umvmrre6a+Z+s7pr5j6/umvm/rL0z9ha1vm/q2ra+Y+oqtb5n6lq0fmvqhrW+Y+oatvzL1V7b+2tRf2/o7U39n6y9N/eW1OWPqR9cmnKlfm3MHpn5g62umvmbr66a+buuKspAW1ZdAeChPhDcwJxPm5CbGx1JXA59bKgHl5BK4rAGClGBhhANidcqfrJ21aqHBJoxEQiRWAk7o3iqB24bUm8Fwq7b+CHFADmI8qstK2MOrfPExnnKTh/ytfDpowDpzK6nkBIT0rVwgzibgCmRuJcExOkF1Lqa/ko/wcIBp8JCr+Snvm/qnwZn2Lu4q5r6lIJRXUR62DziLd7k9xSK4tJXQUg3pDfLYXUAeC4VK4R6s83l7qSisq4yvT3b69uvad93riTdLXXex6+48aT97Wt+DHzrfOd87Hcd1fnaeOevOtnPgBM7vzp/OP86Hmd9m/pj5a+bvCr1/ry7zrWM8M//+D4D+Mms=</latexit>
Fixing it (covariate shift correction)

• Basic algebra
Z Z Z
q(x)
dx q(x)f (x) = dx p(x) f (x) = dx p(x)↵(x)f (x)
p(x)
| {z }
<latexit sha1_base64="zySECyHE+10MKQRamhil0zKZQW8=">AAAQW3ictZdbb9s2FMeVdt06r2vdDXvaCzEjWDoERhRsWF8GtLm0abE29zaNaQSURNlMKFKlqNSeoi+4b7CHYV9lh7Ici0zSJ0eALfL8fySPeDkkg5SzTK+s/LNw5+4X97786v7XrW8efPvwUfvxd+8ymauQHoaSS3UUkIxyJuihZprTo1RRkgScvg/O1o3+/pyqjElxoMcp7SdkIFjMQqLBdNIeYyY0ikYIL6OPS6MnKDZ/f6BWw54aE85FRFWgSEgLHMOrMHhZGLEsTwpMeDokJnNzFVOkIk7anZXuSvWgqwm/TnS8+tk5efzwbxzJME+o0CEnWdbzV1LdL4jSLOS0bOE8oykJz8iA9iApSEKzflH1UYkWwRKhWCr4gVuVtVmiIEmWjZMAyIToYeZqxnid1st1/LRfMJHmmopw0lCcc6QlMh2OIqZoqPkYEiRUDHxF4ZBAB2oYllZrET3ndPQzimhMcq7NmwlmBidDRESobibron1yTjOkWUKh1Pwe8AA+D9yVPJtrvdd14LU93lp0rHqYlC0LTaJPZjGU8/5yPaRS0aTq6VgxKqLb64O6LegDQT+FMkmg0QKvcZDX5KgssMo5LfzubxQyk1eJ0CICp5CMUaqkjKuyVJwzJYVZB0VlLXt+vwcLNDtjab/AKVFYSAbrFQAcxGjHQKjjdzEqoZ1hzDi/bBjj3mqS9Cdu1T5e+locNJ2eqnREkhRWXK829IvN2mJhnMI3NqA/q7yFwCrOk9TM9gb3fGa0YPjWVGbMoXcaVguHH1FnDXJvYrCgUCrJOVHjBrd+abPQ2cpssBszo1OvOIVlnytqVXxptHthxGTS7IAqP++5HhMRjqvoN985bq1emoeEz931gSLpEEWKfGJicFsLtEc47xcjM+rrMkkZp2+IViyk2bw/x4RCZEIfhNwq9NTRYM6xpxlm6EcTJIolrGhcdPzyiROGknFiPnZU9lYhgnAa6x4O6IAJ2F4VGZemDOqsIgzRqDZhxQZD3TeDbY4FZxTmlVUnE4KqWYWYEzHgFOLQsqmoKo1VZXOcgWNG5W1VbKnjT1jX5axJ9aZU36HCJoWLKYZLhxMWd3HJXTgcaXKX2IXphexKD+ztQbg1ox0ExZ7b4tu3M/GtKx4fz8Tj0tQ+GR+U6TyObVYKOoXjwq/gSCaECcebNbM0JyCkijW3zQ1b33D1fVvfd/VdW9919QNbP3D1F7b+wtV3bH3H1ddtfd3Vt21929WPbP3I1d/Y+htXf2Xrr1z9ta2/dvUPtv7B1V/a+ssrc8bWj69MOFu/MucObf3Q1TdtfdPVt2x9y9U14xErJ2+4RLDqRHgNczpjTq9jAqJMNfC6oRJQTqfAtAYIUpJHMVxfnI8KZmtnc7LQYBNGMqWKaAkndLxB4bahzGYw3K6tv0AcUIOEjOqyCvbwSb78HM+EzUP+Rj4bNGCTuZHUagZC+kYulOczcB0yN5LgGJuhJpewv+hneDjANHjI1fyc900zaHCmvY27ir1vaQjlkygP2wecxbvCnWIxXNoqaLWGzAZ54i8jzCOps2VU5YvOalk6V5nAnOxacPv13bvu1cS71a6/0vV3f+08e1rfg+97P3o/eUue7/3uPfO2vB3v0Au9/xbuLTxcePTo3/bddqv9YILeWajLfO9ZT/uH/wGAiUTW</latexit>
↵(x)

• Need to find density ratio, but we don’t have either one.


• Estimating p and q directly is really hard and requires
specialized tools. Can we recycle classifiers?
Fairness and Bias
(covariate shift in action & news)

courses.d2l.ai/berkeley-stat-157
Training set
dog

cat
dog

dog

courses.d2l.ai/berkeley-stat-157
Test set
dog

cat
dog

dog

courses.d2l.ai/berkeley-stat-157
The classifier might perform a lot
worse during test time

courses.d2l.ai/berkeley-stat-157
Why?

courses.d2l.ai/berkeley-stat-157
Simple regression problem

+ + +
covariates
train
eval train
courses.d2l.ai/berkeley-stat-157
Simple regression problem

+
+ + +
covariates
train
eval train
courses.d2l.ai/berkeley-stat-157
Simple regression problem

+
+
+

+ + +
covariates
train
eval train
courses.d2l.ai/berkeley-stat-157
Simple regression problem

+
+
Optimal estimate not
+
in class of linear functions

+ + +
covariates
train
eval train
courses.d2l.ai/berkeley-stat-157
Simple regression problem

+
+
Optimal estimate not
+
in class of linear functions

+ + +
covariates
train
eval train
courses.d2l.ai/berkeley-stat-157
Training error may be misleading (e.g. faces)


Train on IMDB
Test on weird prof
courses.d2l.ai/berkeley-stat-157
No Protection against Bias Theorem

• Estimator performs better (or worse) on some data


• We can always find a distribution q that is much worse
Varp(x) Ey|x[l(y, f(x)]
loss Eq(x)Ey|x[l(y, f(x))] ≥ + Ep(x)Ey|x[l(y, f(x))]
Ep(x)Ey|x[l(y, f(x))]
μ+c
μ

0
probability mass
courses.d2l.ai/berkeley-stat-157
TL;DR Testing where we have
insufficient amounts of training
data may yield strange results.

courses.d2l.ai/berkeley-stat-157
Logistic Regression

REGRE
SSION
Recall - Multiclass Classification

Calibrated Scale Classification


• Output matches probabilities • Multiple classes, typically
(nonnegative, sums to 1) multiple outputs
p(y | o) = softmax(o) • Score should reflect
exp(oy) confidence …
=
∑i exp(oi)
• Negative log-likelihood


−log p(y | o) = log exp(oi) − oy
i
courses.d2l.ai/berkeley-stat-157
Two Classes

• Classes 1 and -1
exp(o1)
p(y = 1 | o) = softmax(o) =
exp(o−1) + exp(o1)

• Shift invariance oi ← oi + c
exp(o1 + c) exp(o1)
p(y = 1 | o) = =
exp(o−1 + c) + exp(o1 + c) exp(o−1) + exp(o1)

courses.d2l.ai/berkeley-stat-157
Two Classes

• Choose o−1 = 0
exp(o1) 1
p(y = 1 | o) = =
exp(0) + exp(o1) 1 + exp(−o1)

• Negative log-likelihood

−log p(y | o) = log(1 + exp(−yo1))

courses.d2l.ai/berkeley-stat-157
lim log(1 + exp(−x)) = log 1 = 0
x→∞
lim log(1 + exp(−x)) + x = lim log(1 + exp(x)) = 0
x→−∞ x→−∞

Logistic loss function

courses.d2l.ai/berkeley-stat-157
Logistic Regression Summary

• Data
(xi, yi) where xi ∈ 𝒳 and yi ∈ {± 1}

• Objective
m


minimize − log(1 + exp(−yi f(xi, w))) + penalty(w)
w
i=1

• Conditional Probability Estimate


1
log p(y = 1 | o) =
1 + exp(−o)
courses.d2l.ai/berkeley-stat-157
covariate correction
Covariate shift correction

• Propensity scoring
Z Z Z
q(x)
dx q(x)f (x) = dx p(x) f (x) = dx p(x)↵(x)f (x)
p(x)
| {z }
<latexit sha1_base64="zySECyHE+10MKQRamhil0zKZQW8=">AAAQW3ictZdbb9s2FMeVdt06r2vdDXvaCzEjWDoERhRsWF8GtLm0abE29zaNaQSURNlMKFKlqNSeoi+4b7CHYV9lh7Ici0zSJ0eALfL8fySPeDkkg5SzTK+s/LNw5+4X97786v7XrW8efPvwUfvxd+8ymauQHoaSS3UUkIxyJuihZprTo1RRkgScvg/O1o3+/pyqjElxoMcp7SdkIFjMQqLBdNIeYyY0ikYIL6OPS6MnKDZ/f6BWw54aE85FRFWgSEgLHMOrMHhZGLEsTwpMeDokJnNzFVOkIk7anZXuSvWgqwm/TnS8+tk5efzwbxzJME+o0CEnWdbzV1LdL4jSLOS0bOE8oykJz8iA9iApSEKzflH1UYkWwRKhWCr4gVuVtVmiIEmWjZMAyIToYeZqxnid1st1/LRfMJHmmopw0lCcc6QlMh2OIqZoqPkYEiRUDHxF4ZBAB2oYllZrET3ndPQzimhMcq7NmwlmBidDRESobibron1yTjOkWUKh1Pwe8AA+D9yVPJtrvdd14LU93lp0rHqYlC0LTaJPZjGU8/5yPaRS0aTq6VgxKqLb64O6LegDQT+FMkmg0QKvcZDX5KgssMo5LfzubxQyk1eJ0CICp5CMUaqkjKuyVJwzJYVZB0VlLXt+vwcLNDtjab/AKVFYSAbrFQAcxGjHQKjjdzEqoZ1hzDi/bBjj3mqS9Cdu1T5e+locNJ2eqnREkhRWXK829IvN2mJhnMI3NqA/q7yFwCrOk9TM9gb3fGa0YPjWVGbMoXcaVguHH1FnDXJvYrCgUCrJOVHjBrd+abPQ2cpssBszo1OvOIVlnytqVXxptHthxGTS7IAqP++5HhMRjqvoN985bq1emoeEz931gSLpEEWKfGJicFsLtEc47xcjM+rrMkkZp2+IViyk2bw/x4RCZEIfhNwq9NTRYM6xpxlm6EcTJIolrGhcdPzyiROGknFiPnZU9lYhgnAa6x4O6IAJ2F4VGZemDOqsIgzRqDZhxQZD3TeDbY4FZxTmlVUnE4KqWYWYEzHgFOLQsqmoKo1VZXOcgWNG5W1VbKnjT1jX5axJ9aZU36HCJoWLKYZLhxMWd3HJXTgcaXKX2IXphexKD+ztQbg1ox0ExZ7b4tu3M/GtKx4fz8Tj0tQ+GR+U6TyObVYKOoXjwq/gSCaECcebNbM0JyCkijW3zQ1b33D1fVvfd/VdW9919QNbP3D1F7b+wtV3bH3H1ddtfd3Vt21929WPbP3I1d/Y+htXf2Xrr1z9ta2/dvUPtv7B1V/a+ssrc8bWj69MOFu/MucObf3Q1TdtfdPVt2x9y9U14xErJ2+4RLDqRHgNczpjTq9jAqJMNfC6oRJQTqfAtAYIUpJHMVxfnI8KZmtnc7LQYBNGMqWKaAkndLxB4bahzGYw3K6tv0AcUIOEjOqyCvbwSb78HM+EzUP+Rj4bNGCTuZHUagZC+kYulOczcB0yN5LgGJuhJpewv+hneDjANHjI1fyc900zaHCmvY27ir1vaQjlkygP2wecxbvCnWIxXNoqaLWGzAZ54i8jzCOps2VU5YvOalk6V5nAnOxacPv13bvu1cS71a6/0vV3f+08e1rfg+97P3o/eUue7/3uPfO2vB3v0Au9/xbuLTxcePTo3/bddqv9YILeWajLfO9ZT/uH/wGAiUTW</latexit>
↵(x)

• Need to find density ratio, but we don’t have either one.


• Key idea: train a classifier between p and q
1
r(x, y) = [p(x)δ(y,1) + q(x)δ(y, − 1)]
2
Covariate shift correction

• Conditional class probability


p(x) q(x) r(y = 1|x)
r(y = 1|x) = and hence ↵ = =
p(x) + q(x) p(x) r(y = 1|x)
<latexit sha1_base64="Em8ddaBiUyAesb/f4HSS+netDPM=">AAAQUHictZdbb9s2FMeVtts6b2vS7XEvxIxgyZYZUbBhfSnQ5tKmxXJP2jSWEVASZTOhSIWiEnuKvuAeh73smwx72Q5l2RaZpE+OAFvk+f9IHt4ORT9hNFXLy3/PPHj46JNPP3v8eeOLL796Mjv39Ot3qchkQI4CwYQ89nFKGOXkSFHFyHEiCY59Rt7752taf39JZEoFP1SDhHRi3OU0ogFWYDqd43Jh8Ny97i+i58iLJA7yZKG/WJT/6Ed0oTMNT5G+yhHmIeoRHhAEJsySHh4XuhgXKsY2XfNPuuoiHzVSnM41l1vL5YNuJtwq0XSqZ/f06ZM/vFAEWUy4ChhO07a7nKhOjqWiASPgSJaSBAfnuEvakOQ4JmknLwemQPNgCVEkJPy4QqW1XiLHcZoOYh/IGKteamvaeJvWzlT0rJNTnmQKBmTYUJQxpATSo4xCKkmg2AASOJAUfEVBD8OoKJiLRmMevWSk/z0KSYQzpvSbcqpnJC1HuWombaEDfElSpGhMoNT0HvAAugfuCpZOtd7bBvDWEW/MW1bVi4uGgcbhld4BxbR7rnpESBKXIx1JSnh4f2NQtQVjwMlVIOIYGs29VQbyqugXuSczRnK39QuBzPBVIDSPwCkkIpRIIaKyLOGXVAqu90FeWou222l7GU/PadLJvQRLjwvKQw14foR2NYSabstDBbTTiyhj44Y9r70Sx52hW5WPY1/zw7rTI5X0cZzAjmtXhk6+UVkMjBHoYw36rcwbCOziLE70aq9xLydGA4a+JiKlFr1bsxo4/LA8r5H7Q4MBBUIKxrAc1Li1sc1AJzuzxq5PjFa9/Ay2fSaJUfHYaI5Cn4q4PgBlftprPcI8GJTRb7pr3Ni9JAswm7rrXYmTHgolvqK8e18btI0Z6+R9PetrIk4oI1tYSRqQdNrd0aEQ6dAHIbcMPVU0mHLsqYcZcqGDRL7gSRLlTbdYtMJQPIh1Z/tFewUiCCORans+6VIOx6vEg0KXQc0V5EE0qkyepN2e6ujJ9uE4Oyewrow6KedETir0GOZdRiAOLemKytKeLG2WM9KXpbdlsYWmO2Rtl9M61R5RHYsK6pSXjzCvsDhucNdj7tricJ0bY9d6FNIbI7C/D+FWz7bv5/t2i9vbE3HbFk9OJuJJoWsfzg9KVRZFJis4GcFR7pZwKGJMueXNqt6aQxBS+ard5rqpr9v6gakf2Pqeqe/Z+qGpH9r6K1N/Zeu7pr5r62umvmbrO6a+Y+vHpn5s61umvmXrb0z9ja2/NfW3tv7B1D/Y+mtTf31jzZj6yY0FZ+o31tyRqR/Z+oapb9j6pqlv2rqiLKTF8E0gPJRfhLcwZxPm7DbGx1JXA687KgHlbASMaoAgJVgY4YBYnfIne2djuNHgEEYiIRIrAV/o3jqB24bUh0Fvp7L+AHFAdmPcr8pKOMOH+eJjPOUmD/k7+bRbg3XmTlLJCQjpO7lAXE7ANcjcSYJjdILqXEx/Jx/h4QOmxkOu4qd8bupJg2/a+7irmOeWglA+jPJwfMC3eIvbSyyCS1sJrVSQPiBP3SXksVCodAmV+by5UhTWVcbXX3YNuP269l33ZuLdSstdbrl7PzdfPKvuwY+db53vnAXHdX51Xjibzq5z5ATOX86/Mw9nHs3+OfvP7H9zM0P0QfV2vnGMZ67xP6FPRas=</latexit>

• Logistic regression
1
r(y = 1 | x) =
1 + exp(−f(x))
r(y = − 1 | x)
⟹ α(x) = = exp( f(x))
r(y = 1 | x)
Covariate Shift Correction Redux

• Training and test data


• Split as if it were a binary classification problem
(labels -1 and 1 for training and test respectively)
• Train with logistic regression to get f
• Use binary classifier output to reweight data
• Solve original problem but weighted

∑ ∑
l(xi, yi, g(xi, w)) ⟶ exp( f(xi)) ⋅ l(xi, yi, g(xi, w))
i i

courses.d2l.ai/berkeley-stat-157
Label shift
Training set
Training set
Test set
Why would anyone do this?
Label Shift

• Medical diagnosis
• Train on data with few sick patients
• Test on data during flu season where q(flu) > p(flu)
while flu symptoms p(symptoms|flu) are still the same
• Speech recognition
• Train on newscast data before election
• Test on newscast after election (new topics, names,
discussions, but still same language)
Label Shift q(x, y) = q(y)p(x|y)
<latexit sha1_base64="PKJRzfFGJ7uhqerpUKcteq5+5Iw=">AAAP9nictZfdbts2FMfV7qOdt9XptrvdEDOCJUNgRMGG9WZAGydtWizfSZvGMgJKomwmFKlQVGJX0asMuxmG7WrPsTfY2+xQVmyRSXrlChBEnv+P5CFFHpJ+wmiqlpf/u3f/o48/+fTBw88an3/x5aPm3OOvXqcikwE5DAQT8sjHKWGUk0NFFSNHiSQ49hl54591tP7mgsiUCn6gRgnpxbjPaUQDrMB0MvfN+cJwabSIfkHnC/BJFoZXo8WTudZye7l80M2EWyVaTvXsnDx+9K8XiiCLCVcBw2nadZcT1cuxVDRgpGh4WUoSHJzhPulCkuOYpL28dL9A82AJUSQkvFyh0lovkeM4TUexD2SM1SC1NW28TetmKnrSyylPMkV4MG4oyhhSAumxQCGVJFBsBAkcSAq+omCAJQ4UjFijMY+eMTL8HoUkwhlT+ks51eOWIsxDVDWTttE+viApUjQmUGp2D3gA3QN3BUtnWu9tA3jriDfmLasaxEXDQOPwUs/TYtY9VwMiJInLkY4kJTz8cGNQtQVjwMllIOIYGs29VQbyqhgWuSczRnK3/ROBzPhTIDSPwCkkIpRIIaKyLOEXVAqu10FeWouu2+t6GU/PaNLLvQRLjwvKQw14foR2NIRabttDBbQziChjk4Y9r7sSx72xW5WPE1/zg7rT1yoZ4jiBFdetDL18vbIYGCPQxxr0a5k3EFjFWZzo2V7jnk2NBgx9TURKLXqnZjVweLE8q5F7Y4MBBUIKxrAc1bjOxGag05VZY9emRqtefgrLPpPEqHhiNEdhSEVcH4AyP+u5HmEejMroN9s5bqxekgWYzdz1vsTJAIUSX1Le/1ALtIsZ6+VD/dc7Ik4oI5tYSRqQdNbd0aEQ6dAHIbcMPVU0mHHsqYcZcq6DRL7gSRLlLbdYtMJQPIp1Z4dFdwUiCCOR6no+6VMO26vEo0KXQa0V5EE0qkyepP2B6umf7cN2dkZgXhl1Us6JnFboMcz7jEAcWtIVlaU9WdosZ6QvS2/LYgstd8zaLqd1qntN9SwqqFNefo15hcVxg7uacFcWh+vcBLvSo5DeGIG9PQi3+m/7fr5nt7i1NRW3bPH4eCoeF7r28f9BqcqiyGQFJ9dwlLslHIoYU255s6qX5hiEVL5qt7lm6mu2vm/q+7a+a+q7tn5g6ge2/tzUn9v6jqnv2HrH1Du2vm3q27Z+ZOpHtr5p6pu2/tLUX9r6K1N/ZetvTf2trb8w9Rc35oypH9+YcKZ+Y84dmvqhra+b+rqtb5j6hq0rykJajL8EwkN5IryFOZ0yp7cxPpa6GvjcUQkop9fAdQ0QpAQLIxwQq1P+dO2sjxcabMJIJERiJeCE7q0RuG1IvRkMtivrDxAHZD/Gw6qshD18nC/ex1Nu8pC/k0/7NVhn7iSVnIKQvpMLxMUU7EDmThIco1NU52L6jryHhwNMjYdcxc9439Q/Dc60H+KuYu5bCkL5OMrD9gFn8Ta3p1gEl7YSWqkgvUGeuEvIY6FQ6RIq83lrpSisq4yvT3YNuP269l33ZuL1Sttdbru7P7aePqnuwQ+db53vnAXHdX52njobzo5z6ATOO+d35y/n7+aw+Vvzj+afY/T+varM147xNP/5HzvuJyM=</latexit>

• Data generating process p(x|y) is unchanged


• Labels change since the underlying cause changed

q(y)
• Need to reweight according to (y) = to get
<latexit sha1_base64="jEk70yAufNoRpzKpQW43rR85bi4=">AAAP/nictZfdbts2FMfV7qvztjrdLrcLYkawdCiMKNiw3gxo46RNi+U7adNYRkBJlM2EIlWKSuwpAoY9zLCbYdiu9gx7g73NDmU5FumkV44AW+T5/0geUuQh6SeMpmp5+b87d997/4MPP7r3ceOTTz+731x48PmrVGQyIIeBYEIe+TgljHJyqKhi5CiRBMc+I6/9s47WX58TmVLBD9QoIb0Y9zmNaIAVmE4WvvJ8ovDS6CH6EXmRxEH+FjJFnuj/k4XWcnu5fNBswq0SLad6dk4e3P/XC0WQxYSrgOE07brLierlWCoaMFI0vCwlCQ7OcJ90IclxTNJeXnajQItgCVEkJPy4QqW1XiLHcZqOYh/IGKtBamvaeJ3WzVT0uJdTnmSK8GDcUJQxpATSY4JCKkmg2AgSOJAUfEXBAMNgKBi5RmMRPWVk+A0KSYQzpvSbcqrHL0WYh6hqJm2jfXxOUqRoTKDU/B7wALoH7gqWzrXe6wbw2hFvLFpWNYiLhoHG4YWer8W8e64GREgSlyMdSUp4eHtjULUFY8DJRSDiGBrNvVUG8qoYFrknM0Zyt/09gcz4VSC0iMApJCKUSCGisizh51QKrtdBXlqLrtvrehlPz2jSy70ES48LykMNeH6EdjSEWm7bQwW0M4goY1cNe153JY57Y7cqH698zQ/qTk9UMsRxAiuuWxl6+XplMTBGoI816KcybyCwirM40bO9xj2dGg0Y+pqIlFr0Ts1q4PDD8qxG7o0NBhQIKRjDclTjOlc2A52uzBq7NjVa9fJTWPaZJEbFV0ZzFIZUxPUBKPPznusR5sGojH7znePG6iVZgNncXe9LnAxQKPEF5f3bWqBdzFgvH+qv3hFxQhnZxErSgKTz7o4OhUiHPgi5ZeiposGcY089zJC3OkjkS54kUd5yi4dWGIpHse7ssOiuQARhJFJd2Lz7lMP2KvGo0GVQawV5EI0qkydpf6B6+mP7sJ2dEZhXRp2UcyKnFXoM8z4jEIce6YrK0p4sbZYz0pelt2WxpZY7Zm2X0zrVnVA9iwrqlJdPMK+wOG5wl1fcpcXhOneFXepRSGdGYG8Pwq3+2r6f79ktbm1NxS1bPD6eiseFrn38fVCqsigyWcHJBI5yt4RDEWPKLW9W9dIcg5DKV+0210x9zdb3TX3f1ndNfdfWD0z9wNafmfozW98x9R1b75h6x9a3TX3b1o9M/cjWN01909ZfmPoLW39p6i9t/Y2pv7H156b+fGbOmPrxzIQz9Zk5d2jqh7a+burrtr5h6hu2rigLaTF+EwgP5YnwGuZ0ypxex/hY6mrgdUMloJxOgEkNEKQECyMcEKtT/nTtrI8XGmzCSCREYiXghO6tEbhtSL0ZDLYr67cQB2Q/xsOqrIQ9fJwv3sVTbvKQv5FP+zVYZ24klZyCkL6RC8T5FOxA5kYSHKNTVOdi+jN5Bw8HmBoPuYqf876pPxqcaW/jrmLuWwpC+TjKw/YBZ/E2t6dYBJe2ElqpIL1BnriPkMdCodJHqMznrZWisK4yvj7ZNeD269p33dnEq5W2u9x2d79rPXlc3YPvOV86XztLjuv84DxxNpwd59AJnF+d352/nL+bvzR/a/7R/HOM3r1TlfnCMZ7mP/8DqnArKg==</latexit>
p(y)
Z Z Z Z
q(y)
q(y) dy p(x|y) dx l(f (x), y) = p(y) dy p(x|y) dx l(f (x), y)
<latexit sha1_base64="/lAZ5768bh1egV6RA0Zln4Q/glI=">AAAQV3ictZfdbts2FMeVbusyd2uS7XI3xIxgzhAYUbBhvRnQxkmbFst30qaxjICSKJsJRaoUldhT9IJ7g2IPsx3Kii0ySbELV4At8vx/JA+/DkU/YTRVa2sf5x598eVXj7+e/6bx5Nvvni4sLn3/NhWZDMhJIJiQpz5OCaOcnCiqGDlNJMGxz8g7/7Kj9XdXRKZU8GM1Skgvxn1OIxpgBabzxcyjXKEPrdEK8lZROEJlPmkNbyrLUP+zVtQarqwisP2BGhWigUjiINeli1wbiv9Zyflic629Vj7obsKtEk2nevbPl57+7YUiyGLCVcBwmnbdtUT1ciwVDRgpGl6WkgQHl7hPupDkOCZpLy/Hp0DLYAlRJCT8wLHSWi+R4zhNR7EPZIzVILU1bbxP62YqetbLKU8yRXgwbijKGFIC6cFGIZUkUGwECRxICr6iYIBhzBRMSaOxjF4wMvwZhSTCGVP6TTnVE5MizENUNZO20RG+IilSNCZQanYPeADdA3cFS2da730DeO+IN5YtqxrERcNA4/Bab4Ri1j1XAyIkicuRjiQlPPx8Y1C1BWPAyXUg4hgazb0NBvKGGBa5JzNGcrf9G4HM+FUgtIzAKSQilEghorIs4VdUCq73QV5ai67b63oZTy9p0su9BEuPC8pDDXh+hPY1hJpu20MFtDOIKGOThj2vux7HvbFblY8TX/PjutO3KhniOIEd160MvXyrshgYI9DHGvRnmTcQ2MVZnOjVXuNeTI0GDH1NREoter9mNXD4YXlZIw/HBgMKhBSMYTmqcZ2JzUCnO7PGbk6NVr38ArZ9JolR8cRojsKQirg+AGV+1ms9wjwYldFvtmvc2L0kCzCbuet9iZMBCiW+prz/uTZoFzPWy4d61jsiTigjO1hJGpB01t3RoRDp0Achtww9VTSYceyphxnyQQeJvOVJEuVNt1ixwlA8inVnh0V3HSIII5Hqej7pUw7Hq8SjQpdBzXXkQTSqTJ6k/YHq6cn24Ti7JLCujDop50ROK/QY5n1GIA6t6orK0p4sbZYz0pelt2WxVtMds7bLaZ3q3lI9iwrqlJffYl5hcdzgbibcjcXhOjfBbvQopHdG4PAQwq2ebd/PD+0Wd3en4q4tnp1NxbNC1z6eH5SqLIpMVnByC0e5W8KhiDHlljcbemuOQUjlG3abm6a+aetHpn5k6wemfmDrx6Z+bOsvTf2lre+b+r6td0y9Y+t7pr5n66emfmrrO6a+Y+uvTf21rb8x9Te2/t7U39v6K1N/dWfNmPrZnQVn6nfW3Impn9j6lqlv2fq2qW/buqIspMX4TSA8lF+E9zAXU+biPsbHUlcDrwcqAeXiFritAYKUYGGEA2J1yp/una3xRoNDGImESKwEfKF7mwRuG1IfBoO9yvoLxAHZj/GwKivhDB/ni0/xlJs85B/k034N1pkHSSWnIKQf5AJxNQU7kHmQBMfoFNW5mP5FPsHDB0yNh1zFz/jc1JMG37Sf465inlsKQvk4ysPxAd/ibW4vsQgubSW0XkH6gDx3V5HHQqHSVVTm8+Z6UVhXGV9/2TXg9uvad927ibfrbXet7R782nz+rLoHzzs/Oj85Lcd1fneeO9vOvnPiBM4/c3NzjbknCx8X/l18vDg/Rh/NVWV+cIxncek/E9ZB4w==</latexit>
p(y)

We don’t have samples from q(y)!


Label Shift q(x, y) = q(y)p(x|y)
<latexit sha1_base64="PKJRzfFGJ7uhqerpUKcteq5+5Iw=">AAAP9nictZfdbts2FMfV7qOdt9XptrvdEDOCJUNgRMGG9WZAGydtWizfSZvGMgJKomwmFKlQVGJX0asMuxmG7WrPsTfY2+xQVmyRSXrlChBEnv+P5CFFHpJ+wmiqlpf/u3f/o48/+fTBw88an3/x5aPm3OOvXqcikwE5DAQT8sjHKWGUk0NFFSNHiSQ49hl54591tP7mgsiUCn6gRgnpxbjPaUQDrMB0MvfN+cJwabSIfkHnC/BJFoZXo8WTudZye7l80M2EWyVaTvXsnDx+9K8XiiCLCVcBw2nadZcT1cuxVDRgpGh4WUoSHJzhPulCkuOYpL28dL9A82AJUSQkvFyh0lovkeM4TUexD2SM1SC1NW28TetmKnrSyylPMkV4MG4oyhhSAumxQCGVJFBsBAkcSAq+omCAJQ4UjFijMY+eMTL8HoUkwhlT+ks51eOWIsxDVDWTttE+viApUjQmUGp2D3gA3QN3BUtnWu9tA3jriDfmLasaxEXDQOPwUs/TYtY9VwMiJInLkY4kJTz8cGNQtQVjwMllIOIYGs29VQbyqhgWuSczRnK3/ROBzPhTIDSPwCkkIpRIIaKyLOEXVAqu10FeWouu2+t6GU/PaNLLvQRLjwvKQw14foR2NIRabttDBbQziChjk4Y9r7sSx72xW5WPE1/zg7rT1yoZ4jiBFdetDL18vbIYGCPQxxr0a5k3EFjFWZzo2V7jnk2NBgx9TURKLXqnZjVweLE8q5F7Y4MBBUIKxrAc1bjOxGag05VZY9emRqtefgrLPpPEqHhiNEdhSEVcH4AyP+u5HmEejMroN9s5bqxekgWYzdz1vsTJAIUSX1Le/1ALtIsZ6+VD/dc7Ik4oI5tYSRqQdNbd0aEQ6dAHIbcMPVU0mHHsqYcZcq6DRL7gSRLlLbdYtMJQPIp1Z4dFdwUiCCOR6no+6VMO26vEo0KXQa0V5EE0qkyepP2B6umf7cN2dkZgXhl1Us6JnFboMcz7jEAcWtIVlaU9WdosZ6QvS2/LYgstd8zaLqd1qntN9SwqqFNefo15hcVxg7uacFcWh+vcBLvSo5DeGIG9PQi3+m/7fr5nt7i1NRW3bPH4eCoeF7r28f9BqcqiyGQFJ9dwlLslHIoYU255s6qX5hiEVL5qt7lm6mu2vm/q+7a+a+q7tn5g6ge2/tzUn9v6jqnv2HrH1Du2vm3q27Z+ZOpHtr5p6pu2/tLUX9r6K1N/ZetvTf2trb8w9Rc35oypH9+YcKZ+Y84dmvqhra+b+rqtb5j6hq0rykJajL8EwkN5IryFOZ0yp7cxPpa6GvjcUQkop9fAdQ0QpAQLIxwQq1P+dO2sjxcabMJIJERiJeCE7q0RuG1IvRkMtivrDxAHZD/Gw6qshD18nC/ex1Nu8pC/k0/7NVhn7iSVnIKQvpMLxMUU7EDmThIco1NU52L6jryHhwNMjYdcxc9439Q/Dc60H+KuYu5bCkL5OMrD9gFn8Ta3p1gEl7YSWqkgvUGeuEvIY6FQ6RIq83lrpSisq4yvT3YNuP269l33ZuL1Sttdbru7P7aePqnuwQ+db53vnAXHdX52njobzo5z6ATOO+d35y/n7+aw+Vvzj+afY/T+varM147xNP/5HzvuJyM=</latexit>

• Key Idea - measure the estimates on test set


• p(x|y) is the same for training and test
• Distribution of predictions |x,y has to be the same
• Simple ‘spectral’ algorithm (Lipton, Wang, Smola, 2018)
• Confusion matrix C[y 0 |y] = Pr(ŷ(x) = y 0 |y) on hold out <latexit sha1_base64="2xyJMFuR2ADsDtAnEdMEHFqTtgM=">AAAQAnictZfdbts2FMfV7qvztibdLndDzAiaDIFhBR3WmwFtnLRpsSTOV5vGMgJKomwmFKlRVGJP0d2whxl2Mwzb1Z5gb7C32aGsxCKT9MoRIIg8/x/JQ4o8JP2E0VS12//du//Bhx99/MmDTxufff7Fw7n5R1++SUUmA3IQCCbkoY9TwignB4oqRg4TSXDsM/LWP+1o/e0ZkSkVfF+NE9KP8YDTiAZYgel4HnV648cX4z76AXlduegNscrHxeJoCQxaWDqeb7Zb7fJB1xNulWg61dM9fvTwXy8UQRYTrgKG07TnthPVz7FUNGCkaHhZShIcnOIB6UGS45ik/bzsSoEWwBKiSEh4uUKltV4ix3GajmMfyBirYWpr2niT1stU9LSfU55kivBg0lCUMaQE0uOCQipJoNgYEjiQFHxFwRBLHCgYvUZjAT1nZPQYhSTCGVP6SznVY5gizENUNZO20B4+IylSNCZQanYPeADdA3cFS2da700DeOOINxYsqxrGRcNA4/Bcz9li1j1XQyIkicuRjiQlPLy7MajagjHg5DwQcQyN5t4qA3lVjIrckxkjudv6jkBm8ikQWkDgFBIRSqQQUVmW8DMqBdfrIC+tRc/t97yMp6c06edegqXHBeWhBjw/Ql0Noabb8lAB7QwjythVw57XW4nj/sStyscrX/P9utOXKhnhOIEV16sM/Xy9shgYI9DHGvRjmTcQWMVZnOjZXuOeT40GDH1NREotuluzGji8WJ7WyN2JwYACIQVjWI5rXOfKZqDTlVlj16ZGq15+Ass+k8So+MpojsKIirg+AGV+1nM9wjwYl9FvtnPcWL0kCzCbuesDiZMhCiU+p3xwVwu0hxnr5yP91zsiTigjm1hJGpB01t3RoRDp0Achtww9VTSYceyphxnykw4S+aInSZQ33WLJCkPxONadHRW9FYggjESq5/lkQDlsrxKPC10GNVeQB9GoMnmSDoaqr3+2D9vZKYF5ZdRJOSdyWqHHMB8wAnFoWVdUlvZkabOckb4svS2LLTbdCWu7nNap3iXVt6igTnn5JeYVFscN7uKKu7A4XOeusAs9Cum1EdjdhXCr/7bv57t2i1tbU3HLFo+OpuJRoWuf/B+UqiyKTFZwcglHuVvCoYgx5ZY3q3ppTkBI5at2m2umvmbre6a+Z+s7pr5j6/umvm/rL0z9ha13Tb1r6x1T79j6tqlv2/qhqR/a+qapb9r6K1N/ZeuvTf21rb8z9Xe2/tLUX16bM6Z+dG3Cmfq1OXdg6ge2vm7q67a+Yeobtq4oC2kx+RIID+WJ8AbmZMqc3MT4WOpq4HNLJaCcXAKXNUCQEiyMcECsTvnTtbM+WWiwCSOREImVgBO6t0bgtiH1ZjDcrqzfQhyQgxiPqrIS9vBJvngfT7nJQ/5WPh3UYJ25lVRyCkL6Vi4QZ1OwA5lbSXCMTlGdi+nP5D08HGBqPOQqfsb7pv5pcKa9i7uKuW8pCOWTKA/bB5zFW9yeYhFc2kpopYL0BnnsLiOPhUKly6jM582VorCuMr4+2TXg9uvad93riTcrLbfdcneeNJ89re7BD5yvnW+cRcd1vneeORtO1zlwAudX53fnL+fvuV/mfpv7Y+7PCXr/XlXmK8d45v75H4fFLBc=</latexit>

• Predicted label vector on test set µ[y 0 ] = Pr(ŷ(x) = y 0 ) <latexit sha1_base64="Zz7C4YsTL2bm6N8fYqrWnSdhNG4=">AAAQAHictZdbb9s2FMfV7tZ5W5NujwMGYkbQZAiMKNiwvgxoc+kNa+Lc2jSWEVASZTPhRaOo1J6ql2EfZtjLMGxP+wj7Bvs2O5SVWGSSPjkCBJHn/yN5SJGHZJgymumVlf9u3X7v/Q8+/OjOx61PPv3s7tz8vc9fZjJXETmIJJPqMMQZYVSQA001I4epIpiHjLwKT9eN/uqMqIxKsa/HKelzPBA0oRHWYDqe/yrgeW98v49+QEFXLQZDrItxuThaAsP4/tLxfHuls1I96HLCrxNtr366x/fu/hvEMso5ETpiOMt6/kqq+wVWmkaMlK0gz0iKo1M8ID1ICsxJ1i+qjpRoASwxSqSCV2hUWZslCsyzbMxDIDnWw8zVjPEqrZfr5EG/oCLNNRHRpKEkZ0hLZEYFxVSRSLMxJHCkKPiKoiFWONIwdq3WAnrEyOg+ikmCc6bNlwpqRjBDWMSobibroD18RjKkKSdQanYPeADdA3cly2Za71UDeOWItxYcqx7ysmWhPH5jZmw5657rIZGK8GqkE0WJiG9uDOq2YAwEeRNJzqHRIlhjIK/JUVkEKmek8DvfEchMPiVCCwicQjJBqZIyqcoScUaVFGYdFJW17Pn9XpCL7JSm/SJIsQqEpCI2QBAmqGsg1PY7ASqhnWFCGbtoOAh6q5z3J27VPl74Wuw3nT5XyQjzFFZcrzb0i83aYmGMQB8b0I9V3kJgFec8NbO9wT2aGi0Y+prKjDp0t2G1cHixOm2QuxODBUVSScawGje49QubhU5XZoPdmBqdesUJLPtcEaviC6M9CiMqeXMAqvys53qCRTSuot9s57i1ekkeYTZz1wcKp0MUK/yGisFNLdAeZqxfjMxfX5c8pYy8wFrRiGSz7o4JhciEPgi5Veipo8GMY08zzJCfTJAoFgNFkqLtl0tOGOJjbjo7KnurEEEYSXQvCMmACtheFR6Xpgxqr6IAolFtChQdDHXf/OwQtrNTAvPKqpMKQdS0woBhMWAE4tCyqagqHajK5jijQlV5WxVbbPsT1nU5a1K9c6rvUFGTCopzLCgdTljc2wvurcPhJneBvTWjkF0agd1dCLfmb4dhseu2uLU1Fbdc8ehoKh6VpvbJ/0GZzpPEZqUg53BS+BUcS46pcLxZM0tzAkKqWHPb3LD1DVffs/U9V9+x9R1X37f1fVd/bOuPXb1r611XX7f1dVfftvVtVz+09UNXf2HrL1z9ma0/c/Xntv7c1V/b+mtXf2LrTy7NGVs/ujThbP3SnDuw9QNX37T1TVd/autPXV1TFtNy8iUQHqoT4RXMyZQ5uYoJsTLVwOeaSkA5OQfOa4AgJVmc4Ig4nQqna2dzstBgE0YyJQprCSf0YIPAbUOZzWC4XVu/gTigBhyP6rIK9vBJvnwXT4XNQ/5aPhs0YJO5ltRqCkL6Wi6SZ1NwHTLXkuAYnaImx+nP5B08HGAaPORqfsb7pvlpcKa9ibuKvW9pCOWTKA/bB5zFO8KdYglc2ipotYbMBnnsL6OAxVJny6jKF+3VsnSuMqE52bXg9uu7d93LiZerHX+l4+982374oL4H3/G+9L72Fj3f+9576D31ut6BF3m/er97f3l/z/0y99vcH3N/TtDbt+oyX3jWM/fP/5h6KxQ=</latexit>

• Obtain q(y) via matrix inversion since


X
0
µ[y ] = C[y 0 |y]q(y)
<latexit sha1_base64="RmP8eSNE5t6iCMRBBqTupNNmuDI=">AAAQAXictZfLbtw2FIaV9JJ02sZJu+yiRAdGnCIYWEaLZlMg8di5oU4cx04cjwSDkqgZ2rwoFGXPVNaq6MMU3RRFu+ob9A36Nj3UyB6RtrOaCBBEnv8jeUiRh2SUMZrr5eX/rlz94MOPPr52/ZPOp599fmPh5q0vXuWyUDHZiSWTajfCOWFUkB1NNSO7mSKYR4y8jg77Rn99RFROpdjWk4yEHA8FTWmMNZj2b34d8GIwuR2iH1GQF3y/nFSoD4aTSYjeLk3u7N/sLveW6wedT/hNous1z+b+rRv/BomMC06EjhnO84G/nOmwxErTmJGqExQ5yXB8iIdkAEmBOcnDsu5JhRbBkqBUKniFRrW1XaLEPM8nPAKSYz3KXc0YL9IGhU7vhSUVWaGJiKcNpQVDWiIzLCihisSaTSCBY0XBVxSPsMKxhsHrdBbRA0bGt1FCUlwwbb5UUDOEOcIiQU0zeQ+9xEckR5pyAqXm94AH0D1wV7J8rvVeNIAXjnhn0bHqEa86FsqTYzNlq3n3XI+IVITXI50qSkTy/sagaQvGQJDjWHIOjZbBKgN5VY6rMlAFI6Xf+55AZvqpEFpE4BSSKcqUlGldlogjqqQw66CsrdXADwdBIfJDmoVlkGEVCElFYoAgStGmgVDX7wWognZGKWXsrOEgGKxwHk7danw887Xcbjt9qpIx5hmsuEFjCMv1xmJhjEAfW9BPdd5CYBUXPDOzvcU9mBktGPqayZw69GbLauHwYnXYIremBguKpZKMYTVpcf0zm4XOVmaLXZsZnXrFASz7QhGr4jOjPQpjKnl7AOr8vOd6ikU8qaPffOe4tXpJEWM2d9eHCmcjlCh8TMXwfS3QAWYsLMfmr/clzygjG1grGpN83t0xoRCZ0Achtw49TTSYc+xphxny1gSJcilQJC27fnXHCUN8wk1nx9VgBSIII6keBBEZUgHbq8KTypRB3RUUQDRqTIGiw5EOzc+OYDs7JDCvrDqpEETNKgwYFkNGIA7dNRXVpQNV2xxnVKRqb+tiS11/yrou521qcEqFDhW3qaA8xYLK4YTFnZxxJw6H29wZdmJGIT83AltbEG7N346icstt8dmzmfjMFff2ZuJeZWqf/h+U6yJNbVYKcgqnpV/DieSYCsebVbM0pyCkylW3zTVbX3P1l7b+0tVf2PoLV9+29W1Xf2jrD11909Y3Xb1v631Xf27rz11919Z3XX3D1jdc/YmtP3H1p7b+1NXf2PobV39k64/OzRlb3zs34Wz93JzbsfUdV1+39XVXf2zrj11dU5bQavolEB7qE+EFzMGMObiIibAy1cDnkkpAOTgFTmuAICVZkuKYOJ2KZmtnfbrQYBNGMiMKawkn9GCNwG1Dmc1g9LyxfgtxQA05HjdlFezh03z1Lp4Km4f8pXw+bMEmcymp1QyE9KVcLI9mYB8yl5LgGJ2hJsfpz+QdPBxgWjzkGn7O+6b5aXCmfR93FXvf0hDKp1Eetg84i/eEO8VSuLTV0EoDmQ1y37+LApZInd9Fdb7srlSVc5WJzMmuA7df373rnk+8Wun5yz3/xXfd+/eae/B17yvvG2/J870fvPveY2/T2/Fi71fvd+8v7++FXxZ+W/hj4c8pevVKU+ZLz3oW/vkfFEosSw==</latexit>
y
Guarantees

• Robust under misspecification


• Even if the estimates y(x) are wrong, calibration is OK:
(same errors on hold-out and test set)
• Confusion matrix and label vector are concentrated:
(use matrix Bernstein inequality)
• Simple algorithm
• Cubic in number of classes, linear in sample size
Black Box Shift Correction on CIFAR10

Tweaking one class probability Dirichlet prior over shifts


Adversarial data
Adversarial Image Generation (e.g. Sharif et al. 2017)

Digital manipulation
to dodge recognition

In real life - via 3D


printed glasses
Adversarial Audio Generation (e.g. Carlini & Wagner, 2018)

• Modify data
slightly such as to
obtain wrong class

maximize l(f (x + ), y)
subject to k k  ✏
<latexit sha1_base64="7QpLaHwGSBBYhTTwHPgnKzrYDOs=">AAAQNHictZfdbts2FMfV7qOdtzXtdrkbYka2dAuMKNiwXrb56BeWNk2TNo1pBJRE2UwoUqWoxJ6iPdawZ9gbDBh2Mwzb1Z5hh5ISi0zSK0eAYfL8fyQPj8hDMUg5y/TS0u/Xrr/3/gcf3rj5UefjTz69NXf7zmevMpmrkO6Ekku1G5CMcibojmaa091UUZIEnL4ODleN/vqIqoxJsa0nKR0kZChYzEKiwbR/ewsnZMz2cUS5Juhn9BXiC/HC+NvacHdxchdh3MGajnWR5cEBDTXSEpUA4pMawicIc/oWYZpmjJtOu0u9pepB5wt+U+h6zbO5f+fWbziSYZ5QoUNOsqzvL6V6UBClWchp2cF5RlMSHpIh7UNRkIRmg6KafInmwRKhWCr4CY0qa7tFQZIsmyQBkAnRo8zVjPEirZ/r+N6gYCLNNRVhPVCcczN9E0kUMQXR4BMokFAx8BWFI6JIqCHenc48esDp+GsU0ZjkXJt/JpiJeoaIiFAzTNZDL8kRzZBmCYVWs3vAA5geuCt5NtN+LwrghRHvzDtWPUrKjoUm0bFZ5eWsZ65HVCqaVJGOFaMiuroYNGNBDAQ9DmWSwKAFXuEgr8hxWWCVc1r4ve8pVOq/EqF5BE4hGaNUSRlXbak4YkoKsw+Kylr2/UEf5yI7ZOmgwClRWEgmIgPgIEabBkJdv4dRCeOMYsb52cAY95eTZFC71fh45mux3Xb6VKVjkqSw4/qNYVCsNxYL4xTm2IJ+rOoWArs4T1Kz2lvcg6nRgmGuqcyYQ2+2rBYOP6IOW+RWbbCgUCrJOVGTFrd6ZrPQ6c5ssWtTo9OvMEkwV9Tq+MxoR2HMZNIOQFWf9VqPiQgnVfab7Rq3di/NQ8Jn7vpQkXSEIkWOmRhe1QbtE84Hxdi89VWZpIzTDaIVC2k26+mYVIhM6oOUW6WeJhvMOPe00wx9a5JEsYAVjYuuX9510lAyScxkx2V/GTIIp7Hu44AOmYDjVZFJadqg7jKc3kDXJqzYcKQH5mUHcJwdUlhXVp9MCKqmHWJOxJBTyEOLpqOqNVaVzXFGBarytmq20PVr1nU5a1P9U2rgUGGbwsUphkuHExZ3csadOBxpc2fYiYlCdi4CW1uQbs3bDoJiyx3x2bOp+MwV9/am4l5peq/fD8p0Hsc2KwU9hePCr+BIJoQJx5sVszVrEErFijvmmq2vufpLW3/p6i9s/YWrb9v6tqs/tPWHrr5p65uuvmrrq67+3Nafu/qure+6+oatb7j6E1t/4upPbf2pq7+x9Teu/sjWH51bM7a+d27B2fq5Nbdj6zuuvm7r667+2NYfu7pmPGJl/U8hPVRfhBcwB1Pm4CImIMp0A3+XdALKwSlw2gMkKcmjmITUmVQw3Tvr9UaDQxjJlCqiJXyh4zUKtw1lDoPR88b6DeQBNYQrUdNWwRle18t38UzYPNQv5bNhCzaVS0mtpiCUL+VCeTQFV6FyKQmOsSlqagn7ib6Dhw+YFg+1hp/xuWleGnzTXsVdxT63NKTyOsvD8QHf4j3hLrEYLm0VtNxA5oDc9xfhjhtJnS2iql50l8vSucoE5suuA7df373rni+8Wu75Sz3/xXfd+/eae/BN7wvvS2/B870fvPveY2/T2/FC71fvT+8f79+5X+b+mPtr7u8avX6tafO5Zz1z//0PQZhAXg==</latexit>

Different norms
Different datasets
Different papers …
Why does this work?
‘Unnatural’ data

• Training and ‘natural’ test data live in


small subset
• Adversarial data is slightly off that
support
• Function behavior undefined away
from where data occurs
‘Unnatural’ data

• Training and ‘natural’ test data live in


small subset
• Adversarial data is slightly off that
support
• Function behavior undefined away
from where data occurs
Wow. Breathtaking. Is this new?
Spam defenses

• While TRUE
• Mail host extends dataset and trains new classifier
• Spammer’s e-mails are rejected
• Spammer finds a modification that succeeds
• Examples
• Add highly scoring words (or sentences) to email
• Add highly scoring sentences (and vary them)
• Change or forge header (‘Dear Alex, …’)
Invariances

• Tangent Distance (Simard et al., 1995)


• Invariance transforms don’t change the label
• Explore data and their neighborhood
Invariances

• Virtual Support Vectors (Schoelkopf, 1997)


Only change the data at the boundary (not enough RAM)
• Data augmentation for training
• Imagenet (pretty much every paper)
Cropping, scaling, change mean, per channel, …
• Speech Recognition
Background noise, scenes, …
• Document Analysis
Random substrings, word removal, insertion
Invariant and robust loss

• Convex loss (Teo et al, 2005)


• Family of transformations 2 <latexit sha1_base64="be8SZyyY5ERYYXNN0+fG1l0XcrM=">AAAP83ictZfdbts2FMfV7qvztjrZLndDzAiWDYFhGRvWyzYfbVos30mbxjICSqJsJhSpUVRiT9GTDLsZhu1qL7I32NvsUFJikUl65QoQRJ7/j+QhRR6SfsJoqnq9/x48/ODDjz7+5NGnrc8+/+Jxe2Hxy9epyGRAjgLBhDz2cUoY5eRIUcXIcSIJjn1G3vjna1p/c0FkSgU/VNOEDGM84jSiAVZgOl1Y9ELCFEYe5chb18nThU6v2ysfdDvh1omOUz+7p4uP//VCEWQx4SpgOE0Hbi9RwxxLRQNGipaXpSTBwTkekQEkOY5JOsxL3wu0BJYQRULCyxUqrc0SOY7TdBr7QMZYjVNb08a7tEGmoifDnPIkU4QHVUNRxpASSA8ECqkkgWJTSOBAUvAVBWMscaBguFqtJfSMkcm3KCQRzpjSX8qpHrQUYR6iupm0iw7wBUmRojGBUvN7wAPoHrgrWDrXeu8awDtHvLVkWdU4LloGGoeXepIW8+65GhMhSVyOdCQp4eH7G4O6LRgDTi4DEcfQaO6tMpBXxaTIPZkxkrvdHwlkqk+B0BICp5CIUCKFiMqyhF9QKbheB3lpLQbucOBlPD2nyTD3Eiw9LigPNeD5EdrVEOq4XQ8V0M44oozdNOx5g34cDyu3ah9vfM0Pm05fq2SC4wRW3KA2DPON2mJgjEAfG9DPZd5AYBVncaJne4N7NjMaMPQ1ESm16N2G1cDhxfK8Qe5XBgMKhBSMYTltcGs3NgOdrcwGuz4zWvXyM1j2mSRGxTdGcxQmVMTNASjz857rEebBtIx+853jxuolWYDZ3F0fSZyMUSjxJeWj97VAB5ixYT7Rf31NxAllZAsrSQOSzrs7OhQiHfog5Jahp44Gc449zTBDftFBIl/2JInyjlt8Z4WheBrrzk6KQR8iCCORGng+GVEO26vE00KXQZ0+8iAa1SZP0tFYDfXP9mE7Oycwr4w6KedEzir0GOYjRiAOreiKytKeLG2WM9KXpbdlseWOW7G2y2mTGlxTQ4sKmpSXX2NeYXHc4K5uuCuLw03uBrvSo5DeGoH9fQi3+m/7fr5vt7i9PRO3bfHkZCaeFLr26v+gVGVRZLKCk2s4yt0SDkWMKbe8WdVLswIhla/aba6b+rqtH5j6ga3vmfqerR+a+qGtPzf157a+a+q7tr5m6mu2vmPqO7Z+bOrHtr5l6lu2/tLUX9r6K1N/ZetvTf2trb8w9Re35oypn9yacKZ+a84dmfqRrW+Y+oatb5r6pq0rykJaVF8C4aE8Ed7BnM2Ys7sYH0tdDXzuqQSUs2vgugYIUoKFEQ6I1Sl/tnY2qoUGmzASCZFYCTihw20FbhtSbwbjndr6PcQBOYrxpC4rYQ+v8sW7eMpNHvL38umoAevMvaSSMxDS93KBuJiBa5C5lwTH6AzVuZj+St7BwwGmwUOu5ue8b+qfBmfa93FXMfctBaG8ivKwfcBZvMvtKRbBpa2E+jWkN8hTdwV5LBQqXUFlPu/0i8K6yvj6ZNeC269r33VvJ173u26v6+790Hn6pL4HP3K+dr5xlh3X+cl56mw6u86REziXzu/OX87f7az9W/uP9p8V+vBBXeYrx3ja//wPH+snPw==</latexit>

• Penalty for extreme transformations 1 ⌘( ) <latexit sha1_base64="mrcINPOoJBzhcwZUUu43tm2RbTE=">AAAP/HictZfdbts2FMfV7qOdt9XpdrfdEDOCpUNgWMGG9bLNR5sWS5rPNo1lBJRE2UwoUqWoxJ6iYQ8z7GYYtqs9xN5gb7NDSYlFJumVI0AQef4/kocUeUj6CaOp6vX+u3P3gw8/+vje/U9an372+YP23MMvXqcikwHZDwQT8sDHKWGUk31FFSMHiSQ49hl545+saP3NKZEpFXxPTRIyiPGQ04gGWIHpaO4rF3lD8g55ROEFLyRM4UeVpXc01+l1e+WDribcOtFx6mfr6OGDf71QBFlMuAoYTtO+20vUIMdS0YCRouVlKUlwcIKHpA9JjmOSDvKyEwWaB0uIIiHh5QqV1maJHMdpOol9IGOsRqmtaeN1Wj9T0eNBTnmSKcKDqqEoY0gJpEcEhVSSQLEJJHAgKfiKghGWOFAwbq3WPHrKyPhbFJIIZ0zpL+VUj16KMA9R3UzaRbv4lKRI0ZhAqdk94AF0D9wVLJ1pvdcN4LUj3pq3rGoUFy0DjcMzPVuLWfdcjYiQJC5HOpKU8PD2xqBuC8aAk7NAxDE0mnvLDORlMS5yT2aM5G73BwKZ6lMgNI/AKSQilEghorIs4adUCq7XQV5ai7476HsZT09oMsi9BEuPC8pDDXh+hLY0hDpu10MFtDOKKGOXDXtefymOB5VbtY+XvuZ7TacvVDLGcQIrrl8bBvlabTEwRqCPDeinMm8gsIqzONGzvcE9nRoNGPqaiJRa9FbDauDwYnnSIHcqgwEFQgrGsJw0uJVLm4FOV2aDXZ0arXr5MSz7TBKj4kujOQpjKuLmAJT5Wc/1CPNgUka/2c5xY/WSLMBs5q4PJU5GKJT4jPLhbS3QPmZskI/1X18RcUIZ2cBK0oCks+6ODoVIhz4IuWXoqaPBjGNPM8yQdzpI5AueJFHecYtHVhiKJ7Hu7LjoL0EEYSRSfc8nQ8phe5V4UugyqLMEGznQlcmTdDhSA/2zfdjOTgjMK6NOyjmR0wo9hvmQEYhDi7qisrQnS5vljPRl6W1ZbKHjVqztctqk+hfUwKKCJuXlF5hXWBw3uPNL7tzicJO7xM71KKRXRmBnB8Kt/tu+n+/YLW5uTsVNWzw8nIqHha69+j8oVVkUmazg5AKOcreEQxFjyi1vlvXSrEBI5ct2m6umvmrru6a+a+vbpr5t63umvmfrz0z9ma1vmfqWra+Y+oqtvzL1V7Z+YOoHtr5h6hu2/sLUX9j6S1N/aetvTf2trT839edX5oypH16ZcKZ+Zc7tm/q+ra+Z+pqtr5v6uq0rykJaVF8C4aE8EV7DHE+Z4+sYH0tdDXxuqASU4wvgogYIUoKFEQ6I1Sl/unbWqoUGmzASCZFYCTihe6sEbhtSbwajV7X1O4gDchjjcV1Wwh5e5Yv38ZSbPORv5NNhA9aZG0klpyCkb+QCcToFVyBzIwmO0SmqczH9mbyHhwNMg4dczc9439Q/Dc60t3FXMfctBaG8ivKwfcBZvMvtKRbBpa2ElmpIb5BH7iLyWChUuojKfN5ZKgrrKuPrk10Lbr+ufde9mni91HV7XXf7+86Tx/U9+L7ztfONs+C4zo/OE2fd2XL2ncD51fnd+cv5u/1L+7f2H+0/K/TunbrMl47xtP/5Hw8wKdo=</latexit>


0
• Find the ‘worst’ possible example at each step

Adversarially Robust
L(x, y, f ) = sup ⌘( )l(f (x + ), y)
2
Networks
<latexit sha1_base64="/r6yJfHCFXO961h8YcCB6PS1Hbk=">AAAQJnictZfLbtw2FIaV9JZO29hpl90QHRgdt8bAMlo0mwKJL4kTxHc7cTwaGJREzdCmSJWi7JnKep+ij9JF0U1RpKs+Sg8l2SPSdlZjAYLI838kDynykPQTRlO1uPju3v0PPvzo408efNr67PMvHs7MPvrydSoyGZCDQDAhD32cEkY5OVBUMXKYSIJjn5E3/umK1t+cEZlSwffVOCH9GA84jWiAFZiOZ1dfdUYL44VoHv2MvDRLjnMvJExh5FGOvFWdLJBHFO5U9nnEOlFnhL5HdX4BjeePZ9uL3cXyQdcTbp1oO/Wzffzo4R9eKIIsJlwFDKdpz11MVD/HUtGAkaLlZSlJcHCKB6QHSY5jkvbzsrsFmgNLiCIh4eUKldZmiRzHaTqOfSBjrIaprWnjTVovU9Hjfk55kinCg6qhKGNICaTHDoVUkkCxMSRwICn4ioIhljhQMMKt1hx6ysjoWxSSCGdM6S/lVI9zijAPUd1M2kV7+IykSNGYQKnpPeABdA/cFSydar03DeCNI96as6xqGBctA43Dcz2vi2n3XA2JkCQuRzqSlPDw7sagbgvGgJPzQMQxNJp7ywzkZTEqck9mjORu90cCmepTIDSHwCkkIpRIIaKyLOFnVAqu10FeWoue2+95GU9PadLPvQRLjwvKQw14foS2NYTabtdDBbQzjChjVw17Xm8pjvuVW7WPV77m+02nL1UywnECK65XG/r5Wm0xMEagjw3oVZk3EFjFWZzo2d7gnk6MBgx9TURKLXq7YTVweLE8bZC7lcGAAiEFY1iOG9zKlc1AJyuzwa5OjFa9/ASWfSaJUfGV0RyFERVxcwDK/LTneoR5MC6j33TnuLF6SRZgNnXXBxInQxRKfE754K4WaA8z1s9H+q+viDihjGxgJWlA0ml3R4dCpEMfhNwy9NTRYMqxpxlmyC86SOQdT5Iob7vFvBWG4nGsOzsqeksQQRiJVM/zyYBy2F4lHhe6DGovwa4OdGXyJB0MVV//bB+2s1MC88qok3JO5KRCj2E+YATi0IKuqCztydJmOSN9WXpbFuu03Yq1XU6bVO+S6ltU0KS8/BLzCovjBndxxV1YHG5yV9iFHoX02gjs7kK41X/b9/Ndu8XNzYm4aYtHRxPxqNC1V/8HpSqLIpMVnFzCUe6WcChiTLnlzbJemhUIqXzZbnPV1Fdtfc/U92x9x9R3bH3f1Pdt/ZmpP7P1bVPftvUVU1+x9S1T37L1Q1M/tPUNU9+w9Rem/sLWX5r6S1t/a+pvbf25qT+/NmdM/ejahDP1a3PuwNQPbH3N1Ndsfd3U121dURbSovoSCA/lifAG5mTCnNzE+FjqauBzSyWgnFwClzVAkBIsjHBArE75k7WzVi002ISRSIjESsAJHe4zcNuQejMYbtXW7yAOyEGMR3VZCXt4lS/ex1Nu8pC/lU8HDVhnbiWVnICQvpULxNkEXIHMrSQ4RieozsX0V/IeHg4wDR5yNT/lfVP/NDjT3sVdxdy3FITyKsrD9gFn8S63p1gEl7YSWqohvUEeuwvIY6FQ6QIq83l7qSisq4yvT3YtuP269l33euL1Utdd7Lo7P7SfPK7vwQ+cr51vnI7jOj85T5x1Z9s5cALnd+cv553z78xvM3/O/D3zT4Xev1eX+coxnpn//gc0iDkg</latexit>
sha1_base64="XktEDP1+6bhg7+tj+0d0kC+z1/I=">AAAP0nictZfdbts2FMfV7qvzuqa93g0xI1g2BIZlYNgu1zhp02JJnK82jWUElETZTChSo6jEnqLrAbsYBuyt9gZ7mx3Ksi3SSa8cAYLI8/+RPKTIQ9JPGE1Vu/3fo8effPrZ5188+bLx1dPG18/Wnj99l4pMBuQ0EEzIMx+nhFFOThVVjJwlkuDYZ+S9f9XV+vtrIlMq+ImaJGQQ4yGnEQ2wAlPv4nmz3WqXD1pOuFWi6VTPxYtn/3qhCLKYcBUwnKZ9t52oQY6logEjRcPLUpLg4AoPSR+SHMckHeSlnwVaB0uIIiHh5QqV1nqJHMdpOol9IGOsRqmtaeNdWj9T0c+DnPIkU4QH04aijCElkO40CqkkgWITSOBAUvAVBSMscaBgaBqNdfSSkfF3KCQRzpjSX8qpHqAUYR6iqpm0hY7xNUmRojGBUqt7wAPoHrgrWLrSeu8awDtHvLFuWdUoLhoGGoc3ekIWq+65GhEhSVyOdCQp4eHDjUHVFowBJzeBiGNoNPe2GMhbYlzknswYyd3WjwQy00+B0DoCp5CIUCKFiMqyhF9TKbheB3lpLfruoO9lPL2iySD3Eiw9LigPNeD5EeppCDXdlocKaGcUUcbmDXtevxPHg6lblY9zX/OTutMzlYxxnMCK61eGQb5TWQyMEehjDfq1zBsIrOIsTvRsr3EvF0YDhr4mIqUW3atZDRxeLK9q5NHUYECBkIIxLCc1rju3GehiZdbY7YXRqpdfwrLPJDEqnhvNURhTEdcHoMyveq5HmAeTMvqtdo4bq5dkAWYrd30ocTJCocQ3lA8faoH2MWODfKz/elfECWVkDytJA5Kuujs6FCId+iDklqGnigYrjj31MEN+00Ei3/AkifKmW3xvhaF4EuvOjot+ByIII5Hqez4ZUg7bq8STQpdBzQ7yIBpVJk/S4UgN9M/2YTu7IjCvjDop50QuKvQY5kNGIA5t6orK0p4sbZYz0pelt2WxjaY7ZW2X0zrVn1EDiwrqlJfPMK+wOG5wt3Pu1uJwnZtjt3oU0qURODqCcKv/tu/nR3aL+/sLcd8Wz88X4nmha5/+H5SqLIpMVnAyg6PcLeFQxJhyy5stvTSnIKTyLbvNbVPftvVjUz+29UNTP7T1E1M/sfVXpv7K1num3rP1rql3bf3A1A9s/czUz2x9z9T3bP2Nqb+x9bem/tbWP5j6B1t/beqvl+aMqZ8vTThTX5pzp6Z+aus7pr5j67umvmvrirKQFtMvgfBQngjvYC4XzOVdjI+lrgY+91QCyuUMmNUAQUqwMMIBsTrlL9bOznShwSaMREIkVgJO6N42gduG1JvB6KCy/gBxQA5jPK7KStjDp/niYzzlJg/5e/l0WIN15l5SyQUI6Xu5QFwvwC5k7iXBMbpAdS6mv5OP8HCAqfGQq/gV75v6p8GZ9iHuKua+pSCUT6M8bB9wFm9xe4pFcGkroU4F6Q3ywt1EHguFSjdRmc+bnaKwrjK+Ptk14PLr2lfd5cS7Tsttt9zDtvPE+cb51tlwXOcn5xdn1+k5p07ghM6fzj9reO2Ptb+ml+THj6rb8gvHeNb+/h8r0h4G</latexit>
sha1_base64="3w36K9JG8hjwVFyUK31QvkFGWPQ=">AAAQG3ictZdbb9s2FMfV7tZ5XZPudS/EjGDOFhhWgGF7GbDm0qZFc0/aNJYRUBJlM6FIjaISe4q+z7CPsodhL8PQfZodSkosMkmfHAGCyPP/kTykyEPSTxhNVa/3/sHDjz7+5NPPHn3e+uLxl0/m5p8+fpOKTAbkMBBMyCMfp4RRTg4VVYwcJZLg2GfkrX+2qvW350SmVPADNUnIIMZDTiMaYAWmk/m1153x0mQpWkQ/Iy/NkpPcCwlTGHmUI29NJwvkEYU7lX0RsU7UGaPvUZ1fQpPFk/l2r9srH3Qz4daJtlM/OydPn/zphSLIYsJVwHCa9t1eogY5looGjBQtL0tJgoMzPCR9SHIck3SQl90t0AJYQhQJCS9XqLQ2S+Q4TtNJ7AMZYzVKbU0bb9P6mYp+GuSUJ5kiPKgaijKGlEB67FBIJQkUm0ACB5KCrygYYYkDBSPcai2gZ4yMv0UhiXDGlP5STvU4pwjzENXNpF20j89JihSNCZSa3QMeQPfAXcHSmdZ72wDeOuKtBcuqRnHRMtA4vNDzuph1z9WICEnicqQjSQkP728M6rZgDDi5CEQcQ6O5t8JAXhHjIvdkxkjudn8gkKk+BUILCJxCIkKJFCIqyxJ+TqXgeh3kpbXou4O+l/H0jCaD3Euw9LigPNSA50doR0Oo7XY9VEA7o4gydt2w5/WX43hQuVX7eO1rftB0+kolYxwnsOL6tWGQr9cWA2ME+tiAXpd5A4FVnMWJnu0N7tnUaMDQ10Sk1KJ3GlYDhxfLswa5VxkMKBBSMIblpMGtXtsMdLoyG+za1GjVy09h2WeSGBVfG81RGFMRNwegzM96rkeYB5My+s12jhurl2QBZjN3fShxMkKhxBeUD+9rgfYxY4N8rP/6qogTysgmVpIGJJ11d3QoRDr0QcgtQ08dDWYce5phhvyqg0Te8SSJ8rZbLFphKJ7EurPjor8MEYSRSPU9nwwph+1V4kmhy6D2MuzqQFcmT9LhSA30z/ZhOzsjMK+MOinnRE4r9BjmQ0YgDi3pisrSnixtljPSl6W3ZbFO261Y2+W0SfWvqIFFBU3Ky68wr7A4bnCX19ylxeEmd41d6lFIb4zA3h6EW/23fT/fs1vc2pqKW7Z4fDwVjwtde/V/UKqyKDJZwckVHOVuCYcixpRb3qzopVmBkMpX7DbXTH3N1vdNfd/Wd01919YPTP3A1p+b+nNb3zH1HVtfNfVVW9829W1bPzL1I1vfNPVNW39p6i9t/ZWpv7L1d6b+ztZfmPqLG3PG1I9vTDhTvzHnDk390NbXTX3d1jdMfcPWFWUhLaovgfBQnghvYU6nzOltjI+lrgY+d1QCyukVcFUDBCnBwggHxOqUP10769VCg00YiYRIrASc0OE+A7cNqTeD0XZt/Q7igBzGeFyXlbCHV/niQzzlJg/5O/l02IB15k5SySkI6Tu5QJxPwVXI3EmCY3SK6lxMfyMf4OEA0+AhV/Mz3jf1T4Mz7X3cVcx9S0Eor6I8bB9wFu9ye4pFcGkroeUa0hvkibuEPBYKlS6hMp+3l4vCusr4+mTXgtuva991bybeLHfdXtfd7TmPnK+db5yO4zo/Or84G86Oc+gEzh/O385757+53+f+mvunuic/fFBfmL9yjGfu3/8Bdgk3iw==</latexit>
sha1_base64="vOdkC1+iRmBWHRxoWMLMnjYf2yU=">AAAQJnictZdbb9s2FMfV7tLO2+p2e9wLMSOYswWGFWBYXwa0ubRp0VybtGksI6AkymZCkRpFpfYUfZ9hH2UPw16GoXvaR9mhpMQik/TJFSCIPP8fyUOKPCT9hNFU9fvvbt3+6ONPPr1z97PW5198ea99/8FXr1KRyYAcBIIJeejjlDDKyYGiipHDRBIc+4y89k9Xtf76jMiUCr6vpgkZxnjEaUQDrMB0fH/tRXeyNF2KFtHPyEuz5Dj3QsIURh7lyFvTyQJ5ROFuZV9ErBt1J+gHVOeX0HTx+H6n3+uXD7qacOtEx6mfneMH9/7wQhFkMeEqYDhNB24/UcMcS0UDRoqWl6UkwcEpHpEBJDmOSTrMy+4WaAEsIYqEhJcrVFqbJXIcp+k09oGMsRqntqaN12mDTEUPhznlSaYID6qGoowhJZAeOxRSSQLFppDAgaTgKwrGWOJAwQi3WgvoMSOT71BIIpwxpb+UUz3OKcI8RHUzaQ+9xGckRYrGBErN7wEPoHvgrmDpXOu9bgCvHfHWgmVV47hoGWgcvtXzuph3z9WYCEnicqQjSQkPP9wY1G3BGHDyNhBxDI3m3goDeUVMityTGSO52/uRQKb6FAgtIHAKiQglUoioLEv4GZWC63WQl9Zi4A4HXsbTU5oMcy/B0uOC8lADnh+hHQ2hjtvzUAHtjCPK2GXDnjdYjuNh5Vbt46Wv+X7T6QuVTHCcwIob1IZhvl5bDIwR6GMDelHmDQRWcRYnerY3uMczowFDXxORUoveaVgNHF4sTxvkXmUwoEBIwRiW0wa3emkz0NnKbLBrM6NVLz+BZZ9JYlR8aTRHYUJF3ByAMj/vuR5hHkzL6DffOW6sXpIFmM3d9ZHEyRiFEr+lfPShFugAMzbMJ/qvr4o4oYxsYiVpQNJ5d0eHQqRDH4TcMvTU0WDOsacZZsgvOkjkXU+SKO+4xaIVhuJprDs7KQbLEEEYidTA88mIctheJZ4WugzqLMOuDnRl8iQdjdVQ/2wftrNTAvPKqJNyTuSsQo9hPmIE4tCSrqgs7cnSZjkjfVl6WxbrdtyKtV1Om9TgghpaVNCkvPwC8wqL4wZ3fsmdWxxucpfYuR6F9MoI7O1BuNV/2/fzPbvFra2ZuGWLR0cz8ajQtVf/B6UqiyKTFZxcwFHulnAoYky55c2KXpoVCKl8xW5zzdTXbP2lqb+09V1T37X1fVPft/Unpv7E1ndMfcfWV0191da3TX3b1g9N/dDWN01909afmfozW39u6s9t/Y2pv7H1p6b+9MqcMfWjKxPO1K/MuQNTP7D1dVNft/UNU9+wdUVZSIvqSyA8lCfCa5iTGXNyHeNjqauBzw2VgHJyAVzUAEFKsDDCAbE65c/Wznq10GATRiIhEisBJ3S4z8BtQ+rNYLxdW7+HOCBHMZ7UZSXs4VW+eB9PuclD/kY+HTVgnbmRVHIGQvpGLhBnM3AVMjeS4BidoToX01/Je3g4wDR4yNX8nPdN/dPgTPsh7irmvqUglFdRHrYPOIv3uD3FIri0ldByDekN8thdQh4LhUqXUJnPO8tFYV1lfH2ya8Ht17XvulcTr5Z7br/n7vY7jx7W9+C7zjfOt07XcZ2fnEfOhrPjHDiB87vzl/PO+bf9W/vP9t/tfyr09q26zNeO8bT/+x8zSDkc</latexit>

e.g. adversarial Reduced penalty for


example generator extreme distortions
Finds worst possible
Nonstationary Environments

courses.d2l.ai/berkeley-stat-157
Interaction with Environment

• Batch (download a book)


Observe training data (x1,y1) ... (xl,yl) then deploy
• Online (follow the class)
Observe x, predict f(x), observe y (stock market, homework)
• Active learning (ask questions in class)
Query y for x, improve model, pick new x
• Bandits (do well at homework)
Pick arm, get reward, pick new arm (also with context)
• Reinforcement Learning (play chess, drive a car)
Take action, environment responds, take new action

courses.d2l.ai/berkeley-stat-157
Batch

training data test data

build
model

courses.d2l.ai/berkeley-stat-157
Online

4 8 3 5

System improves as we see more data

courses.d2l.ai/berkeley-stat-157
Bandits

• Choose an arm (action)


• See what happens (get reward)
• Update model
• Choose next arm (action)
The bandit doesn’t remember what you did last summer.

courses.d2l.ai/berkeley-stat-157
Stateful Systems

no memory memory
courses.d2l.ai/berkeley-stat-157
Reinforcement Learning & Control

• Take action
• Environment reacts
• Observe stuff
• Update model
Repeat
• environment (cooperative, adversary, doesn’t care)
• memory (goldfish, elephant)
• state space (tic tac toe, chess, car)
• past observations (server log, generated during training)
courses.d2l.ai/berkeley-stat-157
Reinforcement Learning & Control

• Games
• Chess, Go, Backgammon (fully observed)
• Poker, Starcraft, ATARI (partially observed, random)
• Parallelism
• Computation advertising, recommender systems (multiple agents & independent
parallel games)
• Load balancing & scheduling (multiple agents)
• Actions
• Continuous decisions (driving, flying, robots in general, HVAC)
• Discrete (elevator, work allocation)
• Simulations
• MuJoCo style
• Only reality (server center)
courses.d2l.ai/berkeley-stat-157
Training ≠ Testing

• Generalization performance pemp (x, y) 6= p(x, y)


(the empirical distribution lies)
<latexit sha1_base64="mXy6G3P5BccDZTot0sfJc2VgV7s=">AAAQAXictZfLbtw2FIaV9JJ02sZOu+yiRAdGncIYWEaLZpn4kjhB7fiaOB4NDEqiZmjzolCUPVNZq6IPU3RTFO2qb9A36Nv0UCN7RNrOaixAEHn+j+QhRR6SYcpophcX/7tz94MPP/r43v1PWp9+9vmDmdmHX7zOZK4ish9JJtVBiDPCqCD7mmpGDlJFMA8ZeROerBj9zSlRGZViT49S0uO4L2hCI6zBdDT7dXoUcKwHiheEp+X8cGH0CAWCvENplT6abS92FqsHXU34daLt1c/W0cMH/waxjHJOhI4YzrKuv5jqXoGVphEjZSvIM5Li6AT3SReSAnOS9YqqJyWaA0uMEqngFRpV1maJAvMsG/EQSON15mrGeJ3WzXXyuFdQkeaaiGjcUJIzpCUyw4Jiqkik2QgSOFIUfEXRACscaRi8VmsOPWVk+C2KSYJzps2XCmqGMENYxKhuJuugXXxKMqQpJ1Bqeg94AN0DdyXLplrvdQN47Yi35hyrHvCyZaE8PjNTtpx2z/WASEV4NdKJokTEtzcGdVswBoKcRZJzaLQIlhnIy3JYFoHKGSn8zg8EMuNPidAcAqeQTFCqpEyqskScUiWFWQdFZS27fq8b5CI7oWmvCFKsAiGpiA0QhAnaMhBq+50AldDOIKGMXTYcBN0lzntjt2ofL30t9ppOX6hkiHkKK65bG3rFWm2xMEagjw3opypvIbCKc56a2d7gnk6MFgx9TWVGHXqrYbVweLE6aZA7Y4MFRVJJxrAaNbiVS5uFTlZmg12dGJ16xTEs+1wRq+JLoz0KQyp5cwCq/LTneoJFNKqi33TnuLV6SR5hNnXX+wqnAxQrfEZF/7YWaBcz1iuG5q+vSJ5SRjawVjQi2bS7Y0IhMqEPQm4VeupoMOXY0wwz5J0JEsV8oEhStP3ykROG+Iibzg7L7hJEEEYS3Q1C0qcCtleFR6Upg9pLKIBoVJsCRfsD3TM/O4Tt7ITAvLLqpEIQNakwYFj0GYE4tGAqqkoHqrI5zqhQVd5Wxebb/ph1Xc6aVPeC6jlU1KSC4gILSocTFnd+yZ07HG5yl9i5GYXsygjs7EC4NX87DIsdt8XNzYm46YqHhxPxsDS1j/8PynSeJDYrBbmAk8Kv4FhyTIXjzbJZmmMQUsWy2+aqra+6+q6t77r6tq1vu/qere+5+jNbf+bqW7a+5eortr7i6q9s/ZWrH9j6gatv2PqGq7+w9Reu/tLWX7r6W1t/6+rPbf35lTlj64dXJpytX5lz+7a+7+prtr7m6uu2vu7qmrKYluMvgfBQnQivYY4nzPF1TIiVqQY+N1QCyvEFcFEDBCnJ4gRHxOlUOFk7a+OFBpswkilRWEs4oQerBG4bymwGg1e19TuIA6rP8bAuC3ecOl++j6fC5iF/I5/1G7DJ3EhqNQEhfSMXydMJuAKZG0lwjE5Qk+P0Z/IeHg4wDR5yNT/lfdP8NDjT3sZdxd63NITycZSH7QPO4h3hTrEELm0VtFRDZoM88hdQwGKpswVU5Yv2Ulk6V5nQnOxacPv13bvu1cTrpY6/2PG3v28/eVzfg+97X3nfePOe7/3oPfHWvS1v34u8X73fvb+8v2d+mflt5o+ZP8fo3Tt1mS8965n553+WbCzG</latexit>

• Covariate shift p(x) 6= q(x) <latexit sha1_base64="ZsAiHdQjfU7q6OMIDtQsRTW29rA=">AAAP7nictZfdbts2FMfV7qvztrrdLndDzAiWDoFhBRvWyzYfbVosifPVprHcgJIomwlFKhSV2FP0GsNuhmG72rPsDfY2O5SVWKSTXjkCBJHn/yN5SJGHpJ8wmqpO57979z/6+JNPP3vweeOLL7962Hz0+Os3qchkQA4CwYQ89HFKGOXkQFHFyGEiCY59Rt76p6taf3tOZEoF31fjhPRjPOA0ogFWYHqfLI6eII+TM3QGqeNHrU67Uz5oNuFWiZZTPd3jxw//9UIRZDHhKmA4TXtuJ1H9HEtFA0aKhpelJMHBKR6QHiQ5jknaz0u3C7QAlhBFQsLLFSqt9RI5jtN0HPtAxlgNU1vTxpu0Xqaip/2c8iRThAeThqKMISWQHgMUUkkCxcaQwIGk4CsKhljiQMFINRoL6Dkjo+9RSCKcMaW/lFM9XinCPERVM2kb7eFzkiJFYwKl5veAB9A9cFewdK713jSAN454Y8GyqmFcNAw0Di/0/Czm3XM1JEKSuBzpSFLCw7sbg6otGANOLgIRx9Bo7q0wkFfEqMg9mTGSu+2fCGQmnwKhBQROIRGhRAoRlWUJP6dScL0O8tJa9Nx+z8t4ekqTfu4lWHpcUB5qwPMj1NUQarltDxXQzjCijF037Hm95TjuT9yqfLz2Nd+vO32lkhGOE1hxvcrQz9cri4ExAn2sQb+UeQOBVZzFiZ7tNe751GjA0NdEpNSiuzWrgcOL5WmN3J0YDCgQUjCG5bjGrV7bDHS6Mmvs2tRo1ctPYNlnkhgVXxvNURhREdcHoMzPe65HmAfjMvrNd44bq5dkAWZzd30gcTJEocQXlA/uaoH2MGP9fKT/+qqIE8rIJlaSBiSdd3d0KEQ69EHILUNPFQ3mHHvqYYac6SCRL3qSRHnLLZ5YYSgex7qzo6K3DBGEkUj1PJ8MKIftVeJxocug1jLyIBpVJk/SwVD19c/2YTs7JTCvjDop50ROK/QY5gNGIA4t6YrK0p4sbZYz0pelt2WxxZY7YW2X0zrVu6L6FhXUKS+/wrzC4rjBXV5zlxaH69w1dqlHIZ0Zgd1dCLf6b/t+vmu3uLU1Fbds8ehoKh4VuvbJ/0GpyqLIZAUnV3CUuyUcihhTbnmzopfmBIRUvmK3uWbqa7a+Z+p7tr5j6ju2vm/q+7b+wtRf2HrX1Lu2vmrqq7a+berbtn5o6oe2vmnqm7b+ytRf2fprU39t6+9M/Z2tvzT1lzNzxtSPZiacqc/MuQNTP7D1dVNft/UNU9+wdUVZSIvJl0B4KE+ENzAnU+bkJsbHUlcDn1sqAeXkCriqAYKUYGGEA2J1yp+unfXJQoNNGImESKwEnNC9NQK3Dak3g+F2Zf0B4oAcxHhUlZWwh0/yxYd4yk0e8rfy6aAG68ytpJJTENK3coE4n4KrkLmVBMfoFNW5mP5KPsDDAabGQ67i57xv6p8GZ9q7uKuY+5aCUD6J8rB9wFm8ze0pFsGlrYSWK0hvkMfuEvJYKFS6hMp83louCusq4+uTXQNuv659151NvFluu522u/Nj69nT6h78wPnW+c5ZdFznZ+eZs+F0nQMncKTzu/OX83czaf7W/KP55wS9f68q841jPM1//gdy5SUk</latexit>

(the covariate distribution lies)


• Logistic regression log(1 + exp( yf (x))) <latexit sha1_base64="l1J6cMLaHIbGh+wkxlFcIMRwY6M=">AAACEXicbVC7SgNBFJ2NryS+ohYKNoNB2CCGXS1MGbSxjGAekIQwO7mbDJl9MDMrWUPAf/ATbLWxsxNbv0Cw80ecPAqTeODC4Zx7ufceJ+RMKsv6MhJLyyura8lUen1jc2s7s7NbkUEkKJRpwANRc4gEznwoK6Y41EIBxHM4VJ3e1civ3oGQLPBvVRxC0yMdn7mMEqWlVma/wYOOaeMT3IB+aJ7G2DX7uVyulclaeWsMvEjsKckWD+6/Uw+vl6VW5qfRDmjkga8oJ1LWbStUzQERilEOw3QjkhAS2iMdqGvqEw9kczB+YIiPtdLGbiB0+QqP1b8TA+JJGXuO7vSI6sp5byT+59Uj5RaaA+aHkQKfTha5EccqwKM0cJsJoIrHmhAqmL4V0y4RhCqd2ewWDv2hTsWez2CRVM7y9nneutHxFNAESXSIjpCJbHSBiugalVAZUTRET+gZvRiPxpvxbnxMWhPGdGYPzcD4/AXs0J6n</latexit>

(tools to fix shift) 1


• Covariate shift correction (p(x) (1, y) + q(x) ( 1, y))
2
<latexit sha1_base64="7vQJO7RMR3qypwX4uT/zvxOTcwc=">AAACMnicbZDLSsNAFIYn9VbvUZduBkWoqCWpC7ssuHFZwbZCU8pkeqJDJxdnJtIQ8hhufQpBcO1W97oTt/oOTtIuvP0w8POdczhnfjfiTCrLejFKU9Mzs3Pl+YXFpeWVVXNtvS3DWFBo0ZCH4twlEjgLoKWY4nAeCSC+y6HjDo/zeucahGRhcKaSCHo+uQiYxyhRGvXNuuMJQlM7S2sZdoQr0qgy2sXOALgiFXsfJ7t4D199YwcFzPrmtlW1CuG/xp6Y7YZ1Y2/dP3w2++a7Mwhp7EOgKCdSdm0rUr2UCMUoh2zBiSVEhA7JBXS1DYgPspcWP8zwjiYD7IVCv0Dhgn6fSIkvZeK7utMn6lL+ruXwv1o3Vl69l7IgihUEdLzIizlWIc7jwgMmgCqeaEOoYPpWTC+JjkzpUH9u4TDKU7F/Z/DXtGtV+7Bqnep46misMtpEW6iCbHSEGugENVELUXSLHtETejbujFfjzXgft5aMycwG+iHj4wvw8Kus</latexit>

• Label shift p(y) 6= q(y)


(the label distribution lies)
<latexit sha1_base64="XvSqyTUTIiirHHNPjEnlRxEI0bE=">AAAP7nictZfdbts2FMfV7qvztrrdLndDzAiWDoFhBRvWyzYfbVosifPVprHcgJIomwlFKhSV2FP0GsNuhmG72rPsDfY2O5SVWKSTXjkCBJHn/yN5SJGHpJ8wmqpO57979z/6+JNPP3vweeOLL7962Hz0+Os3qchkQA4CwYQ89HFKGOXkQFHFyGEiCY59Rt76p6taf3tOZEoF31fjhPRjPOA0ogFWYHqfLI6fII+TM3QGqeNHrU67Uz5oNuFWiZZTPd3jxw//9UIRZDHhKmA4TXtuJ1H9HEtFA0aKhpelJMHBKR6QHiQ5jknaz0u3C7QAlhBFQsLLFSqt9RI5jtN0HPtAxlgNU1vTxpu0Xqaip/2c8iRThAeThqKMISWQHgMUUkkCxcaQwIGk4CsKhljiQMFINRoL6Dkjo+9RSCKcMaW/lFM9XinCPERVM2kb7eFzkiJFYwKl5veAB9A9cFewdK713jSAN454Y8GyqmFcNAw0Di/0/Czm3XM1JEKSuBzpSFLCw7sbg6otGANOLgIRx9Bo7q0wkFfEqMg9mTGSu+2fCGQmnwKhBQROIRGhRAoRlWUJP6dScL0O8tJa9Nx+z8t4ekqTfu4lWHpcUB5qwPMj1NUQarltDxXQzjCijF037Hm95TjuT9yqfLz2Nd+vO32lkhGOE1hxvcrQz9cri4ExAn2sQb+UeQOBVZzFiZ7tNe751GjA0NdEpNSiuzWrgcOL5WmN3J0YDCgQUjCG5bjGrV7bDHS6Mmvs2tRo1ctPYNlnkhgVXxvNURhREdcHoMzPe65HmAfjMvrNd44bq5dkAWZzd30gcTJEocQXlA/uaoH2MGP9fKT/+qqIE8rIJlaSBiSdd3d0KEQ69EHILUNPFQ3mHHvqYYac6SCRL3qSRHnLLZ5YYSgex7qzo6K3DBGEkUj1PJ8MKIftVeJxocug1jLyIBpVJk/SwVD19c/2YTs7JTCvjDop50ROK/QY5gNGIA4t6YrK0p4sbZYz0pelt2WxxZY7YW2X0zrVu6L6FhXUKS+/wrzC4rjBXV5zlxaH69w1dqlHIZ0Zgd1dCLf6b/t+vmu3uLU1Fbds8ehoKh4VuvbJ/0GpyqLIZAUnV3CUuyUcihhTbnmzopfmBIRUvmK3uWbqa7a+Z+p7tr5j6ju2vm/q+7b+wtRf2HrX1Lu2vmrqq7a+berbtn5o6oe2vmnqm7b+ytRf2fprU39t6+9M/Z2tvzT1lzNzxtSPZiacqc/MuQNTP7D1dVNft/UNU9+wdUVZSIvJl0B4KE+ENzAnU+bkJsbHUlcDn1sqAeXkCriqAYKUYGGEA2J1yp+unfXJQoNNGImESKwEnNC9NQK3Dak3g+F2Zf0B4oAcxHhUlZWwh0/yxYd4yk0e8rfy6aAG68ytpJJTENK3coE4n4KrkLmVBMfoFNW5mP5KPsDDAabGQ67i57xv6p8GZ9q7uKuY+5aCUD6J8rB9wFm8ze0pFsGlrYSWK0hvkMfuEvJYKFS6hMp83louCusq4+uTXQNuv659151NvFluu522u/Nj69nT6h78wPnW+c5ZdFznZ+eZs+F0nQMncKTzu/OX83czaf7W/KP55wS9f68q841jPM1//geR6yUm</latexit>

• Nonstationary Environments

You might also like