Lec 2
Lec 2
CS W182/282A
Instructor: Sergey Levine
UC Berkeley
How do we formulate learning problems?
Different types of learning problems
[object label] supervised learning
unlabeled
data
representation unsupervised learning
reinforcement learning
Supervised learning
Given:
[object label]
Questions to answer:
Unsupervised learning
unlabeled
data
representation what does that mean?
GANs
generative modeling: VAEs
pixel RNN, etc.
self-supervised
representation learning:
Reinforcement learning
Actions: muscle contractions Actions: motor current or torque Actions: what to purchase
Observations: sight, smell Observations: camera images Observations: inventory levels
Rewards: food Rewards: task success measure (e.g., Rewards: profit
running speed)
Reinforcement learning
But many other application areas too!
➢ Education (recommend which topic to study next)
➢ YouTube recommendations!
➢ Ad placement
Haarnoja et al., 2019 ➢ Healthcare (recommending treatments)
Let’s start with supervised learning…
Supervised learning
Given:
[object label]
The overwhelming majority of machine learning that is used in industry is supervised learning
9? 4% 0% 0% 0% 11% 0% 4% 0% 6% 75%
4? 5% 0% 0% 0% 50% 0% 3% 0% 2% 40%
Given:
Conditional probabilities
random variable representing the input
why is it a random variable?
chain rule
definition of
conditional
probability
How do we represent it?
computer
program [object label]
[object probability]
0 1 2 3 4 5 6 7 8 9
0% 0% 0% 0% 0% 90% 8% 0% 2% 0%
computer
program [object label]
[object probability]
computer
program [object label]
[object probability]
makes it positive
makes it sum to 1
normalizer
Why is it called a softmax?
Loss functions
So far…
computer
program
[object probability]
2. Define your loss function How to measure if one model in your model
class is better than another?
2. Define your loss function How to measure if one model in your model
class is better than another?
probability distribution
~ over photos
conditional probability
distribution over labels
How is the dataset “generated”?
Training set:
How is the dataset “generated”?
How is the dataset “generated”?
In general:
Examples:
Optimization
The machine learning method
for solving any problem ever
1. Define your model class
in general:
for each dimension, go in the direction
opposite the slope along that dimension
etc.
Gradient descent
matrix
Special case: binary classification
Overfitting: when the empirical risk is low, but the true risk is high
can happen if the dataset is too small
can happen if the model is too powerful (has too many parameters/capacity)
Underfitting: when the empirical risk is high, and the true risk is high
can happen if the model is too weak (has too few parameters/capacity)
can happen if your optimizer is not configured well (e.g., wrong learning rate)
This is very important, and we will discuss this in much more detail later!
Summary
1. Define your model class