0% found this document useful (0 votes)
24 views23 pages

Neural Networks: Machine Learning Is Machine Learning Is

Uploaded by

Shrushti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views23 pages

Neural Networks: Machine Learning Is Machine Learning Is

Uploaded by

Shrushti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

3/1/16

Neural Networks

David Kauchak
CS30
Spring 2016
http://xkcd.com/894/

Machine Learning is… Machine Learning is…


Machine learning is programming computers to optimize a
performance criterion using example data or past experience.
Machine learning, a branch of artificial intelligence,
-- Ethem Alpaydin
concerns the construction and study of systems that can
learn from data. The goal of machine learning is to develop methods that can
automatically detect patterns in data, and then to use the
uncovered patterns to predict future data or other outcomes of
interest.
-- Kevin P. Murphy

The field of pattern recognition is concerned with the automatic


discovery of regularities in data through the use of computer
algorithms and with the use of these regularities to take actions.
-- Christopher M. Bishop

1
3/1/16

Machine Learning is… Machine Learning is…


Machine learning is about predicting the future based on the Machine learning is about predicting the future based on the
past. past.
-- Hal Daume III -- Hal Daume III

past future
t
n dic
le ar pr
e
Training model/ Testing model/
Data predictor Data predictor

Machine Learning, aka Data


data mining: machine learning applied to
“databases”, i.e. collections of data examples

inference and/or estimation in statistics

pattern recognition in engineering


Data
signal processing in electrical engineering

induction

optimization

2
3/1/16

Data Data
examples examples

Data Data

Supervised learning
Data
examples
examples
label

label1

label3 labeled examples


Data
label4

label5

Supervised learning: given labeled examples

3
3/1/16

Supervised learning Supervised learning

label

label1
model/ model/
predictor predictor predicted label
label3

label4

label5

Supervised learning: given labeled examples Supervised learning: learn to predict new example

Supervised learning: classification Classification Example


label

apple
Differentiate
Classification:
apple
a finite set of labels between low-risk
and high-risk
banana customers from
their income and
banana
savings
Supervised learning: given labeled examples

4
3/1/16

Classification Applications Supervised learning: regression


Face recognition label

Character recognition -4.5

Spam detection 10.1 Regression:


label is real-valued
Medical diagnosis: From symptoms to illnesses
3.2
Biometrics: Recognition/authentication using
physical and/or behavioral characteristics: face, iris, 4.3
signature, etc

... Supervised learning: given labeled examples

Regression Example
Regression Applications
Price of a used car Economics/Finance: predict the value of a stock

y = wx+w0 Epidemiology
x : car attributes
Car/plane navigation: angle of the steering wheel,
(e.g. mileage) acceleration, …
y : price
Temporal trends: weather over time

19

5
3/1/16

Unsupervised learning
Unsupervised learning
applications
learn clusters/groups without any label

customer segmentation (i.e. grouping)

image compression

bioinformatics: learn motifs

Unsupervised learning: given data, i.e. examples, but no labels

Reinforcement learning Reinforcement learning example


Backgammon
left, right, straight, left, left, left, straight GOOD
left, straight, straight, left, right, straight, straight BAD … WIN!

left, right, straight, left, left, left, straight 18.5


left, straight, straight, left, right, straight, straight -3
… LOSE!
Given a sequence of examples/states and a reward after
completing that sequence, learn to predict the action to
take in for an individual example/state Given sequences of moves and whether or not
the player won at the end, learn to make good
moves

6
3/1/16

Reinforcement learning example Other learning variations


What data is available:
! Supervised, unsupervised, reinforcement learning
! semi-supervised, active learning, …

How are we getting the data:


! online vs. offline learning

Type of model:
! generative vs. discriminative
! parametric vs. non-parametric
http://www.youtube.com/watch?v=VCdxqn0fcnE

Neural Networks Our Nervous System


Dendrites
Neural Networks try to mimic the structure and Nodes

function of our nervous system


Synapses
+
People like biologically motivated approaches + (weights)
+ Axon
-
-
Synapses

Neuron
What do you know?

7
3/1/16

Our nervous system:


Our nervous system
the computer science view
The human brain
Dendrites
the human brain
Nodes is a large collection " contains ~1011 (100 billion)
of interconnected neurons neurons
Synapses
+
+
Axon
a NEURON (weights)
is a brain cell " each neuron is connected
+
- " they collect, process, and disseminate to ~104 (10,000) other
Synapses
- electrical signals neurons
" they are connected via synapses
" they FIRE depending on the
conditions of the neighboring neurons " Neurons can fire as fast as
10-3 seconds

How does this compare to a computer?

Man vs. Machine Brains are still pretty fast

1011 neurons 1010 transistors


1011 neurons 1011 bits of ram/memory
1014 synapses 1013 bits on disk
10-3 “cycle” time 10-9 cycle time
Who is this?

8
3/1/16

Brains are still pretty fast Artificial Neural Networks


Node (Neuron)

If you were me, you’d be able to identify this person


in 10-1 (1/10) s!

Given a neuron firing time of 10-3 s, how many neurons in


sequence could fire in this time?
" A few hundred

Edge (synapses)
What are possible explanations?
" either neurons are performing some very complicated
computations
" brain is taking advantage of the massive parallelization
(remember, neurons are connected ~10,000 other neurons) our approximation

Node A Weight w Node B


(neuron) (neuron)

W is the strength of signal sent between A and B.

A given neuron has many, many connecting, input neurons


If A fires and w is positive, then A stimulates B.

If a neuron is stimulated enough, then it also fires


If A fires and w is negative, then A inhibits B.

How much stimulation is required is determined by its threshold

9
3/1/16

A Single Neuron/Perceptron
Possible threshold functions
Input x1 Each input contributes:
Weight w1 xi * wi
hard threshold
"$ 1 if x ≥ threshold
g(x) = #
Weight w2
Input x2 $% 0 otherwise
∑ g(in) Output y

threshold function
Input x3
sigmoid
Weight w3 €
in = ∑ w i x i 1
€ g(x) =
Weight w4 i 1+ e−ax
Input x4

€ €

A Single Neuron/Perceptron A Single Neuron/Perceptron


1 1
1 1

-1 -1
1
? 1
?
0 1
Threshold of 1 0 1
Threshold of 1

0.5 0.5
1 1 1*1 + 1*-1 + 0*1 + 1*0.5 = 0.5

10
3/1/16

A Single Neuron/Perceptron A Single Neuron/Perceptron


1 1
1 1

-1 -1
1
0 0
?
Weighted sum is
0 1
Threshold of 1 0.5, which is not 0 1
Threshold of 1
larger than the
0.5 threshold 0.5
1 1 1*1 + 0*-1 + 0*1 + 1*0.5 = 1.5

A Single Neuron/Perceptron Neural network


1
1
inputs
Individual
-1 perceptrons/
0
1 neurons

Weighted sum is
0 1
Threshold of 1 1.5, which is
larger than the
0.5 threshold
1

11
3/1/16

Neural network Neural network


some inputs are
inputs provided/entered inputs
each perceptron
computes and
calculates an answer

Neural network Neural network

inputs inputs

those answers
become inputs for
the next level
finally get the answer
after all levels compute

12
3/1/16

Activation spread Neural networks


Different kinds/characteristics of networks
inputs inputs

http://www.youtube.com/watch?v=Yq7d4ROvZ6I inputs

How are these different?

Neural networks Neural networks


inputs
inputs Recurrent network
inputs

Output is fed back to input


hidden units/layer
Can support memory!

How?
Feed forward networks

13
3/1/16

History of Neural Networks


Training the perceptron
McCulloch and Pitts (1943) – introduced model of First wave in neural networks in the 1960’s
artificial neurons and suggested they could learn
Single neuron
Hebb (1949) – Simple updating rule for learning
Trainable: its threshold and input weights can be modified
Rosenblatt (1962) - the perceptron model
If the neuron doesn’t give the desired output, then it has
Minsky and Papert (1969) – wrote Perceptrons
made a mistake

Bryson and Ho (1969, but largely ignored until


1980s--Rosenblatt) – invented back-propagation Input weights and threshold can be changed according to a
learning for multilayer networks learning algorithm

Examples - Logical operators AND


AND – if all inputs are 1, return 1, otherwise return 0 x1 x2 x1 and x2

0 0 0
OR – if at least one input is 1, return 1, otherwise
return 0 0 1 0

1 0 0
NOT – return the opposite of the input
1 1 1
XOR – if exactly one input is 1, then return 1,
otherwise return 0

14
3/1/16

x1 x2 x1 and x2 x1 x2 x1 and x2
AND 0 0 0 AND 0 0 0
0 1 0 0 1 0
Input x1 Input x1
1 0 0 1 0 0
W1 = ? W1 = 1
1 1 1 1 1 1

T=? Output y T=2 Output y


Output is 1 only if
all inputs are 1

Input x2 W2 = ? Input x2 W2 = 1

Inputs are either 0 or 1

AND AND
Input x1 Input x1
W1 = ? W1 = 1

W2 = ? W2 = 1
Input x2 Input x2
T=? Output y T=4 Output y
Output is 1 only if
all inputs are 1
Input x3 W3 = ? Input x3 W3 = 1

W4 = ? W4 = 1
Input x4 Input x4

Inputs are either 0 or 1

15
3/1/16

x1 x2 x1 or x2

OR OR 0 0 0

Input x1 0 1 1
x1 x2 x1 or x2 W1 = ? 1 0 1
1 1 1
0 0 0
Output y
0 1 1 T=?

1 0 1
Input x2 W2 = ?
1 1 1

x1 x2 x1 or x2
OR 0 0 0 OR
Input x1
Input x1 0 1 1
W1 = ?
W1 = 1 1 0 1
1 1 1

W2 = ?
Input x2
T=1 Output y T=? Output y
Output is 1 if at
least 1 input is 1
Input x3 W3 = ?
Input x2 W2 = 1
Inputs are either 0 or 1
W4 = ?
Input x4

16
3/1/16

OR NOT
Input x1
W1 = 1

x1 not x1

Input x2
W2 = 1 0 1
T=1 Output y
Output is 1 if at
least 1 input is 1 1 0
Input x3 W3 = 1

W4 = 1
Input x4
Inputs are either 0 or 1

x1 not x1
NOT 0 1 NOT
1 0

W1 = ? W1 = -1
Input x1 T=? Output y Input x1 T=0 Output y
Input is either 0 or 1 If input is 1, output is 0.
If input is 0, output is 1.

17
3/1/16

How about… Training neural networks

x1 x2 x3 x1 and
x2
0 0 0 1
Input x1
0 1 0 0 w1 = ?
w =?
1 0 0 1 Input x2 2
T=? Output y Learn individual
1 1 0 0 node parameters
Input x3 Learn the individual
0 0 1 1 w3 = ? (e.g. threshold)
weights between nodes
0 1 1 1
1 0 1 1
1 1 1 0

Positive or negative? Positive or negative?

NEGATIVE NEGATIVE

18
3/1/16

Positive or negative? Positive or negative?

POSITIVE NEGATIVE

Positive or negative? Positive or negative?

POSITIVE POSITIVE

19
3/1/16

Positive or negative? Positive or negative?

NEGATIVE POSITIVE

A method to the madness Training neural networks


blue = positive
x1 x2 x3 x1 and Input x1
w1 = ?
x2
w =?
yellow triangles = positive 0 0 0 1 Input x2 2
T=? Output y
0 1 0 0
Input x3
w3 = ?
1 0 0 1
all others negative 1 1 0 0
1. start with some initial weights and
0 0 1 1 thresholds
0 1 1 1 2. show examples repeatedly to NN
How did you figure this out (or 1 0 1 1
3. update weights/thresholds by
comparing NN output to actual
some of it)? 1 1 1 0
output

20
3/1/16

Perceptron learning
Perceptron learning algorithm Weighted sum is
1
0.5, which is not
1 equal or larger than
repeat until you get all examples right: the threshold
predicted
-1
- for each “training” example:
- calculate current prediction on example
1
0
- if wrong: actual
0 1
Threshold of 1
update weights and threshold towards getting this
1
-
example correct
0.5
1
What could we adjust to make it right?

Perceptron learning Perceptron learning


1 1
1 1

predicted predicted
-1 -1
1
0 1
0
0 1 actual 0 1 actual
Threshold of 1 Threshold of 1

0.5
1 0.5
1
1 1
This weight doesn’t matter, so don’t change Could increase any of these weights

21
3/1/16

Perceptron learning Perceptron learning


1
1
A few missing details, but not much more than this

predicted Keeps adjusting weights as long as it makes mistakes


-1
1
0 If the training data is linearly separable the perceptron
learning algorithm is guaranteed to converge to the
0 1 actual “correct” solution (where it gets all examples right)
Threshold of 1

0.5
1
1
Could decrease the threshold

Which of these are linearly separable?


Linearly Separable
x1 x2 x1 and x2 x1 x2 x1 or x2 x1 x2 x1 xor x2 x1 x2 x1 and x2 x1 x2 x1 or x2 x1 x2 x1 xor x2

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 1 0 0 1 1 0 1 1 0 1 0 0 1 1 0 1 1

1 0 0 1 0 1 1 0 1 1 0 0 1 0 1 1 0 1

1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0

A data set is linearly separable if you can


separate one example type from the other
x1 x1 x1

Which of these are linearly separable? x2 x2 x2

22
3/1/16

Perceptrons XOR
?
Input x1 T=?
1969 book by Marvin Minsky and Seymour Papert ?

Output = x1 xor x2
? T=?
?
The problem is that they can only work for
classification problems that are linearly separable T=?
?

Input x2 ?
x1 x2 x1 xor x2
Insufficiently expressive
0 0 0
0 1 1
“Important research problem” to investigate
multilayer networks although they were pessimistic 1 0 1
about their value 1 1 0

XOR
1
Input x1 T=1
1

Output = x1 xor x2
-1 T=1
-1

Input x2 T=1
1
x1 x2 x1 xor x2
0 0 0
0 1 1
1 0 1
1 1 0

23

You might also like