0% found this document useful (0 votes)

34 views43 pages

08 NN

Uploaded by

amr531.23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views43 pages

08 NN

Uploaded by

amr531.23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

CSE472, CSE386

Artificial Intelligence

Neural Networks
Prof. Mahmoud Khalil
Summer 2024
1
Artificial Neural Networks

Artificial neural networks are inspired by brains and neurons

A neural network is a graph with nodes, or units, connected by links
Each link has an associated weight, a real number
Typically, each node I outputs a real number, which is fed as input to the nodes connected to I
The output of a node is a function of the weighted sum of the node’s inputs
2
Basic Concepts
• A Neural Network maps a set of inputs to a set of
outputs
• Number of inputs/outputs is variable. Input 0 Input 1 Input n

• The Network itself is composed of an arbitrary

number of nodes or units, connected by links, with an
arbitrary topology. Neural Network
• A link from unit i to unit j serves to propagate the activation
aj to j, and it has a weight Wij.
• What can a neural networks do? Output 0 Output 1 Output m

Compute a known function / Approximate an unknown

function Pattern Recognition / Signal Processing
Learn to do any of the above
3
Fully Connected NN
Different
types of nodes

4
Artificial Neuron Node or Unit: Mathematical Abstraction
Artificial Neuron,
Node or unit ,
Processing Unit i

Input edges, Input Output edges,

Activation Output
each with weights Function (ini): each with weights
weighted sum function (g)
n

(positive, negative, and ai  g(  W j,i a j )

of its inputs, (positive, negative,
applied to j0

change over time, including and change over

input function
time, learning)
learning) (typically
n
ini   W j,i a j non-linear).
j0

 a processing element producing an output based on a function of its inputs

5
Activation Functions

Step Function
Sigmoid Function Sign Function

6
Normalizing Unit Thresholds
If t is the threshold value of the output unit, then

Where W0 = t and I0 = −1

- We can always assume that the unit’s threshold is 0 This allows thresholds to be
learned like any other weight

- We can even allow output values in [0, 1] by replacing step0 by the sigmoid function

7
Units as logic Gates

Units with a threshold activation function can act as logic gates;

we can use these units to compute Boolean function of its inputs.

Activation of
threshold units when:
n

W j,i a j  W0,i
j1

8
AND

x1 x2 output
0 0 0
W0= 1.5
0 1 0
1 0 0
-1
1 1 1 w1=1 w2=1

x1 x2
Activation of
threshold units when:
n

W j,i a j  W0,i
j1

9
OR
x1 x2 output
0 0 0 w0= 0.5
0 1 1
1 0 1 -1
w1=1 w2=1
1 1 1

x1 x2

Activation of
threshold units when:
n

W j,i a j  W0,i
j1

10
NOT

x1 output
w0= -
0 1
1 0
-1 w1= 1

x1
Activation of
threshold units when:
n
So, units with a threshold activation function
W j,i a j  W0,i
j1 can act as logic gates given the appropriate
input and bias weights.

11
Network Structures
• Feed-forward networks
• Activation flows from input layer to output layer
• single-layer perceptrons
• multi-layer perceptrons
• Feed-forward networks implement functions,
have no internal state (only weights).

• Recurrent networks
• Feed the outputs back into own inputs
Network is a dynamical system
(stable state, oscillations, chaotic behavior)
Response of the network depends on initial state
• Can support short-term memory
• More difficult to understand
12
Feed-Forward Network
Two input units Two hidden units One Output

Each unit receives input only

from units in the immediately
for simplicity, Bias unit omitted
preceding layer.

Given an input vector x = (x1,x2), the activations of the input units are set to values of the
input vector, i.e., (a1,a2)=(x1,x2), and the network computes:

By adjusting the weights we get different functions:

that is how learning is done in neural networks!
13
Single Layer Feed-Forward Networks (perceptron)
•Single-layer neural network (perceptron network)

A network with all the inputs connected directly to the outputs

– Output units all operate separately: no shared weights

Since each output unit is

independent of the others,
we can limit our study to
single output perceptrons.

14
Perceptron Learning Intuition
• Weight Update
•  Input Ij (j=1,2,…,n)
•  Single output O: target output, T.
• Consider some initial weights
• Define example error: Err = T – O
• Now just move weights in right direction!
• If the error is positive, then we need to increase O.
• Err >0  need to increase O;
• If the error is negative, then we need to decrease O
Ij Wj O
• Err <0  need to decrease O;
• Each input unit j, contributes Wj Ij to total input:
• if Ij is positive, increasing Wj tends to increaseO;
• if Ij is negative, decreasing Wj tends to decreaseO;
• So we use Wj  Wj +   Ij  Err  is the learning rate.
15
Perceptron Leaning: Example
• Let’s consider an example
• Framework and notation:
• 0/1 signals
• Input vector:
X  x0 , x1, x2 … , xn 
• Weight vector:

W  w0 , w1, w2 … , wn 
• x0 = 1 and -w0, simulate the threshold.

• O is output (0 or 1) (single output).

Learning rate = 1.
kn
• Threshold function: S   wk xk S  0 then O  1 else O  0
k 0
16
Perceptron Leaning: Example
Err = T – O
• Set of training examples, each example is a pair
Wj  Wj +   Ij  Err
• i.e., an input vector and a label y (0 or 1). (x , y )
i i

• Learning procedure, called the “error correcting method”

• Start with all zero-weight vector.

• Cycle (repeatedly) through the training examples and for each example do:
• If perceptron is 0 while it should be 1,
add the input vector to the weight vector
• If perceptron is 1 while it should be 0, Intuitively correct,
(e.g., if output is 0
subtract the input vector to the weight vector but it should be 1,
• Otherwise do nothing. the weights are
increased) !

17
Perceptron Leaning: Example
• Consider learning the logical OR function.
• Our examples are:

• Training Samples
x0 x1 x2 label
1 1 0 0 0
2 1 0 1 1
3 1 1 0 1
4 1 1 1 1
kn
Activation Function S   wk xk S  0 then O  1 else O  0
k 0

18
Perceptron Leaning: Example
If perceptron is 0 while it should be 1,
k n
add the input vector to the weight vector
S   wk x k S  0 then O  1 else O  0 If perceptron is 1 while it should be 0,
k 0
subtract the input vector to the weight vector
Error correcting method Otherwise do nothing.

• We’ll use a single perceptron with three inputs.

• We’ll start with all weights 0 W= <0,0,0>
1
I0 w 0
• Example 1 I= < 1,0, 0> label=0 W= <0,0,0>
• Perceptron (10+ 00+ 00 =0, S=0) output  0 I1 w1 O
• it classifies it as 0, so correct, do nothing
I2 w2
• Example 2 I=<1, 0, 1> label=1 W= <0,0,0>
• Perceptron (10+ 00+ 10 = 0) output 0
• it classifies it as 0, while it should be 1, so we add input to weights
• W = <0,0,0> + <1,0,1>= <1,0,1> 19
Perceptron Leaning: Example

• Example 3 I=<1,1, 0> label=1 W= <1,0,1>

• Perceptron (11+ 10+ 01 > 0) output = 1 I0 w0
• it classifies it as 1, correct, do nothing
• W = <1,0,1> I1 w1 O

I2 w2

• Example 4 I=<1,1,1> label=1 W= <1,0,1>

• Perceptron (11+ 10+ 11 > 0) output = 1
• it classifies it as 1, correct, do nothing
• W = <1,0,1>

20
Perceptron Leaning: Example

• Epoch 2, through the examples, W = <1,0,1> . 1

I0 w0

• Example 1 I = <1,0,0> label=0 W = <1,0,1> I1 w1 O

• Perceptron (11+ 00+ 01 >0) output  1
I2 w2
• it classifies it as 1, while it should be 0, so subtract input
from weights
• W = <1,0,1> - <1,0,0> = <0, 0, 1>

• Example 2 I=<1,0, 1> label=1 W= <0,0,1>

• Perceptron (10+ 00+ 11 > 0) output 1
• it classifies it as 1, so correct, do nothing 21
Perceptron Leaning: Example

• Example 3 I=<1,1,0> label=1 W= <0,0,1>

• Perceptron (10+ 10+ 01 = 0) output = 0
• it classifies it as 0, while it should be 1, so add input to weights
• W = I + W = <1, 1, 1>

• Example 4 I=<1,1,1> label=1 W= <1,1,1>

• Perceptron (11+ 11+ 11 > 0) output = 1
• it classifies it as 1, correct, do nothing
• W = <1,1,1>

22
Perceptron Leaning: Example

• Epoch 3, through the examples, W = <1,1,1> .

1 I0 w0

• Example 1 I=<1,0,0> label=0 W = <1,1,1> I1 w1 O

• Perceptron (11+ 01+ 01 >0) output  1
I2 w2
• it classifies it as 1, while it should be 0, so subtract
input from weights
• W = <1,1,1> - <1,0,0> = <0, 1, 1>

• Example 2 I=<1,0,1> label=1 W= <0, 1, 1>

• Perceptron (10+ 01+ 11 > 0) output 1
• it classifies it as 1, so correct, do nothing
• 23
Perceptron Leaning: Example

• Example 3 I=<1,1,0> label=1 W= <0, 1, 1>

• Perceptron (10+ 11+ 01 > 0) output = 1
• it classifies it as 1, correct, do nothing

• Example 4 I=<1,1,1> label=1 W= <0, 1, 1>

• Perceptron (10+ 11+ 11 > 0) output = 1
• it classifies it as 1, correct, do nothing

24
Perceptron Leaning: Example

• Epoch 4, through the examples, W= <0, 1, 1>.

1
• Example 1 I= <1,0,0> label=0 W = <0,1,1> I0 W0 =0
• Perceptron (10+ 01+ 01 = 0) output  0
• it classifies it as 0, so correct, do nothing I1 W1=1O

So the final weight vector W= <0, 1, 1> classifies all I2 W2=1

examples correctly, and the perceptron has learned the function!
OR
Aside: in more realistic cases the bias (W0) will not be 0.
(This was just a toy example!)
Also, in general, many more inputs (100 to 1000)
25
Perceptron Leaning: Example
Desired New New New
Epoch x0 x1 x2 w0 w1 w2 Output Error w2
Target w0 w1
1 example 1 1 0 0 0 0 0 0 0 0 0 0 0
1 0 1 1 0 0 0 0 1 1 0 1
1 1 0 1 1 0 1 1 0 1 0 1
1 1 1 1 1 0 1 1 0 1 0 1
2 1 0 0 0 1 0 1 1 -1 0 0 1
1 0 1 1 0 0 1 1 0 0 0 1
1 1 0 1 0 0 1 0 1 1 1 1
1 1 1 1 1 1 1 1 0 1 1 1
3 1 0 0 0 1 1 1 1 -1 0 1 1
1 0 1 1 0 1 1 1 0 0 1 1
1 1 0 1 0 1 1 1 0 0 1 1
1 1 1 1 0 1 1 1 0 0 1 1
4 1 0 0 0 0 1 1 0 0 0 1 1
26
Expressiveness of Perceptron

What hypothesis space can a perceptron represent?

Even more complex Booelan functions such as majority function .

But can it represent any arbitrary Boolean function?

27
Expressiveness of Perceptron

A threshold perceptron returns 1 iff the weighted sum of its inputs

(including the bias) is positive, i.e.,:

I.e., iff the input is on one side of the hyperplane it defines.

Perceptron  Linear Separator

Linear discriminant function or linear decision surface.

Weights determine slope and bias determines offset.

28
Linear Separability

Consider example with two inputs, x1, x2:

x2
+ +
+ Can view trained network
++ + as defining a “separation line”.

+
What is its equation?
 w0  w1 x1  w2 x2 0
x1

w1 w
x2   x1  0
w2 w2
Percepton used for classification

29
Linear Separability

x2
 
OR

  x1

30
Linear Separability

 

AND
  x1

31
Linear Separability

 
XOR

  x1

32
Linear Separability

x2
Not linearly separable
 
XOR

  x1

Minsky & Papert (1969)

Bad News: Perceptrons can only represent linearly separable functions.
33
Linear Separability XOR
• Consider a threshold perceptron for the logical XOR function (two inputs):

w1 x1  w2 x2  T
• Our examples are:
• Given our examples, we have the following inequalities for
x1 x2 label the perceptron:
1 0 0 0
• From (1) 0 + 0 ≤ T
2 1 0 1
• From (2) w1+ 0 > T
3 0 1 1
• From (3) 0 + w2 > T
4 1 1 0
• From (4) w1 + w2 ≤ T (contradiction)

So, XOR is not linearly separable

34
Convergence of Perceptron Learning Algorithm

Perceptron converges to a consistent function, if…

• … training data linearly separable

• … step size  sufficiently small
• … no “hidden” units

35
Non Linear Classifiers
• The XOR problem
x1 x2 XOR Class
0 0 0 B
0 1 1 A
1 0 1 A
1 1 0 B

36
Non Linear Classifiers
• There is no single line (hyperplane) that separates class A
from class B. On the contrary, AND and OR operations are
linearly separable problems

37
The Two‐Layer Perceptron
• For the XOR problem,
draw two, instead, of
one lines

38
The Two‐Layer Perceptron
• Then class B is located outside the shaded area and class A inside. This is a
two‐phase design.
• Phase 1: Draw two lines (hyperplanes)

g1 ( x)  g 2 ( x)  0
Each of them is realized by a perceptron. The outputs of the perceptrons
will be
0
yi  f ( g i ( x))   i  1, 2
1
depending on the position of x.

• Phase 2: Find the position of x w.r.t. both lines, based on the values of y1,
y2.
39
The Two‐Layer Perceptron

1st phase 2nd

x1 x2 y1 y2 phase
0 0 0(-) 0(-) B(0)
0 1 1(+) 0(-) A(1)
1 0 1(+) 0(-) A(1)
1 1 1(+) 1(+) B(0)

• Equivalently: The computations of the first phase perform a mapping

x  y  [ y1 , y2 ]T

40
The Two‐Layer Perceptron
The decision is now performed on the transformed y data.

g ( y)  0

This can be performed via a second line, which can also be

realized by a perceptron. 41
The Two‐Layer Perceptron
• Computations of the first phase perform a mapping
that transforms the nonlinearly separable problem to a
linearly separable one.

• The architecture

42
The Two‐Layer Perceptron
• This is known as the two layer perceptron with one hidden and one
output layer. The activation functions are
0
f (.)  
1
• The neurons (nodes) of the figure realize the following lines
(hyperplanes)
1
g1 ( x)  x1  x2   0
2
3
g 2 ( x)  x1  x2   0
2
1
g ( y )  y1  2 y2   0
2
43

Neural Networks for Beginners
No ratings yet
Neural Networks for Beginners
65 pages
Artificial Neural Networks Basics
No ratings yet
Artificial Neural Networks Basics
50 pages
Perceptron 2015
No ratings yet
Perceptron 2015
63 pages
Clase3 Redunidireccional
No ratings yet
Clase3 Redunidireccional
74 pages
Unit 1
No ratings yet
Unit 1
19 pages
Perceptron
No ratings yet
Perceptron
11 pages
ML Lecture#4
No ratings yet
ML Lecture#4
109 pages
Lecture 19 NN
No ratings yet
Lecture 19 NN
32 pages
Lecture 19 NN
No ratings yet
Lecture 19 NN
32 pages
Lect 5
No ratings yet
Lect 5
26 pages
ML Unit 3 Study Material-1
No ratings yet
ML Unit 3 Study Material-1
32 pages
AI Lec24-25
No ratings yet
AI Lec24-25
63 pages
Slide 2
No ratings yet
Slide 2
35 pages
Lecture 5 NN
No ratings yet
Lecture 5 NN
57 pages
Dave Reed: Connectionist Approach To AI
No ratings yet
Dave Reed: Connectionist Approach To AI
26 pages
UNIT III 3.1 ML Artificial Neural Networks
No ratings yet
UNIT III 3.1 ML Artificial Neural Networks
65 pages
A Presentation On: By: Edutechlearners
No ratings yet
A Presentation On: By: Edutechlearners
33 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
87 pages
Unit-1 Deep Learning
No ratings yet
Unit-1 Deep Learning
20 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
Soft Computing
No ratings yet
Soft Computing
92 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
71 pages
Perceptron Network
No ratings yet
Perceptron Network
26 pages
ML Tushar Assignment
No ratings yet
ML Tushar Assignment
8 pages
Tasks On Neurons and ANN
No ratings yet
Tasks On Neurons and ANN
15 pages
02 Neural Network
No ratings yet
02 Neural Network
28 pages
Lecture-3 Learning in Feedforward Neural Networks (1) 654
No ratings yet
Lecture-3 Learning in Feedforward Neural Networks (1) 654
35 pages
Assignment 1
No ratings yet
Assignment 1
7 pages
Artificial Neural Networks
0% (1)
Artificial Neural Networks
53 pages
Neural
No ratings yet
Neural
32 pages
Jntuk R20 ML Unit-V
No ratings yet
Jntuk R20 ML Unit-V
19 pages
Neural Network and Deep Learning - Unit 1
No ratings yet
Neural Network and Deep Learning - Unit 1
20 pages
Module I
No ratings yet
Module I
109 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
Neural N Problems - SLP
No ratings yet
Neural N Problems - SLP
123 pages
Neural Network
No ratings yet
Neural Network
82 pages
Learning XOR - Gradient Based Learning - Hidden Units
No ratings yet
Learning XOR - Gradient Based Learning - Hidden Units
43 pages
Artificial Neural Network: Lecture Module 22
No ratings yet
Artificial Neural Network: Lecture Module 22
54 pages
NN Introduction MES
No ratings yet
NN Introduction MES
39 pages
Perceptron Model Explained
No ratings yet
Perceptron Model Explained
24 pages
AN2DL 02 2324 Perceptron 2 FeedForward
No ratings yet
AN2DL 02 2324 Perceptron 2 FeedForward
55 pages
Module 5 Lecture 2
No ratings yet
Module 5 Lecture 2
45 pages
Lecture 10 Neural Network
No ratings yet
Lecture 10 Neural Network
34 pages
Final PPT DataMining
No ratings yet
Final PPT DataMining
64 pages
Softcomputing Assignment 1
No ratings yet
Softcomputing Assignment 1
7 pages
5.2 Neural Network
No ratings yet
5.2 Neural Network
111 pages
International Baccalaureate (IB) : Artificial Neural Networks - #1
No ratings yet
International Baccalaureate (IB) : Artificial Neural Networks - #1
33 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
Single Layer Perceptron Experiment
No ratings yet
Single Layer Perceptron Experiment
11 pages
Slide 5 - Artificial Neural Networks Part1
No ratings yet
Slide 5 - Artificial Neural Networks Part1
27 pages
The Failure of The Perceptron To Successfully Simple Problem Such As XOR (Minsky and Papert)
No ratings yet
The Failure of The Perceptron To Successfully Simple Problem Such As XOR (Minsky and Papert)
13 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
Artificial Neural Networks Explained
No ratings yet
Artificial Neural Networks Explained
54 pages
AI Mini Project Report
No ratings yet
AI Mini Project Report
7 pages
Cs224n Midterm 2018 Solution
No ratings yet
Cs224n Midterm 2018 Solution
17 pages
An Improved SSD-like Deep Network-Based Object Detection Method For Indoor Scenes
No ratings yet
An Improved SSD-like Deep Network-Based Object Detection Method For Indoor Scenes
16 pages
Generative Adversarial Networks
No ratings yet
Generative Adversarial Networks
6 pages
Tutorial - AlphaGo PDF
No ratings yet
Tutorial - AlphaGo PDF
27 pages
The Future of Artificial Intelligence
No ratings yet
The Future of Artificial Intelligence
24 pages
Alexnet Paper
No ratings yet
Alexnet Paper
39 pages
Sentiment Classification With Deep Neural Networks: Yi Zhou
No ratings yet
Sentiment Classification With Deep Neural Networks: Yi Zhou
58 pages
Multi Task Learning (MTL)
No ratings yet
Multi Task Learning (MTL)
15 pages
Supervised Learning Workshop
No ratings yet
Supervised Learning Workshop
30 pages
Kristian Perriu Audio Classification
No ratings yet
Kristian Perriu Audio Classification
5 pages
Chapter 3 Artificial Intelligent: Industry 4.0 in Mechanical Engineering
No ratings yet
Chapter 3 Artificial Intelligent: Industry 4.0 in Mechanical Engineering
45 pages
Long Short-Term Memory
No ratings yet
Long Short-Term Memory
9 pages
21bce5801 53620
No ratings yet
21bce5801 53620
49 pages
Artificial General Intelligence (AGI) - OpenAI's Q Model Might Be The Missing Piece.
No ratings yet
Artificial General Intelligence (AGI) - OpenAI's Q Model Might Be The Missing Piece.
9 pages
Cross Domain Sentiment Analysis
No ratings yet
Cross Domain Sentiment Analysis
17 pages
Multi-Task Learning On Mnist Image Datasets
No ratings yet
Multi-Task Learning On Mnist Image Datasets
4 pages
Artificial Intelligence and It's Types
No ratings yet
Artificial Intelligence and It's Types
2 pages
Spanish Word Vectors Analysis
No ratings yet
Spanish Word Vectors Analysis
5 pages
Age Prediction by Facial Features Recognition Using Yolo v4
No ratings yet
Age Prediction by Facial Features Recognition Using Yolo v4
3 pages
Perfect Crowd Counting Presentation
No ratings yet
Perfect Crowd Counting Presentation
13 pages
ANNMath
No ratings yet
ANNMath
104 pages
A Novel Statistical Analysis and Autoencoder Driven (CB)
No ratings yet
A Novel Statistical Analysis and Autoencoder Driven (CB)
29 pages
DL Pro 456
No ratings yet
DL Pro 456
8 pages
1.2.5. Machine Learning With Python Lab
No ratings yet
1.2.5. Machine Learning With Python Lab
2 pages
HEART DISEASE PREDICTION Using MACHINE LEARNING ALGORITHM Presentation
No ratings yet
HEART DISEASE PREDICTION Using MACHINE LEARNING ALGORITHM Presentation
15 pages
ML Unit V
No ratings yet
ML Unit V
10 pages
NN DL
No ratings yet
NN DL
54 pages
Impact of Artificial Intelligence On The Software Industries
No ratings yet
Impact of Artificial Intelligence On The Software Industries
25 pages
Unit 2 Neural Networks
No ratings yet
Unit 2 Neural Networks
52 pages

08 NN

Uploaded by

08 NN

Uploaded by

CSE472, CSE386

Artificial neural networks are inspired by brains and neurons

• The Network itself is composed of an arbitrary

Compute a known function / Approximate an unknown

Input edges, Input Output edges,

(positive, negative, and ai  g(  W j,i a j )

change over time, including and change over

 a processing element producing an output based on a function of its inputs

Units with a threshold activation function can act as logic gates;

Each unit receives input only

By adjusting the weights we get different functions:

A network with all the inputs connected directly to the outputs

– Output units all operate separately: no shared weights

Since each output unit is

• O is output (0 or 1) (single output).

• Learning procedure, called the “error correcting method”

• Start with all zero-weight vector.

• We’ll use a single perceptron with three inputs.

• Example 3 I=<1,1, 0> label=1 W= <1,0,1>

• Example 4 I=<1,1,1> label=1 W= <1,0,1>

• Epoch 2, through the examples, W = <1,0,1> . 1

• Example 1 I = <1,0,0> label=0 W = <1,0,1> I1 w1 O

• Example 2 I=<1,0, 1> label=1 W= <0,0,1>

• Example 3 I=<1,1,0> label=1 W= <0,0,1>

• Example 4 I=<1,1,1> label=1 W= <1,1,1>

• Epoch 3, through the examples, W = <1,1,1> .

• Example 1 I=<1,0,0> label=0 W = <1,1,1> I1 w1 O

• Example 2 I=<1,0,1> label=1 W= <0, 1, 1>

• Example 3 I=<1,1,0> label=1 W= <0, 1, 1>

• Example 4 I=<1,1,1> label=1 W= <0, 1, 1>

• Epoch 4, through the examples, W= <0, 1, 1>.

So the final weight vector W= <0, 1, 1> classifies all I2 W2=1

What hypothesis space can a perceptron represent?

Even more complex Booelan functions such as majority function .

But can it represent any arbitrary Boolean function?

A threshold perceptron returns 1 iff the weighted sum of its inputs

I.e., iff the input is on one side of the hyperplane it defines.

Linear discriminant function or linear decision surface.

Weights determine slope and bias determines offset.

Consider example with two inputs, x1, x2:

Minsky & Papert (1969)

So, XOR is not linearly separable

Perceptron converges to a consistent function, if…

• … training data linearly separable

1st phase 2nd

• Equivalently: The computations of the first phase perform a mapping

This can be performed via a second line, which can also be

You might also like