Artificial Neural Network
Unit 5
Why name Artificial Neural Network?
Programs to solve any problem by trying to mimic the
structure and function of our nervous system
Based on simulated neurons which are joined together in
variety of ways to form networks
Neural network resembles human brain in following two
ways
A neural network acquires knowledge through learning
A neural network’s knowledge is stored within the
interconnection strengths
Applications of Neural Networks
Human brain: Neuron
Neural Networks
We are born with about 100 billion neurons
A neuron may connect to as many as 100,000
other neurons
Signals “move” via electrochemical signals
The synapses release a chemical transmitter
the sum of which can cause a threshold to be
reached
causing the neuron to “fire”
Analogy between Artificial NN and Biological NN
Analogy between Artificial NN and Biological NN
Dendrites Input
Accept input
Soma Node
Process input
Synapse Weight
Electrochemical contact
between neurons
Axon output
Turns processed input
to output
Artificial NN
Attributes of neuron
m binary inputs and one output (0 or 1)
Synaptic weights wij
Threshold I
Output is ‘1’ if and only if weighted sum of inputs is greater
than threshold
Perceptron
Perceptron
Perceptron
Perceptron
Perceptron
Perceptron
Perceptron
Perceptron
Perceptron
Perceptron
• Learning refers to the method of modifying weights of connections
• Learning ability of a neural network is determined by its architecture and by
the algorithm chosen for training
Introduce Bias
b
Introduce Bias
b
Output is dependent on Step Function i.e. it is either 0 or 1
depending on threshold
Sigmoid Function
Sigmoid Function
Sigmoid Function
Sigmoid Function
Sigmoid Function
Sigmoid Function
Sigmoid Function
Sigmoid Function
Example: Neural Networks
X1
2
X2 2
Y
-1
X3
The activation of a neuron is binary.
Uses threshold
Neuron either fires (activation of one) or does not fire (activation of zero)
If the weight on a path is positive the path is excitatory, otherwise it is inhibitory
Example: AND function
1
AND
X1
Y
X1 X2 Y
1 1 1
X2 1
1 0 0
AND Function
0 1 0
0 0 0
Threshold(Y) = 2
Example: AND function
1
AND
X1
Y
X1 X2 Y
1 1 1
X2 1
1 0 0
AND Function
0 1 0
0 0 0
Threshold(Y) = 2
Example: OR function
OR
X1 2
X1 X2 Y
Y
1 1 1
X2 2 1 0 1
AND Function
OR Function
0 1 1
0 0 0
Threshold(Y) = 2
Example: OR function
OR
X1 2
X1 X2 Y
Y
1 1 1
X2 2
1 0 1
0 1 1
AND Function
OR Function
0 0 0
Threshold(Y) = 2
Example: AND-NOT function
AND
X1 2 NOT
Y X1 X2 Y
X2
1 1 0
-1
1 0 1
AND NOT Function
0 1 0
0 0 0
Threshold(Y) = 2
Example: AND-NOT function
AND
X1 2 NOT
Y X1 X2 Y
X2
1 1 0
-1
1 0 1
AND NOT Function
0 1 0
0 0 0
Threshold(Y) = 2
Example: XOR function
2
2
X1 -1 Z1
XOR
Y X1 X2 Y
-1
1 1 0
Z2
X2
2
1 0 1
2
0 1 1
XOR Function
0 0 0
Artificial Neuron
Activation function
Activation Functions
Stept(x) = 1 if x >= t, else 0
Activation Functions
Stept(x) = 1 if x >= t, else 0
Sign(x) = +1 if x >= 0, else –1
Activation Functions
Stept(x) = 1 if x >= t, else 0
Sign(x) = +1 if x >= 0, else –1
Sigmoid(x) = 1/(1+e-x)
Artificial Neural Networks (ANN)
Input
Model is an assembly of nodes Black box
inter-connected nodes and Output
X1
weighted links w1 node
w2
X2 Y
w3
Output node sums up each
X3
of its input value according t
to the weights of its links
Perceptron Model
Compare output node
against some threshold t
Types of Neural Networks
Connection Type
Static (Feed Forward)
Dynamic (Feedback)
Topology
Single Layer
Multilayer
Recurrent
Learning Method
Supervised
Unsupervised
Reinforcement
Type can be a combination of the above
Types of Neural Networks based on connection
type and topology
Single layer feed-forward networks
Input layer projecting into the output layer
Input Output
layer layer
Types of Neural Networks based on connection
type and topology
Single layer feed-forward networks
Input layer projecting into the output layer
Input Output
layer layer
Multi-layer feed-forward networks (2 layers)
One or more hidden layers
2-layer or
1-hidden layer
fully connected
network
Input Hidden Output
layer layer layer
Multi-layer feed-forward networks (3 layers)
Input Hidden Output
layer layers layer
Recurrent networks
A network with feedback
Recurrent
network
Input Output
layer layer
Type of network based on process of learning
Initialize the weights (w0, w1, …, wk)
Adjust the weights in such a way that the output (output
of activation function) is consistent with class labels
(required output,Yi) of training examples
If not consistent then determine error
Error function:
E Yi f ( wi , X i )
2
Find the weights
i
wi’s that minimize the above error function
Examples of learning algorithm
gradient descent, backpropagation algorithm and others
Learning Method: Supervised Learning
Each training pattern is (Input, desired output)
Adapt weights to train network
After many epochs (iterations), it converges to local
minimum of error
Example: Face recognition
Learning Method: Unsupervised Learning
No help from outside
Training data does not have desired output
Learning by doing
Identify patterns using Clustering method
Ex:
given a set of height and weight of persons.
Some combinations of height and weight conclude that
person is over weight
Learning Method: Reinforcement Learning
Teacher: training data
Teacher assigns scores after evaluating performance of
training parameters
Simplified 2-layer Feed Forward Network
Simplified 2-layer Feed Forward Network
Uses sigmoid function as activation function
Simplified 2-layer Feed Forward Network
Uses sigmoid function as activation function
Simplified 2-layer Feed Forward Network
Uses sigmoid function as activation function
Simplified 2-layer Feed Forward Network
Uses sigmoid function as activation function
Feedforward Neural Network
Perceptrons in input layer takes input
Middle layer is not connected to the external world therefore
called hidden layer
Each perceptron in one layer is connected to every perceptron
on the next layer
Information is fed forward to next layer
Information moves in one direction
Input nodes hidden nodes output nodes
Feedforward networks
Are used for supervised learning
Compute a function f(x) ~ y for training pairs, f(x, y)
In the forward path signal moves from input layer through
hidden layer to output layer
Decision of output layer is measured against desired
(output) ground truth labels
In the backward pass, partial derivative of error function
with respect to weights and biases are back propagated
Error (Cost)
• Determine slope (p) of line which fits in the given data
• t = px
• 𝐶𝑜𝑠𝑡 = σ |𝑡 − 𝑦| 2
• Cost is dependent on slope, p of the line
Cost Function
Cost Function
Gradient Descent Algorithm
Gradient Descent Algorithm
• Goal is to find the lowest point of the cost function
• Gradient descent changes slope by changing weights to
reduce the cost iteratively
• Finally reaches minima
Gradient Descent Algorithm
Example: Multi Layer Perceptron (MLP)
Given data
Question
Input layer has two nodes and one bias nod (=1)
Given: Hidden layer has three nodes
Output of hidden layer is dependent on the input and weights
Output of hidden layer is fed to output layer
Output layer generates either y1 or y2
It can be compared with the known labels
Draw network
Example
Determine the network output if x1=1 and x2=0
2 1
x1= 1
2-1
1
x2 = 0
1 1
Example
Consider the network shown, where x1 and x2 are inputs
and Y is output. Show the output Y for the different
combinations of inputs and predict the logical operation
performed by the network. Assume sign function as an
activation function
Example
A neural network is shown in the figure. Write the
equation for the outputs at node A and B.
Backpropagation Algorithm
Used for layered feedforward ANN
Is a multilayered, feedforward, supervised learning
network based on gradient descent learning rule
We provide the algorithm the examples of inputs and
outputs
Error (difference between actual and expected results) is
calculated
Idea is to reduce this error until ANN learns the training
data
Backpropagation Algorithm
Backpropagation Algorithm
Initialize each weight with random number
Calculate output for every input
Calculate difference between actual output and calculated
output
Error is propagated back to input layer
Weights are adjusted for the minimum error
Gradient is used for adjusting weight
Gradient is the change in error with respect to weights
Repeat till error is below a threshold level
Resulting network is a trained network
Trained network is ready to predict the output for the given
input