0% found this document useful (0 votes)

13 views15 pages

DSP Unit - Iv

The document provides an overview of Multi-Layer Perceptrons (MLPs), including their structure, activation functions like Sigmoid, tanh, and ReLu, and the process of back propagation for training. It also discusses loss functions for classification tasks, epochs, batch sizes, and introduces Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks, highlighting their applications and advantages. Additionally, it covers Convolutional Neural Networks (CNNs), their architecture, and the importance of layers such as convolution and pooling in image processing.

Uploaded by

harshakoushil04

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views15 pages

DSP Unit - Iv

Uploaded by

harshakoushil04

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

UNIT- 4
MULTILAYER PERCEPTRON (or) ANN (Artificial Neural Network) (or) Feed Forward:

 The Perceptron consists of an input layer and an output layer which are fully connected.

 A fully connected Multi-Layered Neural Network is known as Multi-Layer Perceptron.

 A Multi-Layered Neural Network consists of multiple layers of artificial neurons or nodes.

 MLPs have the same input and output layers but may have multiple hidden layers in between
the aforementioned layers, as seen below.

Sigmoid: takes real-valued input and squashes it to range between 0 and 1.

When we plot the output from sigmoid units given various weighted sums as input, it looks remarkably
like a step function:
V. ARUNA KUMARI-Asst. Professor-Dept of MCA
Page 1
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

tanh: takes real-valued input and squashes it to the range [-1, 1 ].

ReLu: ReLu stands for Rectified Linear Units. It takes real-valued input and thresholds it to 0 (replaces
negative values to 0).

Example Multi-layer ANN with Sigmoid Units:

 We will concern ourselves here with ANNs containing only one hidden layer, as this makes
describing the back propagation routine easier.
 Note that networks where you can feed in the input on the left and propagate it forward to get an
output are called feed forward networks.
 Below is such an ANN, with two sigmoid units in the hidden layer. The weights have been set
arbitrarily between all the units.

 Note that the sigma units have been identified with sigma signs in the node on the graph. As we did
with perceptrons, we can give this network an input and determine the output. We can also look to see
which units "fired", i.e., had a value closer to 1 than to 0.
 Suppose we input the values 10, 30, 20 into the three input units, from top to bottom. Then the
weighted sum coming into H1 will be:
SH1 = (0.2 * 10) + (-0.1 * 30) + (0.4 * 20) = 2 -3 + 8 = 7.
 Then the σ function is applied to S H1 to give:
σ(SH1) = 1/(1+e-7) = 1/(1+0.000912) = 0.999
 [Don't forget to negate S]. Similarly, the weighted sum coming into H2 will be:
SH2 = (0.7 * 10) + (-1.2 * 30) + (1.2 * 20) = 7 - 36 + 24 = -5

V. ARUNA KUMARI-Asst. Professor-Dept of MCA

Page 2
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

 and σ applied to S H2 gives:

σ(SH2) = 1/(1+e5) = 1/(1+148.4) = 0.0067
 From this, we can see that H1 has fired, but H2 has not. We can now calculate that the weighted
sum going in to output unit O1 will be:
SO1 = (1.1 * 0.999) + (0.1*0.0067) = 1.0996
 and the weighted sum going in to output unit O2 will be:
SO2 = (3.1 * 0.999) + (1.17*0.0067) = 3.1047
 The output sigmoid unit in O1 will now calculate the output values from the network for O1:
σ(SO1) = 1/(1+e-1.0996) = 1/(1+0.333) = 0.750
 and the output from the network for O2:
σ(SO2) = 1/(1+e-3.1047) = 1/(1+0.045) = 0.957
 Therefore, if this network represented the learned rules for a categorisation problem, the input triple
(10,30,20) would be categorised into the category associated with O2, because this has the larger
output

BACK PROPAGATION:

V. ARUNA KUMARI-Asst. Professor-Dept of MCA

Page 3
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

 So, with back propagation you basically try to change the weights of your model while training.

V. ARUNA KUMARI-Asst. Professor-Dept of MCA

Page 4
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

V. ARUNA KUMARI-Asst. Professor-Dept of MCA

Page 5
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

LOSS FUNCTIONS:

V. ARUNA KUMARI-Asst. Professor-Dept of MCA

Page 6
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

 Loss functions can be classified into two major categories depending upon the type of learning task
we are dealing with Regression losses and Classification losses.

Loss functions for Classification:

1. Binary Cross Entropy Loss:
It gives the probability value between 0 and 1 for a classification task. Cross-Entropy calculates the
average difference between the predicted and actual probabilities.
Mathematical formulation:-

V. ARUNA KUMARI-Asst. Professor-Dept of MCA

Page 7
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

2. Hinge Loss:
 This type of loss is used when the target variable has 1 or -1 as class labels. It penalizes the model
when there is a difference in the sign between the actual and predicted class values.
 Hinge loss is used for maximum-margin classification.
Mathematical formulation:-

EPOCHS AND BATCH SIZES:

 An epoch means training the neural network with all the training data for one cycle.
 In an epoch, we use all of the data exactly once. A forward pass and a backward pass together are
counted as one pass.

 An epoch is made up of one or more batches, where we use a part of the data set to train the neural network. We call
passing through the training examples in batch iteration.
 An epoch is sometimes mixed with iteration. To clarify the concepts, let’s consider a simple example where we
have1000 data points as presented in the figure below:

 If the batch size is 1000, we can complete an epoch with a single iteration. Similarly, if the batch size is
500, an epoch takes two iterations. So, if the batch size is 100, an epoch takes10 iterations to complete. Simply,
for each epoch, the required number of iterations times the batch size gives the number of data points.
 We can use multiple epochs in training. In this case, the neural network is fed the same data more than
once.
V. ARUNA KUMARI-Asst. Professor-Dept of MCA
Page 8
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

V. ARUNA KUMARI-Asst. Professor-Dept of MCA

Page 9
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

RECURRENT NEURAL NETWORK (RNN):

Types of Recurrent Neural Networks:

There are four types of Recurrent Neural Networks:
1. One to One
2. One to Many
3. Many to One
4. Many to Many

V. ARUNA KUMARI-Asst. Professor-Dept of MCA

Page 10
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

Applications of Recurrent Neural Networks:

 Image Captioning: RNNs are used to caption an image by analyzing the activities present.
 Time Series Prediction: Any time series problem, like predicting the prices of stocks in a
particular month, can be solved using an RNN.
 Natural Language Processing: Text mining and Sentiment analysis can be carried out using an
RNN for Natural Language Processing (NLP).
 Machine Translation: Given an input in one language, RNNs can be used to translate the input
into different languages as output.
Advantages of Recurrent Neural Network:
1. An RNN remembers each and every information through time.
2. RNN used with convolutional layers to extend the effective pixel neighborhood.
Disadvantages of Recurrent Neural Network:
1. Gradient vanishing and exploding problems.
2. Training an RNN is a very difficult task.
3. It cannot process very long sequences if using tanh or relu as an activation function.
LONG SHORT-TERM MEMORY (LSTM):
 Long Short-Term Memory (LSTM) networks are an extension of RNN that extend the memory,
which makes it easier to remember past data in memory.
 LSTM are used as the building blocks for the layers of a RNN.
 LSTMs assign data “weights” which helps RNNs to either let new information in, forget
information or give it importance enough to impact the output.
 The units of an LSTM are used as building units for the layers of a RNN, often called an LSTM
network.
 LSTMs enable RNNs to remember inputs over a long period of time. This is because LSTMs
contain information in a memory, much like the memory of a computer. The LSTM can read, write and
delete information from its memory.
 In an LSTM you have three gates: input, forget and output gate. These gates determine whether or
not to let new input in (input gate), delete the information because it isn’t important (forget gate), or let
it impact the output at the current time step (output gate).

V. ARUNA KUMARI-Asst. Professor-Dept of MCA

Page 11
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

Architecture of LSTM network:

 LSTM network have a sequence like structure, but the recurring network has a different module.
Instead of having single neural network layer, they have small parts connected to each other which
function in storing and removal of memory.

1. Input gate- It discover which value from input should be used to modify the
memory. Sigmoid function decides which values to let through 0 or 1. And tanh function gives
weightage to the values which are passed, deciding their level of importance ranging from -1 to 1.

2. Forget gate- It discover the details to be discarded from the block. A sigmoid function decides it. It
looks at the previous state (ht-1) and the content input (Xt) and outputs a number between 0(omit this)
and 1(keep this) for each number in the cell state Ct-1.

3. Output gate- The input and the memory of the block are used to decide the output. Sigmoid function decides
which values to let through 0 or 1. And tanh function decides which values to let through 0, 1. And tanh function
gives weightage to the values which are passed, deciding their level of importance ranging from -1 to 1 and
multiplied with an output of sigmoid.

V. ARUNA KUMARI-Asst. Professor-Dept of MCA

Page 12
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

CONVOLUTIONAL NEURAL NETWORK (CNN):

 Convolutional Neural Network is a special kind of multi-layer neural networks.
 Convolutional Neural Network is one of the main categories to do image classification and image
recognition in neural networks. Scene labeling, objects detections, and face recognition, etc., are some
of the areas where convolutional neural networks are widely used.
 CNN takes an image as input, which is classified and process under a certain category such as dog,
cat, lion, tiger, etc. The computer sees an image as an array of pixels and depends on the resolution of
the image. Based on image resolution, it will see as h * w * d, where h= height w= width and d=
dimension.
 Fully-connected network architecture does not take into account the spatial structure.
 In CNN, each input image will pass through a sequence of convolution layers along with pooling,
fully connected layers, filters (Also known as kernels). After that, we will apply the Soft-max function
to classify an object with probabilistic values 0 and 1.
Why Convolutions:
 Parameter sharing: a feature detector (such as a vertical edge detector) that’s useful in one part of
the image is probably useful in another part of the image.
 Sparsity of connections: In each layer, each output value depends only on small number of inputs.

Convolution Layer:
Convolution layer is the first layer to extract features from an input image. By learning image features
using a small square of input data, the convolutional layer preserves the relationship between pixels. It is
a mathematical operation which takes two inputs such as image matrix and a kernel or filter.
o The dimension of the image matrix is h×w×d.
o The dimension of the filter is f h×fw×d.
o The dimension of the output is (h-f h+1)×(w-fw+1)×1.

V. ARUNA KUMARI-Asst. Professor-Dept of MCA

Page 13
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

Stride:
Stride means how many cells the filter is moved in the input to calculate the next cell in the result.
When the stride is equaled to 1, then we move the filters to 1 pixel at a time and similarly, if the stride is
equaled to 2, then we move the filters to 2 pixels at a time. The following figure shows that the
convolution would work with a stride of 2.

Padding:
1. It allows us to use a CONV layer without necessarily shrinking the height and width of the
volumes. This is important for building deeper networks, since otherwise the height/width would shrink
as we go to deeper layers.
2. It helps us keep more of the information at the border of an image. Without padding, very few
values at the next layer would be affected by pixels as the edges of an image.

V. ARUNA KUMARI-Asst. Professor-Dept of MCA

Page 14
ASHOKA WOMEN’S ENGINEERING COLLEGE (AUTONOMOUS)

Pooling Layer:

 Pooling layer is used to reduce the size of the representations and to speed up calculations.
 In conventional CNNs, the feature map from the convolutional layer is subsample in a pooling
layer before being passed on to the next convolutional layer.
 The pooling layer works to replace a small patch in the feature map with its summary statistic.
 For example, the popular max-pooling layer reduces the input patch to a single value, the
maximum of all values within that patch. Other alternative pooling strategies involve taking the average,
weighted average of the patch as a sub sampling technique.
 Average Pooling: Down-scaling will perform through average pooling by dividing the input into
rectangular pooling regions and computing the average values of each region.

Advantages:
 Good at detecting patterns and features in images, videos, and audio signals and Robust transulate.
 Very High accuracy in image recognition problems.
 Automatically detects the important features without any human supervision.
Disadvantages:
 Computationally expensive to train and require a lot of memory.
 Requires large amounts of labeled data.

V. ARUNA KUMARI-Asst. Professor-Dept of MCA

Page 15

MLT Unit 4 and 5 Part 2
No ratings yet
MLT Unit 4 and 5 Part 2
34 pages
RNN & LSTM: Vamsi Krishna B 1 9 M E 0 2 3
No ratings yet
RNN & LSTM: Vamsi Krishna B 1 9 M E 0 2 3
14 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
LSTM
No ratings yet
LSTM
42 pages
Deep Learning
No ratings yet
Deep Learning
20 pages
Unit 4
No ratings yet
Unit 4
19 pages
DL Mod4
No ratings yet
DL Mod4
105 pages
LSTM Tutorial for AI Beginners
No ratings yet
LSTM Tutorial for AI Beginners
34 pages
Deep Learning RNN & LSTM Guide
100% (1)
Deep Learning RNN & LSTM Guide
44 pages
598 114 216 Recurrent Neural Networks
No ratings yet
598 114 216 Recurrent Neural Networks
87 pages
UNIT 1 Introduction Part 1
No ratings yet
UNIT 1 Introduction Part 1
37 pages
AML 03 Dense Neural Networks
No ratings yet
AML 03 Dense Neural Networks
20 pages
CS 601 Machine Learning Unit 4
No ratings yet
CS 601 Machine Learning Unit 4
14 pages
Unit 3 RCNN Updated
No ratings yet
Unit 3 RCNN Updated
28 pages
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
0% (1)
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
16 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
20 pages
Outline
No ratings yet
Outline
50 pages
ch6 RNN
No ratings yet
ch6 RNN
25 pages
LSTM Ucl
100% (1)
LSTM Ucl
35 pages
Recurrent Neural Networks (RNNS) : A Gentle Introduction and Overview
No ratings yet
Recurrent Neural Networks (RNNS) : A Gentle Introduction and Overview
16 pages
Introduction To Recurrent Neural Network
No ratings yet
Introduction To Recurrent Neural Network
9 pages
Introduction to Recurrent Neural Networks
No ratings yet
Introduction to Recurrent Neural Networks
18 pages
Lecture 09 Slides - After
No ratings yet
Lecture 09 Slides - After
57 pages
Neural Networks and Neural Language Models
No ratings yet
Neural Networks and Neural Language Models
27 pages
Lec1 Inroduction To Neural Network
No ratings yet
Lec1 Inroduction To Neural Network
23 pages
Deep Learning HA (Blog) - 1
No ratings yet
Deep Learning HA (Blog) - 1
9 pages
CP4252 ML Unit - V
No ratings yet
CP4252 ML Unit - V
17 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
Unit 5 (Second Half)
No ratings yet
Unit 5 (Second Half)
10 pages
Lect 9 DM
No ratings yet
Lect 9 DM
35 pages
Notes Chapter8
No ratings yet
Notes Chapter8
4 pages
Neural Network Presentation
No ratings yet
Neural Network Presentation
33 pages
Traditional Neural Networks (TNNS) - Simple Explanation What Are Traditional Neural Networks?
No ratings yet
Traditional Neural Networks (TNNS) - Simple Explanation What Are Traditional Neural Networks?
25 pages
Chapter 5
No ratings yet
Chapter 5
63 pages
Neural Networks
No ratings yet
Neural Networks
27 pages
Lecture06 RNNtTransformer
No ratings yet
Lecture06 RNNtTransformer
60 pages
CS 611 Slides 5
No ratings yet
CS 611 Slides 5
28 pages
Neural Networks
No ratings yet
Neural Networks
11 pages
Lecture 2 - Neural Network v1.0
No ratings yet
Lecture 2 - Neural Network v1.0
64 pages
7 RNN
No ratings yet
7 RNN
25 pages
Deep Learning
No ratings yet
Deep Learning
15 pages
AI Foundation Application-RNN
No ratings yet
AI Foundation Application-RNN
60 pages
Unit 4 - Machine Learning
No ratings yet
Unit 4 - Machine Learning
16 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
Signal Reconstruction Using Neural Networks
No ratings yet
Signal Reconstruction Using Neural Networks
16 pages
Unit 5
No ratings yet
Unit 5
59 pages
Lecture5 MCQ Guide
No ratings yet
Lecture5 MCQ Guide
9 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
Eng PPT Tech
No ratings yet
Eng PPT Tech
18 pages
Neural Networks: Feedforward Basics
No ratings yet
Neural Networks: Feedforward Basics
24 pages
ANN Unit IV Notes
No ratings yet
ANN Unit IV Notes
4 pages
Unit - 4 Artificial Neural Networks
No ratings yet
Unit - 4 Artificial Neural Networks
33 pages
CH4 - AA1.1-Sequence Models
No ratings yet
CH4 - AA1.1-Sequence Models
26 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Unit Ii DNN
No ratings yet
Unit Ii DNN
24 pages
Midterm Study Guide Csci566
No ratings yet
Midterm Study Guide Csci566
20 pages
Introduction To Artificial Neural Networks
No ratings yet
Introduction To Artificial Neural Networks
31 pages
Machine Learning Unit 4 RNN
No ratings yet
Machine Learning Unit 4 RNN
11 pages
CC-1 To 5 Units
No ratings yet
CC-1 To 5 Units
114 pages
STM Unit-1 (A)
No ratings yet
STM Unit-1 (A)
19 pages
EDAP Lab Manual
No ratings yet
EDAP Lab Manual
23 pages
STM Unitwise Imp Questions
No ratings yet
STM Unitwise Imp Questions
5 pages
Echo State Network
No ratings yet
Echo State Network
4 pages
Perceptron Learning and Classification
No ratings yet
Perceptron Learning and Classification
12 pages
Neural Networks and Deep Learning: A Textbook, 2nd Edition Charu C. Aggarwal Instant Download Full Chapters
50% (2)
Neural Networks and Deep Learning: A Textbook, 2nd Edition Charu C. Aggarwal Instant Download Full Chapters
152 pages
Deep Learning - AD3501 - Notes - Unit 3 - Recurrent Neural Networks
No ratings yet
Deep Learning - AD3501 - Notes - Unit 3 - Recurrent Neural Networks
29 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
P 1
No ratings yet
P 1
3 pages
Basics of Neural Networks
No ratings yet
Basics of Neural Networks
17 pages
Perceptron For Class
No ratings yet
Perceptron For Class
28 pages
ANN Syllabus
No ratings yet
ANN Syllabus
5 pages
Hopfield Network: Notes by Dr. B. Anuradha
No ratings yet
Hopfield Network: Notes by Dr. B. Anuradha
17 pages
AI5006 - Deep Learning
No ratings yet
AI5006 - Deep Learning
6 pages
ANN Course File 2011
No ratings yet
ANN Course File 2011
8 pages
NN Lecture Notes
No ratings yet
NN Lecture Notes
45 pages
Lec6 Video Understanding
No ratings yet
Lec6 Video Understanding
33 pages
K-Max Pooling Operation
No ratings yet
K-Max Pooling Operation
134 pages
Introduction To Neural Networks
No ratings yet
Introduction To Neural Networks
10 pages
Lesson 7.0 Supervised Learning With Neural Networks
No ratings yet
Lesson 7.0 Supervised Learning With Neural Networks
22 pages
Urn CH SLSP ZBZ 9781098134181 Ihv PDF
No ratings yet
Urn CH SLSP ZBZ 9781098134181 Ihv PDF
7 pages
DL For NLP
No ratings yet
DL For NLP
31 pages
Adaline
No ratings yet
Adaline
18 pages
Deep Learning for Football Events
No ratings yet
Deep Learning for Football Events
26 pages
UER: An Open-Source Toolkit For Pre-Training Models
No ratings yet
UER: An Open-Source Toolkit For Pre-Training Models
6 pages
Understanding LSTM Networks
No ratings yet
Understanding LSTM Networks
27 pages
.-111111 - Tti, N: Untvi '7,'lo) LLT' Ll''l.it
No ratings yet
.-111111 - Tti, N: Untvi '7,'lo) LLT' Ll''l.it
4 pages
RNN and LSTM Concepts Explained
No ratings yet
RNN and LSTM Concepts Explained
128 pages
Convolution Neural Networks
No ratings yet
Convolution Neural Networks
51 pages
Deep Learning Tutorial Complete (v3)
No ratings yet
Deep Learning Tutorial Complete (v3)
109 pages
Neural Networks: A Comprehensive Guide
No ratings yet
Neural Networks: A Comprehensive Guide
10 pages
RNN Basics
No ratings yet
RNN Basics
17 pages
UNIT 2 - Neural Networks & DL
No ratings yet
UNIT 2 - Neural Networks & DL
98 pages