100% found this document useful (1 vote)

66 views14 pages

Recurrent Neural Networks

Recurrent Neural Networks (RNNs) are designed to process sequential data and are particularly effective for tasks like time series analysis and natural language processing. They utilize feedback loops to retain information, but face challenges such as the vanishing and exploding gradient problems, which hinder learning long-term dependencies. Variants like Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) have been developed to address these issues, improving performance in various applications.

Uploaded by

Kowsalya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

66 views14 pages

Recurrent Neural Networks

Uploaded by

Kowsalya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Recurrent Neural Networks

• Recurrent Neural Networks (RNNs) are a type of artificial neural network designed to process sequences of data.
They work especially well for jobs requiring sequences, such as time series data, voice, natural language, and
other activities.
• RNN works on the principle of saving the output of a particular layer and feeding this back to the input in order to
predict the output of the layer.
• Below is how you can convert a Feed-Forward Neural Network into a Recurrent Neural Network:

Fig: Simple Recurrent Neural Network

1
Recurrent Neural Networks (Cont…)
The nodes in different layers of the neural network are compressed to form a single layer of recurrent neural
networks. A, B, and C are the parameters of the network.

2
Recurrent Neural Networks (Cont…)
Here, “x” is the input layer, “h” is the hidden layer, and “y” is the output layer. A, B, and C are the network
parameters used to improve the output of the model. At any given time t, the current input is a combination of
input at x(t) and x(t-1). The output at any given time is fetched back to the network to improve on the output.

Fig: Fully connected Recurrent Neural Network

3
Recurrent Neural Networks - Work
• In Recurrent Neural networks, the information cycles through a loop to the middle hidden layer.
• The input layer ‘x’ takes in the input to the neural network and processes it and passes it onto the middle layer.
• The middle layer ‘h’ can consist of multiple hidden layers, each with its own activation functions and weights
and biases. If you have a neural network where the various parameters of different hidden layers are not
affected by the previous layer, ie: the neural network does not have memory, then you can use a recurrent
neural network.
• The Recurrent Neural Network will standardize the different activation functions and weights and biases so that
each hidden layer has the same parameters. Then, instead of creating multiple hidden layers, it will create one
and loop over it as many times as required.

Fig: Working of Recurrent Neural Network

4
Types of Recurrent Neural Networks
There are four types of Recurrent Neural Networks:
• One to One
• One to Many
• Many to One
• Many to Many
One to One RNN
This type of neural network is known as the Vanilla Neural Network. It's used for general machine learning
problems, which has a single input and a single output.

5
Types of Recurrent Neural Networks (Cont…)
One to Many RNN
This type of neural network has a single input and multiple outputs. An example of this is the image caption.

Many to One RNN

This RNN takes a sequence of inputs and generates a single output.
Sentiment analysis is a good example of this kind of network where a given
sentence can be classified as expressing positive or negative sentiments.

6
Types of Recurrent Neural Networks (Cont…)
Many to Many RNN
This RNN takes a sequence of inputs and generates a sequence of outputs. Machine translation is one of the
examples.

7
Two Issues of Standard RNNs (Cont…)
1. Vanishing Gradient Problem
2. Exploding Gradient Problem

Vanishing Gradient Problem

Recurrent Neural Networks enable you to model time-dependent and sequential data problems, such as stock market
prediction, machine translation, and text generation. You will find, however, RNN is hard to train because of the
gradient problem.
RNNs suffer from the problem of vanishing gradients. The gradients carry information used in the RNN, and when the
gradient becomes too small, the parameter updates become insignificant. This makes the learning of long data
sequences difficult.
1. As the RNN trains, the gradients (used to adjust the model’s weights) become very small.
2. This makes it hard for the model to learn or remember information from earlier parts of the sequence.
3. The model repeatedly multiplies small numbers during backpropagation, making the gradients shrink to almost
zero.

Impact:

The RNN forgets long-term context and only focuses on recent inputs.

8
Two Issues of Standard RNNs
1. Vanishing Gradient Problem
2. Exploding Gradient Problem
Exploding Gradient Problem
While training a neural network, if the slope tends to grow exponentially instead of decaying, this is called an Exploding
Gradient. This problem arises when large error gradients accumulate, resulting in very large updates to the neural
network model weights during the training process.
Long training time, poor performance, and bad accuracy are the major issues in gradient problems.
• Sometimes, the gradients become extremely large during training.
• This makes the model’s weight updates go out of control, causing errors outputs.
• The model repeatedly multiplies large numbers, making the gradients grow bigger and bigger.

Impact:
Training becomes unstable, and the model fails to learn properly.

Issue Cause Impact Solution

Weights < 1 (during Cannot learn long-term LSTMs, GRUs, ReLU,
Vanishing Gradient
backpropagation) dependencies Clipping
Weights > 1 (during Gradient Clipping,
Exploding Gradient Training instability
backpropagation) Regularization
9
Long Short-Term Memory (LSTM)
1. Long Short-Term Memory (LSTM) is a special type of Recurrent Neural Network (RNN) designed to better handle
the vanishing gradient problem and learn long-term dependencies in sequential data. LSTMs are particularly useful
for tasks like language modeling, text generation, machine translation, and time-series forecasting.

Why LSTMs?
Standard RNNs struggle to learn long-term dependencies because their gradients can either vanish (become too small)
or explode (become too large) during backpropagation. This makes them ineffective for tasks where context over long
sequences is important. LSTMs overcome this limitation through their unique architecture that allows them to
remember information for longer periods.

10
Long Short-Term Memory (LSTM)
Structure of LSTM
Cell State (Ct):
The cell state acts as the memory of the LSTM. It carries information across time steps and can be modified by
different gates. This is what allows LSTMs to maintain long-term dependencies.
Hidden State (ht):
The hidden state is used for the output at each time step and is influenced by the cell state.
Gates:
Gates are neural network layers that control the flow of information through the cell state.
They use sigmoid σ(x) = 1 / (1 + e^(-x)) or tanh tanh(x) = (e^x - e^(-x)) / (e^x + e^(-x)) activation functions.
The gates include:
Forget Gate: Decides what information from the cell state should be discarded.
Input Gate: Decides what new information should be added to the cell state.
Output Gate: Decides what part of the cell state should be output as the hidden state.

11
Gated Recurrent Unit (GRU) Networks
• GRU is another type of RNN that is designed to address the vanishing gradient problem.

• It has two gates: the reset gate and the update gate.

• The reset gate determines how much of the previous state should be forgotten, while the update gate determines
how much of the new state should be remembered.

• This allows the GRU network to selectively update its internal state based on the input sequence.

How GRUs Work in Simple Terms

Think of GRUs as having a mechanism to decide what to remember and what to forget at each step:
• Update Gate: Controls how much of the past should be kept and how much should be replaced with new
information.
• Reset Gate: Helps decide how much of the past should be ignored when generating the new hidden state.

12
Compare GRU vs LSTM
Here is a comparison of Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM) networks

GRU LSTM

Structure Simpler structure with two gates (update and reset gate) More complex structure with three gates (input, forget, and output gate)

Fewer parameters (3 weight matrices - update gate, reset gate and candidate More parameters (4 weight matrices - candidate cell state, input, forget, and
Parameters
hidden state) output gate)

Training Faster to train Slow to train

In most cases, GRU tend to use fewer memory resources due to its simpler LSTM has a more complex structure and a larger number of parameters, thus
Space Complexity structure and fewer parameters, thus better suited for large datasets or might require more memory resources and could be less effective for large
sequences. datasets or sequences.

Generally performed similarly to LSTM on many tasks, but in some cases, GRU LSTM generally performs well on many tasks but is more computationally
Performance has been shown to outperform LSTM and vice versa. It's better to try both and expensive and requires more memory resources. LSTM has advantages over
see which works better for your dataset and task. GRU in natural language understanding and machine translation tasks.
Thank You

What Are Recurrent Neural Networks
No ratings yet
What Are Recurrent Neural Networks
7 pages
Unit 4 - MachineLearning
No ratings yet
Unit 4 - MachineLearning
16 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
Machine Learning: RNN, LSTM, GRU, Translation
No ratings yet
Machine Learning: RNN, LSTM, GRU, Translation
12 pages
Unit 4 - Machine Learning
No ratings yet
Unit 4 - Machine Learning
16 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
36 pages
ML Unit 4
No ratings yet
ML Unit 4
47 pages
Module 4
No ratings yet
Module 4
14 pages
DeepLearning SecC
No ratings yet
DeepLearning SecC
20 pages
RNNs & LSTMs for Tech Enthusiasts
No ratings yet
RNNs & LSTMs for Tech Enthusiasts
9 pages
CNN RNN LSTM GRU Simple
100% (3)
CNN RNN LSTM GRU Simple
20 pages
RNN Introduction
No ratings yet
RNN Introduction
22 pages
RNN LSTM
No ratings yet
RNN LSTM
72 pages
Unit 3
No ratings yet
Unit 3
8 pages
RNN 2
No ratings yet
RNN 2
144 pages
RNNs Explained for Tech Enthusiasts
No ratings yet
RNNs Explained for Tech Enthusiasts
6 pages
DL U-Ii
No ratings yet
DL U-Ii
41 pages
Module 5
No ratings yet
Module 5
21 pages
Module 06
No ratings yet
Module 06
5 pages
Machine Learning Unit 4 RNN
No ratings yet
Machine Learning Unit 4 RNN
11 pages
Mod 6
No ratings yet
Mod 6
48 pages
Deep & Reinforcement - Unit 4
No ratings yet
Deep & Reinforcement - Unit 4
17 pages
RNNs: Design, Advantages, and Challenges
No ratings yet
RNNs: Design, Advantages, and Challenges
30 pages
Deep Learning
No ratings yet
Deep Learning
26 pages
Unit 4
No ratings yet
Unit 4
27 pages
RNNs: Understanding and Applications
No ratings yet
RNNs: Understanding and Applications
30 pages
Unit 3 RCNN
No ratings yet
Unit 3 RCNN
25 pages
LSTM Ucl
100% (1)
LSTM Ucl
35 pages
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
No ratings yet
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
9 pages
Lecture Notes - RRN
No ratings yet
Lecture Notes - RRN
8 pages
2 U4-Rnn
No ratings yet
2 U4-Rnn
17 pages
SRM Institute of Science and Technology: Record Work
No ratings yet
SRM Institute of Science and Technology: Record Work
251 pages
Unit III - Recurrent Neural Networks
No ratings yet
Unit III - Recurrent Neural Networks
44 pages
RNN
No ratings yet
RNN
28 pages
Unit-Iv DL
No ratings yet
Unit-Iv DL
54 pages
GenAI Module2
No ratings yet
GenAI Module2
190 pages
AAM Unit 6 Notes
No ratings yet
AAM Unit 6 Notes
20 pages
ML (Cs-601) Unit 4 Complete
No ratings yet
ML (Cs-601) Unit 4 Complete
45 pages
DL Unit-4
No ratings yet
DL Unit-4
4 pages
Module2 L7 RNN LSTM
No ratings yet
Module2 L7 RNN LSTM
47 pages
LSTM&RNN
No ratings yet
LSTM&RNN
10 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
Deep Learning (MODULE-5)
100% (1)
Deep Learning (MODULE-5)
71 pages
RNNs: A Guide for AI Enthusiasts
No ratings yet
RNNs: A Guide for AI Enthusiasts
83 pages
RNN Notes
No ratings yet
RNN Notes
45 pages
Unit V
No ratings yet
Unit V
32 pages
Deep Learning - AD3501 - Notes - Unit 3 - Recurrent Neural Networks
No ratings yet
Deep Learning - AD3501 - Notes - Unit 3 - Recurrent Neural Networks
29 pages
RNN LSTM
No ratings yet
RNN LSTM
49 pages
CS 601 Machine Learning Unit 4
No ratings yet
CS 601 Machine Learning Unit 4
14 pages
Final PDL - Unit IV
No ratings yet
Final PDL - Unit IV
51 pages
DL Co3 - PPT 1
No ratings yet
DL Co3 - PPT 1
22 pages
What Is A Recurrent Neural Network
No ratings yet
What Is A Recurrent Neural Network
36 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
21 pages
Sequence Modeling
100% (1)
Sequence Modeling
131 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
47 pages
Day 4
No ratings yet
Day 4
22 pages
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
0% (1)
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
16 pages
Unit 5
No ratings yet
Unit 5
76 pages
Data Science Report Gowsalya
No ratings yet
Data Science Report Gowsalya
1 page
CCS355 - Neural Netwok and Deep Learning Lab Manual
No ratings yet
CCS355 - Neural Netwok and Deep Learning Lab Manual
100 pages
TripsterAI Hackathon PPT Format Gowsalya
No ratings yet
TripsterAI Hackathon PPT Format Gowsalya
12 pages
Unit 1 and Unit 2
No ratings yet
Unit 1 and Unit 2
1 page
Act. 4 - Quiz 1 Unit 1
No ratings yet
Act. 4 - Quiz 1 Unit 1
17 pages
Islam: History, Teachings, and Practices
100% (1)
Islam: History, Teachings, and Practices
12 pages
Lasmathhs017 Math 8 Q1 W3 LC1
No ratings yet
Lasmathhs017 Math 8 Q1 W3 LC1
6 pages
Instant Download What I Believe 1st Edition Anthony Kenny PDF All Chapters
100% (11)
Instant Download What I Believe 1st Edition Anthony Kenny PDF All Chapters
51 pages
Katalon Studio Assessment
No ratings yet
Katalon Studio Assessment
19 pages
Lab 3 - Forms
No ratings yet
Lab 3 - Forms
32 pages
The Fatalist by Isaac Bashevis Singer
No ratings yet
The Fatalist by Isaac Bashevis Singer
5 pages
Toward Speaking English Effectively3
No ratings yet
Toward Speaking English Effectively3
42 pages
Puritanism & Early American Literature
No ratings yet
Puritanism & Early American Literature
4 pages
PBL Sem6
No ratings yet
PBL Sem6
14 pages
Global Success 6-U8
No ratings yet
Global Success 6-U8
65 pages
Previous Question Papers
No ratings yet
Previous Question Papers
14 pages
A Midsummer Night
100% (1)
A Midsummer Night
10 pages
IELTS Inventions Speaking Guide
100% (2)
IELTS Inventions Speaking Guide
2 pages
Hizbun Nawawi
No ratings yet
Hizbun Nawawi
2 pages
8 Week Leetcode List
No ratings yet
8 Week Leetcode List
8 pages
Vocabulary Building
No ratings yet
Vocabulary Building
14 pages
La Vida Es Sueo Resumen
100% (2)
La Vida Es Sueo Resumen
5 pages
Algebra Solutions for Math Students
No ratings yet
Algebra Solutions for Math Students
4 pages
SOLUTIONS OF Ytha Yu Charles Marut-Assem PDF
0% (1)
SOLUTIONS OF Ytha Yu Charles Marut-Assem PDF
129 pages
Triptico de 5to de Primaria Ingles
No ratings yet
Triptico de 5to de Primaria Ingles
2 pages
The Internet: Pros and Cons
No ratings yet
The Internet: Pros and Cons
4 pages
Fall 2024 Capstone Library Catalog
No ratings yet
Fall 2024 Capstone Library Catalog
132 pages
CTY3 Extra Grammar Exercises Unit 1
No ratings yet
CTY3 Extra Grammar Exercises Unit 1
6 pages
ICT For Kids 1
100% (1)
ICT For Kids 1
74 pages
Youth Ministry Curriculum Planning Guide
No ratings yet
Youth Ministry Curriculum Planning Guide
7 pages
Electrical Symbols Guide
No ratings yet
Electrical Symbols Guide
23 pages
Class XII Bank Management Project
No ratings yet
Class XII Bank Management Project
29 pages
Common English Pronunciation Problem Faced by Cantonese Speakers
No ratings yet
Common English Pronunciation Problem Faced by Cantonese Speakers
9 pages
Lesson Plan Yr4 5
No ratings yet
Lesson Plan Yr4 5
2 pages

Recurrent Neural Networks

Uploaded by

Recurrent Neural Networks

Uploaded by

Recurrent Neural Networks

Fig: Simple Recurrent Neural Network

Fig: Fully connected Recurrent Neural Network

Fig: Working of Recurrent Neural Network

Many to One RNN

Vanishing Gradient Problem

Issue Cause Impact Solution

How GRUs Work in Simple Terms

Training Faster to train Slow to train

You might also like