MACHINE LEARNING TECHNIQUES
AGENDA:
1. Bidirectional RNN
2. Encoder–Decoder Sequence-to-Sequence Architecture
3. Deep Recurrent Networks
4. Recursive Neural Networks
Two Issues of Standard RNNs
• Vanishing Gradient Problem
– Recurrent Neural Networks enable you to model time-
dependent and sequential data problems, such as stock
market prediction, machine translation, and text
generation. You will find, however, RNN is hard to train
because of the gradient problem.
– RNNs suffer from the problem of vanishing gradients. The
gradients carry information used in the RNN, and when
the gradient becomes too small, the parameter updates
become insignificant. This makes the learning of long
data sequences difficult.
• Exploding Gradient Problem
– While training a neural network, if the slope
tends to grow exponentially instead of decaying,
this is called an Exploding Gradient. This
problem arises when large error gradients
accumulate, resulting in very large updates to
the neural network model weights during the
training process.
– Long training time, poor performance, and less
accuracy are the major issues in gradient
problems.
BIDIRECTIONAL RNN
• Connect two hidden layers of opposite
directions to the same output.
• Output layer gets information from past
(backwards) and future (forward) states
simultaneously.
Application:
Handwriting Recognition
Language Translation
Encoder–Decoder Sequence-to-Sequence
Architecture
• Special case of RNN that maps a fixed-length
input with a fixed-length output where their
lengths may differ.
• The model has three parts:
• Encoder
• Intermediate (encode) vector
• Decoder
Encoder
Encoder Vector
RNN RNN
RNN RNN RNN
Decoder
Encoder:
Encoder Vector:
Decoder:
Applications
Sentiment Analysis
Machine Translation
Speech Recognition
Video and Image Captioning
Text Summarization
Chatbot creation
DEEP RECURRENT NETWORKS
• Class of Neural Network where connections
between nodes form a graph with many
hidden layers
• The computation process:
1. Input to the Hidden state
2. Between two hidden states
3.Hidden state to the output.
DEEP RECURRENT NETWORKS
• Advantages:
– Process inputs of any length
– Possess internal memory
• Disadvantages:
– Initialization of model requires care to obtain convergence.
RECURSIVE NEURAL NETWORKS
• Apply the same set of weights recursively over
a structured input, to produce a structured
prediction over variable-size input structures
• Types:
• Inner Approach
• Outer Approach
• Application :Sentiment Analysis
Inner Approach
• Conduct recursion inside the underlying graph
and objective is achieved usually by moving
forward
E Prediction
Outer Approach
• Conduct recursion by outside
the underlying graph
DIFFERENCE BETWEEN RECURSIVE AND
RECURRENT NEURAL NETWORKS
RECURSIVE NEURAL RECURRENT NEURAL
NETWORKS NETWORKS
Tree like structure Chain like structure
Same weights used Different weights are used
repetitively.
Complex and Expensive at Less Complex
Computational Learning phase