0% found this document useful (0 votes)

641 views18 pages

Al3502 - DLV Unit 2

The document provides an overview of deep learning, focusing on deep feed-forward neural networks, gradient descent, back-propagation, and challenges such as the vanishing gradient problem. It discusses the structure and functioning of neural networks, optimization techniques, and mitigation strategies to enhance model performance and fairness. Additionally, it highlights heuristics for avoiding local minima and accelerating training processes in deep learning applications.

Uploaded by

swethakarthi1619

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

641 views18 pages

Al3502 - DLV Unit 2

Uploaded by

swethakarthi1619

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Al3502 - DEEP LEARNING FOR VISION

UNIT II INTRODUCTION TO DEEP LEARNING

Deep Feed-Forward Neural Networks – Gradient Descent – Back-Propagation and

Other Differentiation Algorithms – Vanishing Gradient Problem – Mitigation –
Rectified Linear Unit (ReLU) – Heuristics for Avoiding Bad Local Minima – Heuristics
for Faster Training – Nestors Accelerated Gradient Descent – Regularization for Deep
Learning – Dropout – Adversarial Training – Optimization for Training Deep Models.

DEEP LEARNING

 Deep learning is a subfield of machine learning that utilizes artificial neural networks
with multiple layers to analyze data and learn complex patterns.
 Deep learning is a class of machine learning techniques that use multi-layered
artificial neural networks to model and learn complex representations of data. It is
particularly effective for processing unstructured data such as images, audio, and
natural language.

NEURAL NETWORKS

A Neural Network is a computational model inspired by the structure and functioning of the
human brain, designed to recognize patterns and solve problems by learning from data
through interconnected layers of artificial neurons.
 Neurons: The basic units that receive inputs, each neuron is governed by a threshold
and an activation function.
 Connections: Links between neurons that carry information, regulated by weights
and biases.
 Weights and Biases: These parameters determine the strength and influence of
connections.
 Propagation Functions: Mechanisms that help process and transfer data across layers
of neurons.
 Learning Rule: The method that adjusts weights and biases over time to improve
accuracy.

1. Input Layer: This is where the network receives its input data. Each input neuron in
the layer corresponds to a feature in the input data.
2. Hidden Layers: These layers perform most of the computational heavy lifting. A
neural network can have one or multiple hidden layers. Each layer consists of units
(neurons) that transform the inputs into something that the output layer can use.
3. Output Layer: The final layer produces the output of the model. The format of these
outputs varies depending on the specific task like classification, regression.
DEEP FEED-FORWARD NEURAL NETWORKS

Feedforward Neural Network (FNN) is a type of artificial neural network in which

information flows in a single direction—from the input layer through hidden layers to the
output layer—without loops or feedback. It is mainly used for pattern recognition tasks like
image and speech classification.

Advantages of DFNN

 Can model complex non-linear relationships

 Suitable for image, text, sound, and tabular data
 Easily scalable with hardware (e.g., GPUs)

Limitations

 Needs large data for effective training

 May overfit without regularization
 Computationally expensive

Real-Life Applications

 Image classification
 Disease prediction
 Game-playing AI
 Stock price forecasting
GRADIENT DESCENT
Gradient Descent is an optimization algorithm used to minimize the loss function in machine
learning and deep learning models by adjusting model parameters (like weights and biases) in
the direction of the steepest descent (i.e., negative gradient).

Gradient Descent is a fundamental optimization algorithm used in machine learning and deep
learning to minimize the loss function. Simply put, it helps us find the best possible
parameters (or weights) for our models so that they can make accurate predictions.

Why is it Important?

In the context of deep learning, especially in computer vision tasks (like identifying objects in
images), we need our models to learn from data effectively. Gradient Descent helps us adjust
the model's parameters in such a way that the difference between the predicted values and
actual values (the error) is minimized.

Step-by-Step Explanation

1. Initialization:

Start with random values for the model parameters (weights). Think of this as starting at a
random point on a hilly landscape.

2. Calculate the Loss:

Use a loss function to measure how far off your model's predictions are from the
actual results. This is like calculating the height of the hill at your starting point.

3. Compute the Gradient:

The gradient is a vector that points in the direction of the steepest ascent. It indicates
how much the loss will increase if we change the parameters. Imagine you want to go
downhill; the gradient tells you which way to go.

4. Update the Parameters:

Adjust the parameters in the opposite direction of the gradient. This is like taking a step down
the hill. The size of the step is determined by a value called the learning rate.

5. Repeat:

Continue calculating the loss, computing the gradient, and updating the parameters
until the model’s performance stops improving significantly or until a set number of
iterations is reached.

VISUALIZING GRADIENT DESCENT

Imagine a ball at the top of a hill (representing high loss). The ball rolls down to find the
lowest point (the minimum loss). Each time the ball rolls, it takes the steepest path
downward, which is analogous to following the gradient in Gradient Descent.

Real-Life Application

Example: Image Classification

Task: Suppose we want to teach a computer to recognize cats in photos.

Process:

 We initialize our model with random weights.

 We feed it a series of images (some with cats, some without) and calculate how well it
identifies the cats using a loss function.
 The model calculates the gradients and updates its weights using Gradient Descent.
 After several iterations, the model learns to recognize cats more accurately.

BACK-PROPACATION
 Backpropagation is a crucial concept in deep learning, particularly when training
neural networks to recognize patterns in data, such as images. Let's break it down in a
way that is clear and accessible.
 Backpropagation is an algorithm used to minimize the error in a neural network's
predictions. It does this by adjusting the weights of the connections within the
network based on the errors it makes during training.

Why is Back-Propagation Important?

 Learning from Mistakes: Just like how we learn from our mistakes, back-
propagation helps neural networks learn by correcting their errors.
 Improving Accuracy: By repeatedly adjusting weights, the model becomes better at
making accurate predictions over time.
Step-by-Step Explanation of Back-Propagation

1. Forward Pass:
 The input data (like an image) is passed through the neural network.
 Each neuron processes the input and passes its output to the next layer.
 At the end, the network produces an output (like a prediction of what is in the image).
2. Calculate Error:
 The output of the network is compared to the actual answer (the ground truth).
 The difference between the predicted output and the actual answer is calculated using
a loss function (a measure of how wrong the prediction is).

3. Backward Pass:
 The error is then propagated backward through the network.
 This involves calculating the gradient (slope) of the error with respect to each weight
in the network. The gradient tells us how much to change the weights to reduce the
error.

4. Update Weights:
 Using the gradients calculated, the weights are updated to minimize the error. This is
typically done using an optimization algorithm like Stochastic Gradient Descent
(SGD).
 The weights are adjusted slightly in the direction that reduces the error, based on the
calculated gradients.

5. Repeat:
 Steps 1 to 4 are repeated for many iterations (epochs) with different batches of
training data until the model's performance stabilizes or improves to an acceptable
level.

VANISHING GRADIENT PROBLEM

In deep learning, particularly when we talk about training neural networks, one of the
challenges we face is the Vanishing Gradient Problem. This issue can significantly hinder our
ability to train deep networks effectively. Let's break it down step by step.

Step 1: Understanding Neural Networks

 Neural Networks are computational models inspired by the human brain. They
consist of layers of nodes (neurons) that process inputs to produce outputs.
 Each connection between nodes has a weight, which is adjusted during training to
improve the network's performance.
Step 2: The Role of Gradients

 Gradients are numerical values that indicate how much a change in input will affect
the output. They are essential for training neural networks because they guide the
adjustments made to the weights.
 During training, we use a method called backpropagation to compute these gradients.
This process helps us update the weights effectively.

Step 3: What Happens During Backpropagation?

 When backpropagation is executed, gradients are calculated from the output layer
back to the input layer.
 In deep networks (networks with many layers), the gradients must be multiplied at
each layer. If the gradients are small (close to zero), the values can become extremely
small as they move back through the layers.

Step 4: The Vanishing Gradient Phenomenon

 When the gradients become very small, the updates to the weights of the earlier
layers (closer to the input) are negligible. This means that those layers learn very
slowly, if at all.
 As a result, the network might struggle to learn important features in the data, leading
to poor performance.

Solutions to the Vanishing Gradient Problem

1. Use of Activation Functions:

Certain functions, like ReLU (Rectified Linear Unit), help maintain gradients better
than traditional activation functions like sigmoid or tanh.

2. Batch Normalization:

This technique normalizes the inputs to each layer, helping to stabilize and accelerate
training.

3. Skip Connections:

These connections allow gradients to bypass certain layers, making it easier for the
network to learn.

4. Proper Initialization:

Initializing weights correctly can prevent gradients from becoming too small in the
first place.
MITIGATION
Mitigation, in the context of deep learning for vision, refers to strategies and techniques used
to reduce negative outcomes or risks associated with using deep learning models. These
negative outcomes could include biases in predictions, inaccuracies in image recognition, or
failures in model performance. The goal of mitigation is to ensure that the models are fair,
reliable, and effective.

Why is Mitigation Important?

1. Accuracy: Models need to make correct predictions based on visual data. Mitigation
helps improve accuracy by addressing potential errors.

2. Fairness: Deep learning models can unintentionally learn biases from the data they
are trained on. Mitigation techniques aim to identify and reduce these biases, leading to fairer
outcomes for all users.

3. Robustness: Models should perform well even when presented with unfamiliar data.
Mitigation strategies help enhance the robustness of models against unexpected inputs.

Steps to Mitigation in Deep Learning for Vision

1. Identify Potential Risks

 Analyze the data and the model to identify what could go wrong.
 Look for biases in the dataset, such as underrepresentation of certain groups or
classes.

2. Data Augmentation

 What It Is: This technique involves modifying the training images to create new
variations.
 Example: If you have a picture of a cat, you can rotate, zoom, or change its
brightness to make new versions.
 Benefit: This helps the model learn better by providing diverse examples.

3. Bias Detection

 Use statistical methods to check for biases in the model's predictions.

 Example: If a facial recognition model performs poorly on certain ethnic groups, it
indicates bias.
 Mitigation: Adjust the training data to include more diverse examples.

4. Model Evaluation

 Continuously assess the model with new data to ensure it performs well across
different scenarios.
 Use metrics like precision, recall, and F1-score to evaluate performance.
5. Feedback Loop

 Incorporate feedback from users and stakeholders to improve the model iteratively.
 Example: If users report inaccuracies, take that feedback to adjust the model.

RECTIFIED LINEAR UNIT (RELU)

 Definition: An activation function determines whether a neuron in a neural network
should be activated or not. It helps the model learn complex patterns in the data.
 Purpose: It introduces non-linearity into the model, allowing it to learn from errors
and make better predictions.

{ReLU}(x) = max(0, x)

• If the input x is greater than 0, ReLU outputs x.

• If x is less than or equal to 0, it outputs 0.

Steps

1. Input to the Neuron: When data is fed into the neuron, it produces a numerical
output (let’s call this output x).

2. Applying ReLU:

• If x > 0, the output remains x.

• If x \leq 0, the output becomes 0.

2. Output: The result is then passed onto the next layer in the neural network.

Why Use ReLU?

 Simplicity: It is computationally efficient since it only requires a simple thresholding

at zero.
 Performance: It helps mitigate the vanishing gradient problem, allowing models to
learn faster and perform better.
 Sparsity: ReLU creates sparse representations, meaning that it activates only a
portion of the neurons, making the model more efficient.
HEURISTICS FOR AVOIDING BAD LOCAL MINIMA

Introduction to Local Minima

 Local Minima: These are small dips or valleys in the landscape where we might get
stuck. They are not the lowest point (global minimum) but are lower than their
immediate surroundings.
 Global Minimum: This is the deepest point in the landscape, representing the best
possible outcome for our model.

The Challenge
If our optimization process gets stuck in a local minimum, it won't find the best solution. This
can lead to a model that doesn’t perform well, which is why we need strategies (heuristics) to
avoid this problem.

Heuristics to avoid bad local minima

1. Initialization Strategies
 Random Initialization: Start with random values for weights instead of setting them
all to zero. This increases the chances of exploring different paths in the optimization
landscape.
 Heuristic Initialization: Use techniques like Xavier or He initialization to set
weights based on the number of input and output neurons, helping in better
convergence.

2. Learning Rate Adjustment

 Adaptive Learning Rates: Use optimizers like Adam or RMSprop that adjust the
learning rate during training. A learning rate that changes can help navigate out of
local minima.
 Learning Rate Schedules: Gradually decrease the learning rate as training
progresses, which can help in fine-tuning the model once it’s near a minimum.

3. Momentum
 Incorporate Momentum: This technique helps the optimization algorithm maintain
its direction and speed, allowing it to "roll over" small local minima instead of
getting stuck.
 How it Works: Like pushing a heavy ball down a hill, momentum helps carry the
optimization past small dips in the landscape.
4. Adding Noise
 Stochastic Gradient Descent (SGD): Instead of using the entire dataset to calculate
gradients, use a small random subset. This introduces noise and variability, which
can help escape local minima.
 Dropout: In neural networks, randomly dropping out nodes during training can
prevent the model from becoming overly reliant on specific features, leading to better
generalization.

5. Batch Normalization
Normalizing Activations: This technique helps maintain healthy distributions of layer
inputs, allowing for faster training and reducing the chance of getting stuck in poor local
minima.

6. Ensemble Methods
Combine Multiple Models: Train different models and combine their predictions. This
can help mitigate the effects of any one model getting stuck in a local minimum.

HEURISTICS FOR FASTER TRAINING

Training deep learning models can be time-consuming and resource-intensive. Heuristics
help us optimize this process, allowing models to learn faster and more effectively without
sacrificing performance.

Steps to Implement Heuristics for Faster Training

1. Data Augmentation
 What is it?: This technique involves creating variations of the training data. For
example, if you have a picture of a dog, you can flip it, rotate it, or change its
brightness to get new images.
 How it helps: By providing more diverse data, it helps the model learn better and
generalizes well without needing more raw data.

2. Learning Rate Scheduling

 What is it?: The learning rate is a parameter that controls how much to change the
model in response to the estimated error each time the model weights are updated.
Scheduling means changing this rate during training.
 How it helps: Starting with a higher learning rate allows the model to learn quickly,
and then reducing it helps fine-tune the model for better accuracy.
3. Transfer Learning
 What is it?: Instead of training a model from scratch, transfer learning takes a pre-
trained model (one that has already learned from a large dataset) and fine-tunes it for
a specific task.
 How it helps: This approach saves time and resources, as the model already has a
good understanding of features relevant to many tasks.

4. Batch Normalization
 What is it?: This technique normalizes the inputs of each layer in the network,
ensuring that they have a mean of zero and a standard deviation of one.
 How it helps: It stabilizes the learning process and allows for faster convergence,
meaning the model can learn more quickly.

5. Early Stopping
 What is it?: This heuristic involves monitoring the model's performance on a
validation set and stopping training when performance starts to degrade.
 How it helps: It prevents overfitting (where the model learns the training data too
well and performs poorly on new data) and saves time by not training longer than
necessary.

NESTORS ACCELERATED GRADIENT DESCENT

Nesterov's Accelerated Gradient Descent (NAG) is an optimization technique used in
machine learning, particularly in deep learning for improving the performance and speed of
training models. Let's break this concept down step by step for undergraduate students.

Understanding Gradient Descent

Before diving into Nesterov's method, it’s essential to understand the basic concept of
Gradient Descent:

1. Objective: The main goal of gradient descent is to minimize a function, usually the
loss function in machine learning, which measures how well the model is performing.

2. How It Works:

 Step 1: Start with an initial point (the model's parameters).

 Step 2: Calculate the gradient (the slope of the function) at that point.
 Step 3: Move in the opposite direction of the gradient (downhill) to reach a lower
point. The step size is determined by a parameter called the learning rate.
 Step 4: Repeat this process until you reach a point where the function is minimized
(or until you have sufficiently reduced the loss).
What Makes Nesterov’s Method Special?
Nesterov's method improves upon basic gradient descent by introducing a concept called
momentum. Here’s how it works:

1. Momentum: Instead of just using the current gradient to update the parameters,
momentum helps the optimization process to keep moving in the right direction. Think of it
like a ball rolling down a hill: it gains speed as it goes downhill, which helps it overcome
small bumps along the way.

2. Nesterov’s Approach:

 In Nesterov's method, we first make a “lookahead” step by predicting where we will

be after applying momentum.
 Step 1: Calculate the gradient at this predicted position (not just the current position).
 Step 2: Use this gradient to update the parameters. This way, the update step
considers where we are headed, resulting in more informed updates.

Step-by-Step Explanation of Nesterov's Accelerated Gradient Descent

1. Initialize Parameters: Start with initial values for the model parameters and set
the learning rate.
2. Calculate Momentum: Compute the momentum term, which is a combination of
the previous update and the current gradient.
3. Lookahead Position: Predict the next position using the momentum term.
4. Compute Gradient: Calculate the gradient at the lookahead position.
5. Update Parameters: Update the model parameters using this new gradient
information.
6. Repeat: Continue this process until the loss function converges or reaches a
satisfactory level.
REGULARIZATION FOR DEEP LEARNING
Regularization is a technique used to prevent a machine learning model from becoming too
complex. When a model is too complex, it can perform very well onthe training data but
poorly on new, unseen data. This phenomenon is known as overfitting.

Key Terms:

 Overfitting: When a model learns the noise and details of the training data too well,
leading to poor performance on new data.
 Underfitting: When a model is too simple and fails to capture the underlying patterns
in the data.

Why Do We Need Regularization?

Think of regularization like a coach for athletes. Just as a coach helps athletes avoid
overtraining and maintain a balanced approach to their sport, regularization helps models
maintain a balance between learning from the data and not memorizing it.

Types of Regularization Techniques

There are several methods of regularization, but we will focus on two common techniques:

1. L1 Regularization
2. L2 Regularization.

1. L1 Regularization (Lasso)

Concept: L1 regularization adds a penalty equivalent to the absolute value of the magnitude
of coefficients (weights) to the loss function. This encourages the model to use fewer
features, effectively performing feature selection.

Mathematical Representation:

Loss = Original Loss + λ * ∑ |weights|

Example: Imagine you’re teaching a student to identify different types of fruit. If they focus
too much on minor details (like the exact shade of yellow in a banana), they may confuse it
with a lemon. L1 regularization helps the model focus on the most important features (like
shape and size) instead of the minor details.

2. L2 Regularization (Ridge)

Concept: L2 regularization adds a penalty equal to the square of the magnitude of

coefficients to the loss function. This helps to keep the weights small and spread out across
features, reducing the chances of overfitting.
Mathematical Representation:

Loss = Original Loss +λ *∑ (weights^2)

Example: Think of a student learning to play piano. If they try to memorize every single note
of a song instead of understanding the chords and structure, they may struggle to play
anything new. L2 regularization encourages the model to learn the general structure rather
than memorizing specific examples.

Real-Life Applications of Regularization

1. Image Classification: When training models to classify images (e.g., dogs vs. cats),
regularization helps ensure that the model doesn't get distracted by irrelevant details (like
background elements), allowing it to focus on essential features like fur patterns and shapes.

2. Natural Language Processing: In tasks like sentiment analysis, regularization can

prevent models from overfitting to specific phrases or words, helping them to generalize
better to different expressions of sentiment

ADVERSARIAL TRAINING
Adversarial training is a technique used in deep learning to improve the robustness of
machine learning models, particularly those used for image recognition and processing. It
helps models learn not just from regular images but also from challenging examples designed
to confuse them.

Why is it Important?
In the real world, models can encounter unexpected or tricky inputs that might make them
perform poorly. Adversarial training prepares models to handle these difficult cases by
exposing them to examples that are intentionally designed to mislead them.

Step-by-Step Explanation of Adversarial Training

1. Understanding Adversarial Examples:

Adversarial Examples are inputs to a model that have been slightly altered in a
way that is usually imperceptible to humans but causes the model to make an
incorrect prediction.

For example, if a model is trained to recognize cats, an image of a cat can be

slightly modified (like changing a few pixels) to trick the model into thinking
it’s a dog.
2. Creating Adversarial Examples:

Techniques like the Fast Gradient Sign Method (FGSM) can be used to create
these adversarial examples. This method calculates how small changes to an
image can lead to a large change in the model's output.

3. Training with Adversarial Examples:

During training, the model is shown both normal images and adversarial
examples. By learning from both, it becomes better at identifying real objects as
well as recognizing when an image is trying to trick it.

For instance, if the model learns that a specific modification of a cat image
leads to it thinking it’s a dog, it can adjust its parameters to improve its
performance on similar tricks in the future.

4. Evaluating Model Robustness:

After training, the model is tested with a mix of normal and adversarial
examples to see how well it performs. A robust model will maintain its accuracy
even when faced with adversarial inputs.

Real-Life Applications of Adversarial Training

 Self-Driving Cars: They must recognize road signs and pedestrians

accurately. Adversarial training helps ensure they can handle unusual
situations, like a sign that has been altered to confuse the model.
 Medical Image Diagnosis: In healthcare, models analyze X-rays or
MRIs. Adversarial training can help improve their accuracy, even when
images are slightly altered due to noise or other factors.
 Security Systems: Facial recognition systems can be tricked by small
changes in facial images. Adversarial training helps these systems
become more secure against attempts to bypass them.
OPTIMIZATION FOR TRAINING DEEP MODELS
Optimization in this context refers to the process of fine-tuning the parameters of a neural
network to improve its performance. Think of it as finding the best possible solution to a
problem — in our case, making the model as accurate as possible when predicting or
classifying data.

Why is Optimization Important?

1. Accuracy Improvement: A well-optimized model can significantly increase
accuracy in tasks like image recognition or natural language processing.
2. Efficiency: Optimization helps in reducing the time it takes for the model to learn
from the data, making the training process faster.
3. Resource Utilization: It ensures that computational resources (like GPU usage)
are used efficiently, which is crucial given the large datasets often involved in
deep learning.

Key Concepts in Optimization

1. Loss Function

The loss function measures how well the model's predictions match the actual outcomes. It
quantifies the difference between predicted values and actual values. Our goal in optimization
is to minimize this loss function.

Example: In a model that recognizes cats and dogs, if the model predicts a dog but the actual
image is of a cat, the loss function will give a higher value. We want to minimize these
errors.

2. Gradient Descent

Gradient Descent is one of the most common optimization algorithms used for training deep
models. It works by calculating the gradient (or slope) of the loss function and updating the
model's parameters in the opposite direction of that gradient.

Steps:

 Calculate the Gradient: Determine how much the loss function changes with respect
to each parameter.
 Update Parameters: Adjust the parameters slightly in the direction that reduces the
loss.
 Repeat: Continue this process until the loss is minimized and the model performs
satisfactorily.

Visual Cue: Imagine a ball rolling down a hill. The ball will naturally roll to the lowest point
(minimum loss). Gradient descent helps the model find its way to this "lowest point."
3. Learning Rate

The learning rate is a crucial hyperparameter in gradient descent. It determines how big of a
step we take in the parameter space when updating the model.

 Too High: The model might overshoot the minimum and fail to converge.
 Too Low: The model may take too long to converge, making the training inefficient.

Example: If you're climbing down a mountain, taking huge steps could lead you to fall off a
cliff (overshooting), while tiny steps will take forever to reach the bottom (too slow).

4. Regularization

Regularization techniques are used during optimization to prevent the model from becoming
too complex and overfitting the training data. Overfitting happens when the model learns the
training data too well, including its noise and outliers.

• L1 and L2 Regularization: These add a penalty to the loss function based on the size
of the parameters, encouraging the model to keep the parameters small and simple.

Example: Think of it like a student studying for a test. If the student memorizes every detail
from the textbook without understanding the concepts, they might do well on that test but
struggle with real-world applications.

AD601 Deep Learning Unit-2 Notes
No ratings yet
AD601 Deep Learning Unit-2 Notes
14 pages
Al3502 - DLV Unit 3
No ratings yet
Al3502 - DLV Unit 3
11 pages
DLV Lab Manual Print
No ratings yet
DLV Lab Manual Print
29 pages
Al3502 Deep Learning For Vision Lab Manuval
No ratings yet
Al3502 Deep Learning For Vision Lab Manuval
19 pages
DLV Question Ia 1 Set (A)
No ratings yet
DLV Question Ia 1 Set (A)
3 pages
Al3502deep Learning For Visionl T P C
No ratings yet
Al3502deep Learning For Visionl T P C
3 pages
Deep Learning For Vision Lab Manual 2024
100% (1)
Deep Learning For Vision Lab Manual 2024
25 pages
Unit III
No ratings yet
Unit III
38 pages
Deep Learning - Question Bank
No ratings yet
Deep Learning - Question Bank
6 pages
Deep Learning For Vision Book 2
No ratings yet
Deep Learning For Vision Book 2
292 pages
Unit 2 Introduction To Deep Learning
67% (3)
Unit 2 Introduction To Deep Learning
79 pages
Unit-V Deep Learning Techniques
100% (1)
Unit-V Deep Learning Techniques
31 pages
Deep Learning Question Bank (2024-25)
No ratings yet
Deep Learning Question Bank (2024-25)
2 pages
DL Question Bank
No ratings yet
DL Question Bank
23 pages
Unit 5
No ratings yet
Unit 5
61 pages
Deep Learning CNN Training Guide
No ratings yet
Deep Learning CNN Training Guide
20 pages
Al3502 - DLV Unit 1 Notes
No ratings yet
Al3502 - DLV Unit 1 Notes
15 pages
DL Unit-2
No ratings yet
DL Unit-2
51 pages
Deep Learning Exam Guide
No ratings yet
Deep Learning Exam Guide
3 pages
Backpropagation & Neural Networks
No ratings yet
Backpropagation & Neural Networks
30 pages
UNIT2
No ratings yet
UNIT2
25 pages
Stochastic Encoders
100% (1)
Stochastic Encoders
2 pages
DLunit 4
No ratings yet
DLunit 4
16 pages
DL Unit5
No ratings yet
DL Unit5
15 pages
DEEP LEARNING Import Questions For External Exam
No ratings yet
DEEP LEARNING Import Questions For External Exam
1 page
Neural Networks Lab Guide
No ratings yet
Neural Networks Lab Guide
26 pages
Deep Learning Questions
50% (2)
Deep Learning Questions
51 pages
Tangent Prop and Manifold Tangent Classifier Are B
No ratings yet
Tangent Prop and Manifold Tangent Classifier Are B
4 pages
DL Unit 2
No ratings yet
DL Unit 2
29 pages
Unit 5 DL
No ratings yet
Unit 5 DL
11 pages
Deep Learning Question Bank Iv-I
No ratings yet
Deep Learning Question Bank Iv-I
5 pages
Deep Learning Laboratory
No ratings yet
Deep Learning Laboratory
69 pages
Neural Networks & Deep Learning Notes
No ratings yet
Neural Networks & Deep Learning Notes
153 pages
402B Deep Learning
100% (1)
402B Deep Learning
82 pages
2.building Blocks of Neural Networks
100% (1)
2.building Blocks of Neural Networks
2 pages
CCS338 Lab Manual Final
No ratings yet
CCS338 Lab Manual Final
7 pages
Unit Ii
No ratings yet
Unit Ii
8 pages
Unit II
No ratings yet
Unit II
56 pages
Unit IV
No ratings yet
Unit IV
22 pages
What Is Gradient Based Learning in Deep Learning
100% (1)
What Is Gradient Based Learning in Deep Learning
12 pages
Ccs355 Neural Networks and Deep Learning Unit1
No ratings yet
Ccs355 Neural Networks and Deep Learning Unit1
29 pages
Unit - 3-NNDL - Notes
No ratings yet
Unit - 3-NNDL - Notes
17 pages
Solving XOR Problem Using DNN AIDS
100% (1)
Solving XOR Problem Using DNN AIDS
4 pages
ccs355 Syllabus NNDL
100% (1)
ccs355 Syllabus NNDL
3 pages
Deep Learning (MODULE-3)
No ratings yet
Deep Learning (MODULE-3)
85 pages
Unit 4 DL
No ratings yet
Unit 4 DL
31 pages
120 Deep Learning Important Questions + Answers ?
No ratings yet
120 Deep Learning Important Questions + Answers ?
68 pages
Unit V
No ratings yet
Unit V
21 pages
Unit 1 Notes
100% (1)
Unit 1 Notes
14 pages
Deep Learning Question Paper
100% (1)
Deep Learning Question Paper
3 pages
Neural Network Optimizers Guide
100% (2)
Neural Network Optimizers Guide
21 pages
Artificial Neural Networks Guide
100% (1)
Artificial Neural Networks Guide
45 pages
Deep Learning - AD3501 - Important Question and 2 Marks With Answers - Unit 1
No ratings yet
Deep Learning - AD3501 - Important Question and 2 Marks With Answers - Unit 1
13 pages
Deep Learning Question
No ratings yet
Deep Learning Question
4 pages
Deep Learning
No ratings yet
Deep Learning
243 pages
Unit 5
No ratings yet
Unit 5
36 pages
Greedy-Layerwise in Deep Learning
No ratings yet
Greedy-Layerwise in Deep Learning
15 pages
Deep Learning - AD3501 - Notes - Unit 2 - Convolutional Neural Networks
No ratings yet
Deep Learning - AD3501 - Notes - Unit 2 - Convolutional Neural Networks
36 pages
DL Unit - 4
100% (1)
DL Unit - 4
14 pages
Unit 2
No ratings yet
Unit 2
10 pages
Excel & Access Data Integration
No ratings yet
Excel & Access Data Integration
6 pages
LM138, LM338 Adjustable Regulators
No ratings yet
LM138, LM338 Adjustable Regulators
14 pages
Prowatch5 0
No ratings yet
Prowatch5 0
10 pages
Siebelink, Voordijk, Adriaanse - 2018 - Developing and Testing A Tool To Evaluate BIM Maturity Sectoral Analysis in The Dutch Constructi
No ratings yet
Siebelink, Voordijk, Adriaanse - 2018 - Developing and Testing A Tool To Evaluate BIM Maturity Sectoral Analysis in The Dutch Constructi
14 pages
Model: RC522 Arduino Module RFID: WWW - Ekt WWW - Ekt
No ratings yet
Model: RC522 Arduino Module RFID: WWW - Ekt WWW - Ekt
2 pages
Literature Review of Speed Control of DC Motor Using Chopper
100% (1)
Literature Review of Speed Control of DC Motor Using Chopper
17 pages
Certification Guide Data Engineer 2024
No ratings yet
Certification Guide Data Engineer 2024
7 pages
Iphone Price List
No ratings yet
Iphone Price List
1 page
Comparison and Analysis of Delay Elements
No ratings yet
Comparison and Analysis of Delay Elements
4 pages
Outlier Detection - Weka - IQR
No ratings yet
Outlier Detection - Weka - IQR
7 pages
Phantom Forces AIMBOT - LOWER CAM SENSITIVITY
No ratings yet
Phantom Forces AIMBOT - LOWER CAM SENSITIVITY
2 pages
F1 Self-Checking MC Quiz Chapter 10 Manipulation of Simple Polynomials - PDF - Google Drive
No ratings yet
F1 Self-Checking MC Quiz Chapter 10 Manipulation of Simple Polynomials - PDF - Google Drive
1 page
R00 - 16-20MVA - 33-11kV - Datasheet
No ratings yet
R00 - 16-20MVA - 33-11kV - Datasheet
3 pages
Crysis Manual
0% (1)
Crysis Manual
14 pages
Cover Letter Template On Microsoft Word
100% (1)
Cover Letter Template On Microsoft Word
7 pages
CX600 Troubleshooting Guide
No ratings yet
CX600 Troubleshooting Guide
13 pages
Rohini 62912743812
No ratings yet
Rohini 62912743812
6 pages
470.223.02 474.64 Grid Vgpu Release Notes Ubuntu
No ratings yet
470.223.02 474.64 Grid Vgpu Release Notes Ubuntu
63 pages
Meaningful Metaverse Design Insights
No ratings yet
Meaningful Metaverse Design Insights
10 pages
Noun Modifier
No ratings yet
Noun Modifier
129 pages
A Philosophy of Computer Art
100% (1)
A Philosophy of Computer Art
16 pages
Eng User Manual DRG A226g
No ratings yet
Eng User Manual DRG A226g
164 pages
Reconyx HyperFire - MMM
No ratings yet
Reconyx HyperFire - MMM
32 pages
Process Control Guide for Engineers
No ratings yet
Process Control Guide for Engineers
23 pages
Safwan Updated Resume
No ratings yet
Safwan Updated Resume
1 page
Budget of Work (Bow) in Mathematics
No ratings yet
Budget of Work (Bow) in Mathematics
6 pages
Testing in Python - Unit Test & Script
No ratings yet
Testing in Python - Unit Test & Script
5 pages
Traffic Flow - Wikipedia
No ratings yet
Traffic Flow - Wikipedia
146 pages
Introduction toGIS
No ratings yet
Introduction toGIS
38 pages
Complete Bundle Essentials of Business Communication 8th Edition Guffey
No ratings yet
Complete Bundle Essentials of Business Communication 8th Edition Guffey
405 pages

Al3502 - DLV Unit 2

Uploaded by

Al3502 - DLV Unit 2

Uploaded by

Al3502 - DEEP LEARNING FOR VISION

UNIT II INTRODUCTION TO DEEP LEARNING

Deep Feed-Forward Neural Networks – Gradient Descent – Back-Propagation and

Feedforward Neural Network (FNN) is a type of artificial neural network in which

 Can model complex non-linear relationships

 Needs large data for effective training

2. Calculate the Loss:

3. Compute the Gradient:

4. Update the Parameters:

VISUALIZING GRADIENT DESCENT

Example: Image Classification

Task: Suppose we want to teach a computer to recognize cats in photos.

 We initialize our model with random weights.

Why is Back-Propagation Important?

VANISHING GRADIENT PROBLEM

Step 1: Understanding Neural Networks

Step 3: What Happens During Backpropagation?

Step 4: The Vanishing Gradient Phenomenon

Solutions to the Vanishing Gradient Problem

1. Use of Activation Functions:

Why is Mitigation Important?

Steps to Mitigation in Deep Learning for Vision

1. Identify Potential Risks

 Use statistical methods to check for biases in the model's predictions.

RECTIFIED LINEAR UNIT (RELU)

• If the input x is greater than 0, ReLU outputs x.

• If x is less than or equal to 0, it outputs 0.

• If x > 0, the output remains x.

• If x \leq 0, the output becomes 0.

Why Use ReLU?

 Simplicity: It is computationally efficient since it only requires a simple thresholding

Introduction to Local Minima

Heuristics to avoid bad local minima

2. Learning Rate Adjustment

HEURISTICS FOR FASTER TRAINING

Steps to Implement Heuristics for Faster Training

2. Learning Rate Scheduling

NESTORS ACCELERATED GRADIENT DESCENT

Understanding Gradient Descent

 Step 1: Start with an initial point (the model's parameters).

 In Nesterov's method, we first make a “lookahead” step by predicting where we will

Step-by-Step Explanation of Nesterov's Accelerated Gradient Descent

Why Do We Need Regularization?

Types of Regularization Techniques

Loss = Original Loss + λ * ∑ |weights|

Concept: L2 regularization adds a penalty equal to the square of the magnitude of

Loss = Original Loss +λ *∑ (weights^2)

Real-Life Applications of Regularization

2. Natural Language Processing: In tasks like sentiment analysis, regularization can

Step-by-Step Explanation of Adversarial Training

1. Understanding Adversarial Examples:

For example, if a model is trained to recognize cats, an image of a cat can be

3. Training with Adversarial Examples:

4. Evaluating Model Robustness:

Real-Life Applications of Adversarial Training

 Self-Driving Cars: They must recognize road signs and pedestrians

Why is Optimization Important?

Key Concepts in Optimization

You might also like