Gradient Descent

Gradient descent is an optimization algorithm used in training machine learning models, iteratively adjusting parameters to minimize a cost function. It involves understanding gradients, which measure the change in weights concerning error, and can be implemented in three main types: batch, stochastic, and mini-batch gradient descent. Mini-batch gradient descent is the most commonly used method in deep learning, balancing efficiency and robustness.

Uploaded by

akshitchauhan2580

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views2 pages

Gradient Descent

Uploaded by

akshitchauhan2580

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Gradient Descent

Gradient descent is an optimization algorithm that’s used when training a machine learning model.
It’s based on a convex function and tweaks its parameters iteratively to minimize a given function to
its local minimum.

You start by defining the initial parameter’s values and from there the gradient descent algorithm
uses calculus to iteratively adjust the values so they minimize the given cost-function. To understand
this concept fully, it’s important to know about gradients.

A gradient simply measures the change in all weights with regard to the change in error. You can also
think of a gradient as the slope of a function. The higher the gradient, the steeper the slope and the
faster a model can learn. But if the slope is zero, the model stops learning. In mathematical terms, a
gradient is a partial derivative with respect to its inputs.

Imagine a blindfolded man who wants to climb to the top of a hill with the fewest steps possible. He
might start climbing the hill by taking really big steps in the steepest direction. But as he comes
closer to the top, his steps will get smaller and smaller to avoid overshooting it. Imagine the image
below illustrates our hill from a top-down view and the red arrows are the steps of our climber. A
gradient in this context is a vector that contains the direction of the steepest step the blindfolded
man can take and how long that step should be.

How Does Gradient Descent Work?

Instead of climbing up a hill, think of gradient descent as hiking down to the bottom of a valley. The
equation below describes what the gradient descent algorithm does: b is the next position of our
climber, while a represents his current position. The minus sign refers to the minimization part of the
gradient descent algorithm. The gamma in the middle is a waiting factor and the gradient term ( Δf(a)
) is simply the direction of the steepest descent.

Types of Gradient Descent

Batch Gradient Descent

Batch gradient descent, also called vanilla gradient descent, calculates the error for each example
within the training dataset, but it only gets updated after all training examples have been evaluated.
This process is like a cycle and called a training epoch.

An advantage of batch gradient descent is its computational efficiency: it produces a stable error
gradient and a stable convergence. But the stable error gradient can sometimes result in a state of
convergence that isn’t the best the model can achieve. It also requires the entire training dataset to
be in memory and available to the algorithm.

Stochastic Gradient Descent

By contrast, stochastic gradient descent (SGD) does this for each training example within the dataset,
meaning it updates the parameters for each training example one by one. Depending on the
problem, this can make SGD faster than batch gradient descent. One advantage is that frequent
updates allow us to have a pretty detailed rate of improvement.

The frequent updates, however, are more computationally expensive than the batch gradient
descent approach. Additionally, the frequency of those updates can result in noisy gradients, which
may cause the error rate to jump around instead of slowly decreasing.

Mini-Batch Gradient Descent

Mini-batch gradient descent is the go-to method since it’s a combination of the concepts of SGD and
batch gradient descent. It simply splits the training dataset into small batches and performs an
update for each of those batches. This creates a balance between the robustness of stochastic
gradient descent and the efficiency of batch gradient descent.

Common mini-batch sizes range between 50 and 256, but like any other machine learning technique,
there is no clear rule because it varies for different applications. This is the go-to algorithm when
training a neural network, and it’s the most common type of gradient descent within deep learning.

Understanding Gradient Descent in ML
No ratings yet
Understanding Gradient Descent in ML
9 pages
ML Lec 08 Gradient Descent
No ratings yet
ML Lec 08 Gradient Descent
37 pages
Gradient Descent
No ratings yet
Gradient Descent
13 pages
DL Unit - 2
No ratings yet
DL Unit - 2
20 pages
Gradient Descent Algorithm Is A First
No ratings yet
Gradient Descent Algorithm Is A First
5 pages
Gradient Descent - PR
No ratings yet
Gradient Descent - PR
31 pages
Adam Optimizer
No ratings yet
Adam Optimizer
22 pages
An Overview of Gradient Descent Optimization Algorithms PDF
No ratings yet
An Overview of Gradient Descent Optimization Algorithms PDF
12 pages
chp2 Gradient Descent Algorithm
No ratings yet
chp2 Gradient Descent Algorithm
5 pages
Gradient Descent for ML Practitioners
No ratings yet
Gradient Descent for ML Practitioners
2 pages
Paper 2
No ratings yet
Paper 2
27 pages
Gradient Descent in Machine Learning
No ratings yet
Gradient Descent in Machine Learning
98 pages
Gradient Descent
No ratings yet
Gradient Descent
17 pages
Lesson 4 Gradient Descent
No ratings yet
Lesson 4 Gradient Descent
13 pages
Gradient Descent Optimization Guide
No ratings yet
Gradient Descent Optimization Guide
9 pages
Technical Writing
No ratings yet
Technical Writing
9 pages
Gradient Descent
No ratings yet
Gradient Descent
4 pages
Gradient Descent A Fundamental Optimization Algorithm
No ratings yet
Gradient Descent A Fundamental Optimization Algorithm
30 pages
Technical Writing
No ratings yet
Technical Writing
8 pages
Gradient Descent Presentation
No ratings yet
Gradient Descent Presentation
26 pages
Gradient Descent - A Quick, Simple Introduction - Built in
No ratings yet
Gradient Descent - A Quick, Simple Introduction - Built in
15 pages
WINSEM2024-25 CSE4006 ETH AP2024254000693 2025-01-08 Reference-Material-I
No ratings yet
WINSEM2024-25 CSE4006 ETH AP2024254000693 2025-01-08 Reference-Material-I
40 pages
Gradient Descent DS Rohit Sharma Fench Knjs
No ratings yet
Gradient Descent DS Rohit Sharma Fench Knjs
15 pages
Gradient Descent
No ratings yet
Gradient Descent
6 pages
UNIT2
No ratings yet
UNIT2
25 pages
UNIT3
No ratings yet
UNIT3
37 pages
Gradient Descent
No ratings yet
Gradient Descent
2 pages
Gradient Descent Algorithm in Machine Learning
No ratings yet
Gradient Descent Algorithm in Machine Learning
21 pages
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
No ratings yet
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
37 pages
QB Unit 3
No ratings yet
QB Unit 3
14 pages
Backpropagation, Sgmiod Neuron & Gradient Discend
No ratings yet
Backpropagation, Sgmiod Neuron & Gradient Discend
29 pages
Neural Network Optimization Tactics
No ratings yet
Neural Network Optimization Tactics
20 pages
Gradient Descent in Machine Learning
No ratings yet
Gradient Descent in Machine Learning
8 pages
Deep Learning Optimizers Explained
No ratings yet
Deep Learning Optimizers Explained
20 pages
DL Exp2
No ratings yet
DL Exp2
6 pages
Gradient Descent Types Explained
No ratings yet
Gradient Descent Types Explained
5 pages
Gradient-Based Optimizers
No ratings yet
Gradient-Based Optimizers
54 pages
Unit 4 - GRADIENT LEARNING
No ratings yet
Unit 4 - GRADIENT LEARNING
3 pages
Gradient Descent in Machine Learning - Javatpoint
No ratings yet
Gradient Descent in Machine Learning - Javatpoint
9 pages
Lecture 08 ML
No ratings yet
Lecture 08 ML
20 pages
Gradient Descent Algorithm in Machine Learning: Dr. P. K. Chaurasia
No ratings yet
Gradient Descent Algorithm in Machine Learning: Dr. P. K. Chaurasia
24 pages
Gradient Descent
No ratings yet
Gradient Descent
14 pages
LInear
No ratings yet
LInear
14 pages
Understanding Cost Function & Gradient Descent
No ratings yet
Understanding Cost Function & Gradient Descent
142 pages
Gradient Descent New
No ratings yet
Gradient Descent New
42 pages
Gradient Descent in Machine Learning
No ratings yet
Gradient Descent in Machine Learning
3 pages
Gradient Descent for ML Experts
No ratings yet
Gradient Descent for ML Experts
5 pages
5 Optimizers
No ratings yet
5 Optimizers
10 pages
ML3 Unit 4-3
No ratings yet
ML3 Unit 4-3
13 pages
Gradient Descent Regression
No ratings yet
Gradient Descent Regression
14 pages
Lecture 5
No ratings yet
Lecture 5
34 pages
Gradient Descent for Data Scientists
No ratings yet
Gradient Descent for Data Scientists
9 pages
Lec05-1-Gradient Descent-Detailed
No ratings yet
Lec05-1-Gradient Descent-Detailed
62 pages
Assignment 4
No ratings yet
Assignment 4
8 pages
Gradient Decent
No ratings yet
Gradient Decent
40 pages
Gradient Descent Deep Learning Lecture
No ratings yet
Gradient Descent Deep Learning Lecture
5 pages
Linear Models-Gradient Descent, Regularization (Introduction)
No ratings yet
Linear Models-Gradient Descent, Regularization (Introduction)
26 pages
What Is Gradient Descent - Built in
No ratings yet
What Is Gradient Descent - Built in
11 pages
Module 4 Lab 3
No ratings yet
Module 4 Lab 3
6 pages
Multi-Modal Human Verification Using Face and Speech
No ratings yet
Multi-Modal Human Verification Using Face and Speech
6 pages
Cyberbullying Detection IEEE
No ratings yet
Cyberbullying Detection IEEE
2 pages
Refernces 2
No ratings yet
Refernces 2
18 pages
Software Bug Prediction via ML Review
No ratings yet
Software Bug Prediction via ML Review
13 pages
Unsupervised Learning in Deep Learning
No ratings yet
Unsupervised Learning in Deep Learning
51 pages
Playbook Executive Briefing Deep Learning
No ratings yet
Playbook Executive Briefing Deep Learning
20 pages
Unit 5 Notes
No ratings yet
Unit 5 Notes
20 pages
B.tech Iot and Ece Iot
No ratings yet
B.tech Iot and Ece Iot
58 pages
AI & Data Science Career Profile
No ratings yet
AI & Data Science Career Profile
1 page
ML Mod 4
No ratings yet
ML Mod 4
13 pages
Asu Sop-2
No ratings yet
Asu Sop-2
3 pages
Data Validation for ML Practitioners
No ratings yet
Data Validation for ML Practitioners
3 pages
AI - ML Assessment - Quizizz
No ratings yet
AI - ML Assessment - Quizizz
2 pages
Unit1 ML
No ratings yet
Unit1 ML
36 pages
AutoGPT & Autonomous Agents: RL Bots
No ratings yet
AutoGPT & Autonomous Agents: RL Bots
223 pages
Cs 229, Autumn 2016 Problem Set #2: Naive Bayes, SVMS, and Theory
No ratings yet
Cs 229, Autumn 2016 Problem Set #2: Naive Bayes, SVMS, and Theory
20 pages
Final
No ratings yet
Final
2 pages
AI Unit 4 - Artificial Neural Network by Kulbhushan (Krazy Kaksha & KK World)
No ratings yet
AI Unit 4 - Artificial Neural Network by Kulbhushan (Krazy Kaksha & KK World)
5 pages
CV PhamVanTuan Upload
No ratings yet
CV PhamVanTuan Upload
2 pages
Vehere Packetworker br151
No ratings yet
Vehere Packetworker br151
55 pages
Final Document-1
No ratings yet
Final Document-1
62 pages
Arning Time Series Classification With Fisher Information
No ratings yet
Arning Time Series Classification With Fisher Information
22 pages
(IJCST-V12I4P7) :gajanan Ankatwar, Dr. Chitra Dhawale
100% (1)
(IJCST-V12I4P7) :gajanan Ankatwar, Dr. Chitra Dhawale
32 pages
(Ebook) The Humachine by Nada R. Sanders, John D. Wood ISBN 9781138571341, 1138571342 Available Any Format
100% (3)
(Ebook) The Humachine by Nada R. Sanders, John D. Wood ISBN 9781138571341, 1138571342 Available Any Format
182 pages
Aiml Virtual Internship Report
No ratings yet
Aiml Virtual Internship Report
99 pages
Lecture 1
No ratings yet
Lecture 1
32 pages
Achieving Privacy-Preserving Online Multi-Layer Perceptron Model in Smart Grid
No ratings yet
Achieving Privacy-Preserving Online Multi-Layer Perceptron Model in Smart Grid
12 pages
Sign Language Recognition Using Machine Learning
No ratings yet
Sign Language Recognition Using Machine Learning
7 pages
Journal of Purchasing and Supply Management: Jan Martin Spreitzenbarth, Christoph Bode, Heiner Stuckenschmidt
No ratings yet
Journal of Purchasing and Supply Management: Jan Martin Spreitzenbarth, Christoph Bode, Heiner Stuckenschmidt
21 pages
AI's Impact On Digital Marketing
No ratings yet
AI's Impact On Digital Marketing
7 pages

Gradient Descent

Uploaded by

Gradient Descent

Uploaded by

Gradient Descent

How Does Gradient Descent Work?

Types of Gradient Descent

Batch Gradient Descent

Stochastic Gradient Descent

Mini-Batch Gradient Descent

You might also like