0% found this document useful (0 votes)

8 views7 pages

NNDL

The document discusses various concepts in deep learning, including the importance of activation functions, loss functions like Mean Squared Error, and the differences between CNNs and RNNs. It also covers the architecture of a Multi-Layer Perceptron for laptop price classification, the significance of pooling in CNNs, and the role of LSTMs and GANs in improving learning efficiency and data generation, respectively. Additionally, it highlights BERT's revolutionary impact on NLP through its bidirectional context and pre-training approach.

Uploaded by

sampath M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views7 pages

NNDL

Uploaded by

sampath M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

1.

Why is the choice of activation function crucial

in shaping a neural network’s performance?
7. In what key ways do RNNs and CNNs differ in
The activation function introduces non-linearity processing data?
into the model, enabling the neural network to
CNNs process spatial data (e.g., images) by
learn complex patterns and relationships in data.
capturing local features using filters, while RNNs
Without it, the network behaves like a linear
handle sequential data (e.g., text/time series) by
model regardless of depth.
maintaining temporal dependencies through
hidden states.

2. Why is Mean Squared Error (MSE) commonly

used as a loss function in deep learning?
8. How does the forget gate in an LSTM improve
MSE is widely used because it penalizes larger learning efficiency?
errors more heavily, ensuring precise predictions. It
The forget gate selectively removes irrelevant
is differentiable and computationally efficient for
information from the cell state, preventing long-
regression problems.
term memory clutter and enhancing the network’s
ability to learn relevant patterns over time.

3. How does adjusting the learning rate impact

the efficiency of the backpropagation algorithm?
9. What is the primary role of Generative
A higher learning rate speeds up learning but may Adversarial Networks (GANs) in deep learning?
overshoot minima, while a lower rate ensures
GANs generate realistic synthetic data by training
stability but slows convergence. Proper tuning
two models (generator and discriminator) in a
helps achieve efficient and accurate learning.
competitive setup, widely used for data
augmentation, image generation, and more.

4. Mention the significance of BAM in neural

networks.
10. Why is BERT widely recognized as a
Bidirectional Associative Memory (BAM) stores breakthrough architecture in NLP-based Neural
pattern pairs and recalls outputs from given inputs Network Application?
using a bidirectional associative mechanism, useful
BERT uses bidirectional context from transformers,
in memory-based neural models.
enabling it to understand word meanings based on
full sentence context. It significantly improved
performance on many NLP tasks like question
5. What are the different types of pooling in
answering and sentiment analysis.
CNNs, and how do they influence feature
extraction?

Common pooling types include max pooling,

average pooling, and global pooling. Pooling
reduces spatial dimensions, extracts dominant
features, and provides translation invariance.

6. Why might increasing the filter size in a CNN

lead to a loss of fine image details?

Larger filters cover broader areas, smoothing out

high-frequency details like edges or textures, thus
losing fine-grained information crucial for precise
feature detection.
1. Multi-Layer Perceptron (MLP) Model for Laptop • Adam Optimizer: Combines momentum
Price Classification and RMSprop for adaptive learning rates
and faster convergence.
Problem Statement:
• Loss Function: Categorical Crossentropy is
Classify laptops into three price categories: low,
standard for multi-class classification.
medium, and high based on features such as
processor speed, RAM, brand, screen size, SSD
presence, and GPU availability.
Benefits of MLP:

• Learns non-linear boundaries between

Proposed MLP Architecture: classes.

Layer Details • Works well for structured/tabular data.

6 neurons (one for each feature like • Can handle categorical features with
Input Layer embedding or one-hot encoding.
RAM, CPU speed, etc.)

Hidden
64 neurons, ReLU activation
Layer 1 2. Activation Functions and Their Properties

Hidden Sigmoid Function:

32 neurons, ReLU activation
Layer 2
• Formula:
Output 3 neurons, Softmax activation (for 3 σ(x)=11+e−x\sigma(x) = \frac{1}{1 + e^{-
Layer classes) x}}σ(x)=1+e−x1
python • Input Range: (-∞, ∞)
CopyEdit • Output Range: (0, 1)
model = Sequential([ • Use Case: Binary classification, last-layer
Dense(64, input_dim=6, activation='relu'), activation.

Dense(32, activation='relu'), • Limitation: Causes vanishing gradients

when inputs are large/small.
Dense(3, activation='softmax')

])
ReLU (Rectified Linear Unit):
model.compile(optimizer='adam',
loss='categorical_crossentropy', • Formula:
metrics=['accuracy'])
f(x)=max⁡(0,x)f(x) = \max(0, x)f(x)=max(0,x)

• Input Range: (-∞, ∞)

Justification:
• Output Range: [0, ∞)
• ReLU Activation: Prevents vanishing
• Use Case: Hidden layers; introduces
gradient and allows deep networks to
sparsity.
learn complex patterns.
• Limitation: Dying ReLU (neurons stuck at
• Softmax: Ideal for multi-class
zero when inputs < 0).
classification, normalizes outputs to
probabilities.

Softmax Function:
• Formula: 4. Optimizers in Neural Networks

Softmax(xi)=exi∑jexj\text{Softmax}(x_i) = Role of Optimizers:

\frac{e^{x_i}}{\sum_{j} e^{x_j}}Softmax(xi)=∑jexjexi
• Minimize the loss function.
• Input Range: Vector of real numbers
• Control the learning dynamics of the
• Output Range: Probabilities summing to 1 network.

• Use Case: Multi-class classification output

layer.
A. Stochastic Gradient Descent (SGD)

• Basic optimizer.
3. Forward vs Backward Propagation
• May get stuck in local minima.
Forward Propagation:
B. Momentum
• Pass input through layers.
• Adds inertia to SGD.
• Compute outputs via weighted sums and
• Helps escape shallow minima.
activation functions.
C. RMSProp
• Loss function compares predicted output
with actual label. • Scales learning rate adaptively using a
moving average of squared gradients.

Backward Propagation: • Suitable for RNNs.

D. Adam
• Uses chain rule of calculus.
• Combines Momentum + RMSProp.
• Calculates gradients of the loss w.r.t
weights. • Popular for all neural networks.
• Updates weights using gradient descent.

Comparison Table:
Why Both Are Needed:
Adaptive
Optimizer Momentum Use Case
• Forward gives prediction and computes LR
error.
Simple
SGD No No
• Backward corrects the model by problems
minimizing this error.
RMSProp Yes No RNNs
• One without the other results in either no
learning or no objective. Most deep
Adam Yes Yes
models
Here are extremely detailed answers for the • Effect: Smoothens the feature map,
following deep learning topics: preserves background information better.

• 5. Types of Pooling in CNNs • Use Case: Used in early CNNs like LeNet;
modern networks use it for global
• 6. Why increasing filter size can cause a
pooling.
loss of fine details
C. Global Pooling
• 7. Difference between RNNs and CNNs in
data processing • Global Max or Average pooling over the
entire feature map.
• 8. Forget gate in LSTM and its impact on
learning • Used before the output layer to reduce
feature maps to a single number per map.
• 9. Role of GANs in deep learning
• Prevents overfitting, fewer parameters
• 10. BERT as a breakthrough in NLP than fully connected layers.

D. Lp Pooling
5. What are the different types of pooling in
• Generalized pooling where:
CNNs, and how do they influence feature
extraction? y=(∑xp)1/py = \left(\sum x^p\right)^{1/p}

Pooling is a downsampling technique used in CNNs • Special cases: p=1 → Avg pooling; p=∞ →
to reduce the spatial dimensions (width and Max pooling.
height) of feature maps, controlling overfitting and
computational load.

Purpose of Pooling: Influence on Feature Extraction:

• Reduce dimensionality and computation. Pooling

Characteristics Best Use
Type
• Retain essential features while discarding
noise. Max Keeps strong
Edge/textures
Pooling features
• Introduce translation invariance to the
model. Avg Smooth
Backgrounds
Pooling representation
Types of Pooling:

A. Max Pooling Global Removes spatial Classification,

Pooling info Efficient CNNs
• Mechanism: Selects the maximum value
from each patch of the feature map.

• Effect: Retains strongest activation; great 6. Why might increasing the filter size in a CNN
for edge and texture detection. lead to a loss of fine image details?

• Formula: Role of Filter Size in Feature Detection:

yi,j=max⁡(xm,n)for
• Filters (kernels) are responsible for
m,n∈patch around (i,j)y_{i,j} =
extracting features like edges, corners,
\max(x_{m,n}) \quad \text{for} \; m,n \in
textures.
\text{patch around } (i,j)
• Common filter sizes: 3x3, 5x5, 7x7.
B. Average Pooling
Effect of Larger Filters:
• Mechanism: Averages the values within
the patch. • Cover larger spatial areas in one pass.
• Capture broader features but lose local
granularity.
RNNs: Recurrent Neural Networks
Fine Details Missed:
• Input: Sequences (e.g., time-series, text).
• Small patterns like dots, edges, noise.
• Maintains hidden state/memory across
• These details often lie in small pixel time steps.
differences—larger filters average them
• Handles variable-length input.
out.
Example:

• Text generation, speech recognition, stock

Comparative View:
prediction.
Filter
Captures Misses
Size
Comparative Table:
Fine textures, sharp High-level
3x3
edges patterns Feature CNN RNN

Abstract shapes, Fine image Input 2D (images) 1D sequences

7x7
context textures
Order
Low High
Best Practice: sensitivity

• Stack smaller filters (e.g., two 3x3s ≈ one No internal Maintains hidden
5x5) to keep receptive field while Memory
memory state
preserving detail.
Use Case Vision NLP, time series
• More non-linearity and better feature
richness.

8. How does the forget gate in an LSTM improve

learning efficiency?
7. How do RNNs and CNNs differ in processing
data? Problem in Standard RNNs:
Core Idea: • Struggle to retain long-term
dependencies.
• CNNs process data spatially.
• Gradients either vanish or explode.
• RNNs process data sequentially
(temporally).

LSTM Architecture Overview:

CNNs: Convolutional Neural Networks An LSTM has three gates:

• Input: Fixed-size 2D grids (e.g., images). 1. Forget Gate ftf_t

• Learn spatial hierarchies (edges → 2. Input Gate iti_t

textures → shapes).
3. Output Gate oto_t
• Weight sharing across spatial dimensions.

Example:
Forget Gate Mechanics:
• Image classification, object detection,
medical scans. • Formula:
ft=σ(Wf⋅[ht−1,xt]+bf)f_t = \sigma(W_f \cdot [h_{t- How It Works:
1}, x_t] + b_f)
• Generator improves to fool the
• ft∈[0,1]f_t \in [0,1]: Determines how Discriminator.
much of previous memory to keep.
• Discriminator improves to detect fakes.
• If ftf_t → 1: Keep memory.
• Both get better over time, resulting in
• If ftf_t → 0: Forget completely. realistic data generation.

Impact on Learning: Applications of GANs:

• Helps avoid memorizing irrelevant • Face generation

patterns. (ThisPersonDoesNotExist).

• Enables selective memory retention. • Super-resolution images.

• Better gradient flow → improved long- • Art and music synthesis.

term learning.
• Data augmentation in low-data scenarios.

Example:
10. Why is BERT widely recognized as a
In a sentence: breakthrough in NLP-based Neural Network
Applications?
“The weather in Paris is cold but not snowy.”
BERT = Bidirectional Encoder Representations
To predict “snowy”, LSTM might forget irrelevant
from Transformers
words like "Paris" but retain "cold".
Why It’s Revolutionary:

9. What is the primary role of Generative A. Bidirectional Context

Adversarial Networks (GANs) in deep learning? • Unlike older models (e.g., GPT), BERT
reads left and right context together.
Goal:
• Deeply understands contextual meaning.
GANs aim to generate realistic data by learning
from an existing dataset. B. Pre-training + Fine-tuning Paradigm

• Pre-trained on large corpus (Wikipedia +

Books).
GAN Architecture:
• Fine-tuned on tasks like sentiment
• Generator (G): Creates fake data from
analysis, Q&A, NER.
noise.

• Discriminator (D): Classifies real vs fake

data. Key Components:
They play a minimax game: • Based on Transformer encoder (not
min⁡Gmax⁡DV(D,G)=Ex∼pdata[log⁡D(x)]+Ez∼pz[ decoder).
log⁡(1−D(G(z)))]\min_G \max_D V(D, G) = • Input: WordPiece embeddings +
\mathbb{E}_{x \sim p_{data}}[\log D(x)] + positional encodings.
\mathbb{E}_{z \sim p_z}[\log(1 - D(G(z)))]
• Uses Masked Language Modeling (MLM)
to learn context.
• Uses Next Sentence Prediction (NSP) for
understanding relationships.

Applications:

• Sentiment classification

• Named Entity Recognition (NER)

• Question answering (like SQuAD)

• Document search and ranking (Google

Search uses it!)

Real-world Impact:

• Reduced task-specific architecture

engineering.

• State-of-the-art results on 11 NLP tasks.

• Foundation of models like RoBERTa,

DistilBERT, ALBERT.

2 Marks Gen AI
No ratings yet
2 Marks Gen AI
14 pages
DL Internal
No ratings yet
DL Internal
9 pages
Exam Gen AI
No ratings yet
Exam Gen AI
14 pages
ML Prep For Samsung
No ratings yet
ML Prep For Samsung
73 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
43 pages
Introduction To Convolutional Neural Networks
No ratings yet
Introduction To Convolutional Neural Networks
4 pages
SDL Unit 2 3 4
No ratings yet
SDL Unit 2 3 4
12 pages
Unit II
No ratings yet
Unit II
38 pages
Terms To Review
No ratings yet
Terms To Review
9 pages
Ai & Ds-Ii Iat-2 QB Soln
No ratings yet
Ai & Ds-Ii Iat-2 QB Soln
15 pages
Deep Learning Lab
No ratings yet
Deep Learning Lab
11 pages
DL CO1 and CO2 Answers
No ratings yet
DL CO1 and CO2 Answers
36 pages
NNML Full
No ratings yet
NNML Full
19 pages
Various Neural Network Architect Assignment Questions
No ratings yet
Various Neural Network Architect Assignment Questions
9 pages
Ilide Info Deep Learning Questions PR
No ratings yet
Ilide Info Deep Learning Questions PR
51 pages
Deep Learning Questions
50% (2)
Deep Learning Questions
51 pages
Assignment 4
No ratings yet
Assignment 4
7 pages
120 Deep Learning Important Questions + Answers ?
No ratings yet
120 Deep Learning Important Questions + Answers ?
68 pages
Machine Learning for Data Scientists
No ratings yet
Machine Learning for Data Scientists
14 pages
Lecture5 MCQ Guide
No ratings yet
Lecture5 MCQ Guide
9 pages
DL Notes
No ratings yet
DL Notes
21 pages
Deep Learning Updated
No ratings yet
Deep Learning Updated
11 pages
Deep Learning
No ratings yet
Deep Learning
4 pages
ISE-1 Imp DLPDF
No ratings yet
ISE-1 Imp DLPDF
28 pages
DeekshikaJadyada21 AP24LDS11
No ratings yet
DeekshikaJadyada21 AP24LDS11
5 pages
Gen Ai Mynotes
No ratings yet
Gen Ai Mynotes
12 pages
Antim Prahar AI and ML For Business 2025
No ratings yet
Antim Prahar AI and ML For Business 2025
45 pages
DL Unit3
No ratings yet
DL Unit3
8 pages
Deep Learning
No ratings yet
Deep Learning
15 pages
Unit 3
No ratings yet
Unit 3
59 pages
Gen Aiml Notes by Piyush
No ratings yet
Gen Aiml Notes by Piyush
39 pages
SocrAI Day 2
No ratings yet
SocrAI Day 2
66 pages
MLT UNIT-4 & 5 Imp Sol
No ratings yet
MLT UNIT-4 & 5 Imp Sol
22 pages
Assignment 2 QSN 1
No ratings yet
Assignment 2 QSN 1
4 pages
Notes From Training
No ratings yet
Notes From Training
12 pages
Convolutional Neural Networks: Convolutional Layer Pooling Layer Fully Connected Layer
No ratings yet
Convolutional Neural Networks: Convolutional Layer Pooling Layer Fully Connected Layer
33 pages
DL Unit 3 Important Questions and Answers PDF .. - 1
No ratings yet
DL Unit 3 Important Questions and Answers PDF .. - 1
8 pages
DL Imp Viva
No ratings yet
DL Imp Viva
5 pages
Neural Network - Test Questions
No ratings yet
Neural Network - Test Questions
9 pages
Images and Convolutional Neural Networks: Practical Deep Learning
No ratings yet
Images and Convolutional Neural Networks: Practical Deep Learning
34 pages
DL Cie2
No ratings yet
DL Cie2
5 pages
Genai See
No ratings yet
Genai See
51 pages
Deep Learning Cheatsheet Guide
No ratings yet
Deep Learning Cheatsheet Guide
14 pages
Artificial Intelligence MIDTERM
No ratings yet
Artificial Intelligence MIDTERM
5 pages
AMLQuestion BANK
No ratings yet
AMLQuestion BANK
3 pages
Deep Learning 15
No ratings yet
Deep Learning 15
13 pages
Unit 2 Notes NLP
No ratings yet
Unit 2 Notes NLP
6 pages
Lesson 2 Neural Network Architectures
No ratings yet
Lesson 2 Neural Network Architectures
35 pages
Lecture 1-Unit 3.3
No ratings yet
Lecture 1-Unit 3.3
3 pages
Lecture 3
No ratings yet
Lecture 3
48 pages
Deep Learning Unit-III
No ratings yet
Deep Learning Unit-III
9 pages
Deep Learning Interview Q&A
No ratings yet
Deep Learning Interview Q&A
10 pages
Algorithmic Advances
No ratings yet
Algorithmic Advances
5 pages
DL Unit Iv
No ratings yet
DL Unit Iv
18 pages
Neural Networks & Deep Learning - Study Notes
No ratings yet
Neural Networks & Deep Learning - Study Notes
8 pages
Deep Learning
No ratings yet
Deep Learning
13 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
15 pages
Multi Layer Perceptron
No ratings yet
Multi Layer Perceptron
25 pages
Styleformer Transformer Based Generative Adversarial Networks With Style Vector
No ratings yet
Styleformer Transformer Based Generative Adversarial Networks With Style Vector
10 pages
CNN Hyperparameter Tuning Guide
No ratings yet
CNN Hyperparameter Tuning Guide
2 pages
AD3501 Deep Learning Syllabus
100% (1)
AD3501 Deep Learning Syllabus
1 page
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
20 pages
PART I Chapter 5 Neural Network
No ratings yet
PART I Chapter 5 Neural Network
19 pages
AI Lab 12 Lab Tasks - 39
No ratings yet
AI Lab 12 Lab Tasks - 39
12 pages
2.neural Network
No ratings yet
2.neural Network
19 pages
Transformers in Action MEAP V06 Nicole Koenigstein Full Access
100% (3)
Transformers in Action MEAP V06 Nicole Koenigstein Full Access
156 pages
Purple Gradient Artificial Intelligence Presentation
No ratings yet
Purple Gradient Artificial Intelligence Presentation
9 pages
NN Examples
No ratings yet
NN Examples
91 pages
DL Unit I & Unit II
No ratings yet
DL Unit I & Unit II
156 pages
BackPropagation Through Time
No ratings yet
BackPropagation Through Time
6 pages
Amharic Abstractive Text Summarization
No ratings yet
Amharic Abstractive Text Summarization
6 pages
ISE-2 5 DL Marks New Imp
No ratings yet
ISE-2 5 DL Marks New Imp
17 pages
Chapter 3
No ratings yet
Chapter 3
32 pages
Different Artificial Neural Networks Architectures
No ratings yet
Different Artificial Neural Networks Architectures
27 pages
Syllabus DLP
No ratings yet
Syllabus DLP
2 pages
231AD63 Deep Learning
No ratings yet
231AD63 Deep Learning
2 pages
Deep Learning Viva Questions (1-3)
No ratings yet
Deep Learning Viva Questions (1-3)
4 pages
Lecture 0.4 - Neural Networks
No ratings yet
Lecture 0.4 - Neural Networks
51 pages
Self-Organizing Maps: Unsupervised Learning
No ratings yet
Self-Organizing Maps: Unsupervised Learning
8 pages
Machine Learning Mini-Project Report
No ratings yet
Machine Learning Mini-Project Report
26 pages
Deep Learning Paradigms & Challenges
No ratings yet
Deep Learning Paradigms & Challenges
136 pages
Ann-Unit Iv
No ratings yet
Ann-Unit Iv
27 pages
RNNs: Applications and Training Guide
No ratings yet
RNNs: Applications and Training Guide
36 pages
Class Notes DL Unit 1
No ratings yet
Class Notes DL Unit 1
2 pages
Iva Unit-5 Edited
No ratings yet
Iva Unit-5 Edited
42 pages
Ch. 9: Introduction To Convolution Neural Networks (CNN) and Systems
No ratings yet
Ch. 9: Introduction To Convolution Neural Networks (CNN) and Systems
96 pages
Difference Between AlexNet, VGGNet, ResNet, and Inception
No ratings yet
Difference Between AlexNet, VGGNet, ResNet, and Inception
25 pages