QP

Question paper on advanced machine learning

Uploaded by

khansethifamily2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views3 pages

QP

Question paper on advanced machine learning

Uploaded by

khansethifamily2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

EE782 Advanced Topics in Machine Learning

End-Semester Examination Question Paper and Answer Sheet

November 24, 2023; 08:30 am to 11:30 am

ROLL NO. NAME: _________

Instructions:
 Exam is open notes as long as the notes are on paper and not on an electronic device
 Collaboration between a student and any other person or the Internet is prohibited
 SUBMIT THIS SHEET ONLY. Answer all questions in the space given on this sheet only
 Use separate sheet for rough work
 Total marks = 27; weight in course 27%

1. Let f(x,y) be 2x + 3y2 − 2x2. Find the critical points of this function and characterize them into local
maxima, local minima, furrow (flat in one direction, minima in the other), saddle point, or point of
inflection in a given direction. Show your work. [1.5]
Derivative = 0 to give (−0.5, 0) [0.5]
Hessian = [−4, 0; 0 6] [0.5]
Saddle point because Hessian is diagonal, one eigenvalue is positive, the other is negative. [0.5]

2. Not carefully initializing weights in a deep neural network where every layer has a sigmoid
nonlinearity will lead to (a) no problem, (b) vanishing gradients, (c) exploding gradients, or (d) both
vanishing and exploding gradients? Explain the case for both vanishin and exploding gradients. [2]
For high positive and negative activation, the sigmoid has zero gradient, which leads to vanishing gradients
if the weights are not carefully initialized. [1]
There is no repeated multiplication of weights which could lead to exploding gradients. [1]

3. Suppose one layer has an output of size C (one-dimensional array) before a nonlinearity is applied.
A second layer is a convolutional layer with ReLU nonlinearity, whose output has dimensions
H×W×C (three-dimensional tensor). The output of the first layer needs to be used as channel-wise
attention weight for the output of the second layer. Suggest a nonlinearity and any additional
operations that need to be applied to the output of the first layer for this purpose. [2]
Firstly, to convert the first output into a 0-1 range, for which we will use softmax nonlinearity. [1]
Secondly, the dimensions are incompatible, therefore we will repeat each output element H×W times. [1]

4. For a convolutional layer with output of size H×W×C×B, where B is the batch size and C is the
number of channels, what will be the number of elements that will be averaged for computing one
mean during batch normalization? [1]
H.W.B elements will be averaged. We will get C such averages, one for each neuron/kernel/filter [1]

5. What will happen if we train a GAN that has a discriminator with low capacity (e.g. not enough
learnable parameters)? Will it lead to (a) mode collapse, (b) generation of unrealistic samples, or (c)
a discriminator that easily classifies between real and fake? Justify your answer. [1]
(b) generation of unrealistic samples, because the generator can generate low-quality (obvious) fakes, and
the discriminator will not be able to tell the difference between those and the real images.

6. Which of the following is a good principle for designing a loss function for regression that is
robust to outliers? Justify with an example of a robust loss function and how it treats inliers (non-
outliers) versus outliers. [1.5]
a) The loss function should be convex
b) The loss function should have a constant upper bound
c) The gradient of the loss function should have a constant upper bound
d) The absolute value of the gradient of the loss function should have a constant upper bound
(d) The absolute value of the gradient should be upper-bounded, because then an outlier cannot have more
than a certain max contribution to the overall gradient sum across samples. An example of this is Huber
loss, where samples with errors more than ±δ have a constant absolute gradient.
(b) is also correct if the reasoning is that outliers will get a vanishing gradient, like this orange curve

7. Write the formula for generalized cross entropy and explain how it might help deal with
mislabeled samples by drawing an approximate graph and explaining the role of its hyperparameter.
[1.5]
(1−yijq)/q. For Lim q→0, this becomes CE [0.5], and for q=1, this becomes MAE [0.5]. For any other q in-
between, the max gradient is limited by an upper bound, which limits the impact of mislabled samples [1].

8. According to the paper titled “Normalized Loss Functions for Deep Learning with Noisy Labels”
by Ma et al., (a) draw the approximate graphs of cross entropy and (b) normalized cross entropy for
the predicted probability of the correct class [Hint: assume binary classification], and (c) explain the
advantage of normalized cross entropy over cross entropy when dealing with mislabled samples, as
well as (d) the disadvantage of using only an active loss (e.g. NCE) based on the graphs. [2]

[0.5 + 0.5]
NCE has a limited gradient, which limits the impact of mislabeled samples. [0.5]
However, the gradient is 0 for error of 1, which discourages learning on correctly labeled samples initially. For this,
we need passive loss as well. [0.5]

9. List four different methods of augmenting images for self-supervised learning. [2]
Any four, e.g. rotation, flip, blur, noise addition, distortion, color jitter, grayscale. [0.5x4]

10. Give the general architecture of a neural network that is being trained in a self-supervised
manner to restore images of old degraded photographs. Your answer must given an example each of
(a) the dimensions of the input layer, (b) the dimensions and nonlinearity (if any) of the output layer,
(c) the loss function, (d) a plausible architecture, and (e) a method to create the training dataset. [2.5]
(a) HxWx3
(b) HxWx3, with sigmoid (because pixel ranges are in 0 and 1)
(c) Pixel-wise MSE, MAE, or MS-SSIM
(d) UNet (with skip connections)
(e) Take clean images and simulate degradation to create old looking photos

11. What could be an advantage in few-shot learning of decreasing the relative distance of a query
sample from the prototype of its class as opposed to the support samples of its class? [1]
If one sample in the support set is an outlier, its impact can be reduced. Overall, the training is more stable.

12. Suppose that we want to use graph neural networks on Facebook communities to classify them
into those who might respond to a particular ad versus those who will not. Give at least two
examples of vertex attributes and two examples of edge attributes that one can use. [2]
Vertex: age, gender, location etc.
Edge: friend-connection, # likes, # comments on each others’ posts
13. Write the Laplacian matrix for the following graph where the vertex serial numbers are written
inside the vertex. [1.5]
D A L

2 0 0 1 1 2 0 -1 -1
1 0 0 1 0 0 1 -1 0
2 1 1 0 0 -1 -1 2 0
1 1 0 0 0 -1 0 0 1

14. What is a pseudo-label for a classification problem? [1]

PL is assuming that the label assigned to an unlabeled sample by a model is valid.

15. Write the formula for entropy and describe one way to use it for semi-supervised classification.
[1]
−Σj yij log yij. [0.5]
For SSL, we can minimize the entropy of the unlabeled samples, and CE of labeled samples. [0.5]

16. Explain how Grad-CAM (gradient-weighted class activation map) works to localize objects when
a CNN is only trained for image classification. [1.5]
By doing gradient ascend on the class probability with respect to activations, it determines the important
activations. Then it averages them for a channel. Then it takes their ReLU and weighs it by channel
average.

17. List and briefly describe the key differences between how dropout is used for regular training
and inference versus how it is used for uncertainty estimation. [1.5]
In regular DO, we turn it off during inference, and scale the weights down. [0.5]
In uncertainty estimation, we keep DO on during inference and do no scale the weights. [0.5]
Then we take the variability of the estimation for various instances of DO during inference. [0.5]

18. Describe two different ways to compare the performance of uncertanity estimation methods for
classification. [1.5]
AUC of outlier versus inlier identification [0.5]
Average accuracy versus percent of least uncertain samples used in averaging. [1]

Ee782 Es QP 2023
No ratings yet
Ee782 Es QP 2023
2 pages
QP3
No ratings yet
QP3
2 pages
Deep Learning Exam: Technical University of Munich
No ratings yet
Deep Learning Exam: Technical University of Munich
20 pages
DL Quiz1
No ratings yet
DL Quiz1
5 pages
SS 2020
No ratings yet
SS 2020
21 pages
WS 2021 Solutions
No ratings yet
WS 2021 Solutions
16 pages
Deep Learning Exam Solutions 2019
No ratings yet
Deep Learning Exam Solutions 2019
20 pages
SS 2021 Solutions
No ratings yet
SS 2021 Solutions
16 pages
WS 2021
No ratings yet
WS 2021
16 pages
DNN Cluster S2 22 MidSem Makeup
No ratings yet
DNN Cluster S2 22 MidSem Makeup
7 pages
DL2024
No ratings yet
DL2024
4 pages
Midpaper
No ratings yet
Midpaper
16 pages
Mock Endterm ADL 2021
No ratings yet
Mock Endterm ADL 2021
8 pages
SS 2021
No ratings yet
SS 2021
16 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Updated Assignment-1 Deep Learning
No ratings yet
Updated Assignment-1 Deep Learning
3 pages
Deep Learning
No ratings yet
Deep Learning
9 pages
F16midterm Sols v2
No ratings yet
F16midterm Sols v2
14 pages
Disc11-Examprep-Sols (9 Files Merged)
No ratings yet
Disc11-Examprep-Sols (9 Files Merged)
12 pages
SS 2020 Solutions
No ratings yet
SS 2020 Solutions
22 pages
Homework 2
No ratings yet
Homework 2
3 pages
Deep Learning 15
No ratings yet
Deep Learning 15
13 pages
DL Exam 2023-2
No ratings yet
DL Exam 2023-2
5 pages
Midterm Csci566
No ratings yet
Midterm Csci566
10 pages
MT1SP19
No ratings yet
MT1SP19
13 pages
Second Exam 2021-22 Solution
No ratings yet
Second Exam 2021-22 Solution
9 pages
(AK) AIMLCZG511 Midsem Regular
No ratings yet
(AK) AIMLCZG511 Midsem Regular
7 pages
M.Tech Deep Learning Exam Guide
100% (1)
M.Tech Deep Learning Exam Guide
6 pages
Genai See
No ratings yet
Genai See
51 pages
Tutorial 1
No ratings yet
Tutorial 1
2 pages
19CSE456 - VI Sem May 2022
No ratings yet
19CSE456 - VI Sem May 2022
6 pages
Final Exam Solutions
No ratings yet
Final Exam Solutions
12 pages
10 Improving Deep Neural Networks Hyperparameter Tuning, Regularization
No ratings yet
10 Improving Deep Neural Networks Hyperparameter Tuning, Regularization
6 pages
ML Endsem 2022
No ratings yet
ML Endsem 2022
7 pages
Second Exam 2021-22
No ratings yet
Second Exam 2021-22
14 pages
CSE489: Machine Vision (Sheet 7) : Yehia Zakaria
No ratings yet
CSE489: Machine Vision (Sheet 7) : Yehia Zakaria
34 pages
DL - Midterm - Fall23
No ratings yet
DL - Midterm - Fall23
2 pages
Neural Network and Deep Learning
No ratings yet
Neural Network and Deep Learning
4 pages
Exam Long Questions
No ratings yet
Exam Long Questions
8 pages
Introduction To Neural Networks 67103 - 2019 Exam B
No ratings yet
Introduction To Neural Networks 67103 - 2019 Exam B
2 pages
T243 COE 292 Quiz04 Concept
No ratings yet
T243 COE 292 Quiz04 Concept
7 pages
C6 Sample
No ratings yet
C6 Sample
4 pages
7COM1033test 0000
No ratings yet
7COM1033test 0000
4 pages
Shoolini University Mid Sem
No ratings yet
Shoolini University Mid Sem
3 pages
CS230 Midterm Fall 2022
No ratings yet
CS230 Midterm Fall 2022
14 pages
Short MCMC Supplementary
No ratings yet
Short MCMC Supplementary
5 pages
Domande ANN
No ratings yet
Domande ANN
28 pages
1160 CS F425 20241218114944 Comprehensive Exam Question Paper
No ratings yet
1160 CS F425 20241218114944 Comprehensive Exam Question Paper
5 pages
MT1 SP19 Solutions
No ratings yet
MT1 SP19 Solutions
14 pages
Aat Assignment I Deep Learning A8707 Iiib - Tech I Sem
No ratings yet
Aat Assignment I Deep Learning A8707 Iiib - Tech I Sem
2 pages
Deep Learning Midterm Exam
No ratings yet
Deep Learning Midterm Exam
2 pages
Mid Sem
No ratings yet
Mid Sem
2 pages
Solution Dseclzg524 05-07-2020 Ec3r
No ratings yet
Solution Dseclzg524 05-07-2020 Ec3r
7 pages
Rubrics Midsem
No ratings yet
Rubrics Midsem
2 pages
DNN Cluster S2 22 MidSem Regular
No ratings yet
DNN Cluster S2 22 MidSem Regular
6 pages
DL Midterm Rubrics
No ratings yet
DL Midterm Rubrics
5 pages
Data Science Interview Qes.
No ratings yet
Data Science Interview Qes.
15 pages
CS230: Deep Learning: Winter Quarter 2018 Stanford University Midterm Examination 180 Minutes
100% (1)
CS230: Deep Learning: Winter Quarter 2018 Stanford University Midterm Examination 180 Minutes
36 pages
Roadmap Gen AI
No ratings yet
Roadmap Gen AI
2 pages
Shreyash Resume
No ratings yet
Shreyash Resume
2 pages
Report 2
No ratings yet
Report 2
52 pages
Lesson 6: Practical Deep Learning For Coders (V2)
No ratings yet
Lesson 6: Practical Deep Learning For Coders (V2)
21 pages
ME Lab - II Sem Syllabus
No ratings yet
ME Lab - II Sem Syllabus
6 pages
02 - Kunjavskij - Plant Simulation Python Interface Introduction
No ratings yet
02 - Kunjavskij - Plant Simulation Python Interface Introduction
20 pages
Artificial Intelligence Unit 1 PPT Part 1
No ratings yet
Artificial Intelligence Unit 1 PPT Part 1
81 pages
Resource Allocation in Cloud - PPT Download
No ratings yet
Resource Allocation in Cloud - PPT Download
8 pages
IEEE Datascience Titles 2021
No ratings yet
IEEE Datascience Titles 2021
3 pages
Naive Bayes for Email Spam Classification
No ratings yet
Naive Bayes for Email Spam Classification
3 pages
ISYE6740 Fall2024 HW4 Rubric
No ratings yet
ISYE6740 Fall2024 HW4 Rubric
5 pages
40 Interview Questions On Clustering
No ratings yet
40 Interview Questions On Clustering
9 pages
(PR 2024) Lec1 Intro Regression I
No ratings yet
(PR 2024) Lec1 Intro Regression I
25 pages
AI Roadmap - Based On Stanford AI Graduate Certificate
No ratings yet
AI Roadmap - Based On Stanford AI Graduate Certificate
16 pages
Core Concepts in AI
No ratings yet
Core Concepts in AI
1 page
Scaler DSML GitHub Search
No ratings yet
Scaler DSML GitHub Search
7 pages
Devasree
No ratings yet
Devasree
33 pages
ML CHeat Sheet
No ratings yet
ML CHeat Sheet
3 pages
Aiml Question Bank
No ratings yet
Aiml Question Bank
4 pages
Reading 89 Machine Learning and AI For Risk Management - Answers
No ratings yet
Reading 89 Machine Learning and AI For Risk Management - Answers
4 pages
Smart Bin For Waste Management
No ratings yet
Smart Bin For Waste Management
5 pages
Sipga - 2024 12 26
No ratings yet
Sipga - 2024 12 26
10 pages
Que t1 Orange
No ratings yet
Que t1 Orange
200 pages
DWM Exp7 C49
No ratings yet
DWM Exp7 C49
11 pages
Crime Reporting and Investigation System CRIS
No ratings yet
Crime Reporting and Investigation System CRIS
21 pages
CEE 105 Inferential Stat Parametric Test Feb22
No ratings yet
CEE 105 Inferential Stat Parametric Test Feb22
132 pages
CS60010: Deep Learning: Recurrent Neural Network
No ratings yet
CS60010: Deep Learning: Recurrent Neural Network
44 pages
Lecture 1
No ratings yet
Lecture 1
85 pages
Presentation UNIT-2
No ratings yet
Presentation UNIT-2
96 pages
Y21 CSIT Ashish Mehata Exit Requirements
No ratings yet
Y21 CSIT Ashish Mehata Exit Requirements
20 pages

QP

Uploaded by

QP

Uploaded by

EE782 Advanced Topics in Machine Learning

End-Semester Examination Question Paper and Answer Sheet

ROLL NO. ________________ NAME: _________________________

14. What is a pseudo-label for a classification problem? [1]

You might also like

ROLL NO. NAME: _________