0% found this document useful (0 votes)

213 views10 pages

Practice Final sp22

The document is instructions for a machine learning final exam consisting of 3 sections: Section A contains short 1-2 point questions intended to take 1-2 minutes each, totaling 18 points over 11 questions in about 20 minutes. Section B contains longer 3-4 point questions intended to take 2-4 minutes each, totaling 29 points over 9 questions in about 35 minutes. Section C contains even longer 5-7 point questions intended to take 4-8 minutes each, totaling 33 points over 6 questions in about 45 minutes. Students have 120 minutes total to complete the exam and must scan and upload their answers within 15 minutes after finishing.

Uploaded by

Ajue Ramli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

213 views10 pages

Practice Final sp22

Uploaded by

Ajue Ramli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

1

Applied Machine Learning Final Exam (CIS 419/519)

Instructions
Write your answers on paper with a pen. Write your name, and your section number (CIS 419 or CIS 519) prominently
on the first page of your answers at the top left.

No devices except for a calculator. You may use one handwritten sheet of paper as a cheatsheet, with both sides.’’'

The exam contains 3 sections, roughly arranged by question type:

● Section A has 1 or 2-point questions that are each intended to take 1-2 mins to complete (total: 18 pts, 11 questions,
~20 minutes).
● Section B contains 3,4-point questions that are each intended to take 2-4 mins to complete. (total: 29 pts, 9
questions, ~35 minutes)
● Section C contains 5, 6, and 7-point questions that are intended to take between 4 and 8 mins to complete. As a
thumb rule, an x-pt question should take not much longer than x mins. (total: 33 pts, 6 questions, ~45 minutes)

The exam is out of 80 points, and you have 120 minutes. Manage your time well! These expected time suggestions
are intended to help: if you find yourself sinking a lot more time into a problem, it may be a better strategy to stop and return
later to take a fresh look.

At the end of 2 hours, you will put down your pens, and proceed to submission.

Submission Process:

Step 1: Scan & Upload Within 15 mins: After the exam, you will have 15 minutes of access to your smartphone for
scanning and uploading your solutions to gradescope. Make sure to use a pdf scanning app (for example, the Google
Drive app has a feature) to scan your answers and upload to Gradescope during these 15 minutes. Make sure your
smartphone is charged and ready!

Step 2: Select Pages On Your Own Time: Then, after the submission window closes, you will be able to annotate
pages for each question on your own time. Please do use a pdf scanner app, and do not upload image files directly:
we have noticed that for some reason, Gradescope doesn't allow you to annotate your responses after the submission
window if you upload image files. Complete this before 9 p.m. today, May 9.

For in-person exam takers, we will collect your answer sheets as a backup in case scanning went wrong.

All the best!

Section A
1 or 2-point questions, that are each intended to take 1-2 mins to complete (total: 18 pts, 11 questions, ~20 mins)
No need to show your work.

1. [1 pt] Suppose we have a regularized linear regression model: . What is the effect of
increasing λ on bias and variance? Select the best option.
A. Increases bias, increases variance
B. Increases bias, decreases variance
C. Decreases bias, increases variance
D. Decreases bias, decreases variance

2. [1 pt] The TAs of the course want to develop a neural network which takes as input student information, such as
autograder scores for each homework, and frequency of class attendance and predict student performance on this final.
Which of the following functions is an appropriate loss function for this problem? (Select all that are appropriate)
● Binary Cross Entropy
● Negative Log Likelihood
● Mean Squared Error
● KL divergence.

3. [1 pt] For an MDP (S, A, T, , R) if we only change the discount factor , the optimal policy is guaranteed to remain the
same. True/ False?

4. [1 pt] Suppose you are designing a recommendation system for the Penn library book collection. The system will be used
to recommend similar books when people checkout book rentals. The library has more than 1 million books, but has only
collected <10,000 user reviews + checkout history in total so far. Which one of the following methods would best suit the
use case?
A. Nearest Neighbor Collaborative Filtering
B. Content-based Recommendation
C. Matrix Factorization
D. Popularity-based Recommendation

5. [2 pt] When training RNNs, making sure gradients do not get too small (vanishing gradients) or get too big (exploding
gradients) is crucial to successful learning. Select all true statements about controlling gradient magnitudes:

(1) L2 Regularization helps reduce the vanishing gradient problem.

(2) LSTM cells can reduce vanishing gradients.
(3) Adding more hidden layers helps solve the vanishing gradient problem.
(4) Clipping the gradients helps solve the exploding gradients problem.

6. [2 pts] Consider the following encoder-decoder style model over

characters. The goal of the model is read in a list of characters (i.e.
A,B,C), and then output the same sequence in exactly that order (i.e.
Copy the input sequence as output). In the examples on the right, the
input sequence is consumed by an RNN, and the hidden states of the
RNN are shown above the states. Assume every decoder hidden state,
in light gray, is used to compute dot-product attention with the encoder
hidden states, in dark gray.In the first example on the right, at the first
hidden state of the decoder (state 4), order the encoder states, 1,2,3 by
which receives the most attention from state 4, largest first.
3

7. [2 pts] Susan is not satisfied with the performance of her deep neural network model. She plans to build an ensemble with
K members and average their predictions to improve performance. Which of these approaches could she use to construct
this ensemble? Select all that apply.
1. Training K deep neural networks with the same architecture and the same initialization.
2. Training K deep neural networks with different architectures.
3. Training K deep neural networks on different subsets of the data.
4. Training K deep neural networks with the same architecture and different random initializations.
5. None of the above

8. [2 pt] Consider the following two PCA-based dimensionality reduction methods.

Method 1: PCA is applied to project points from d-dimensional space to i principle coordinates. PCA is applied once again
to these coordinates from i-dimensional space to j principle coordinates. (d > i > j).
Method 2: PCA is applied to project points from d-dimensional space directly to j principle coordinates.

Do method 1 and method 2 produce the same results? State Yes / No.

9. [2 pt] You are training an ML model. In which of the following scenarios would increasing the bias (as in “bias-variance
tradeoff”) of the ML algorithm help? Select all that apply.
A. The training dataset is small.
B. You observe high error on both the training set and the validation set.
C. Your data cannot be linearly separated.
D. You observe low error on the training set and high error on the validation set.

10. [2 pt] Which of the classifiers A and B shown in the figure (on the right) would you use in an
automatic braking system of a car to detect collisions (assume that collisions are the positive
class)? Note: sensitivity is also called the "true positive rate", and specificity is also called the
"true negative rate".

11. [2 pt] You are given that cloud cover is more predictive of the chances of rain than the time of
day. Consider the following two hypothesis classes for predicting rain:
A. Decision trees with cloud cover attribute at the root
B. Decision trees with time of day attribute at the root
Class A has greater expressive power than class B because it is using a more predictive
attribute at the root node. True or False?

Section B
3,4 -point questions, that are each intended to take 3-5 mins to complete (total: 29 pts, 9 questions, ~35 mins)

12. [3 pt] Suppose we are trying to build a linear regression classifier to solve the XOR problem, whose
inputs and , and corresponding outputs y are shown in the figure. You are aware that you cannot
solve the problem when only using and as features, so you decide to use the following three
features instead.
●
●
●
Would a linear regression classifier using only the above three features solve the problem? Justify your
answer.
4
13. [3 pt] Jeff is training a deep neural network for classification using a three way split of data that he calls, D1, D2, and D3.
He is considering using the datasets to train the weights of his network, tune the architecture, and then finally report
performance as measured by accuracy on the D3 data split. He trains the network parameters on dataset D1. He is
unsure which dataset to use to tune hyperparameters, so he tries tuning them on D1 (scenario A) , D2 (scenario B), and
D3 (scenario C). Can you predict the ascending order of reported accuracies for scenarios A, B, and C?

14. [3 pt] Consider one possible extension of k-nearest neighbors adapted to

regression. In this setting, when given a test point, we compute its
neighborhood using euclidean distance, and then assign it the mean value in
the neighborhood.

Consider the 1-dimensional example shown in the figure, where the x axis is the
input, and y axis is the output. The black solid points are training points.
14.1. [1 pt] If we use the 2-nearest neighbor version of this method, given a
test point at the red value (x = 4), which of the hollow red points
A,B,C,D would be the most likely prediction?
14.2. [1 pt] What about if we used the 4-nearest neighbor version?
14.3. [1 pt] In a higher dimensional setting with , would you prefer
to use this method or linear regression? Justify your choice with at
least one concrete reason

15. [3 pt] Suppose we use a logistic regression model for a three-class classification problem having two features. Let the
features be represented as (includes the bias term). The trained parameters for the classes to be ,
, . Write a set of linear equations ax+by=c that define the lines that make up the decision
boundary for the trained model.

16. [3 pts] Consider the architecture for deep averaging networks

(DAN) that you implemented for homework for classification
(right)
(1) [1 pt] (True/False) Are DANs invariant to the order of
tokens in the input sentence? In other words, if we swap
the position of two words in an input sentence to the
model, is it always true that the model will output the
same results?
(2) [2 pt] Consider a slight variation to the DAN architecture.
Instead of taking average right after the word embedding
layer, we concatenate together the word embeddings
and put them through a linear feed-forward layer that
produces an embedding of equal size as a single word
embedding. Is this model invariant to the order of the input sentence? Briefly justify your answer.

17. [3 pt] A convolutional layer in a neural network is composed of three 2x2 filters (with biases) and is applied to an input
volume of shape (10, 10, 3) with no image padding and a stride of 2.
17.1. [1.5 pt] What is the shape of the output volume from the layer?
17.2. [1.5 pt] How many learnable parameters are in the layer?
5
18. [3 pts] Consider the k-means clustering problem shown in the
figure below, with k=3, and six data points A through F. We use
the distance metric. A clustering is called stable if one more
iteration of the k-means algorithm leaves the 3 clusters
unchanged. Which of the following clusterings are stable?
(1) {A, B, E}, {C, D}, {F}
(2) {A, B}, {D, E}, {C, F}
(3) {A, D}, {B, E}, {C, F}
(4) {A}, {D}, {B, C, E, F}
(5) {A, B}, {D}, {C, E, F}

19. [4 pt] Consider a linear regression model in one dimension with no bias terms, y=w*x, where y and x are scalars. The
weight w is initialized to 0. You are given a dataset with a single point, (x = 1, y =1) and your loss function is mean
squared error. Compute the value of the parameters for two steps of gradient descent with momentum. Show your work.
Recall the form of gradient descent with momentum:

We will set and .

Section C
5,7 -point questions, that are each intended to take 5-10 mins to complete (total: 33 pts, 6 questions, ~45 mins)

20. [5 pt] Given a dataset of two dimensional points (right), each belonging to one of two classes, Judy wants to build a
neural network to classify a new point. She will directly feed point coordinates into the network. She found an architecture
online, written in pytorch, but some parts are missing. Here is the code she found:

20.1. [1 pt] Fill in the missing blanks in the __init__ method so that the forward executes correctly.
6
20.2. [4 pt] If she trains this neural network correctly and with sufficient data (from the distribution shown above, which
of the following classification boundaries might she possibly observe? Choose all that apply, if any, and explain
your reasoning.

21. [5 pt] Consider a system for radiation therapy planning. Given a patient with a malignant tumor, the problem is to select
the optimal radiation exposure time for that patient. A key element in this problem is estimating the probability that a given
tumor will be eradicated given certain features. A data scientist collects the following information relating to this radiation
therapy system. X1 denotes time in milliseconds that a patient is irradiated with, X2 = holds the size of the tumor in
centimeters, and Y notates a binary response variable indicating if the tumor was eradicated. The data scientist fits a
logistic regression model to the dataset and obtains the following parameter
estimates. , , .
21.1. [3 pt] Estimate the probability that, given a patient who undergoes the treatment for 40 milliseconds and who is
presented with a tumor sized 3.5 centimeters, the system eradicates the tumor.
21.2. [2 pt] For how many milliseconds would the patient in part (a) need to be radiated to have exactly a 50% chance
of eradicating the tumor?

22. [6 pt] Consider the network shown below, being used for regression. At initialization, , and . Suppose our
training dataset is a single sample,
22.1. [4 pts] Compute the ratio of weights and after one step of gradient descent.
22.2. [2 pt] What are those same ratios at convergence? Show your work in detail.

23. [5 pt] Consider a deterministic grid world shown in the figure with an “absorbing”
state G: any action performed at this state leads back to the same state. The
immediate rewards are 10 for the labeled transitions and 0 for the unlabelled
transitions. The discount factor is
23.1. [1pt] Show the optimal policy by drawing arrows corresponding to
optimal actions for each cell in the grid.
23.2. [1 pt] Compute the optimal V-value function for the top left state
(column 1, row 2) in this grid world.
23.3. [3 pt] Now, consider applying the Q-learning algorithm to this grid world. Assuming the table of Q-values is
initialized to zero. Assume the agent begins in the bottom left grid square and then travels clockwise along the
7
perimeter of the grid until it reaches the absorbing goal state, completing the first training episode. Describe
which Q-values are modified as a result of this episode, and give their revised values.

24. [5 pt] Consider this gridworld MDP with 2 rows and 3 columns, which
operates like the one we saw in class. The states are grid squares,
identified by their row and column number (row first, indexed bottom to
top and column second, indexed left to right). The agent always starts
in state (1,1), marked with the letter S. As marked in the figure, there
are two terminal goal states: (2,3) with reward +5, and (1,3) with
reward -5. Rewards are 0 in non-terminal states. The reward for a state
is received as the agent moves into the state. The transition function is
such that the intended agent movement (North, South, West, or East)
happens with probability .8. With probability .1 each, the agent moves in one of the directions perpendicular to the
intended direction. If a collision with a wall happens, the agent stays in the same state
24.1. [1 pt] Draw the optimal policy for this grid?
24.2. [4 pt] Suppose the agent does not know the transition probabilities in this MDP. It starts with a policy that always
chooses to go right, and executes three trials running through the environment, recorded below as the sequence
of states traversed in each trial:
1) States (1,1) (1,2) (1,3),
2) States (1,1) (1,2) (2,2) (2,3), and
3) States (1,1) (2,1) (2,2) (2,3).
Given only these three trials, estimate the policy value function for this “always right” policy, at states (1,1) and
(2,2).

25. [7 pt] Consider a neural network that maps from an input vector of length 2 to an output vector of length 2. Consider the
following architecture, containing standard linear fully connected layers mapping from x to , to , and a nonlinear
activation function mapping from to , where .
For simplicity in the following, consider a training dataset with a single data point, where the input vector is x=[0.5, 0.7]
and the target or label vector is y=[0.7, 0.5].

25.1. [1 pt] Draw a plot of the activation function

25.2. [2 pt] Suppose that the output of the above network is set to , and the loss function is the squared l2 norm
distance . Report the gradients of the loss with respect to the weights in the first layer (from x to
a1) . Hint: Carefully observing the shape of your plot in part 1 might simplify your calculations.
25.3. [4 pt] Suppose that the output of the above network is instead set to . Recall that this
type of “skip connection” is a key building block of residual network architectures. Using the same squared l2
norm objective function, recompute the gradients of the loss with respect to the weights in the first layer (from x
to ).
8
—-------------------------- END OF EXAM —--------------------------
9
<left empty for scratch space>
10

Answers For End-Sem Exam Part - 2 (Deep Learning)
No ratings yet
Answers For End-Sem Exam Part - 2 (Deep Learning)
20 pages
Neural Networks & SVMs in AI
No ratings yet
Neural Networks & SVMs in AI
19 pages
ML Unit-2
No ratings yet
ML Unit-2
26 pages
Deep Learning Exp
100% (1)
Deep Learning Exp
25 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
Deep Learning Exam With Answers
No ratings yet
Deep Learning Exam With Answers
4 pages
Machine Learning Full Question Bank
No ratings yet
Machine Learning Full Question Bank
14 pages
R22 ML Question Bank For It and CSM
No ratings yet
R22 ML Question Bank For It and CSM
4 pages
Rajesh (DL Unit1) 04dec2024
No ratings yet
Rajesh (DL Unit1) 04dec2024
125 pages
Machine Learning Theory Essentials
No ratings yet
Machine Learning Theory Essentials
9 pages
Deep Learning Exam Solutions
No ratings yet
Deep Learning Exam Solutions
14 pages
02 ML Supervised Learning
No ratings yet
02 ML Supervised Learning
32 pages
Advanced RNN Techniques Explained
No ratings yet
Advanced RNN Techniques Explained
15 pages
Machine Learning Lab Manual Final
No ratings yet
Machine Learning Lab Manual Final
65 pages
Deep Learning r18 Jntuh Lab Manual
No ratings yet
Deep Learning r18 Jntuh Lab Manual
20 pages
CS230 Midterm Solutions Fall 2022
No ratings yet
CS230 Midterm Solutions Fall 2022
20 pages
Machine Learning Question Bank-Unit 3
No ratings yet
Machine Learning Question Bank-Unit 3
6 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Deep Learning - Unit-III Two Marks
100% (2)
Deep Learning - Unit-III Two Marks
3 pages
LP I ML Viva Questions
100% (1)
LP I ML Viva Questions
9 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
21 pages
Machine Learning Midterm
No ratings yet
Machine Learning Midterm
18 pages
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
No ratings yet
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
7 pages
Intro to k-Nearest Neighbor Algorithm
No ratings yet
Intro to k-Nearest Neighbor Algorithm
3 pages
Lecture 2.1.2activation Function
No ratings yet
Lecture 2.1.2activation Function
15 pages
Deep Learning - Question Bank
No ratings yet
Deep Learning - Question Bank
6 pages
ML Viva Questions
100% (1)
ML Viva Questions
4 pages
Workshop ML & DL
No ratings yet
Workshop ML & DL
1 page
Autoencoders: Types and Applications
No ratings yet
Autoencoders: Types and Applications
91 pages
Dimensionality Reduction Lecture Slide
No ratings yet
Dimensionality Reduction Lecture Slide
27 pages
Machine Learning Basics for Students
100% (1)
Machine Learning Basics for Students
21 pages
Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
Unit - 3-NNDL - Notes
No ratings yet
Unit - 3-NNDL - Notes
17 pages
Gaussian Mixture Models Unit-III
No ratings yet
Gaussian Mixture Models Unit-III
13 pages
Tensor Flow
No ratings yet
Tensor Flow
12 pages
ML Unit-1
No ratings yet
ML Unit-1
43 pages
Soft Computing Assignment
100% (1)
Soft Computing Assignment
13 pages
DEEP LEARNING Import Questions For External Exam
No ratings yet
DEEP LEARNING Import Questions For External Exam
1 page
Lab I TENSOR FLOW AND KERAS
No ratings yet
Lab I TENSOR FLOW AND KERAS
3 pages
Evaluation Metrics For Regression: Dr. Jasmeet Singh Assistant Professor, Csed Tiet, Patiala
No ratings yet
Evaluation Metrics For Regression: Dr. Jasmeet Singh Assistant Professor, Csed Tiet, Patiala
13 pages
DL Question Bank
No ratings yet
DL Question Bank
23 pages
Deep Learning With Tensorflow
No ratings yet
Deep Learning With Tensorflow
15 pages
Dimensionality Reduction Guide
No ratings yet
Dimensionality Reduction Guide
79 pages
Deep Learning Optimization Guide
No ratings yet
Deep Learning Optimization Guide
30 pages
Single-Layer Perceptron Guide
No ratings yet
Single-Layer Perceptron Guide
39 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
18 pages
Machine Learning Bits
100% (2)
Machine Learning Bits
28 pages
Nueral Network Mcqs
No ratings yet
Nueral Network Mcqs
6 pages
LDA in Python: Machine Learning Lab
No ratings yet
LDA in Python: Machine Learning Lab
12 pages
Machine Learning Question Bank
No ratings yet
Machine Learning Question Bank
1 page
ML Lab
No ratings yet
ML Lab
21 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
33 pages
cp4252 Machine Learning Lab Manual
No ratings yet
cp4252 Machine Learning Lab Manual
38 pages
Deep Learning CNN Training Guide
No ratings yet
Deep Learning CNN Training Guide
20 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
No ratings yet
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
35 pages
Machine Learning
No ratings yet
Machine Learning
25 pages
2.building Blocks of Neural Networks
100% (1)
2.building Blocks of Neural Networks
2 pages
Deep Learning
No ratings yet
Deep Learning
9 pages
INF8953CE Final Exam Questions 2020
No ratings yet
INF8953CE Final Exam Questions 2020
5 pages
Osstell ISQ Brochure
No ratings yet
Osstell ISQ Brochure
5 pages
Blockchain in Agriculture IEEE
No ratings yet
Blockchain in Agriculture IEEE
5 pages
Rhodia Polyamide Poster Auto
No ratings yet
Rhodia Polyamide Poster Auto
10 pages
Fishing Operations
100% (1)
Fishing Operations
81 pages
EDJ28503 - Lab 1 - Gear Train Open Ended Lab - 2023 - 2024
No ratings yet
EDJ28503 - Lab 1 - Gear Train Open Ended Lab - 2023 - 2024
5 pages
System Relief and Power Boost Valve Test and Adjustment
No ratings yet
System Relief and Power Boost Valve Test and Adjustment
3 pages
SRS/MKS: Department of Aeronautical Engineering Study Materials
No ratings yet
SRS/MKS: Department of Aeronautical Engineering Study Materials
74 pages
Synthesis and Characterization of Silica Nanoparticles From Clay
No ratings yet
Synthesis and Characterization of Silica Nanoparticles From Clay
7 pages
Real Options - Case 1 Gulf Coast - Capital Budgeting With Staged Entry
No ratings yet
Real Options - Case 1 Gulf Coast - Capital Budgeting With Staged Entry
11 pages
7.3. Electric Fields QP
No ratings yet
7.3. Electric Fields QP
16 pages
Evaluating Limits Algebraically
No ratings yet
Evaluating Limits Algebraically
13 pages
Warp Drive in General Relativity
No ratings yet
Warp Drive in General Relativity
11 pages
Gauteng Province Mathematics - Annual Teaching Plan - Grade 11 2017
No ratings yet
Gauteng Province Mathematics - Annual Teaching Plan - Grade 11 2017
3 pages
Power Point Questions: (10-15 Marks Each)
100% (1)
Power Point Questions: (10-15 Marks Each)
4 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
2-5 Postulates and Paragraph Proofs
No ratings yet
2-5 Postulates and Paragraph Proofs
24 pages
Chasing The Dragon's Tail
92% (39)
Chasing The Dragon's Tail
494 pages
Numerical Methods for Scientists
No ratings yet
Numerical Methods for Scientists
26 pages
Dynamic Vibration Absorbers Optimiz
No ratings yet
Dynamic Vibration Absorbers Optimiz
7 pages
Financial Times Handbook - Lawrence Galitz
No ratings yet
Financial Times Handbook - Lawrence Galitz
7 pages
11 Computer Science
No ratings yet
11 Computer Science
4 pages
FDT21W
No ratings yet
FDT21W
2 pages
Cermets: High-Temp Metal-Ceramic Composites
No ratings yet
Cermets: High-Temp Metal-Ceramic Composites
5 pages
CHE 42 - Problem Set 1 - Phase Equilibrium
No ratings yet
CHE 42 - Problem Set 1 - Phase Equilibrium
3 pages
Error MSG
No ratings yet
Error MSG
26 pages
153/154, 4Th CRS, Dollars Clny, Opp. Clarence Public School, J P Nagar 4Th Phase, Bangalore, Karnataka-560078 Phone - 080-40402020
No ratings yet
153/154, 4Th CRS, Dollars Clny, Opp. Clarence Public School, J P Nagar 4Th Phase, Bangalore, Karnataka-560078 Phone - 080-40402020
1 page
Exp. 2
No ratings yet
Exp. 2
11 pages
Account Statement From 15 May 2019 To 17 May 2019
No ratings yet
Account Statement From 15 May 2019 To 17 May 2019
2 pages
Chloride Diffusion Coefficient Calculation
No ratings yet
Chloride Diffusion Coefficient Calculation
1 page
Elements of Surface Texture
No ratings yet
Elements of Surface Texture
3 pages

Practice Final sp22

Uploaded by

Practice Final sp22

Uploaded by

1

Applied Machine Learning Final Exam (CIS 419/519)

The exam contains 3 sections, roughly arranged by question type:

All the best!

(1) L2 Regularization helps reduce the vanishing gradient problem.

6. [2 pts] Consider the following encoder-decoder style model over

8. [2 pt] Consider the following two PCA-based dimensionality reduction methods.

14. [3 pt] Consider one possible extension of k-nearest neighbors adapted to

16. [3 pts] Consider the architecture for deep averaging networks

We will set and .

25.1. [1 pt] Draw a plot of the activation function

You might also like