0% found this document useful (0 votes)

67 views10 pages

Midterm Csci566

The CSCI 566 Midterm for Spring 2024 consists of 50 questions plus 6 bonus questions, each worth ½ point, with a total duration of 112 minutes. The exam is open book and notes, but prohibits electronic devices, and students with OSAS permission can have extended time. The document includes various questions related to neural networks, decision trees, and other machine learning concepts.

Uploaded by

丁铭涛

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views10 pages

Midterm Csci566

Uploaded by

丁铭涛

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

CSCI 566 Midterm – Spring 2024

Student Name (Please print):

Student ID:

Student Email:

This exam contains 50+6 bonus ques6ons with equal weights (each worth ½ point of your total
grade). Please only provide your answer in the answer sheet below. We will only collect and grade
the ﬁrst page and not consider the informa6on on other pages. Please make sure your answer is
readable.

This exam is open book, open notes, but no computers or other electronic devices.

The total 6me for this exam is 112 minutes (2 min/ques6on; 1:00 pm to 2:52 pm). Students with
OSAS permission can take 1.75 6mes the dura6on and submit by 4:20 pm.

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 23 24

25 26 27 28

29 30 31 32

33 34 35 36

37 38 39 40

41 42 43 44

45 46 47 48

49 50 51 52

53 54 55 56

1
1. After training a neural network, you observe a large gap between the training accuracy (100%)
and the test accuracy (46%). Which of the following methods is least common to reduce this gap?
a. L2 regularization
b. Replace Tanh by Sigmoid activation
c. Reduce the size of the neural networks
d. Adding batch normalization

2. You are benchmarking runtimes for layers commonly encountered in CNNs. Which of the
following would you expect to be the fastest (in terms of floating point operations)?
a. Conv layer (convolution operation + bias addition)
b. Max pooling
c. Sigmoid activation
d. Batch Normalization

3. Consider a neural network model with parameters initialized with zeros. w[1] denotes the weight
matrix of the first layer. You forward propagate a batch of examples, then backpropagate the
gradients and update the parameters. Which of the following statements is true?
a. Entries of w[1] may be positive or negative
b. Entries of w[1] are all negative
c. Entries of w[1] are all positive due to the gradient direction in the initial batches
d. Entries of w[1] are all zeros

4. If your input image is 64x64x3 (i.e., 3 channels), how many parameters are there in single 1x1
convolution filters without bias?

5. Which of the below is typical approach to solve the exploding gradient problem?
a. Use SGD optimization with small batch sizes
b. Oversample minority classes
a. Increase the batch size to reduce overfitting
b. Resize gradient by their norm to a predefined threshold

6. What is the primary purpose of pruning in decision trees?

a. To reduce the depth of the tree and prevent overfitting
b. To optimize the tree's parameters by L2 regularization and prevent overfitting
c. To handle missing data
d. To improve the tree's interpretability by using a smaller tree

7. You are training an RNN and find that all the weights and parameters are taking on the value
NaN (not a number). Which of the following is the most likely cause?
a. Vanishing gradient problem
b. Exploding gradient problem
c. ReLU activation function g(.) used to compute g(z), where z is too large
d. Sigmoid activation function g(.) used to compute g(z), where z is too large

8. Will the general neural network regularization techniques like L2 work in Graph Neural Networks
(GNNs)?
a. No. Due to the unique spatial relationship, L2 will push GNN weights to near-zero values,
which causes a phenomenon called over-smoothing
b. Yes. GNNs are still learning the network weights as a type of neural networks
c. No. L2 is not feasible for GNN since the squared loss will cause gradient error in aggregation
d. Yes. However, in practice, people most likely rely on early stopping to control GNN
complexity but not L2 regularization

2
9. What is the primary purpose of the aggregation function in Graph Neural Networks (GNNs)?
a. Combine information from a node’s neighbors to update the node’s representation
b. To reduce the dimensionality of node features before processing
c. To classify the nodes into different categories based on their features
d. To predict the presence or absence of edges between nodes in a graph.

10. Are pooling layers necessary in a convolutional neural network (CNN)?

a. Yes. Without pooling layers, the spatial correlation will not be kept
b. Yes. Without pooling layers, CNN will likely face the issue of vanishing gradient problems
c. No. It is primarily for reducing dimensionality and preventing overfitting
d. No. It is primarily for amplifying the non-linearity in CNN

11. Why is the Rectified Linear Unit (ReLU) activation function considered non-linear?
a. Because it can only output positive values
b. Because it outputs the same value as its input
c. Because it introduces a point of non-linearity at zero, where the function changes slope
d. Because it is a piecewise function with two linear segments

12. In the context of neural networks, why is nonlinearity an essential component within the layers of
the network?
a. Nonlinearity allows neural networks to compute only linearly separable functions, simplifying
calculations
b. Nonlinearity enables neural networks to make decisions and binary classifications more
efficiently
c. Nonlinearity allows neural networks to learn and model complex patterns and relationships
within the data that cannot be represented with linear models
d. Nonlinearity is primarily used to speed up the training process by reducing the number of
required iterations

13. Consider a simple feedforward neural network model, also known as a multi-layer perceptron
(MLP), consisting of a single layer with three neurons. Each neuron receives an input and applies
a weight to it. For this scenario, the inputs to the three neurons are 1, 2, and 3, respectively.
Correspondingly, the weights applied by the neurons are 4, 5, and 6. After the weighted inputs
are summed, a linear activation function is applied to the result. This activation function multiplies
the sum by a constant value of 3. What is the final output of the network based on these
parameters?
a. 32
b. 96
c. 643
d. 9

14. Choose a disadvantage of decision trees among the following.

a. Decision trees lack interpretability and are not suited for high-stakes applications
b. Decision trees are costly to run in comparison to kNN
c. Decision trees are prone to overfit
d. All of the above

15. How does a GNN typically handle varying sizes and structures of input graphs?
a. By padding smaller graphs to match the largest one in the dataset
b. By utilizing a fixed number of parameters that are shared across different parts of the graph
c. By employing a variable number of layers depending on the size of the graph
d. By transforming all graphs to a fixed size before processing

3
16. What is NOT a common method for handling directed graphs in GNNs?
a. Treat them as undirected graphs
b. Use different weight matrices for incoming and outgoing edges
c. Use the information of the out-link nodes only
d. Allow more flexible weight matrices

17. Which of the following best describes a Graph Autoencoder?

a. A network designed for unsupervised learning on graph data
b. A GNN variant used for regression tasks
c. A model that predicts the next node in a sequence
d. A GNN that only uses convolutional layers

18. Which of the following is not a regularization technique?

a. Model pruning
b. L2 norm regularization
c. Random weight initialization
d. Data augmentation

19. Based on the above figure, which is a sigmoid function?

20. Based on the above figure, which is an Exponential Linear Unit (ELU) function?

21. For all these non-linear activation functions, which are the primary properties that affect their
usages in neural networks?
a. The value range and the rate of response, e.g., the slope
b. The value range only
c. The ability to handle probability outputs
d. The ability to handle probability outputs and the rate of response, e.g., the slope

22. When computing statistics for data preprocessing (for example, a mean of features), which set of
data should be used to compute?
a. Training dataset only to avoid data leakage
b. Validation dataset only to avoid overfitting on the training dataset
c. Training and validation datasets together since both are used for training
d. Training, validation, and test datasets altogether since we need as much data as possible to
approximate statistics

4
23. Which of the following statements about batch normalization is true?
a. During the training time, batch normalization will use the mean and variance of the current
mini-batch.
b. During the test time, batch normalization will use the mean and variance of the current mini-
batch.
c. When batch size is set to the size of the entire dataset, batch normalization on input is
equivalent to performing L1 normalization on input.
d. When batch size is set to the size of the entire dataset, batch normalization on input is
equivalent to performing L2 normalization on input.

24. You have a fully connected neural network with input layer (i), 2 hidden layers (h1 and h2) and 1
output layer (𝜎). Dimensions of the layers are i = 10; h1 = 20; h2 = 20; 𝜎 = 10. Assume fully
connected layers do not have biases. You decided to double the number of hidden units in every
hidden layer. How many times will it increase the number of parameters of the network?
a. 1 (the same number of parameters)
b. 2
c. 3
d. 6

25. Which of the following statements about Recurrent Neural Networks (RNNs) is not true?
a. RNNs suffer from the vanishing gradient problem but not the exploding gradient problem
b. RNNs can produce a sequence of outputs
c. RNNs can deal with variable-length inputs
d. RNNs are usually difficult to train due to the vanishing gradient problem

Consider a variant of LeNet-5 (C1, P1, C2, P2, F1, F2, F3) shown here and answer the corresponding
questions. Convolution layers and fully connected layers have weights and biases. Here, this LeNet takes
a color image of size (32, 32, 3) as input and outputs a prediction vector of probabilities for 4 classes.

26. For the first convolutional layer, we observe that the shape of input data is (32, 32, 3) and the
shape of output data is (28, 28, 4). Assuming there is no spatial padding for this layer and stride
is 1, what is the size of C1’s convolutional kernel? The size of a kernel is represented in the form
of (input channel, kernel height, kernel width, and output channel). The answer should be four
numbers in the parenthesis.

5
27. Assume we are trying to learn a decision tree. Our input data consists of N samples, each with k
attributes (k<<N). We define the depth of a tree as the maximum number of nodes between the
root and any of the leaf nodes (including the leaf, not the root).

If all attributes are binary, what is the maximal number of leaf (decision) nodes that we
can have in a decision tree for this? (choose the closest one if your answer is slightly off).
a. 2!
b. 𝑁𝐾
c. 2𝑁𝐾 − 1
d. 𝑁 ! − 1

28. For the same binary-valued tree, what is the maximum possible depth of a decision tree for this
data?
a. 𝑂(𝑘).
b. 𝑂(𝑘 − 1)
c. 𝑂(𝑙𝑜𝑔𝑘)
d. 𝑂(log(𝑘 + 𝑁))

29. If all attributes are continuous, what is the maximum number of leaf nodes that we can
have in a decision tree for this data?
a. 2!"#
b. 𝑁𝐾 $
c. 𝐾
d. 𝑁

30. If all attributes are continuous, what is the maximal possible depth for a decision tree for this?
(choose the closest one if your answer is slightly off).
a. 𝑁𝐾 $
b. 𝐾
c. 𝑁 − 1
d. 2!"#

31. Which of the following layers is not linear activation functions?

a. Convolutional layer
b. Pooling layer
c. ReLU layer
d. Batch normalization Layer

32. There are 100 data instances, each of which is a 64 dimension/feature vector. To have a 3
nearest neighbor classifier on the data, the number of parameters to learn is __.

33. In a convolutional layer with 3 filters, each with the size (3, 2, 2) representing C, H, W, the number
of parameters in this layer including biases is ____.

34. Ss For a 10-way classification task, there are 100 billion images in the training set, each with the
size (C, H, W) = (3, 224, 224). After training each of the following models on the task, when a new
image comes, which one would be the least efficient to infer the label of the new image?
a. 10 nearest neighbor classifier
b. AlexNet
c. ResNet-50
d. An MLP with 100 layers, where each layer is in half the size of the previous layer, and the
minimum size is 16

6
35. Comparing Ridge regression and Lasso regression, which one(s) can help feature selection?
a. Ridge regression (L2)
b. Lasso regression (L1)
c. Both
d. Neither

36. Suppose you have a single neuron with a linear activation function g() as above and input 𝐱 =
𝑥% , . . . , 𝑥& and weights 𝐖 = 𝑊% , . . . , 𝑊& . Here is the squared error function ((𝑦 − 𝐖 ' 𝐱)$ ) for this
input and the true output is a scalar y, what is the weight update rule for the neuron based on
gradient given the learning rate is 𝜆?
a. 𝑊( ← 𝑊( + 𝜆2𝑥( (𝑦 − 𝐖 ' 𝐱)
$
b. 𝑊( ← 𝑊( + 𝜆2𝑥( (𝑦 − 𝐖 ' 𝐱) + =|𝐖|=
$
c. 𝑊( ← 𝑊( + 𝜆𝑥( (𝑦 − 𝐖 ' 𝐱) + =|𝐖|=
d. 𝑊( ← 𝑊( + 𝜆𝑥(

37. For linear regression models, assume we only observe a single input for each output (that is, a
set of {x, y} pairs). We would like to compare the following two models on our input dataset (for
each one we split into training and testing set to evaluate the learned model). Assume we have
an unlimited amount of data:
𝐴: 𝑦 = 𝑤 $ 𝑥
𝐵: 𝑦 = 𝑤𝑥

Which of the following is correct (chose the answer that best describes the outcome):
a. There are datasets for which A would perform better than B
b. There are datasets for which B would perform better than A
c. Both 1 and 2 are correct.
d. They would perform equally well on all datasets.

38. Note that model A now has a new form. Again we assume unlimited data. Which of the following
is correct (chose the answer that best describes the outcome):
𝐴: 𝑦 = tan(𝑤) 𝑥 + 𝑤𝑥
𝐵: 𝑦 = 𝑤𝑥

a. There are datasets for which A would perform better than B

b. There are datasets for which B would perform better than A
c. Both 1 and 2 are correct.
d. They would perform equally well on all datasets.

39. What is an outlier within the context of machine learning and data mining?
a. An error in data collection
b. A data point that differs significantly from other observations in the dataset
c. A data point that lies within the normal range of the distribution
d. A data point that is the most frequent in the dataset

40. When using the k-nearest neighbors (k-NN) algorithm for outlier detection, how are outliers
determined?
a. Outliers are the points with the highest density of neighbors
b. Outliers are the points with the lowest density of neighbors
c. Outliers are the points that are closest to the center of the dataset
d. Outliers are the points that are most similar to the k-nearest neighbors

7
41. Consider a binary classification problem (2 possible classes) on a 2D plane (2 inputs) with a
circular decision boundary ((𝑥# − 𝑎)$ + (𝑥$ − 𝑏)$ < 𝑟 $ → 𝑦 = 0; otherwise, 𝑦 = 1; 𝑎, 𝑏, 𝑟 are
learnable parameters). What is the minimum number of distinct points required to make perfect
prediction impossible for this classifier?
a. 2
b. 3
c. 4
d. 8

42. Consider a supervised learning problem with MSE loss. If the training error is 0,
which of the following is true?
a. Test error will always be 0
b. True error on any given data will always be 0
c. Both (A) and (B)
d. Neither (A) nor (B)

43. Consider an MLP with one hidden layer of size 8, 4 inputs, and 2 outputs. How
many learnable parameters does this network have? (ignore the bias term; assume
activation functions don’t have parameters)?
a. 48
b. 40
c. 24
d. 8

44. Consider an MLP with one hidden layer, 2 inputs, and 1 output, but no activation
after the hidden or the output layer. Which of the following functions can it predict
accurately (assuming there is enough data and hidden neurons available for convergence)?
a. 𝑦 = 𝑥
b. 𝑦 = |𝑥|
c. 𝑦 = 𝑥 $
d. Both (A) and (B)

45. Consider a binary classification problem (2 possible classes) with 0/1 loss (loss = 0 if
the prediction is correct, loss = 1 otherwise). Can backpropagation + SGD (let us assume
magically it is differentiable) be used to train a neural network for this problem?
a. Yes, regardless of the output ac=va=ons or the NN architecture
b. No, regardless of the output ac=va=ons or the NN architecture
c. Yes, but only for some output ac=va=ons (regardless of the NN architecture)
d. Yes, but only for certain NN architectures and output ac=va=on func=ons

46. Consider an 8 × 8 input image passed through a CNN with a 1 × 1 filter and stride of
1. Assume that the convolution operation is followed by an 8 ×8 max pool operation.
Will output change if pixels in the input image are shuffled without changing the CNN
weights?
a. Yes
b. No
c. Depends on the shuﬄing
d. Depends on the weights

8
47. Which of the following functions are permutation invariant (x1, x2, x3 are inputs)?
a. 𝑓(𝑥) = 𝑥# + 2𝑥$ + 𝑥)
b. 𝑓(𝑥) = average(𝑥# , 2𝑥$ , 𝑥) )
c. 𝑓(𝑥) = max (relu(𝑥# ), relu(𝑥$ ), relu(𝑥) ))
d. None of above

48. Consider a GNN with the following functions for calculating node embeddings 𝑧* of node 𝑣:

Assume you are solving a problem for which the neighborhood aggregation (AGGREGATE
(!)
function above) should weigh the neighbors’ embeddings (ℎ+! ) based on their distance to the
(!)
embedding of the current node (ℎ* ). Is it possible to implement such a function while preserving
permutation invariance?
a. Yes
b. No
c. Yes, but only for some distance functions
d. Yes, but only for some graphs

49. In the context of clustering, what does the term "centroid" refer to?
a. The center point of a cluster in k-Means clustering.
b. The largest point in a dataset.
c. The average distance between clusters.
d. The initial point selected at random in hierarchical clustering.

50. When using the k-means algorithm, if the initial centroids are chosen poorly, which of the
following may occur?
a. The algorithm will not run
b. The algorithm may converge to a local minimum
c. The number of clusters will automatically increase
d. The algorithm will switch to a hierarchical approach

51. How does the K-nearest neighbors (KNN) algorithm typically handle regression problems?
a. By calculating the mean value of the nearest neighbors' labels/values
b. By taking a majority vote among the labels/values of the nearest neighbors
c. By treating it as a classification problem and use the value from the most similar neighbor
d. By using gradient descent to minimize the distance between neighbors

52. The introduction of residual connections in neural networks, as seen in ResNet architectures,
helps mitigate the vanishing gradient problem. How do these connections work?
a. By allowing gradients to flow through additional paths, bypassing non-linear transformations.
b. By reducing the depth of the network, thereby shortening the gradient propagation path.
c. By adding a constant to the gradients at each layer.
d. By forcing all layers to learn an identity function, making the network effectively shallower.

53. How can gradient clipping help in addressing the vanishing gradient problem?
a. It makes the gradients larger when they are too small.
b. It prevents gradients from becoming too large, which may be used for vanishing gradients
with caution.
c. It helps by normalizing the gradients to a specific range to ensure they neither vanish nor
explode.
d. Gradient clipping is not a method used for addressing vanishing gradients.

9
54. Why does the vanishing gradient problem primarily affect deep networks?
a. Because shallow networks do not use gradient-based learning algorithms
b. Because deep networks have more parameters and thus higher computational complexity
c. Because in deep networks, the multiple layers can cause the gradients to become very small,
exponentially fast as they are propagated backward through each layer
d. Because deep networks are more prone to overfitting which inherently causes vanishing
gradients

55. When a CNN is described as being "deep," this refers to which of the following?
a. The number of filters in the convolutional layers.
b. The size of the filters in the convolutional layers.
c. The number of convolutional layers in the network.
d. The amount of pooling layers in the network.

56. Did the instructor mentioned there could be curve for the grade?
a. No. It is clearly stated that there is no curve
b. No. But there might be random grade bump
c. Yes. It depends on the final score distribution
d. Yes. There will be a curve no matter of the score distribution

10 Improving Deep Neural Networks Hyperparameter Tuning, Regularization
No ratings yet
10 Improving Deep Neural Networks Hyperparameter Tuning, Regularization
6 pages
Examen Deep Learning
100% (1)
Examen Deep Learning
8 pages
DL2024
No ratings yet
DL2024
4 pages
Is The Data Linearly Separable?: A) Yes B) No
No ratings yet
Is The Data Linearly Separable?: A) Yes B) No
19 pages
ITML MID-2 Bits
No ratings yet
ITML MID-2 Bits
17 pages
DL MID2 Bit Bank 2024-25
No ratings yet
DL MID2 Bit Bank 2024-25
25 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
QP3
No ratings yet
QP3
2 pages
Deep Learning MCQ
No ratings yet
Deep Learning MCQ
6 pages
Deep Learning
No ratings yet
Deep Learning
6 pages
ObjectiveQ&a Mid-I NNDL
No ratings yet
ObjectiveQ&a Mid-I NNDL
15 pages
MCQ
100% (1)
MCQ
9 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
9 pages
DL MCQ
No ratings yet
DL MCQ
13 pages
Question Bank - Deep Learning
No ratings yet
Question Bank - Deep Learning
25 pages
Huawei: Question & Answers
No ratings yet
Huawei: Question & Answers
149 pages
AI Neural Networks MCQ Guide
No ratings yet
AI Neural Networks MCQ Guide
13 pages
Deep Learning
No ratings yet
Deep Learning
17 pages
Question Bank
No ratings yet
Question Bank
14 pages
Minor 1 - DNN
No ratings yet
Minor 1 - DNN
2 pages
Quiz 1 Spring 2024 CSE 638 Deep Learning
No ratings yet
Quiz 1 Spring 2024 CSE 638 Deep Learning
2 pages
Neural Networks & Deep Learning MCQs
100% (1)
Neural Networks & Deep Learning MCQs
6 pages
Deep Learning
No ratings yet
Deep Learning
4 pages
Shoolini University Mid Sem
No ratings yet
Shoolini University Mid Sem
3 pages
Ee782 Es QP 2023
No ratings yet
Ee782 Es QP 2023
2 pages
Deep Learning Quiz for CSE Students
No ratings yet
Deep Learning Quiz for CSE Students
3 pages
Unit 3
No ratings yet
Unit 3
4 pages
Neural Network MCQ Answer Scheme
No ratings yet
Neural Network MCQ Answer Scheme
9 pages
QP
No ratings yet
QP
3 pages
Neural Networks & Machine Learning FAQ
No ratings yet
Neural Networks & Machine Learning FAQ
5 pages
Neural Network Basics
No ratings yet
Neural Network Basics
37 pages
CAIS Demo
No ratings yet
CAIS Demo
15 pages
Deep Learning MCQ
No ratings yet
Deep Learning MCQ
7 pages
Amlss
No ratings yet
Amlss
10 pages
MCQ Deep Learning Engineering Syllabus 1to 5 Unit ..
No ratings yet
MCQ Deep Learning Engineering Syllabus 1to 5 Unit ..
2 pages
Deep Learning Quiz for Enthusiasts
No ratings yet
Deep Learning Quiz for Enthusiasts
6 pages
MT1 SP19 Solutions
No ratings yet
MT1 SP19 Solutions
14 pages
CS230 Midterm Fall 2022
No ratings yet
CS230 Midterm Fall 2022
14 pages
Deep Learning Exam: Technical University of Munich
No ratings yet
Deep Learning Exam: Technical University of Munich
20 pages
Mock Paper 8NN
No ratings yet
Mock Paper 8NN
2 pages
Assignment 4
No ratings yet
Assignment 4
7 pages
WS 2021
No ratings yet
WS 2021
16 pages
MT1SP19
No ratings yet
MT1SP19
13 pages
Data Science Interview Qes.
No ratings yet
Data Science Interview Qes.
15 pages
Deep Learning
No ratings yet
Deep Learning
9 pages
F16midterm Sols v2
No ratings yet
F16midterm Sols v2
14 pages
Updated Assignment-1 Deep Learning
No ratings yet
Updated Assignment-1 Deep Learning
3 pages
Unit 1 Mid Term
No ratings yet
Unit 1 Mid Term
3 pages
Artificial Intelligence Questions
No ratings yet
Artificial Intelligence Questions
15 pages
Quiz5 Set1
No ratings yet
Quiz5 Set1
2 pages
ML Prep For Samsung
No ratings yet
ML Prep For Samsung
73 pages
19CSE456 - VI Sem May 2022
No ratings yet
19CSE456 - VI Sem May 2022
6 pages
4th Attempts Huawei
No ratings yet
4th Attempts Huawei
6 pages
WS 2021 Solutions
No ratings yet
WS 2021 Solutions
16 pages
Final Exam Update Huawei
0% (1)
Final Exam Update Huawei
13 pages
Cs230exam Win19 Soln
No ratings yet
Cs230exam Win19 Soln
29 pages
ML Lec-22
No ratings yet
ML Lec-22
25 pages
Prediction of Student Academic Performance Based On Their Emotional Wellbeing and Interaction On Various e Learning Platforms
No ratings yet
Prediction of Student Academic Performance Based On Their Emotional Wellbeing and Interaction On Various e Learning Platforms
30 pages
Neural Networks 16 Mark Answers
No ratings yet
Neural Networks 16 Mark Answers
3 pages
Soft Computing: Perceptron & XOR Backpropagation
No ratings yet
Soft Computing: Perceptron & XOR Backpropagation
6 pages
Machine Learning Megapack
No ratings yet
Machine Learning Megapack
6 pages
Graded Assessment
No ratings yet
Graded Assessment
6 pages
Week 3
No ratings yet
Week 3
79 pages
Breast Cancer SVM Classification Guide
No ratings yet
Breast Cancer SVM Classification Guide
2 pages
Machine Learning: III B. Tech I Semester Regular/Supplementary Examinations, December - 2023
No ratings yet
Machine Learning: III B. Tech I Semester Regular/Supplementary Examinations, December - 2023
8 pages
2 Ai-B ML TLP
No ratings yet
2 Ai-B ML TLP
4 pages
Linear Classifiers & SVMs in Python
No ratings yet
Linear Classifiers & SVMs in Python
24 pages
AdaBoost New PDF
No ratings yet
AdaBoost New PDF
45 pages
CSM 422
No ratings yet
CSM 422
2 pages
Deep Learning For Data Analytics
No ratings yet
Deep Learning For Data Analytics
2 pages
Experiment 2.5 DL
No ratings yet
Experiment 2.5 DL
3 pages
QB DL
No ratings yet
QB DL
2 pages
Neural Network Learning Models
No ratings yet
Neural Network Learning Models
7 pages
Sample Questions of ANN
No ratings yet
Sample Questions of ANN
5 pages
22AMC03 Introduction To Machine Learning
No ratings yet
22AMC03 Introduction To Machine Learning
2 pages
K-Nearest Neighbors Algorithm Guide
No ratings yet
K-Nearest Neighbors Algorithm Guide
7 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
2 pages
Question Bank
No ratings yet
Question Bank
2 pages
ML Questions Paper
No ratings yet
ML Questions Paper
8 pages
P1NN Gallardo - Pinal - Erika
No ratings yet
P1NN Gallardo - Pinal - Erika
17 pages
Classification in Machine Learning
No ratings yet
Classification in Machine Learning
25 pages
Deep Learning Question Bank Iv-I
No ratings yet
Deep Learning Question Bank Iv-I
5 pages
Neural Network Essentials
No ratings yet
Neural Network Essentials
34 pages
Weighted Ensemble Model For Image Classification: Talib Iqball M. Arif Wani
No ratings yet
Weighted Ensemble Model For Image Classification: Talib Iqball M. Arif Wani
8 pages
ML Question Papper
100% (1)
ML Question Papper
2 pages
VTU Module-4 Chapter-2 Ensemble Learning and Random Forests
No ratings yet
VTU Module-4 Chapter-2 Ensemble Learning and Random Forests
61 pages
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
20 pages

Midterm Csci566

Uploaded by

Midterm Csci566

Uploaded by

CSCI 566 Midterm – Spring 2024

Student Name (Please print):

6. What is the primary purpose of pruning in decision trees?

10. Are pooling layers necessary in a convolutional neural network (CNN)?

14. Choose a disadvantage of decision trees among the following.

17. Which of the following best describes a Graph Autoencoder?

18. Which of the following is not a regularization technique?

19. Based on the above figure, which is a sigmoid function?

31. Which of the following layers is not linear activation functions?

a. There are datasets for which A would perform better than B

You might also like