NEURAL NETWORK QUESTION ANSWERS
1. Answer: c) ReLU 23. Answer: b) To reduce the spatial
dimensions of the feature map
2. Answer: b) Multi-class
classification 24. Answer: b) Sigmoid
3. Answer: c) Once per epoch 25. Answer: b) 0.5
4. Answer: b) Binary Cross-Entropy 26. Answer: a, c
5. Answer: a) 3x3 27. Answer: a, b, c
6. Answer: b, d 28. Answer: a, b
7. Answer: a, b 29. Answer: 200
8. Answer: a, b, c, d 30. Answer: 0.4621
9. Answer: 1.0 31. Answer: b) Add dropout layers
10. Answer: 0 32. Answer: c) Dropout
11. Answer: a) Sigmoid → ReLU 33. Answer: tensor(0)
12. Answer: Mini-batch Gradient 34. Issue:
Descent
input_shape=(64, 64, 3) is incorrect in
13. Answer: (1, 4, 14, 14) PyTorch because PyTorch follows
(channels, height, width) format, while
14. Fix: Replace Sigmoid() with
TensorFlow uses (height, width,
LeakyReLU(negative_slope=0.1).
channels).
15. Answer: Tanh’s zero-centered
Fix:
output aids gradient flow during
backpropagation. Change input_shape=(64, 64, 3) to
input_shape=(3, 64, 64).
35. Answer: Batch Gradient Descent
updates weights after processing
16.
the entire dataset, while Stochastic
17. Correct Order: Convolution → Gradient Descent updates weights
MaxPooling → Flattening → Fully after each sample.
Connected
18. Answer: MSE and MAE
36.
19. Answer: Converts multi-
37. Answer: It converts the 2D feature
dimensional feature maps into 1D
maps into a 1D vector for input into
vectors for dense layers.
fully connected layers.
20. Answer: b) Chain rule
38. Answer: It reduces the spatial
21. Answer: d) Momentum Gradient dimensions of the output feature
Descent map.
22. Answer: a) ReLU is computationally 39. Answer: c) Huber Loss
efficient
1
NEURAL NETWORK QUESTION ANSWERS
40.
41. Answer: It allows the model to shift
the activation function for better
fitting.
42. Answer: To compute gradients of
the loss function with respect to the
weights.
43. Answer: It balances the efficiency
of Batch Gradient Descent and the
noise reduction of Stochastic
Gradient Descent.
44. Answer: Tanh
45. Answer: It controls the step size
during weight updates.
46. Answer: -0.1
47. Answer: It suffers from the
vanishing gradient problem.
48. Answer: 5x5
49. Answer: To convert raw scores into
probabilities for multi-class
classification.
50. Answer: It avoids the vanishing
gradient problem and is
computationally efficient.