Introduction to Machine Learning
Week 11
Prof. B. Ravindran, IIT Madras
1. (1 Mark) What constraint must be satisfied by the mixing coefficients (πk ) in a GMM?
(a) πk > 0 ∀ k
P
(b) k πk = 1
(c) πk < 1 ∀ k
P
(d) k πk = 0
Soln. B - Refer to the lectures
2. (1 Mark) The EM algorithm is guaranteed to decrease the value of its objective function on
any iteration.
(a) True
(b) False
Soln. B - Refer to the lectures
3. (1 Mark) Why might the EM algorithm for GMMs converge to a local maximum rather than
the global maximum of the likelihood function?
(a) The algorithm is not guaranteed to increase the likelihood at each iteration
(b) The likelihood function is non-convex
(c) The responsibilities are incorrectly calculated
(d) The number of components K is too small
Soln. B - Refer to the lectures
4. (1 Mark) What does soft clustering mean in GMMs?
(a) There may be samples that are outside of any cluster boundary.
(b) The updates during maximum likelihood are taken in small steps, to guarantee conver-
gence.
(c) It restricts the underlying distribution to be gaussian.
(d) Samples are assigned probabilities of belonging to a cluster.
Soln. D - Refer to the lectures
5. (2 Marks) KNN is a special case of GMM with the following properties: (Multiple Correct)
1
(a) γi = i
(2πϵ)1/2
e− 2ϵ
(b) Covariance = ϵI
(c) µi = µj ∀i, j
1
1
(d) πk = k
Soln. B, D - Refer to the lectures
6. (1 Mark) We apply the Expectation Maximization algorithm to f (D, Z, θ) where D denotes
the data, Z denotes the hidden variables and θ the variables we seek to optimize. Which of
the following are correct?
(a) EM will always return the same solution which may not be optimal
(b) EM will always return the same solution which must be optimal
(c) The solution depends on the initialization
Soln. C - Refer to the lectures
7. (1 Mark) True or False: Iterating between the E-step and M-step of EM algorithms always
converges to a local optimum of the likelihood.
(a) True
(b) False
Soln. A - Refer to the lectures
8. (2 Marks) The number of parameters needed to specify a Gaussian Mixture Model with 4
clusters, data of dimension 5, and diagonal covariances is:
(a) Lesser than 21
(b) Between 21 and 30
(c) Between 31 and 40
(d) Between 41 and 50
Soln. D - For a GMM with 4 clusters in 5D with diagonal covariances, we need: means
(4×5=20), diagonal covariances (4×5=20), and mixing coefficients (4-1=3). Total parameters
= 20 + 20 + 3 = 43 parameters.