0% found this document useful (0 votes)
20 views18 pages

PA Wk11

The document contains practice questions for a Machine Learning course focused on probability and statistics, specifically dealing with continuous random variables and their properties. It includes problems on calculating probabilities, expectations, and variances using various probability density functions (PDFs) and distributions. Answers and explanations are provided for each question, demonstrating the application of statistical concepts.

Uploaded by

amit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views18 pages

PA Wk11

The document contains practice questions for a Machine Learning course focused on probability and statistics, specifically dealing with continuous random variables and their properties. It includes problems on calculating probabilities, expectations, and variances using various probability density functions (PDFs) and distributions. Answers and explanations are provided for each question, demonstrating the application of statistical concepts.

Uploaded by

amit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Course: Machine Learning - Foundations

Week 11 Questions

PRACTICE QUESTIONS

1. (1 point) The continuous random variable X represents the amount of sunshine in hours
between noon and 8 pm at a skiing resort in the high season. The probability density
function, f (x), of X is modelled by
(
kx2 , for 0 ≤ x ≤ 8
f (x) =
0, otherwise

Find the probability that on a particular day in the high season there is more than two
hours of sunshine between noon and 8 pm.

Answer: 0.98

Explanation: To solve this question, we will use the pdf to find the required probability.
But first we must find k using the fact that total probability is 1.

Z Z8
k 3 8 512
fX (x)dx = kx2 dx = x = k
3 0 3
x 0

3
Since the total probability is 1 =⇒ k = . So, now the probability that there is
512
more than two hours of sunshine is P (X > 2) given by

Z8 Z8
3 3 3 1 4 8 3 (84 − 24 )
P (X > 2) = fX (x)dx = x dx = x = ≈ 0.98
512 512 4 2 (512)(4)
2 2

∴ The answer is 0.98

2. (1 point) Let X be a continuous random variable with PDF


(
6x + bx2 for 0 < x < 1
fX (x) =
0 otherwise

Calculate Value of b.
Course: Machine Learning - Foundations Page 2 of 18
Answer: -6

Explanation: To solve this question, we will use the fact that total probability is 1.

Z Z1
b 1 b
fX (x)dx = (6x + bx2 )dx = 3x2 + x3 = 3 +
3 0 3
x 0

b
Since the total probability is 1 =⇒ 3 + = 1 =⇒ b = −6.
3
∴ The answer is −6

3. (1 point) Let X be a continuous random variable with PDF


(
4xk for 0 < x < 1
fX (x) =
0 otherwise

for some k, find E(X).

Answer: 0.8

Explanation: To solve this question, we will use the pdf to find the expectation. But
first we must find k using the fact that total probability is 1.

Z Z1
4 1 4
fX (x)dx = 4xk dx = xk+1 =
k+1 0 k+1
x 0

4
Since the total probability is 1 =⇒ = 1 =⇒ k = 3. So, now the expectation is
k+1
given by
Z Z1
4 1
(x) 4x3 dx = x4 = 0.8

E(X) = xfX (x)dx =
5 0
x 0

∴ The answer is 0.8

4. (1 point) The time that passenger train will reach the station is uniformly distributed
between 2:00 PM and 4:00 PM. What is the probability that the train reaches station
Course: Machine Learning - Foundations Page 3 of 18
exactly at 04:00 PM?

Answer: 0

Explanation: Let X be the random variable denoting the time the train will arrive
such that 2 ≤ X ≤ 4. Now, X is a continuous random variable so the probability of X
being equal to a single point is 0. That is, P (X = 4) = 0.

∴ The answer is 0.

5. (1 point) If X is an exponential random variable with rate parameter λ then which of


the following statement(s) is(are) correct.
A. P (X > x + k|X < k) = P (X > k) for k, x ≥ 0.
B. P (X > x + k|X > k) = P (X > k) for k, x ≤ 0.
C. P (X > x + k|X > k) = P (X > x) for k, x ≥ 0.
D. P (X > x + k|X > k) = P (X > k) for k, x ≥ 0.

Answer: C

Explanation: For an exponential random variable, we have a memory-less property.


Say X denotes waiting time in hours. Given that you have waited for k hours, waiting
for x more hours is the same as waiting for only x hours. That is,

P (X > x + k | X > k) = P (X > x)

We can see why this is true. We know that the cdf of an exponential distribution is
FX (x) = 1 − e−λx =⇒ P (X > x) = 1 − FX (x) = e−λx . Now,

P (X > x + k ∧ X > k)
P (X > x + k | X > k) =
P (X > k)
P (X > x + k)
=
P (X > k)
−λ(x+k)
e
=
e−λk
−λx
=e
= P (X > x)
Course: Machine Learning - Foundations Page 4 of 18
∴ Option C is correct.

6. (1 point) The lifetime of a electric bulb is exponentially distributed with a mean life of
18 months. If there is a 60% chance that an electric bulb will last for at most t months,
then what is the value of t?

Answer: 16.5

Explanation: Let X be the random variable denoting the lifetime of an electric bulb
1
in months. Since the mean life is 18 months, the rate parameter λ = , which makes
18
1
the pdf fX (x) =
18
Now, the probability that an electric bulb will last for at most t months is P (X < t)
given by
−t −t
P (X < t) = 1 − e 18 = 0.6 =⇒ e 18 = 0.4 =⇒ t = (−18) ln(0.4) ≈ 16.5

∴ The answer is 16.5 months.

7. (1 point) (Multiple Select)Let X be uniformly distributed with parameters a and b,


then which of the following is/are true:
2 (b2 + a2 + ab)
A. E(X ) =
3
−1
B. f (x) = ; a≤x≤b
(a − b)
(b + a)
C. E(X) =
2
(b − a)2
D. V (X) =
12
Answer: A, B, C, D

Explanation: Here, we have a uniformly distributed random variable X ∼ U(a, b).


1
Such a distribution has a constant pdf fX (x) = , x ∈ [a, b].
b−a
−1
Since fX (x) = =⇒ Option B is correct.
a−b
Course: Machine Learning - Foundations Page 5 of 18

Now, let us find the expectation of X

Zb
b 2 − a2
Z
x 1 b a+b
E(X) = xfX (x)dx = dx = x2 = =
b−a 2(b − a) a 2(b − a) 2
x a

So, Option C is correct.

Now, let us find the expectation of X 2

Zb
x2 b 3 − a3 b2 + a2 + ab
Z
2
 2 1 3
b
E X = x fX (x)dx = dx = x = =
b−a 3(b − a) a 2(b − a) 3
x a

So, Option A is correct.

Now, let us find the variance of X


2
b2 + a2 + ab

2
 2 a+b
V (X) = E X − E(X) = −
3 2
2 2 2 2
4b + 4a + 4ab − 3a − 3b − 6ab (b − a)2
= =
12 12
So, Option D is correct.

∴ Options A, B, C and D are correct.

8. (1 point) (Multiple Select) Which of the following option is/are correct?


A. The shape of the normal density curve is bell shaped.
B. Normal density curve is symmetric about its mean.
C. The area under a standard normal density curve is 1.
D. The standard normal density curve is symmetric about the value 0.

Answer: A, B, C, D

Explanation: The pdf of a normal curve is


 2 !
1 −1 x−µ
fX (x) = √ exp
σ 2π 2 σ
Course: Machine Learning - Foundations Page 6 of 18

Firstly, we know this equation is a bell curve, so option A is correct. Also, we can
see that this curve is symmetric about the mean. This is because of the fact that
fX (µ − x) = fX (x). So, Option B is correct.

When considering the standard normal, the mean is 0 and variance is 1. Since it is
still a pdf, the area under it will be 1. So, Option C is correct.
Finally, Option D is correct for the same reason as B.

∴ Options A, B, C and D are correct.

9. (1 point) Let X and Y be continuous random variables with joint density


(
cxy for 0 < x < 1, 0 < y < 1
fXY (x, y)
0 otherwise

1 1
Calculate P (0 < X < , 0 < Y < )
2 2

Answer: 0.0625

Explanation: To solve this question, we will use the pdf to find the required probability.
But first we must find c using the fact that total probability is 1.
Z1 Z1
x2 y2 1 c
ZZ 1
fX,Y (x, y)d(x, y) = cxy dxdy = c . . =
2 0 2 0 4
x,y 0 0

Since the total probability is 1 =⇒ c = 4. So, now the required probability is given by
 Z0.5Z0.5
x2 y2

1 1 0.5 0.5 1
P 0<X< , 0<Y < = 4xy dxdy = 4 . . = = 0.0625
2 2 2 0 2 0 16
0 0

∴ The answer is 0.0625

25
10. ( points) Let X be a uniformly distributed random variable with µx = 15 and σx2 = .
3
Calculate P (X > 17)
Course: Machine Learning - Foundations Page 7 of 18
Answer: 0.3

Explanation: Here, we have a uniformly distributed random variable X ∼ U(a, b). To


solve this question, we fill first find a, b to find the required probability. Now, given the
mean, we have
a+b
µX = = 15 =⇒ a = 30 − b
2
Substituting the above into the equation for variance.

2 (b − a)2 25
σX = = =⇒ (2b − 30)2 = 100 =⇒ b = 20 =⇒ a = 10
12 3
So, X ∼ U(10, 20). Finally, the required probability is given by

Z20
1 x 20 3
P (X > 17) = dx = = = 0.3
20 − 10 10 17 10
17

∴ The answer is 0.3.

11. (1 point) Let X be a continuous random variable with PDF



ax
 for 0 < x < 3
fX (x) = a(6 − x) for 3 ≤ x ≤ 6

0 otherwise

Calculate P (0 ≤ x ≤ 4)

Answer: 0.77

Explanation: To solve this question, we will use the pdf to find the required probability.
But first we must find a using the fact that total probability is 1.

Z3 Z6
x2
Z  
a 23 6
fX (x)dx = ax dx + a(6 − x) dx = x +a 6x −
2 0 2 3
x 0
 3   
9a 36 9
= + a 36 − − 18 − = 9a
2 2 2
Course: Machine Learning - Foundations Page 8 of 18
1
Since the total probability is 1 =⇒ a = . So, now the required probability is given by
9
Z3 Z4
x2
 
x 6−x 1 23 1 4
P (0 < X < 4) = dx + dx = x + 6x −
9 9 18 0 9 2 3
0 3  
1 1 16 9 2.5
= + 24 − − 18 − = 0.5 + ≈ 0.77
2 9 2 2 9
∴ The answer is 0.77

12. Let X be exponentially distributed with parameter λ, then which of the following is/are
true about the variance of X:

a. V (X) = E[X 2 ] − (E[X])2


b. V (X) = E[X − E[X]]2
c. V (X) = (E[X])2
d. V (X) = E[X 2 ]

Answer: A, B, C

Explanation: According to the definition of the variance,

V (X) = E (X − E(X))2 = E X 2 − E(X)2


  

So, Options A and B are correct.


1 1
For an exponential distribution with rate parameter λ, E(X) = and V (X) = 2 .
λ λ
2 2 2
Also, E(X ) = V (X) + E(X) = 2 .
λ
Given this information, we have option C is correct and D is incorrect.

∴ Options A, B and C are correct.

13. (1 point) Which of the following options is/are always true for three events A, B and C
of a random experiment?

a. If A ⊂ B then P (B|A) = 1
b. If B ⊂ A then P (B|A) = 1
Course: Machine Learning - Foundations Page 9 of 18
c. If B ⊂ A then P (A|B) = 1
d. If P (A|B) > P (A) then P (B|A) > P (B) (Assuming the events have non-zero
probabilities)

Answer: A, C, D

Explanation: To solve this question, we will use the definition of conditional probabil-
P (B ∩ A)
ity, P (B|A) = .
P (A)
P (B ∩ A) P (A)
Now, if A ⊂ B =⇒ B ∩ A = A =⇒ P (B|A) = = = 1.
P (A) P (A)
So, Option A and C are correct.

P (B ∩ A) P (B)
If however, we have B ⊂ A =⇒ B ∩ A = B =⇒ P (B|A) = = ̸= 1.
P (A) P (A)
So, Option B is incorrect.

Finally,
P (A|B) > P (A)
P (A ∩ B)
=⇒ > P (A)
P (B)
P (A ∩ B)
=⇒ > P (B)
P (A)
=⇒ P (B|A) > P (B)
So, Option D is correct.

∴ Options A, C and D are correct.

14. (1 point) Let the random experiment of selecting a number from a set of integers from
1 to 20, both inclusive. Assuming all numbers are equally likely to occur. Let A be
the event that the selected number is odd, B be the event that the selected number is
divisible by 3. Choose the correct option from the following:
A. A and B are dependent on each other.
B. A and B are independent on each other.
C. Can’t say

Answer: B
Course: Machine Learning - Foundations Page 10 of 18
Explanation: To solve this question, we will use the definition of independent events.
Two events A and B are said to be independent if the occurrence of one event does not
effect the other. That is, P (A|B) = P (A) or equivalently, P (A ∩ B) = P (A)P (B).

Let X be the random variable uniformly distributed between integers 1 to 20 inclu-


sive. We can compute,
10
P (A) = P (X is odd) = = 0.5
20
6
P (B) = P (X is divisible by 3) = = 0.3
20
3
P (A ∩ B) = P (X is odd and divisible by 3) = = 0.15
20
Since P (A)P (B) = (0.5)(0.3) = 0.15 = P (A ∩ B), A and B are independent.

∴ Option B is correct.

15. (1 point) Mayur rolls a fair die repeatedly until a number that is multiple of 3 is ob-
served. Let the random variable N represent the total number of times the die is rolled.
Find the probability distribution of N .
  k−1
2
 1
× , k = 1, 2, 3, . . .
A. fN (k) = 3 3

0 otherwise
  k−1
1
 2
× , k = 1, 2, 3, . . .
B. fN (k) = 3 3

0 otherwise
  k−1
1
 1
× , k = 1, 2, 3, . . .
C. fN (k) = 2 2

0 otherwise
  k−1
1
 5
× , k = 1, 2, 3, . . .
D. fN (k) = 6 6

0 otherwise

Answer: B

Explanation: Here, the random variable N , representing the number of dice rolls fol-
lows a geometric distribution. Here, the probability of success p, is the probability of
Course: Machine Learning - Foundations Page 11 of 18
2 1
getting a multiple of 3. So, we get p = = .
6 3
The pmf of a geometric distribution is given by fN (k) = (1 − p)k−1 p.
 k−1
1 1 2
For p = , we get fN (k) = × .
3 3 3
∴ Option B is correct.

16. (1 point) Shelly wrote an exam that contains 20 multiple choice questions. Each ques-
tion has 4 options out of which only one option is correct and each question carries 1
mark. She knows the correct answer of 10 questions, and for the remaining 10 questions,
she chooses the options at random. Assume that all the questions are independent. Find
the probability that she will score 18 marks in the exam.

   8  2
10 1 3
a. × ×
8 4 4
   8  2
10 1 3
b. × ×
2 4 4
   2  8
10 1 3
c. × ×
2 4 4
   8  2
10 3 1
d. × ×
8 4 4

Answer: A, B

Explanation: Here, we need to find the probability that Shelly scores 18 marks. We
know that she knows the answer of 10 questions, which guarantees her 10 marks. Let X
be the random variable denoting the number of questions she will guess correctly. This
will bring her total marks to be 10 + X.

Now, since she guesses on a total of 10 questions, each being independent with a proba-
bility of 0.25 of being correct, we can say that X follows a binomial distribution. That
is, X ∼ Bin(n = 10, p = 0.25).

Putting it all together we get -


 
10
P (marks = 18) = P (X + 10 = 18) = P (X = 8) = (0.25)8 (0.75)2
8
Course: Machine Learning - Foundations Page 12 of 18
So, Option A is correct.
       
n n 10 10
Also, using the property of binomial coefficients, = =⇒ = .
k n−k 8 2
So, Option B is also correct.

∴ Options A and B are correct.

17. (1 point) Suppose the number of runs scored of a delivery is uniform in {1, 2, 3, 4, 5, 6}
independent of what happens in other deliveries. A batsman needs to bat till he hits a
four. What is the probability that he needs fewer than 6 deliveries to do so? (Answer
the question correct to two decimal points.)

Answer: 0.6

Explanation: Let X be the random variable denoting the number of deliveries such
that the X th delivery is 4 runs. We can see that X here is a geometric
  random variable,
1 1
where p, the probability of success is . That is, X ∼ Geom p = .
6 6
 x−1
1 5
So the pmf of X becomes fX (x) = × , x = 1, 2, 3, . . .
6 6

Now, we can calculate the required probability -


5
X
P (X < 6) = P (X = x)
x=1
5  x−1
X 1 5
= .
x=1
6 6
 5
5
1−
1 6
= .
6 5
1−
6
 5
5
=1−
6
≈ 0.6
∴ The answer is 0.6
Course: Machine Learning - Foundations Page 13 of 18
18. (1 point) Let X and Y be two random variables with joint PMF fX,Y (x, y) given below.
Calculate covariance between X and Y .

X
1 2 3
Y

1 0.25 0.25 0

2 0 0.25 0.25

Answer: 0.25

Explanation: We can find the covariance between X and Y using the formula Cov(X, Y ) =
E(XY ) − E(X)E(Y ). Let us find the relevant expectations.
2
X
E(X) = xP (X = x) = 0.5 + (2)(0.5) = 1.5
x=1
3
X
E(Y ) = yP (Y = y) = 0.25 + (2)(0.5) + (3)(0.25) = 2
y=1

X
E(XY ) = xy P (X = x ∩ Y = y)
x,y
3 X
X 2
= xy P (X = x ∩ Y = y)
y=1 x=1

= (1)(0.25) + (2)(0.25) + (4)(0.25) + (6)(0.25)


= 3.25

Putting it all together we get -

Cov(X, Y ) = E(XY ) − E(X)E(Y ) = 3.25 − (2)(1.5) = 0.25

∴ The answer is 0.25.

19. (1 point) Let X and Y be two random variables with joint PMF fX,Y (x, y) given below.
Calculate fY |X=2 (2).
Course: Machine Learning - Foundations Page 14 of 18
X
1 2 3
Y

1 0.25 0.25 0

2 0.125 a1 0.125

Answer: 0.5

Explanation: To find the required probability, we first must find a1 . We can use the
law of total probability here.
X
P (X = x ∩ Y = y) =1
x,y

0.25 + 0.25 + 0.125 + a1 + 0.125 =1


0.75 + a1 =1
a1 =0.25

So, calculating the required probability -

fY |X=2 (2) = P (Y = 2|X = 2)


P (Y = 2 ∩ X = 2)
=
P (X = 2)
a1
= 3
P
P (X = 2 ∩ Y = y)
y=1
a1
=
0.25 + a1
0.25
=
0.5
= 0.5

∴ The answer is 0.5.

20. (1 point) Two random variables X and Y are jointly distributed with PDF
( y
ax + for x, y ∈ {0, 1}
fX,Y (x, y) = 4
0 otherwise
Course: Machine Learning - Foundations Page 15 of 18
Calculate the value of a.

Answer: 0.25

Explanation: For this question we can find the value of a by using the law of total
probability.
X
fX,Y (x, y) =1 (1)
x,y

fX,Y (0, 0) + fX,Y (1, 0) + fX,Y (0, 1) + fX,Y (1, 1) =1 (2)


   
1 1
(0) + (a) + + a+ =1 (3)
4 4
=⇒ 1 = 2a + 0.5 =⇒ a = 0.25 (4)

∴ The answer is 0.25

21. (1 point) A discrete random variable X has PMF as follows


(
k × (1 − x)2 for x = 1, 2, 3
P (X = x) =
0 otherwise

Calculate the value of k

Answer: 0.2

Explanation: For this question we can find the value of k by using the law of total
probability.
X
1= P (X = x)
x
=⇒ 1 = P (X = 1) + P (X = 2) + P (X = 3)
=⇒ 1 = (0) + (k) + (4k)

=⇒ 1 = 5k =⇒ k = 0.2

∴ The answer is 0.2.


Course: Machine Learning - Foundations Page 16 of 18
22. (1 point) A discrete random variable X has PMF as given below where a, b, c are con-
stants.

x 1 2 3 4
P (X = x) a b c 0.3

The CDF FX (x) is given below

x 1 2 3 4
FX (x) 0.2 0.6 0.7 d

Find the value of a + b + c + d.

Answer: 1.7

Explanation: Let us look the the pmf first. From the law of total probability, we know
4
X
that P (X = x) = 1 =⇒ a + b + c + 0.3 = 1 =⇒ a + b + c = 0.7.
x=1

Now, since X takes the values 1, 2, 3, 4, which means P (X ≤ 4) = 1 =⇒ FX (4) =


1 =⇒ d = 1.

So, a + b + c + d = 1.7

∴ The answer is 1.7.

23. (1 point) A series of four matches is played between India and England. Let the random
variable X represent the absolute difference in the number of matches won by India and
England. Find the set of possible values that X can take. (Assume that the match does
not result in a tie.)

A. {0,2,4}
B. {0,1,2,4}
C. {0,1,2,3,4}
D. {0,1,3,4}

Answer: A
Course: Machine Learning - Foundations Page 17 of 18
Explanation: Let Y be the random variable denoting the number of wins for India.
Since 4 matches are played, the number of wins for England is 4 − Y .
X is the absolute difference in the number of wins, we get, X = |Y − (4 − Y )| = |2Y − 4|.
Since Y represents number of wins, Y can take on the values 1, 2, 3, 4. This implies X
can take on the values 0, 2, 4.

∴ Option A is correct.

24. (1 point) There are five multiple choice questions asked in an exam. There is 70% chance
that Shelly will solve a question correctly and independent of the rest of the solution. Let
X be the random variable that represents the number of questions she solves correctly.
Which of the following is the probability mass function of X?


 0.00243 ,x = 0

0.02835 ,x = 1





0.1323 ,x = 2
A. P (X = x) =


 0.3087 ,x = 3

0.36015

 ,x = 4

0.16807 ,x = 5



 0.00243 ,x = 0

0.02835 ,x = 1





0.16807 ,x = 2
B. P (X = x) =


0.3087 ,x = 3



0.36015 ,x = 4

0.1323 ,x = 5



0.00243 ,x = 0

0.02835 ,x = 1





0.3087 ,x = 2
C. P (X = x) =


0.1323 ,x = 3

0.36015

 ,x = 4

0.16807 ,x = 5

Course: Machine Learning - Foundations Page 18 of 18


 0.00243 , x = 0

0.01835 , x = 1





0.1223 , x = 2
D. P (X = x) =


 0.2987 , x = 3



 0.37015 , x = 4

0.19807 , x = 5

Answer: A

Explanation: Let X be the random variable denoting the number of questions solved
correctly. Since there are 5 questions in total and they have independently a probability
of 0.7 to be solved correctly. This means that X follows a binomial distribution.
 
5
That is, X ∼ Binom(n = 5, p = 0.7) =⇒ P (X = x) = (0.7)x (0.3)5−x .
x
Simplifying for each value of x,


 0.00243 ,x = 0

0.02835 ,x = 1





0.1323 ,x = 2
P (X = x) =


 0.3087 ,x = 3



 0.36015 ,x = 4

0.16807 ,x = 5

∴ Option A is correct.

You might also like