Applied Statistics - Revision Exercise Solution
Question 1
Let X be the volume of a box of “Healthy Milk”, X ~ N(1.05, 0.022)
1−1.05
(a) P(X < 1) = P(Z < ) = P(Z < -2.5) = 0.5 - 0.4938 = 0.0062
0.02
(b) P(X > K) = 0.05,
P(1.05 < X < K) = 0.5 – 0.05 = 0.45
as P(0 < Z < 1.645) = 0.05 from table
𝐾−1.05
= 1.645, K = 1.05 + 0.02(1.645) = 1.0829
0.02
(c) Let Y be the number of boxes of “Healthy Milk” weighs less than 1.00 liter, Y ~ Bin(6, 0.0062)
P(Y = 1) = 6C1(0.0062)1(0.9938)5 = 0.0361
(d) Let W be the volume of a box of “Healthy Milk” after the change, W ~ N(1.03, 0.022)
1−1.03
P(W < 1) = P(Z < ) = P(Z < -1.5) = 0.5 - 0.4332 = 0.0668
0.02
The probability that a box of “Healthy Milk” weighing less than 1 liter will be increased after the
change.
Question 2
2
(a) k = 110 × 15 – 98 – 100 – 104 - … - 120 = 94
3
25
(a) Q1 = 104 grams (as i = 15 × 100 = 3.75 ↑ 4)
50
median = 115 grams (as i = 15 × 100 = 7.5 ↑ 8)
75
Q3 = 117 grams (as i = 15 × 100 = 11.25 ↑ 12)
(b) It is a left-skewed distribution, as Q2 – Q1 = 115 – 104 = 11 > Q3 – Q2 = 117 – 115 = 2
2
(c) sample mean price of a Japanese peach = 110 3(0.25) = $27.67
Median price of a Japanese peach = 115(0.25) = $28.75
Sample standard deviation price of a Japanese peach = 8.3124(0.25) = 2.0781
1
Question 3
(a) Define D = number of labor hours loss before the program - number of labor hours loss after the
program
Plant 1 2 3 4 5 6 7 8 9 10
Before 45 73 46 124 33 57 83 34 26 17
After 36 60 44 119 35 51 77 29 24 11
d 9 13 2 5 -2 6 6 5 2 6
Step 1: H0: µd = 0 v.s. H1: µd > 0
Step 2: As x = 5.2 and s = 4.08
5.2 − 0
t= = 4.03
4.08
10
Step 3: Reject the null hypothesis if t > 1.833, df = 10 − 1 = 9
Step 4: Since t = 4.03 > 1.833, H 0 is rejected at 5% significance level
There is sufficient evidence to conclude that weekly loss of labor hours due to accidents after the
implementation of the program is significantly less than before.
(b) Use the provided data in the table, after the program was put into operation
𝑥̅ = 48.6, 𝑠 = 31.0312, 𝑛 = 10, 𝑑. 𝑓. = 9, t(0.05, 9) = 1.833
31.0312 31.0312
90% C.I. of µ = (48.6 − 1.833 , 48.6 + 1.833 ) = (30.6129, 66.5871)
√10 √10
Question 4
230+500+98+172
(a) P(earn belows $40000) = 450+150+230+500+98+172 = 0.625
450+230+98
(b) P(Master-Degree holder) = 450+150+230+500+98+172 = 0.4863
98
(c) P(earn belows $20000 | Master) = 450+230+98 = 0.1260
150+500
(d) P(non-Master holder | earns $20000 or above) = = 0.4887
450+150+230+500
230
(e) P(Master-Degree holder | earns $35000) = 230+500 = 0.3151
2
Question 5
Step 1: H0: Central : Wan Chai : Causeway Bay = 1 : 1 : 1
H1: Central : Wan Chai : Causeway Bay ≠ 1 : 1 : 1
1
Step 2: Expected frequency for each center: 350 = 116.67
3
=
2 (100 − 116.67 ) (110 − 116.67 ) (140 − 116.67 )
2
+
2
+
2
= 7.43
116.67 116.67 116.67
Step 3: d.f. = 3 – 1 = 2, reject H0 whenχ2 > 5.991
Step 4: As 2 = 7.43 > 5.991, H0 is rejected at 5% significance level.
It is concluded that the ratio of patient took the training program at Central : Wan Chai : Causeway Bay is
different from “1 : 1 : 1”.
Question 6
Let 𝑋 be the weight of a bag of cookies. Then 𝑋~𝑁(510, 42 ).
500−510
(a) 𝑃(𝑋 < 500) = 𝑃 (𝑍 < 4
) = 𝑃(𝑍 < −2.5) = 0.5 − 0.4938 = 0.0062.
(b) 𝑃(𝑋 < 𝐾) = 0.1
𝑃(𝐾 < 𝑋 < 500) = 0.5 − 0.1 = 0.4
As 𝑃(−1.28 < 𝑍 < 0) = 0.4 from table
𝐾−510
= −1.28 ∴ 𝐾 = 510 + 4(−1.28) = 504.88
4
(c) Define T = X1 + X2. Then 𝑇 ~ 𝑁(1020, 32).
1030−1020
𝑃(𝑇 < 1030) = 𝑃 (𝑇 < ) = 𝑃(𝑇 < 1.77) = 0.5 + 0.4616 = 0.9616
√32
Question 7
(a) r = 0.4513, it is moderate positive correlation
(b) y = 7.7162 + 2.4324 x
(c) When the GPA is zero, the monthly income is $7716.2
For every extra unit of the GPA, the monthly income increases by $2432.4
(d) y = 7.7162 + 2.4324 (4) = 17.4459
i.e. estimated monthly income =$17445.9
The estimation is unreliable as it is estimated by extrapolation estimation.
3
Question 8
(a) For the test of population mean test score is lower than 60 marks, given population standard deviation =
13 marks:
Step 1: H0: μ = 60 v.s. H1: μ < 60
Step 2: 𝑥̅ = 58.3333, (from calculator)
58.3333−60
𝑧= 13⁄ = -0.4965
√15
Step 3: Critical value for one-tail test at 5% = -1.645
Reject H0 when z < -1.645
Step 4: Since -0.4965 > -1.645, H 0 is not rejected at 5% significance level.
There is no evidence to say that the population mean test score is lower than 60 marks.
(b) For the estimation of population proportion of employees who evaluate the training course positively,
25
given n = 31, 𝑝̂ = 31 = 0.8065
0.8065(1−0.8065) 0.8065(1−0.8065)
98% C.I. of p = (0.8065 − 2.33√ , 0.8065 + 2.33√ )
31 31
= (0.6412, 0.9718)
Question 9
1 2 3 2 1 1
(a) + 12 + 12 + 𝑘 + 12 + 12 = 1, 𝑘=4
12
1 2 3 1 2 1
(b) 𝐸(𝑋) = 1 (12) + 2 (12) + 3 (12) + 4 (4) + 5 (12) + 6 (12) = 3.5
1 2 3 1 2 1
(c) 𝑉𝑎𝑟(𝑋) = 12 (12) + 22 (12) + 32 (12) + 42 (4) + 52 (12) + 62 (12) − 3.52 = 1.917
(d) Use Y to denote profit made from each customer in a single purchase of optical mouse,
Y = (40 - 10)X = 30X
𝐸(𝑌) = 30𝐸(𝑋) = 30 × 3.5 = 105
Var(Y) = 302 𝑉𝑎𝑟(𝑋) = 900 × 1.917 = 1725.3
4
Question 10
Step 1: H0: by railway : by other transportation = 3 : 7
H1: by railway : by other transportation ≠ 3 : 7
Step 2: Expected number of passengers arrive the airport by railway = 200(0.3) = 60
Expected number of passengers arrive the airport by other transportation = 200(0.7) = 140
(𝑂−𝐸)2 (40−60)2 (160−140)2
χ2 = ∑ = + = 9.5238
𝐸 60 140
Step 3: d.f. = 1, critical value = 3.841, reject H0 when χ2 > 3.841
Step 4: As χ2 = 9.5238 > 3.841, H0 is rejected at 5% significance level.
There is sufficient evidence that the ratio of passengers arrives the airport by railway is different from 30%.
Hence, the statement made by the director is likely to be wrong.
Question 11
(a) As total probability = 1,
0.7 + 4k + 3k + 2k + k = 1,
0.7 + 10k = 1,
k = 0.03
(b) E(X) = 1(0.7) + 2(0.12) + 3(0.09) + 4(0.06) + 5(0.03) = 1.6
Var(X) = 12 (0.7) + 22 (0.12) + 32 (0.09) + 42 (0.06) + 52 (0.03) - 1.62 = 1.14
(X) = √1.14 = 1.0677
(c) Y = 3X - 5
E(Y) = 3E(X) - 5 = -0.2
(Y) = 3(X) = 3.2031
(d) Use W to denote the number of clients have exactly 5 mortgages, W ~ Bin(9, 0.03),
P(W ≤ 2) = p(0) + p(1) + p(2) = (0.97)9 + 9C1(0.03)1(0.97)8 + 9C2(0.03)2(0.97)7 = 0.9980
Question 12
40×2500+45×2700
(a) combined sample mean = 40+45
= $2605.8824
(b) Step 1. H0: μprimary – μsecondary = 0 vs H1: μprimary – μsecondary < 0
Step 2: calculated t = -3.67
Step 3: p-value = 0.0002
Step 4. For p-value = 0.0002 < 0.05, H0 is rejected.
There is sufficient evidence to conclude that the population mean spending on extra-curriculum
activities in a month by secondary school students is higher than that by primary school students
5
Question 13
(ai) All 2000 convenience stores of this convenience store company
(aii) Sample size of large convenience store = 750/2000×40 = 15
Sample size of medium convenience store = 600/2000×40 = 12
(b) I: by bus
(c) After 5% off discount, population mean spending = 120(0.95) = $114
For normal distribution, median = mean, so median spending = $114
200−150 1
(d) P(university graduate | less than 5 years working experience) = 900−400 = 10 = 0.1
(e) VI: Two-tailed z-test for a proportion
(f) VII: ANOVA test
(g) I: One-tailed z-test for a mean