Sampling
Sampling
Basic Definitions
∑𝒙𝒊 ∑ 𝒇𝒊 ⋅𝒙𝒊
• Average(Mean) : For ungrouped data 𝝁 = or For grouped data
𝒏 ∑𝒇𝒊
∑(𝒙𝒊 −ഥ𝒙)𝟐
• Standard Deviation: 𝝈 = 𝒏
• Population or Universe : The group of individuals from which we draw data for a study
• Sample : Finite subset of the population
• Sampling : The process of selecting a sample from a population
• Sample size : The number of individuals in a sample
• Parameter : The statistical constants such as mean S.D. of the population
• Statistics: The statistical constants such as mean S.D. of the sample
• Symbols which are used for population and sample: →
Population Sample
Parameter Statistics
Population size (𝑁) Sample size (𝑛)
Population Mean (𝜇) Sample Mean (𝜇𝑥ҧ )
Population S.D (𝜎) Sample S.D (𝜎𝑥ҧ 𝑜𝑟 𝑠)
• Sampling distribution: Let us consider a population of size 𝑁 and let us draw all possible samples
of a given size 𝑛. For each of these samples, we compute a statistic (i.e., sample mean, sample
variance, sample proportion etc..). The value of the statistic may vary from sample to sample.
• “The distribution of values of the statistic for different samples of the same size is called sampling
distribution of the statistic”.
• When we obtain a distribution of mean, it is called Sampling distribution of mean and when we
obtain a distribution of proportion, it is called Sampling distribution of proportion
• Standard error : The standard deviation of sampling distribution is called the standard error (S.E)
• The relation between mean of the sampling distribution and population mean: 𝝁ഥ𝒙 = 𝝁
• The relation between variance of the sampling distribution and population variance:
𝝈𝟐 𝝈𝟐 𝑵−𝒏
• With replacement → 𝝈𝟐ഥ𝒙 = without replacement → 𝝈𝟐ഥ𝒙 =
𝒏 𝒏 𝑵−𝟏
ഥ and Standard deviation (𝒔) in Calculator CASIO fx-991ES and onwards
Steps to find Mean 𝒙
For ungrouped data
Step1: MODE>>3: STAT>>1:1-VAR>>{Enter the x values} >> AC
• All possible samples of size 2 with replacement :{(3,3), (3,7), (3,11), (3,15), (7,3), (7,7), (7,11), (7,15), (11,3),
(11,7), (11,11), (11,15),(15,3),(15,7),(15,11),(15,15)}
• Sample means={3,5,7,9,5,7,9,11,7,9,11,13,9,11,13,15}
• Distribution of the sample means
𝑥 3 5 7 9 11 13 15
𝑓 1 2 3 4 3 2 1
∑(𝑓⋅𝑥)
• Mean of the sample means : 𝜇𝑥ҧ = = 9;
∑𝑓
∑ 𝑓⋅𝑥 2 1456
• 𝜎𝑥2ҧ = 𝜇𝑥ҧ 2 − 𝜇𝑥ҧ 2
= − 𝜇𝑥ҧ 2
= − 92 = 10
∑𝑓 16
• All possible samples of size 2 without replacement :{(3,7), (3,11), (3,15), (7,3),(7,11), (7,15), (11,3),(11,7),
(11,15),(15,3),(15,7),(15,11)}
• Sample means={5,7,9,11,13,5,9,11,7,9,13,9,11,13} (Example: Mean of (3,7) is (3+7)/2=5)
• Distribution of the sample means
𝑥 5 7 9 11 13
𝑓 2 2 4 2 2
∑(𝑓⋅𝑥) 108
• Mean of the sample means : 𝜇𝑥ҧ = = = 9;
∑𝑓 12
1052 80 20
• Variance of the sample distribution : 𝜎𝑥2ҧ = 𝜇𝑥ҧ 2 − 𝜇𝑥ҧ 2 = − 9 2 = 12 =
12 3
Conclusion
• Test statistic is a quantity derived from the sample for statistical hypothesis testing..
• Critical region : The test procedure divides the possible values of the test statistic into two regions namely an
acceptance region for 𝐻0 and a rejection region for 𝐻0 . The region where 𝐻0 is rejected is known as the critical
region
• Level of Significance (LoS) : The probability of rejecting 𝐻0 when it is true. Usually we take 5% and 1% LoS.
• One/Two tailed test : The nature of the critical region depends on the alternative hypothesis 𝐻1
• For example if 𝐻0 : 𝜇 = 𝜇0 and if 𝐻1 : 𝜇 < 𝜇0 then we use chose the critical region from one-tailed test(left)
if 𝐻1 : 𝜇 > 𝜇0 then we use chose the critical region from one-tailed test(right)
if 𝐻1 : 𝜇 ≠ 𝜇0 then we use chose the critical region from two-tailed test
• Test procedure
The steps in the application of a statistical test procedure for testing a null hypothesis are as follows:
• Setting up the null hypothesis.
• Setting up the alternative hypothesis.
• Identifying the test statistic.
• Setting a suitable level of significance such as 1% or 5%.(Default 5%)
• Identifying the critical region.
• Making decision based on calculated value 𝑇 and critical value 𝑇𝛼 .
• Accept 𝐻0 if 𝑇 < 𝑇𝛼 or Reject 𝐻0 if 𝑇 > |𝑇𝛼 |
Central Limit Theorem
the distribution of sample means approximates a normal distribution as the sample size gets larger, regardless of
the population's distribution, i.e., standardized sample mean is given by
ҧ
𝑥−𝝁
𝑧 = 𝝈/ 𝒏
Z-test
• For large samples (n ≥ 30), most of the sampling distributions tend to normality, and so, the test may be based
on normal distribution.
• Critical values for 𝑧 test and confidence limits for 𝑧 Two Tailed
Z-test for single mean
• For large samples (n ≥ 30), most of the sampling distributions tend to normality, and so, the test may be based
on normal distribution.
• Null Hypothesis 𝑯𝟎 : 𝝁 = 𝝁𝟎 (There is no significance difference between means of the population and
sample)
• 𝑧-statistic is given by
• 𝒙 ഥ → Mean of the sample,
• 𝜇0 → 𝐻0 (assuming equal to population mean)
• 𝜎 → Standard deviation
• 𝑛 → size of the sample
• Accept 𝐻0 if 𝑧 < 𝑧𝛼 or Reject 𝐻0 if 𝑧 > |𝑧𝛼 |
1) A sample of 900 members is found to have a mean of 3.4 cm. Can it be reasonably regarded as a truly random
sample from a large population with mean 3.25cm and SD 1.61cm.
Solution: 𝑛 = 900, 𝑥ҧ = 3.4, 𝜎 = 1.61,
H0 : 𝜇 = 3.25 𝑆𝑎𝑚𝑝𝑙𝑒 𝑖𝑠 𝑓𝑟𝑜𝑚 𝑡ℎ𝑒 𝑙𝑎𝑟𝑔𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑚𝑒𝑎𝑛 3.25 ,
𝐻1 : 𝜇 ≠ 3.25 ⇒ Two tailed test ⇒ in 5% 𝐿𝑜𝑆 𝑧𝛼 = 1.96, in 1% 𝐿𝑜𝑆 𝑧𝛼 = 2.58
ҧ
𝑥−𝝁
|𝑧| = = 2.795 > 𝑧𝛼 at both LoS
𝝈/ 𝒏
⇒ Reject 𝐻0 at both 5% and 1% LoS.
2) An ambulance service claims that it takes on the average less than 10 minutes to reach its destination in
emergency calls. A sample of 36 calls has a mean of 11 minutes and the variance of 16 minutes. Test the
significance at 0.05 level.
Solution: 𝑛 = 36, 𝑥ҧ = 11, 𝜎 2 = 16,
H0 : 𝜇 = 10 𝐻𝑖𝑠 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑠𝑒𝑟𝑣𝑖𝑐𝑒 𝑡𝑖𝑚𝑒 𝑖𝑠 10 𝑚𝑖𝑛𝑠 ,
𝐻1 : 𝜇 < 10 𝐻𝑖𝑠 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑠𝑒𝑟𝑣𝑖𝑐𝑒 𝑡𝑖𝑚𝑒 𝑖𝑠 < 10 𝑚𝑖𝑛𝑠 ⇒ One tailed test ⇒ in 5% 𝐿𝑜𝑆 𝑧𝛼 = 1.645
ҧ
𝑥−𝝁
𝑧 = = 1.5 < 𝑧𝛼 at 5% LoS
𝝈/ 𝒏
⇒ Accept 𝐻0 at 5% LoS. Hence Reject the ambulance claim
Z-test for difference of means
• For two large samples (n ≥ 30), if we want to check the difference of means then
• Null Hypothesis 𝑯𝟎 : 𝝁𝟏 = 𝝁𝟐 (There is no significance difference between means)
• 𝑧-statistic is given by
𝑥1 − 𝑥2
𝑧=
𝜎1 𝜎2
𝑛 +𝑛
1 2
• If we 𝜎1 and 𝜎2 are unknown and 𝑠1 and 𝑠2 are known and 𝜎1 ≠ 𝜎2 then 𝜎1 = 𝑠1 and 𝜎2 = 𝑠2
𝑛1 𝑠12 +𝑛2 𝑠22
• If we 𝜎1 and 𝜎2 are unknown and 𝑠1 and 𝑠2 are known and 𝜎1 = 𝜎2 then 𝜎1 = 𝜎2 = 𝑛1 +𝑛2
1) A company claims that its bulbs are superior to those of its main competitor. If a study showed that a sample of
40 of its bulbs has a mean life time of 647 hrs of continuous use with a S.D of 27 hrs. While a sample of 40
bulbs made by its main competitor had a mean life time of 638 hrs of continuous use with a S.D of 31 hrs. Test
the significance between the difference of two means at 5% LoS and 1% LoS.
Solution: 𝑛1 = 40, 𝑥1 = 647, 𝜎1 = 27, 𝑛2 = 40, 𝑥2 = 638, 𝜎2 = 31
𝐻0 : 𝜇1 = 𝜇2 (𝑞𝑢𝑎𝑙𝑖𝑡𝑦 𝑜𝑓 𝑏𝑜𝑡ℎ 𝑐𝑜𝑚𝑝𝑎𝑛𝑦 ′ 𝑠 𝑏𝑢𝑙𝑏𝑠 𝑎𝑟𝑒 𝑠𝑎𝑚𝑒),
𝐻1 : 𝜇1 > 𝜇2 , ⇒ One tailed test ⇒ in 5% 𝐿𝑜𝑆 𝑧𝛼 = 1.64, in 1% 𝐿𝑜𝑆 𝑧𝛼 = 2.33
𝑥1 −𝑥2
𝑧= = 7.47 > 𝑧𝛼 at both LoS
𝜎1 𝜎2
+
𝑛1 𝑛2
⇒ Reject 𝐻0 at both 5% and 1% LoS. ⇒ Accept 𝐻1 i.e., Company claim is accepted.
Z-test for significance of proportions
• For large samples (n ≥ 30), and proportions of success is given then 𝑧-statistic is given by
• 𝑋 → number of success in independent trials
• 𝑃 → probability of success in independent trials
• 𝑛 → size of the sample
𝑋
• If P is unknown then let 𝑝 = 𝑛 , then the probable limits for the proportion in the population is given by
𝑝 ± 2.58 𝑝𝑞/𝑛 𝑜𝑟 𝑝 ± 1.96 𝑝𝑞/𝑛
• Accept 𝐻0 if 𝑧 < 𝑧𝛼 or Reject 𝐻0 if 𝑧 > |𝑧𝛼 |
1) A coin was tossed 400 times and the head turned up 216 times. Test the hypothesis that the coin is unbiased
at 5% level of significance.
Solution: 𝑛 = 400, 𝑋 = 216, 𝑃 =prob. of head = 1/2 ⇒ 𝑄 = 1/2
𝐻0 : Coin is unbiased i.e. 𝐻1 : Coin is unbiased ⇒ Two tailed test ⇒ at 5% 𝐿𝑜𝑆 𝑧𝛼 = 1.96
𝑋 − 𝑛𝑃
|𝑧| = = 1.6 < 𝑧𝛼
𝑛𝑃𝑄
⇒ 𝐻0 is accepted. i.e., Coin is unbiased.
2) A dice was thrown 9000 times and a throw of 5 or 6 was obtained 3240 times on assumption of random
throwing. Do the data indicate an unbiased dice at 0.01 LoS?
(Hint: n=9000, X=3240, P= prob. of getting 5 or 6 = 2/6)
3) A survey was conducted in a slum locality of 2000 families by selecting a sample of size 800. It was revealed
that 180 families were illiterates. Find the probable limits of the illiterate families in the population of 2000 at
5% LoS.
𝑋 180
Solution: 𝑃 is unknown so 𝑝 = 𝑛 = 800
⇒ the probable limits at 5% LoS is 𝑝 ± 1.96 𝑝𝑞/𝑛 = 0.206, 0.2433
Probable number of illiterate families are 0.206 ⋅ 2000, 0.2433 ⋅ 2000 = 413, 487
4) A sample of 900 days is taken from metrological records of a certain district and 100 of them are found to be
foggy. What are the probable limits in percentage of foggy days in the district?
𝑋 1 𝑝𝑞
(Hint: 𝑛 = 900, 𝑋 = 100, 𝑝 = 𝑛 = 9 , 𝑝 ± 1.96 ⇒ (7.96% , 14.25%)
𝑛
Student’s t- Test
• Student’s t-test is used when sample size is less than 30 and
population S.D is unknown. 𝑛 > 30
• The probability density function of t-distribution is given by
Example: If the case is 𝑥 + 𝑦 + 𝑧 = 1, to find 𝑥(or 𝑦 or 𝑧) we have freedom to choose 2 values 𝑦 and 𝑧(𝑥&𝑧 or
𝑥&𝑦) arbitrarily, hence degrees of freedom is 2
Student’s t- Test for single mean
The following assumptions are made for the t-test
• The sample size is small (n<30)
• Null Hypothesis- 𝐻0 : 𝜇 = 𝜇0 (Sample mean=population mean)
ҧ 0
𝑥−𝜇
• t-statistic is given by 𝑡 = 𝑠/ with 𝑛 − 1 degrees of freedom 99% and 95% Confidence limit for 𝜇
𝑛−1
∑ 𝑥−𝑥ҧ 2
• 𝑥,ҧ 𝑠 are the mean and S.D if the sample 𝑠 2 =
𝑛
Problems
1) The nine items of a sample have the following values:45, 47, 50, 52, 48, 47, 49, 53, 51. Does the mean of
these differ significantly from the assumed mean of 47.5 at 5% LOS?
Hint: 𝐻0 : 𝜇 = 47.5, , 𝐻1 : 𝜇 ≠ 47.5, 𝑥ҧ = ∑𝑥𝑖 /𝑛, 𝑠 = ∑ 𝑥𝑖 − 𝑥ҧ 2 /𝑛 = 2.469, 𝑛 = 9, 𝑑𝑓 = 8
2) The heights of 10 males of a given locality are found to be 175,168,155,170, 152, 170, 175, 160, 160 and
165 cms. Based on this sample, find the 95% confidence limits for the height of males in that locality.
3) A fertilizer mixing machine is set to give 12 kg of nitrate for quintal bag of fertilizer. Ten 100 kg bags are
examined, the percentage of nitrate per bag are as follows: 11, 14, 13, 12, 13, 12, 13, 14, 11, 12. Are there
any reasons to believe that the machine is defective? Value of t for 9 degree of freedom is 2.262.
Hint: 𝐻0 : 𝜇 = 12, 𝑥ҧ = ∑𝑥𝑖 /𝑛,𝑠 = ∑ 𝑥𝑖 − 𝑥ҧ 2 /𝑛, 𝑛 = 10, 𝑑𝑓 = 9
4) A machinist is making engine parts with axle diameter of 0.7 inch. A random sample of 10 parts shows
mean diameter 0.742 inch with a SD of 0.04inch .On the basis of the sample, test whether the work is
meeting the specification? .
Hint: 𝐻0 : 𝜇 = 0.7, 𝐻1 : 𝜇 ≠ 𝑥,ҧ ⇒ two tailed test, 𝑥ҧ = 0.742, 𝑠 = 0.04, 𝑛 = 10, 𝑑𝑓 = 9
5) The mean lifetime of a sample of 25 bulbs is found as 1550 hours with a SD of 120h. The company
manufacturing the bulbs claims that the average life of their bulbs is 1600h. Is the claim acceptable at 5%
LOS?
6) The average breaking strength of steel rods is specified to be 18.5 thousand pounds. To test this a sample of
14 rods was tested. The mean and standard deviation obtained were 17.85 and 1.955 respectively. Is the
result of the experiment significant with 95% confidence?
𝒔
7) Show that 95% confidence limits for the mean µ of the population are 𝒙
ഥ ± 𝒕𝟎.𝟎𝟓 .Deduce that a
𝒏−𝟏
random sample of 16 values with mean 41.5 inches and the sum of the square of the deviation from
the mean 135 inches2 and drawn from a normal population ,95% confidence limits for the mean of
population are 39.9 and 43.1 inches.
Student’s t- Test for difference between means of two independent samples of mean
sizes 𝒏𝟏 and 𝒏𝟐
𝒙𝟏 −𝒙𝟐
t-statistic is given by 𝒕 =
𝒏𝟏 𝒔𝟐 𝟐
𝟏 +𝒏𝟐 𝒔𝟐 𝟏 + 𝟏
𝒏𝟏 +𝒏𝟐 −𝟐 𝒏𝟏 𝒏𝟐
which follows 𝑛1 + 𝑛2 − 2 degrees of freedom. 𝑠1 and 𝑠2 are the S.D of samples. If 𝑛1 = 𝑛2 = 𝑛 then
Problems
1) A group of 10 rats fed on a diet A and another group of 8 rats fed on a different diet B
recorded the following increase in weights in gms. Test whether the diet A is superior to diet B.
Diet A 5 6 8 1 12 4 3 9 6 10
Diet B 2 3 6 8 1 10 2 8 - -
2
∑𝑥1𝑖 ∑ 𝑥1𝑖 −𝑥1 2 ∑𝑥2𝑖 ∑ 𝑥2 𝑖 −𝑥2
𝑥1 = = 6.4, 𝑠1 = = 3.2 , 𝑥2 = = 5, 𝑠2 = = 1.7892
𝑛1 𝑛1 𝑛2 𝑛2
At 5% LoS, 𝑡𝛼 =1.746
𝒙𝟏 − 𝒙𝟐
𝑡= = 𝟎. 𝟐𝟖𝟔𝟑 𝒅. 𝒇. = 𝟏𝟎 + 𝟖 − 𝟐 = 𝟏𝟔, 𝒕 < 𝒕𝜶
𝒏𝟏 𝒔𝟐𝟏 + 𝒏𝟐 𝒔𝟐𝟐 𝟏 𝟏
+
𝒏𝟏 + 𝒏𝟐 − 𝟐 𝒏𝟏 𝒏𝟐
⇒ Accept 𝐻0 . That is reject the claim that Diet A is superior to Diet B
2) Sample of two types of electric bulbs were tested for length of life and the following data were obtained
Size Mean SD
Sample1 8 1234 h 36 h
Sample2 7 1036 h 40 h
Is the difference in the means sufficient to warrant that type 1 bulbs are superior to type 2 bulbs?
3) The table gives the biological values of protein from 6 cows’ milk and 6 buffalo’s milk. Examine whether the
differences are significant.
ഥ
𝒅
𝒕=
𝒔𝒅 / 𝒏 − 𝟏
1) Eleven school boys were given as test in drawing. They were given a month’s further tuition and a
second Test at of equal difficulty was held at the end of it. Do the marks give evidence that the
students have benefited by extra coaching?
Boys 1 2 3 4 5 6 7 8 9 10 11
Marks I test 23 20 19 21 18 20 18 17 23 16 19
Marks II test 24 19 22 18 20 22 20 20 23 20 18
Test whether the Poisson distribution is a good fit to this observed distribution.
3) To an observed frequency distribution, binomial distribution is fitted after estimating p from the
observed data. The observed and theoretical frequencies are given below
4) 10000 digits are randomly chosen from a telephone directory and the following
data is obtained. Test whether there is equi-distribution in the telephone director
at 1% level of significance.
5) According to a theory in Genetics, the proportion of beans of four types A, B, C and D in a
generation should be 9:3:3:1. In an experiment, among 1600 beans, the frequency of beans of
each of the above four types were 882, 313, 287 and 118 respectively. Does the result support the
theory?
:1
⇒ Unknown distribution ∶ 𝑝 =0
⇒ No pooling ∶ 𝑘 =0
4 − (1 + 0 + 0)
6) In order to test whether a die is biased, it is thrown 72 times and the results are tabulated as follows:
6 − (1 + 0 + 0)
7) A survey of 64 families with 3 children each is conducted and the number of male children in each family
is noted. The result are tabulated as follows: Apply chi-square test of goodness of fit to test whether male
and female children are equiprobable
Degrees of freedom = 𝑛 − 1 + 𝑝 + 𝑘 = 4 − 1 = 3
⇒ 𝜒𝛼2 with 3 𝑑. 𝑓. = 7.81 hence 𝜒 2 < 𝜒𝛼2 ⇒ 𝐻0 is accepted
𝝌𝟐 test of independent attributes
• Let 𝐴𝑖 and 𝐵𝑖 are the two attributes which characterize the population sample of size m and 𝑛.
• 𝐻0 : The two attributes 𝐴 and 𝐵 are independent
𝑜𝑖∗ × 𝑜∗𝑗 𝑅𝑜𝑤 𝑡𝑜𝑡𝑎𝑙×𝐶𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙
• We compute expected frequency 𝑒𝑖𝑗 using 𝑒𝑖𝑗 = 𝑁 = 𝐺𝑟𝑎𝑛𝑑 𝑡𝑜𝑡𝑎𝑙
𝑜𝑖 −𝑒𝑖 2
• Compute 𝜒 = ∑𝑛𝑖=1
2
with 𝑚 − 1 × 𝑛 − 1 degrees of freedom.
𝑒𝑖
1) The following table shows the result of an experiment to investigate the effect of vaccination
induced on the animals against a particular disease. Use χ2- test to test the hypothesis that
there is no difference between and vaccinated and unvaccinated groups i.e. vaccination and this
disease are independent
vaccinated 9 42 51
𝑜1 = 9, 𝑜2 = 42, 𝑜3 = 17, 𝑜4 = 28
Not
17 28 45
vaccinated 𝑛 = 96, d.f.=𝑛 − 1 =95
Total 26 70 96
Exercise Problems:
1. A sample of 12 measurements of the diameter of a metal ball gave the mean xത = 7.38𝑚𝑚 With S.D. 𝑠 = 1.24𝑚𝑚.
Find i) 95% ii) 99% confidence limits for the actual diameter.
𝑠
Hint: 𝑛 = 12 < 30 ⇒ 𝑡 − 𝑡𝑒𝑠𝑡, 𝑥ҧ = 7.38, 𝑠 = 1.24, find the confidence limit using 𝑥ҧ ± 𝑡𝛼 𝑛−1
2. A random sample of size 16 values from a normal population showed a mean of 53 and a sum of squares of deviation
from the mean is equal to 150. Can this sample be regarded as taken from the population having 56 as mean? Obtain
95% and 99% confidence limits of the mean of the population.
2 ∑ 𝑥𝑖−𝑥ҧ 2
Hint: 𝑛 = 16 < 30 ⇒ 𝑡 − 𝑡𝑒𝑠𝑡 , 𝑥ҧ = 53, ∑ 𝑥𝑖 − 𝑥ҧ = 150, find 𝑠= , 𝐻0 : 𝜇 = 56 (sample mean =
𝑛
population mean)
3. A certain stimulus administered to each of the 12 patients resulted in the following change in blood pressure. 5,2, 8, -
1,3,0,6,-2,1,5,0,4. Can it be concluded that the stimulus will increase the blood pressure?
Hint: 𝑑𝑖 are given, 𝑛 = 12 < 30 ⇒ 𝑡 − 𝑡𝑒𝑠𝑡, dependent samples, 𝐻0 : 𝑆𝑡𝑖𝑚𝑢𝑙𝑢𝑠 𝑖𝑛𝑐𝑟𝑒𝑎𝑠𝑒𝑠 𝑡ℎ𝑒 𝑏𝑙𝑜𝑜𝑑 𝑝𝑟𝑒𝑠𝑠𝑢𝑟𝑒
4. Test the equality of standard deviations for the data given below at 5% level of significance: n1=10 ; n2=14; s1=1.5 ;
s2=1.2
Hint: Equality of s.d. ⇒ F-test
5. The mean height and the SD height of 8 randomly chosen soldiers are 166.9 and 8.29 cm respectively. The
corresponding values of 6 randomly chosen sailors are 170.3 and 8.50 cm respectively. Based on this data, can we
conclude that soldiers are, in general, shorter than sailors?
Hint: 𝑛1 , 𝑛2 < 30 ⇒ 𝑡 − 𝑡𝑒𝑠𝑡 , 2 independent samples
6. According to the norms established for a mechanical aptitude test, persons who are 18 years old have an average height of
73.2 inches with a standard deviation of 8.6 inches. If 4 randomly selected persons of that age averaged 76.7, test the
hypothesis against the alternative hypothesis at the 0.01 level of significance.
Hint: 𝐻0 : 𝜇 = 73.2, 𝑛 = 4 < 30 ⇒ 𝑡 − 𝑡𝑒𝑠𝑡 ,single mean 𝑥ҧ = 76.7, 𝑠 = 8.6
7. The heights of 10 males of a given locality are found to be 175,168,155,170, 152, 170, 175, 160, 160 and 165 cms. Based
on this sample, find the 95% confidence limits for the height of males in that locality.
𝑠
Hint: 𝑛 = 10 < 30 ⇒ 𝑡 − 𝑡𝑒𝑠𝑡, 𝑥,ҧ 𝑠, find the confidence limit using 𝑥ҧ ± 𝑡𝛼 𝑛−1
8. Sample of students were drawn from two universities and from their weights in kilograms, mean and standard deviations
are calculated and shown below. Make a large sample test to test the significance of the difference between the means.
Mean S.D Sample size
University A 55 10 400
University B 57 15 100.
Hint: 𝜎1 ≠ 𝜎2 are unknown and 𝑠1 , 𝑠2 are known, 𝑯𝟎 : 𝝁𝟏 = 𝝁𝟐 (There is no significance difference between means)
9. A sample of 6 persons in an office reveled an average daily smoking of 10,12,8,9,16,5 cigarettes. The average level of
smoking in the whole office has to be estimated at 90% level of confidence. t = 2.015 for 5 degree of freedom..
5 𝑠 5
Hint: 𝑛 = 6 < 30 ⇒ 𝑡 − 𝑡𝑒𝑠𝑡, 𝑥,ҧ 𝑠, find the confidence limit using 𝑥ҧ ± 𝑡0.1 , given 𝑡0.1 = 2.015
𝑛−1
10. Ten individuals are chosen at random from a population and their heights are found to be in inches 63, 63, 64, 66, 69, 69,
70, 70, 71. Discuss the suggestion that the mean height of universe is 65
Hint: 𝐻0 : 𝜇 = 65, 𝑛 = 9 < 30 ⇒ 𝑡 − 𝑡𝑒𝑠𝑡 ,single mean, calculate 𝑥,ҧ 𝑠
11. The mean life time of sample of 100 fluorescent light bulbs produced by a company is computed to be 1570 hours with a
standard deviation of 120 hours. The company claims that the average life of the bulbs produced by it is 1600 hours. Using
the level of significance of 0.05, is the claim acceptable?
Hint: 𝑛 = 100 > 30 ⇒,z-test, 𝑥ҧ = 1570, 𝜎 = 120, H0 : 𝜇 = 1600(The avg life of the bulbs produced by the company is 1600)
12. Fit a Poisson distribution for the following data and test the goodness of fit.
x: 0 1 2 3 4 5 6 Total
f: 273 70 30 7 7 2 1 390.
Hint: 𝑛 = 7, goodness of the fit ⇒ 𝜒 2 -test
H0 : ∑𝑜𝑖 = ∑𝑒𝑖 , 𝑑𝑓: −1
∑𝑓 𝑥
𝑥ҧ = ∑𝑓𝑖 𝑖 = 0.5 = 𝑚(parameter of the Poisson distribution) 𝑝 = 1 ,
𝑖
𝑒𝑖 : 283.19,90.62, 14.49, 1.54, 0.12,0,0 Pooling 3 data ⇒ 𝑘 = 3 ⇒ 𝑑. 𝑓 = 𝑛 − 1 + 𝑝 + 𝑘 = 2
13. Among 64 offspring of a certain cross between Guinea pigs 34 were red, 10 were black and 20 were white. According to
the genetic model these numbers should be in the ratio 9:3:4. Are the data consistent with the model at 5% level?
64 10 20
Hint: 𝑜𝑖 : 34,10,20, 9:3:4⇒ total 16, 𝑒𝑖 → 9 ⋅ 16 , 3 ⋅ 16 , 4 ⋅ 16