Hypothesis Testing Guide
Hypothesis Testing Guide
Introduction – Hypothesis – Null and Alternative Hypothesis – Type 1 and Type II errors – Level of significance – one tail
and two tail tests – Tests concerning one mean and two means (Large and Small samples) – Tests on proportions.
Introduction: There are many problems, in which rather than estimating the value of a parameter we need to
decide whether to accept or reject a statement about the parameters. This statement is called a hypothesis and
the decision making procedure about the hypothesis is called hypothesis testing.
Test of Hypothesis: The procedure which enables us to decide on the basis of sample results whether a
hypothesis is true or not, is called test of hypothesis or test of significance.
Procedure for testing a Hypothesis: Test of Hypothesis involves the following steps:
Step 1: Statement (or assumption) of Hypothesis: There are two types of hypothesis. They are (i) Null
Hypothesis (ii) Alternative Hypothesis.
(i) Null Hypothesis: For applying the tests of significance, we first set up a hypothesis – a definite
statement about the population parameter. Such a hypothesis is usually a hypothesis of no
difference is called Null Hypothesis. It is in the form H 0 : 0 , where 0 is the value which
is assumed or claimed for the population characteristic. It is the reference point against which the
Alternative Hypothesis is set up.
(ii) Alternative Hypothesis: Any hypothesis which contradicts the Null Hypothesis is called an
Alternative Hypothesis. Usually denoted by H1. The two hypothesis H 0 and H1 are such that if
one is true, the other is false and vice versa.
For example: If we want to test the null hypothesis that the population has a specified mean μ0
(say) i.e., H 0 : 0 then the Alternative Hypothesis would be
(i) H 0 : 0 (i.e., either 0 or 0 )
(ii) H 0 : 0
(iii) H 0 : 0
The alternative hypothesis (i) is known as a two-tailed alternative and the alternative hypothesis
(ii) is known as right-tailed and in (iii) is known as left-tailed. One has to choose from the above three
forms depending on the situation posed.
Step 2: Specification of the Level of Significance: The level of significance denoted by α is the confidence with
which we rejects or accepts the null hypothesis H 0 i.e., it is the maximum possible probability with which we
are willing to risk an error in rejecting H 0 when it is true. In practice, we take either 5% i.e., 0.05 or 1% i.e.,
0.01 level of significance. Level of significance is also known as the size of the test. (If level of significance is
not given in the problem, usually we choose 5% level of significance)
Step 3: Identification of the Test Statistic: There are several tests of significance viz Z, t, F, etc., First we have
to select the right test depending on the nature of the information given in the problem, then we construct the
test criterion and select the appropriate probability distribution.
Step 4: Critical Region: The critical region is formed based on following factors:
(a) Distribution of the Statistic i.e., whether the statistic follows the normal, ‘t’, χ2 or F distribution
(b) Form of Alternative Hypothesis: If the form has ≠ sign, the critical region is divided equally in the
left and right tails, sides of the distribution.
If the form of alternative hypothesis has < sign, the entire critical region is taken in the left tail of
the distribution.
If the from of alternative hypothesis has > sing, the entire critical region is taken on the right side of
the distribution.
t E (t )
Compute the test statistic z under the Null Hypothesis. Here ‘t’ is a sample
S .E. of t
statistic and S.E. is the standard error of ‘t’.
Step 5: Conclusion: We compare the computed value of the test statistic Z with the critical value Z at given
level of significance (α).
If Z Z (that is, if the absolute value of the calculated value of Z is less than the critical value Z )
we conclude that it is not significant. We accept the null hypothesis.
If Z Z then the difference is significant and hence the null hypothesis is rejected at the level of
significance α.
For two-tailed test: - For single-tailed (right or left) test: -
If Z 1.96 then accept H 0 at 5% of LOS. If Z 1.645 then accept H 0 at 5% of LOS.
If Z 1.96 then reject H 0 at 5% of LOS If Z 1.645 then reject H 0 at 5% of LOS
If Z 2.58 then accept H 0 at 1% of LOS If Z 2.33 then accept H 0 at 1% of LOS
If Z 2.58 then accept H 0 at 1% of LOS If Z 2.33 then accept H 0 at 1% of LOS
ERRORS OF SAMPLING:-
TYPE - I ERROR: Reject H 0 when it is true.
If the Null Hypothesis H 0 is true but it rejected by test procedure, then the error made is called Type
-I error or α error.
Under Large sample tests, we will see four important tests to test the significance:
1. Testing of significance for single mean.
2. Testing of significance for difference of two means.
3. Testing of significance for single proportion.
4. Testing of significance for difference of two proportions.
TEST – 1: Test of Significance of a Single Mean – Large Samples
Let a random sample of size n (n≥30) has the sample mean x , and μ be the population mean. Also the
population mean μ has a specified value 0 .
Working Rule:-
1. The Null Hypothesis is H 0 : x = μ i.e., there is no significance difference between the sample mean and
population mean or the sample has been drawn from the parent population.
2. The Alternative Hypothesis is (i) H1 x (i.e., 0 ) or
(ii) H1 x (i.e., 0 ) or
(iii) H1 x (i.e., 0 )
Since n is large, the sampling distribution of x is approximately normal.
3. Level of Significance : Set the Level of Significance α.
4. The Test Statistic : We have the following two cases.
Case I : When the S.D. σ of population is known, in this case, S.E. of Mean ,
n
x
The Test statistic is z
n
Case II : When the S.D. σ of population is not known, in this case, S.E. of mean s
n
x
The Test statistic is z
s
n
5. Critical Value Z : Find the critical value Z of Z at the LOS α from the nor table.
Table: Critical Values of Z.
LOS (α) 1% 5% 10%
Critical region for 0 Z 2.58 Z 1.96 Z 1.645
Critical region for 0 Z 2.33 Z 1.645 Z 1.28
Critical region for 0 Z 2.33 Z 1.645 Z 1.28
Note: 1. We reject Null Hypothesis H 0 when Z 3 without mentioning any level of significance.
2. The Values x 1.96 are called 95% confidence limits for the mean of the population corresponding to
n
the given sample. Similarly x 2.58 are 99% and x 2.33 are called 98% confidence limits.
n n
Q1] A sample of 64 students have a mean weight of 70 kgs. Can this be regarded as a sample from a population
with mean weight 56 kgs and S.D. 25 kgs.
Q2] An oceanographer wants to check whether the depth of the ocean in a certain region is 57.4 fathoms, as
had previously been recorded what can be conclude at the 0.05 level of significance, if readings taken at 40
random locations in the given region yielded a mean of 59.1 fathoms with a S.D. of 5.2 fathoms.
Q3] The mean life time of a sample of 100 light tubes produced by a company is found to be 1560 hrs. with a
population S.D. of 90 hrs. Test the hypothesis for α=0.05 that the mean life time of the tubes produced by the
company is 1580 hrs.
Q4] A trucking firm suspects the claim that average life of certain tires is at least 28000 miles. To check the claim
the firm puts 40 of these tires on its trucks and gets a mean life time of 27463 miles with a S.D. of 1348 miles.
Can the claim be true?
Q5] An ambulance service claims that it takes on the average less than 10 minutes to reach its destination in
emergency calls. A sample of 36 calls has a mean of 11 minutes and the variance of 16 minutes. Test the claim
at 0.05 level of significance.
Q6] A sample of 400 items is taken from a population whose S.D. is 10. The mean of the sample is 40. Test
whether the sample has come from a population with mean 38. Also calculate 95% confidence interval for the
population mean.
Let x1 and x2 be the sample means of two independent large random samples sizes n 1 and n2 drawn
from two populations having means μ1 and μ2 and S.Ds. σ1 and σ2 .
If Z 3 then either the sample have not been drawn from the same, other wise accept H 0
Note: If the two samples are drawn from two populations with unknown S.Ds. σ1 and σ2 , then σ1 and σ2 can be
replaced by sample variances s12 and s22 provided both the samples n1 and n2 are large, In this case, the test
statistic is z
x x .
1 2
2
s s2
1
2
n1 n2
Q7] Two types of new cars produced in USA are tested for petrol mileage, one sample is consisting of 42 cars
averaged 15 kmpl. While the other sample consisting of 80 cars averaged 11.5 kmpl with population variances
as 2.0 and 1.5 respectively. Test whether there is any significance difference in the petrol consumption of these
two types of cars [use α=0.01]
Q8] The means of two large samples of sizes 1000 and 2000 members are 67.5 inches and 68.0 inches
respectively. Can the samples be regarded as drawn from the same population of S.D. 2.5 inches.
Q9] Samples of students were drawn from two universities and from their weights in kilograms, mean and S.Ds.
are calculated and shown below. Make a large sample test to test the significance of the difference between
the means.
Q10] The average marks scored by 32 boys is 72 with a S.D. of 8 while that for 36 girls is 70 with a S.D. of 6. Does
this indicate that the boys perform better than girls at level of significance 0.05?
Q11] The mean height of 50 male students who participated in sports is 68.2 inches with a S.D. of 2.5. the mean
height of 50 male students who have not participated in sports is 67.2 inches with a S.D. of 2.8. Test the
hypothesis that the height of students who participated in sports is more than the students who have not
participated in sports.
Suppose a large random sample of size n has a sample proportion p of members possessing a certain
attribute [i.e., proportion of successes]. To test the hypothesis that the proportion p in the population has a
specified value p0 .
2. Alternative Hypothesis is (i) H1 : P ≠ P0 (i.e., P>P0 or P<P0) or (ii) H1: P>P0 or (III)H1: P<P0
Since n is large, the sampling distribution of P is approximately normal.
3. Level of Significance : α
P P0 pP
4. If H0 is true, then the test statistic Z or z , which p is the sample proportion is
S .E. of P PQ
n
approximately normally distributed.
5. Conclusion: The critical region for z depending on the nature of H1 and level of significance α is given in the
following table:
Q12] A manufacturer claimed that at least 95% of the equipment which he supplied to a factory can formed to
specifications. An examination of a sample of 200 pieces of equipment revealed 18 were faulty. Test this claim
at 5% LOS.
Q13] In a sample of 1000 people in Karnataka 540 are rice eaters and the rest are wheat eaters. Can we assume
that both rice and wheat are equally popular in this state at 1% LOS?
Q14] In a bit city 325 men out of 600 men were found to be smokers. Does this information support the
conclusion that the majority of men in this city are smokers?
Q15] A die was thrown 9000 times and of these 3220 yielded a 3 or 4. Is this consistent with the hypothesis that
the die was unbiased?
Q16]Experience had shown that 20% of a manufactured product is of the top quality. In one day’s production
of 400 articles only 50 are top quality. Test the hypothesis at 0.05 LOS.
Let P1 and P2 be the sample proportions in two large random samples of sizes n1 and n2 drawn from two
populations having proportions P1 and P2.
To test whether the two samples have been drawn from the same population.
Q17] In two large populations, there are 30% and 25% respectively of fair haired people. Is this difference likely
to be hidden in samples of 1200 and 900 respectively from the two populations.
Q18] In a random sample of 1000 persons from town A, 400 are found to be consumers of wheat. In a sample
of 800 from town B, 400 are found to be consumers of wheat. Do these data reveal a significant difference
between town A and town B, so far as the proportion of wheat consumers is concerned?
Q19] In a sample of 600 students of a certain college 400 are found to use ball pens. In another college, from a
sample of 900 students 450 were found to use ball pens. Test whether 2 colleges are significantly different with
respect to the habit of using ball pens.
Q20]A machine puts out of 9 imperfect articles in a sample of 200 articles. After the machine is overhauled it
puts out 5 imperfect articles in a sample of 700 articles. Test at 5% LOS whether the machine is improved?
Q21] Among the items produced by a factory out of 800, 65 were defective in another sample out of 300, 40
were defective. Test the significance between the differences of two proportions at 1% LOS.
TEST OF SIGNIFICANCE [SMALL SAMPLES]
Introduction:- In the earlier chapter, we considered certain tests of significance based on the theory of the
normal distribution. The assumptions made in deriving those tests will be valid only for large samples. When
the sample is small (n < 30), we can use normal distribution to test for a specified population mean or difference
of two population means as in large sample tests only when the sample is drawn from a normal population
whose S.D., σ is known. If σ is not known, we cannot proceed as above.
Degree of Freedom (d.f.) :- The number of independent variates which make up the statistic is known as the
degree of freedom (d.f.) and it is denoted by [the letter ‘Nu’ of the Greek alphabet].
In general, the number of degrees of freedom is equal to the total number of observations less the
number of independent constraints imposed on the observations. For example, in a set of data of n
observations, if k is the number of independent constraints then =n – k.
t-Distribution (or) Student’s t-Distribution :- It is used for testing of hypothesis when the sample size is small
and population S.D. σ is not known.
Definition : If {x1, x2, …. , xn} be any random sample of size n drawn from a normal (or approximately normal)
x
population with mean μ and variance σ2, then the test statistic ‘t’ is defined by t , where x = sample
S n
mean and S
2 1 n
xi x
n 1 i 1
2
is an unbiased estimate of σ2 . The test statistic t
x
is a random
S n
variable having the t-distribution with =n – 1 degrees of freedom and with probability density function
1
t2 2
f (t ) y0 1
where =n – 1 and y0 is a constant got by f (t ).dt 1.
This is known as “Student’s t-
Properties of t-distribution :-
1. The shape of t-distribution is bell shaped, which is similar to that
of a normal distribution and is symmetrical about the mean.
2. The t-distribution curve is also asymptotic to the t-axis i.e., the
two tails of the curve on both sides of t=0 extends to infinity.
3. It is symmetrical about the line t = 0.
4. The form of the probability curve varies with degrees of freedom
i.e., with sample size.
1. To test the significance of the sample mean, when population variance is not given.
2. To test the significance of the mean of the sample i.e., to test if the sample mean differs significantly
from the population mean.
3. To test the significance of the difference between two sample means or to compare two samples.
4. To test the significance of an observed sample correlation coefficient and sample regression coefficient.
Q22] Find (a) t0.05 when =16 (b) t 0.01 when =10 (c) t 0.995 when =7.
Q24] A random sample of size 25 from a normal population has the mean 47.5 and S.D. 8.4. Does this
information tend to support or refute the claim that the mean of the population is 42.5?
Q25] Ten bearings made by a certain process have a mean diameter of 0.5060 cm with a S.D. of 0.0040 cm.
Assuming that the data may be taken as a random sample from a normal distribution, construct a 95%
confidence interval for the actual average diameter of the bearings?
Q26] A sample of size 10 was taken from a population S.D. of sample is 0.03. Find the maximum error with 99%
confidence.
Let a random sample of size n (n<30) has a sample mean x . To test the hypothesis that the population
mean μ has a specified value μ0 When population S.D. σ is not known.
1. Null Hypothesis : H0 : μ = μ0
2. Alternative Hypothesis : H1 : μ ≠ μ0
3. Level of Significance : α
Degrees of freedom = (n-1)
For n-1 d.f. the table value of t is tα/2 .
x
4. The test statistic is t , (OR)
s n 1
x
when ‘s’ directly given in the problem then we use the test statistic formula is t
s n
5. Conclusion: The calculated value of t is |t|. The table value of t is t / 2 .
(i) If |t| > t / 2 then the reject the Null Hypothesis at α LOS.
(ii) If |t| < t / 2 then the accept the Null Hypothesis at α LOS.
Q27] A mechanist is making engine parts with axle diameters of 0.700 inch. A random sample of 10 parts shows
a mean diameter of 0.742 inch with a S.D. of 0.040 inch. Compute the statistic you would use to test whether
the work is meeting the specification at 0.05 level of significance.
Q28] A machine is designed to produce installation washers for electrical devices of average thickness of 0.025
cm. A random sample of 10 washers was found to have a thickness of 0.024 cm. with a S.D. of 0.002 cm. Test
the significance of the deviation value of ‘t’ for 9 degrees of freedom at 5% LOS is 2.262.
Q29] A random sample of size 16 values from a normal population showed a mean of 53 and a sum of squares
of deviations from the mean equals to 150. Can this sample be regarded as taken from the population having
56 as mean? Obtain 95% confidence limits of the mean of the population.
Q30] A random sample of 10 boys had the following I.Q’s. 70, 120, 110, 101, 88, 83, 95, 98, 107 and 100.
(a) Do these data support the assumption of a population mean I.Q. of 100?
(b) Find a reasonable range in which most of the mean I.Q. values of samples of 10 boys lie.
Q31] The heights of 10 males of a given locality are found to be 70, 67, 62, 68, 61, 68, 70, 64, 64, 66 inches. Is it
reasonable to believe that the average height is greater than 64 inches? Test at 5% significance level assuming
that for 9 degrees of freedom (t = 1.833 at α = 0.05).
Q32] The life time of electric bulbs for a random sample of 10 from a large consignment gave the following data.
Item 1 2 3 4 5 6 7 8 9 10
Life in 1000 hrs. 1.2 4.6 3.9 4.1 5.2 3.8 3.9 4.3 4.4 5.6
Can we accept the hypothesis that the average life time of bulbs is 4000 hrs.
TEST – 6: Student’s ‘t’ Test for difference of Two Mean – Small Samples
Let x1 and x2 be the sample means of two independent small random samples sizes n1 and n2 drawn
from two normal populations having means μ1 and μ2 .
To Test whether two two population means are equal or to test whether the difference μ1 and μ2 is
significant.
1. Null Hypothesis : H 0 : μ1 = μ2
2. Alternative Hypothesis: H1 : μ1 ≠ μ2
3. Level of Significance : α
Degrees of freedom = n1 + n2 -2
At d.f. the table value of t is tα/2 .
x1 x 2 n s 2 n2 s22
4. The test statistic is t where S 2 1 1 (OR )
1 1 n1 n2 2
S
n1 n2
S 2
x x x x
1 1
2
2 2
2
n1 n2 2
6. Conclusion: The calculated value of t is |t|. The table value of t is t / 2 .
(iii) If |t| > t / 2 then the reject the Null Hypothesis at α LOS.
(iv) If |t| < t / 2 then the accept the Null Hypothesis at α LOS.
Q33] Samples of two types of electric light bulbs were tested for length of life and following data were obtained.
Type – I Type - II
Sample sizes 8 7
Sample Means 1234 hra 1036 hrs
Sample S.D’s. 36 hrs. 40 hrs
Is the difference in the means sufficient to warrant that Type-I is superior to Type-II regarding length of life.
Q34] The means of two random samples of sizes 9 and 7 are 196.42 and 198.82 respectively. The sum of the
squares of the deviations from the means are 26.94 and 18.73 respectively. Can the sample be considered to
have been drawn from the same.
Q35] Two independent samples of 8 and 7 items respectively had the following values:
Sample - I 11 11 13 11 15 9 12 14
Sample - II 9 11 10 13 9 8 10 -------
Is the difference between the means of samples significant?
Q36] To compare two kinds of bumper guards, 6 of each kind were mounted on a car and then the car was run
into a concrete wall. The following are the costs of repairs.
Guard-I 107 148 123 165 102 119
Guard-II 134 115 112 151 133 129
Use the 0.01 level of significance to test whether the difference between two sample means is significant.
Q37] The IQ’s of 16 students from one area of a city showed a mean of 107 with a S.D. of 10, while the IQ’s of
14 students from another area of the city showed a mean of 112 with a S.D. of 8. Is there a significant difference
between the IQ’s of the two groups at a 0.05 level of significance?
Q38] The mean life of a sample of 10 electric bulbs (or motors) was found to be 1456 hrs with S.D. of 423 hrs.
A second sample of 17 bulbs (or motors) chosen from a different batch showed a mean life of 1280 hrs with S.D.
of 398 hrs. Is there a significant difference between the means of two batches.
F-DISTRIBUTION
[SAMPLING DISTRIBUTION OF THE RATIO OF TWO SAMPLE VARIANCES]
Let S12 be the sample variance of an independent sample of size n1 drawn from a normal population
N(μ1, σ1 ). Similarly, Let S22 be the sample variance in an independent sample of size n2 drawn from another
2
` Consider the sampling distribution of the ratio of the variances of the two independent random samples
defined by
S12 12 22 S12
F 2 2 2 2 which follows F-Distribution with 1 n1 1 and 2 n2 1 d.f.
S2 2 1 S2
F-distribution can be used to test the equality of several population means, comparing sample variances
and analysis of variance.
Under the assumption (hypothesis) that two normal populations have the same variance i.e., σ12 = σ22,
S12
we have F 2 .
S2
F determines whether the ratio of two sample variances S12 and S22 is too small or too large. F is always
a positive number.
Properties of F-distribution :
Q39] For an F-distribution, find (a) F0.05 with 1 =7 and 2 =15 (b) F0.01 with 1 =24 and 2 =19
( c ) F0.95 with 1 =19 and 2 =24 (d) F0.99 with 1 =28 and 2 =12. (e) F0.95 with 1 =10 and 2 =20.
(f) F0.95 with 1 =15 and 2 =12 (g) F0.99 with 1 =5 and 2 =20.
Let two independent random samples of sizes n1 and n2 be drawn from two normal populations. To test
the hypothesis that the two population variances σ12 and σ22 are equal.
y y
2
n s2
S 2 2
2
n2 1 n2 1
2
5. Conclusion: If the calculated value of F > the tabulated value of F at α LOS, we reject the null hypothesis
H0 and conclude that the variances σ12 and σ22 are not equal. Other wise, we accept the Null Hypothesis
H0 and conclude that σ12 and σ22 are equal.
Note: F 1 , 2 is the value of F with 1 , 2 d.f. such that the area under the F-distribution to the right of F
is α. In the tables F tabulated for α=0.05 and α=0.01 for variances combinations of the d.f. 1 , 2 . Clearly
value of F at 5% significance in lower than at 1%.
Q40] In one sample of 8 observations from a normal population, the sum of the squares of deviations of the
sample values from the sample mean is 84.4 and in another sample of 10 observations it was 102.6. Test at 5%
level whether the population have the same variance.
Q41] It is known that the mean diameters of rivets produced by two firms A and B are practically the same, but
the standard deviations may differ. For 22 rivets produced by firm A, the S.D. is 2.9 mm which the 16 rivets
manufactured by firm B, the S.D. is 3.8 mm, compute the statistic you would use to test whether the products
of firm A have the same variability a those of firm B and test its significance.
Q42] Pumpkins were grown under two experimental conditions. Two random samples of 11 and 9 pumpkins,
show the sample S.Ds. of their weights as 0.8 and 0.5 respectively. Assuming that the weight distributions are
normal, test hypothesis that the true variances are equal.
Q43] The nicotine contents in milligrams in two samples of tobacco were found to be as follows:
Sample A 24 27 26 21 25 --------
Sample B 27 30 28 31 22 36
Can it be said that the two samples have come from the same normal population?
Q44] The time taken by workers in performing a job by method I and method II is given below:
Method I 20 16 26 27 23 22 -----
Method II 27 33 42 35 32 34 38
Do the data show that the variances of time distribution from population from which these samples are drawn
do not differ significantly?
Q45] The measurements of the output of two units have given the following results. Assuming that both
samples have been obtained from the normal populations at 5% significant level, test whether the two
populations have the same variance.
Unit A 14.1 10.1 14.7 13.7 14.0
Unit B 14.0 14.5 13.7 12.7 14.1
Chi-Square (χ2) Test
the right hand tail of χ2 - distribution curve, the entries are χ2 values. It is necessary to calculate values of χα2
for α>0.50 since χ2 curve or distribution is not symmetrical.
5. Mean = and variance = 2 .
The theoretical sampling distribution of the sample variance for random samples from normal
population is related to the chi-squared distribution as follows:
Let S2 be the variance of a random sample of size n, taken from a normal population having the variance
σ2.
Then 2
(n 1) S 2
n
x x
i
2
2
S
n
xi x
2
2
i 1 2
i 1 n 1
Is a value of a random variable having the χ2 - distribution with =n-1 dof.
Exactly 95% of χ2 – distribution lies between χ0.9752 and χ0.0252 and when σ2 is too small, χ2 falls to the
right of χ0.0252 and when σ2 is too large, , χ2 falls to the left of χ0.9752. Thus when σ2 is correct, χ2 – values are
to the left of χ0.9752 or to the right of χ0.0252.
APPLICATIONS OF χ2 DISTRIBUTION:
1. To test the goodness of fit.
2. To test the independence of attributes.
3. To test the homogeneity of independent estimation of the population variance.
4. To test the homogeneity of independent estimation of the population correlation coefficient.
We use this test to decide whether the discrepancy between theory and experiment is significant or
not. i.e., to test whether the difference between the theoretical and observed values can be attributed to
chance or not.
1. Null Hypothesis : H0 : There is no significant difference between the observed values and the
corresponding expected values.
2. Alternative Hypothesis : H1 : The above difference is significant.
3. Level of Significance : α, with dof =n-1.
4. The test statistic : Let O1, O2, …… On be a set of observced frequencies and E1, E2, ……., En the
Oi Ei 2
corresponding set of expected frequencies. Then the test statistic χ is given by
2 2
Ei
5. Conclusion : If the calculated value of χ2 > Table value of χ2 at α LOS, the Null Hypothesis H0 is rejected.
Otherwise H0 is accepted.
Conditions of validity :- Following are the conditions which should be satisfied before χ2 test can be applied.
1. The sample observations should be independent.
2. N, the total frequency is large i.e., N>50.
3. The constraints on the cell frequencies, if any, are linear.
4. No theoretical (or expected) frequency should be less than 10. If small theoretical frequencies occur,
the difficulty is overcome by grouping 2 or more classes together before calculating (O – E).
Note that the degrees of freedom is determined with the number of classes after regrouping.
Q46] The number of automobile accidents per week in a certain community are as follows: 12, 8, 20, 2, 14, 10,
15, 6, 9, 4. Are these frequencies in agreement with the belief that accident conditions were the same during
this 10 week period.
Q47] A sample analysis of examination results of 500 students was made. It was found that 220 student had
failed, 170 had secured a third class, 90 were placed in second class and 20 got a first class. Do these figures
commensurate with the general examination result which is in the ratio of 4:3:2:1 for the various categories
respectively.
Q48] A die is thrown 264 times with the following results show that the die is biased. [ 0.05 11.07 for 5 d.f.]
2
Q49] A pair of dice are thrown 360 times and the frequency of each sum is indicated below:
Sum 2 3 4 5 6 7 8 9 10 11 12
O.F. 8 24 35 37 44 65 51 42 26 14 14
Would you say that the dice are fair on the basis of the chi-squared test at 0.05 LOS?
Q50] 4 coins were tossed 160 times and the following results were obtained.
Number of Heads 0 1 2 3 4
Observed Frequency 17 52 54 31 6
Under the assumption that coins are balanced, find the expected frequencies of 0, 1, 2, 3 and 4 heads, and test
the goodness of fit (α = 0.05)
Q51] Fit a Poisson distribution to the following data and for its goodness of fit at LOS 0.05?
X 0 1 2 3 4
f 419 352 154 56 19
Def :- An attribute means a quality or characteristic. Examples of attributes are drinking, smoking, blindness,
honesty, beauty etc.,
Let the observations be classified according to two attributes and the frequencies Oi in the different
categories be shown in a two-way table, called contingency table.
We have to test on the basis of cell frequencies whether the two attributes are independent or not.
1. Null Hypothesis : H0 : There is no association between the attributes i.e., we assume that the two
attributes are independent.
2. Alternative Hypothesis : H1 : The two attributes are dependent.
3. Level of Significance : α
d.f. = [No. of rows – 1] X [No. o columns – 1]
For , will be taken from chi-square table.
2