0% found this document useful (0 votes)
653 views43 pages

BCS301.Module 5

Uploaded by

patilchinmay510
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
653 views43 pages

BCS301.Module 5

Uploaded by

patilchinmay510
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

5.

1 The ANOVA Technique


Introduction:

The analysis of variance (ANOVA) is a statistical technique to test whether the means of three or
more populations are equal or not. This technique was developed by R A Fisher. This technique
is widely used in Professional Business and Physical Sciences.

In this technique, variance is splitted into two parts:


(i)Variance between samples (Columns) (ii) Variance within samples (Rows)

A table showing the source of variation, the sum of squares, degrees of freedom, mean squares
and the formula for the F ratio is called ANOVA table.

If the given data is classified according to one factor, the classification is called one way
classification. Then ANOVA table for one-way classification is to be constructed.

If the given data is classified according to two factors, the classification is called two-way
classification. Then ANOVA table for two-way classification is to be constructed.

Analysis of variance is based on the following assumptions:

(i) The samples are independently drawn the population.


(ii) Populations from which the sample are selected are normally distributed.
(iii) Each of the population have the same variance.

1
ANOVA table for one-way classification:

Source of Sum of squares Degrees of Mean squares Ratio


variation
freedom
Between samples SSC

Within samples SSE

Total SST - -

Expansion of abbreviations:
SSC – Sum of squares between samples (Columns)
SSE – Sum of squares within sample (Rows)
SST – Total sum of squares of variations
MSC – Mean squares of variations between samples (Columns)
MSE - Mean squares of variations within samples (Rows)

Notations:
Total sum all the observations
Number of observations.
Number of columns.

How to find SSC and SSE?

Working rule:

(i) Assume all are equal.


(ii) Construct ANOVA tale for one-way classification.
(iii) Under
(iv) If calculated value < tabulated value, accept Reject otherwise.

2
1. Three different machines are used for a production. On the basis of the outputs, test
whether the machines are equally effective.

Output
Machine 1 Machine 2 Machine 3
10 9 20
5 7 16
11 5 10
10 6 4

Solution:

Null hypothesis . All the three Machines are equally effective.

To find: SSC and SSE

Output

10 100 9 81 20 400
5 25 7 49 16 256
11 121 5 25 10 100
10 100 6 36 4 16
36 346 27 191 50 772

3
Construction of ANOVA table for one-way classification:

Source Sum of Degrees of Mean squares Ratio


of
squares freedom
variation
Between
samples

Within
samples

Calculated value:

Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples

Comparison:

Calculated value

Critical value

Calculated value Tabulated value.

Conclusion:

Accept All the three Machines are equally effective.

4
2. Three samples each of size 5 were drawn from three uncorrelated normal populations
with equal variances. Test the hypothesis that the population means are equal at
level.

Sample 1 10 12 9 16 13
Sample 2 9 7 12 11 11
Sample 3 14 11 15 14 16
Solution:

Null hypothesis . All the three samples have equal population means.

To find: SSC and SSE

Output

10 100 9 81 14 196
12 144 7 49 11 121
9 81 12 144 15 225
16 256 11 121 14 196
13 169 11 121 16 256
60 750 50 516 70 994

5
Construction of ANOVA table for one-way classification:

Source Sum of Degrees of Mean squares Ratio


of
squares freedom
variation
Between
samples

Within
samples

Calculated value:

Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples

Comparison:

Calculated value

Critical value

Calculated value Tabulated value.

Conclusion:

Reject All the three samples have not equal population means.

6
3. A Manager of a merchandizing firm wishes to test whether its three salesmen A, B, C
tend to make sales of the same size or whether they differ in their selling abilities.
During a week there have been 14 sales calls, A made 5 calls, B made 4 calls and C
made 5 calls. Following are the weekly sales record ( in rupees) of the three salesmen:
A 500 400 700 300 600
B 300 700 400 600
C 500 300 500 400 300
Perform the analysis of variance and draw your conclusions.

Solution:
The sales data have a common factor 100. Divide all the above values by 100.

5 4 7 3 6
3 7 4 6
5 3 5 4 3

Null hypothesis .

All the three salesmen tend to make sales of the same size.

To find: SSC and SSE

Output

5 25 3 9 5 25
4 16 7 49 3 9
7 49 4 16 5 25
8 64 6 36 4 16
6 36 - - 3 9
30 190 20 110 20 84

7
Construction of ANOVA table for one-way classification:

Source Sum of Degrees of Mean squares Ratio


of
squares freedom
variation
Between
samples

Within
samples

Calculated value:

Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples

Comparison:

Calculated value

Critical value

Calculated value Tabulated value.

Conclusion:

Accept All the three salesmen tend to make sales of the same size.

8
4. Three samples of five, five and four car tyres are drawn respectively from three
brands A, B, and C manufactured by three machines. The lifetime of these tyres (per
1000 miles) is given below. Test whether the average life time of the three brands of
tyres are equal or not.

A 35 40 33 36 31
B 30 25 34 28 33
C 28 24 30 26 -

Solution:

Subtract 30 from each of the given values.

Null hypothesis .

The average lifetime of three brands of tyres are equal.

To find: SSC and SSE

Output

9
Construction of ANOVA table for one-way classification:

Source Sum of Degrees of freedom Mean squares Ratio


of
squares
variation
Between
samples

Within
samples

Calculated value:

Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples

Comparison:

Calculated value

Critical value

Calculated value Tabulated value.

Conclusion:

Reject The average lifetime of three brands of tyres are not equal.

10
5. To assess the significance of possible variation in performance in a certain test
between the grammar school of a city, a common test was given to a number of
students taken at random from the senior fifth class of each of the four schools
concerned. The results are given below. Make an analysis of variance data.
Schools
Solution:
8 12 18 13
Subtract 10 from each of the given values. 10 11 12 9
12 9 16 12
8 14 6 16
7 4 8 15

Null hypothesis

Samples have come from the same universe.

To find: SSC and SSE

Output

11
Construction of ANOVA table for one-way classification:

Source Sum of Degrees of Mean squares Ratio


of
squares freedom
variation
Between
samples

Within
samples

Calculated value:

Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples

Comparison:

Calculated value

Critical value

Calculated value Tabulated value.

Conclusion:

Accept Samples have come from the same universe.

12
6. The three samples below have been obtained from normal populations with equal
variances. Test the hypothesis that the sample means are equal.

Solution:

Subtract 10 from each of the given values.

Null hypothesis

All the three samples have taken from the same population.

To find: SSC and SSE

13
Construction of ANOVA table for one-way classification:

Source Sum of Degrees of freedom Mean squares Ratio


of
squares
variation
Between
samples

Within
samples

Calculated value:

Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples

Comparison:

Calculated value and Critical value

Calculated value Tabulated value.

Conclusion:

Reject All the three samples have not taken from the same population.

14
7. The following table gives the yields on 15 sample plots under three varieties of seeds:

Find out if the average yields of land under different varieties of seeds show
significant differences.

Solution: Subtract 20 from each of the given values.

Null hypothesis

The average yields of land under different varieties of seeds do not show significant
differences.

To find: SSC and SSE

15
Construction of ANOVA table for one-way classification:

Source Sum of Degrees of Mean squares Ratio


of
squares freedom
variation
Between
samples

Within
samples

Calculated value:

Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples

Comparison:

Calculated value and Critical value

Calculated value Tabulated value.

Conclusion: Reject

16
The average yields of land under different varieties of seeds show significant differences.

8. Test the significance of the variation of the retail prices of a commodity in three
cities Mumbai, Chennai and Bengaluru. Four shops were chosen at random in each
city and prices observed in rupees were as follows:

Do the data indicate that the prices in the three cities are significantly different?

Solution:

Subtract 10 from each of the given values.

Null hypothesis

There is no significant difference in the prices in the three cities.

To find: SSC and SSE

17
Construction of ANOVA table for one-way classification:

Source Sum of Degrees of Mean squares Ratio


of
squares freedom
variation
Between
samples

Within
samples

Calculated value:

Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples

Comparison:

Calculated value and Critical value


18
Calculated value Tabulated value.

Conclusion: Accept

The prices in the three cities are not significantly different.

ANOVA for two-way classification

In a two-way classification, the data are classified according to two different criteria or factors.

Expansion of abbreviations:
SSC – Sum of squares between columns CF – Correction Factor
SSR – Sum of squares between rows MSC – Mean squares of variations between columns
SST – Total sum of squares of variations MSR – Mean squares of variations between rows
SSE – Sum of squares due to errors MSE - Mean squares of variations between rows

ANOVA table for two-way classification:

Source of Sum of Degrees of Mean squares Ratio


variation
squares freedom
Between columns SSC

Between rows SSR

Error SSE

19
How to find SSC, SSE and SST from the following table?

Total

Total

Notation:
Column totals Grand total
Row Totals N – Total number of elements

Working rule:

(v) Assume There is no significant difference between rows and between columns.
(vi) Construct ANOVA table for two-way classification.
Under

(vii) Find tabulated value at level at degrees of freedom.


Where - Degrees of freedom of the numerator
- Degrees of freedom of the denominator
(viii) If calculated value tabulated value, accept Reject otherwise.

20
1. In a certain factory, production can be accomplished by four different workers on
five different machines. A simple study, in context of a two-way design without
repeated values, is being made with two-fold objectives of examining whether the four
workers differ with respect to mean productivity and whether the mean productivity
is the same for the five different machines. The researcher involved in this study
reports while analyzing the gathered data as under:
(i) Sum of squares of variance between machines is 35.2
(ii) Sum of squares of variance between workmen is 53.8
(iii) Sum of squares for total variance is 174.2
Setup ANOVA table for the given information and draw the inference about variance
at 5% level of significance.

Solution:
Null hypothesis
(i) No significant difference between mean productivity of the four workers (Between
columns).
(ii) No significant difference between mean productivity of five different machines
(Between rows).

By data, ,

Therefore,

21
ANOVA table for two-way classification:

Sum of Degrees of Mean squares Ratio


squares freedom
Between
columns

Between
rows

Error

Source of Calculated Critical value at Comparison


variation
value
Between
workers
Between
machines

(i) Accept
22
No significant difference between mean productivity of the four workers.
(ii) Accept
No significant difference between mean productivity of the five machines.

2. The following table gives the number of refrigerators sold by 4 salesmen in three
months May, June and July:
Month Salesmen
A B C D
May 50 40 48 39
June 46 48 50 45
July 39 44 40 39

(i) Is there any significant difference in the sales made by the four salesmen?
(ii) Is there any significant difference in the sales made during different months?

Solution:

The given data are coded by subtracting 40 from each observation.

Sum of observations: Sum of square of observations:


A B C D Total
May
June
July
Total

Null hypothesis
(i) There is no significant difference in the sales made by the four salesmen (between
columns).
(ii) There is no significant difference in the sales made during different months
(between rows).

To find: SSC, SSR, SSE

23
ANOVA table for two-way classification:

Source of Sum of Degrees of freedom Mean squares Ratio


variation
squares
Between
columns

Between
rows

Error

Source of Calculated Critical value at Comparison Conclusion


variation
value level
Between
Salesmen
Between
months

Conclusion:

(i) Accept There is no significant difference in the sales made by the four salesmen.
24
(ii) Accept There is no significant difference in the sales made during different months.

3. Perform a two-way ANOVA on the data given below:


Plot of land Treatment
A B C D
I 38 40 41 39
II 45 42 49 36
III 40 38 42 42

(i) Is there any significant difference between the treatments?


(ii) Is there any significant difference between the Plots?

The given data are coded by subtracting 40 from each observation.

Sum of observations: Sum of square of observations:

A B C D Total
I
II
III
Total

Null hypothesis
(i) There is no significant difference between treatments (Between columns).
(ii) There is no significant difference between plots (Between rows).

To find: SSC, SSR, SSE

25
ANOVA table for two-way classification:

Source of Sum of Degrees of Mean squares Ratio


variation
squares freedom
Between
columns

Between
rows

Error

Source of Calculated Critical Comparison Conclusion


variation
value value at
level
Between
Salesmen

Between
months

Conclusion:

(i) Accept There is no significant difference between treatments.


(ii) Accept There is no significant difference between plots.

26
4. A tea company appoints 4 salesmen A, B, C and D and observes their sales in 3
seasons- Summer, Winter and Monsoon. The figures (in lakhs) are given in the
following table:

Seasons Salesman A Salesman B Salesman C Salesman D


Summer 36 36 21 35
Winter 28 29 31 32
Monsoon 26 28 29 29

(i) Do the salesmen significantly differ in performance?


(ii) Is there significant difference between the seasons?

Solution: The given data are coded by subtracting 30 from each observation.
Seasons Salesman A Salesman B Salesman C Salesman D
Summer
Winter
Monsoon

Null hypothesis
(i) There is no significant difference between the performance of salesmen (Between
columns).
(ii) There is no significant difference between the seasons (Between rows).
Sum of observations: Sum of square of observations:
A B C D Total
36 36 81
Summer 8
4 1
Winter
Monsoon
Total

To find: SSC, SSR, SSE

T Total sum of observations

= Total number of observations

27
ANOVA table for two-way classification:

Source of Sum of Degrees of Mean squares Ratio


variation
squares freedom
Between
columns

Between
rows

Error

Source of Calculated Critical value at Comparison Conclusion


variation
value level
Between
Salesmen
Between
seasons

Comparison:

(i) Accept There is no significant difference between the performance of salesmen.


(ii) Accept There is no significant difference between the seasons.

28
5. To study the performance of three detergents and three different water temperatures
the following whiteness readings were obtained with specially designed equipment.

Water temperature Detergent A Detergent B Detergent C


Cold water 57 55 67
Warm water 49 52 68
Hot water 54 46 58
Perform a two-way analysis using 5% level of significance. (Given )

Solution:

The given data are coded by subtracting 50 from each observation.

Sum of observations: Sum of square of observations:

Det A Det B Det C Total


Cold water 29
Warm water 19
Hot water 8
Total 56 788

Null hypothesis

(i) The performance of three detergents is equal (Between columns).


(ii) The performance of three different temperature of waters is equal (Between rows).

To find: SSC, SSR, SSE

29
ANOVA table for two-way classification:

Source of Sum of Degrees of Mean squares Ratio


variation
squares freedom
Between
columns

Between
rows

Total

Source of Calculated Critical value at Comparison Conclusion


variation
value level
Between
detergents
Between
types of
waters

Conclusion:
(i) Reject Performance of three detergents is not equal.
(ii) Accept Performance of three different temperature of waters is equal.

30
6. A Farmer applies three types of fertilizers on 4 separate plots. The figure on yield per
square acre are tabulated below:
Plots Yield
Fertilizers A B C D Total
Nitrogen 6 4 8 6 24
Potash 7 6 6 9 28
Phosphates 8 5 10 9 32
Total 21 15 24 24 84
Find out if the plots are materially different in fertility as also, if three fertilizers
make any material difference in yields.

Solution:

Null hypothesis

(i) Plots are equally fertile.


(ii) Fertilizers are equally effective.

To find: SSC, SSR, SSE

31
ANOVA table for two-way classification:

Source of Sum of Degrees of Mean squares Ratio


variation
squares freedom
Between
columns

Between
rows

Error

Source of Calculated Critical value at Comparison Conclusion


variation
value level
Between
plots
Between
fertilizers

(i) Accept Plots are equally fertile.


(ii) Accept Fertilizers are equally effective.

32
7. Set up ANOVA table for the following information related to three drugs testing to
judge the effectiveness in reducing blood pressure for three different groups of
people.
Group of people

Do the drugs act differently? Are the different groups of people affected differently?
Is the interaction term significant? Answer the above questions taking a significant
level of 5%.
Solution:
Null Hypothesis
(i) Three drugs do not act differently.
(ii) Three groups of people are not affected differently.
(iii) The interaction terms are not significantly different.

Code by subtracting 10 from each observation.

Sum of observations: Sum of square of observations:

X Y Z Total
A 10
5 1
B
1
C

Total 7

33
To find: SSC, SSR, SSE

ANOVA table:

Source of Sum of Degrees of Mean squares Ratio


variation
squares freedom
Between
columns

Between
rows

34
Interaction

Errors

Total

Sources of Calculated Critical value at Comparison Conclusion


variations value level
Between
columns
Between rows

Interaction

Conclusion:
(i) Accept There is a significant difference between columns.
Drugs act differently.
(ii) Accept There is a significant difference between rows.
Groups of people affect differently.
(iii) Reject There is a significant difference within the group of individuals.

35
5.1 Latin square design
Introduction:

Latin square: A Latin square of order is an arrangement of symbols in cells arranged in


rows and columns such that each symbol occurs once and only once in each row and in each
column.

Example: Latin square of order 4, choose four symbols – A, B, C and D. These letters are Latin
letters which are used as symbols. Write them in a way such that each of the letters out of A, B,
C and D occurs once and only once in each row and each column.
A B C D
B C D A
C D A B
D A B C
This is a Latin square.

Latin square design: The LSD is an incomplete three-way layout in which each of the three
factors are rows, columns and treatments.

Example: Suppose four different brands of petrol are to be compared with respect to the mileage
per litre achieved in four motor cars. Important factors responsible for the variation in mileage
are 4 cars, 4 drivers and 4 petrol brands.

Expansion of abbreviations:
SSR – Sum of squares between rows
SSC – Sum of squares between columns
SSL – Sum of squares between letters
SSE – Sum of squares of errors
MSR – Mean squares of variations between rows
MSC - Mean squares of variations between columns
MSL - Mean squares of variations between letters
36
MSE - Mean squares of errors

Notations:
Total sum all the observations
Number of observations.
Number of columns.
Order of the Latin square

ANOVA table for Latin Square Design:

Source of Sum of Degrees of freedom Mean squares Ratio Critical


variation
squares value
Rows SSR

Columns SSC

Letters SSL -

Error SSE

Working rule:

Assume There is no significant difference between rows, between columns and


between letters.
Construct ANOVA table for Latin square design:
Under

37
Find tabulated value at level at degrees of freedom.
Where - Degrees of freedom of the numerator
- Degrees of freedom of the denominator
If calculated value tabulated value, accept Reject otherwise.
1. Analyze and interpret the following statistics concerning output of wheat for field
obtained as result of experiment conducted to test for four varieties of wheat viz, A,
B, C and D under Latin square design.

25 23 20 20
A D C B
19 19 21 18
B A D C
19 14 17 20
D C B A
17 20 21 15

Solution: Assume null hypothesis


(i) There is no significant difference between rows.
(ii) There is no significant difference between columns.
(iii) There is no significant difference between four varieties of wheats.

Code the data by subtracting 20 from each value.


Sum of observations: Sum of square of observations:

B A

38
Therefore,

To find:

ANOVA table for Latin Square Design:


Source of Sum of Degrees of freedom Mean squares Ratio Critical
variation
squares value
Rows SSR

Columns SSC

Letters SSL -

Error SSE

Source of Sum of Degrees Mean squares Ratio Critical value


variation
squares of
freedom
Rows 46.5

Columns 7.5

39
Letters 48.5 -

Error 10.5

Conclusion:
(iv) Calculated value Critical Value. Accept
There is a significant difference between rows.
(v) Calculated value Critical Value. Accept
There is no significant difference between columns.
(vi) Calculated value Critical Value. Reject
There is a significant difference between four varieties of wheats.

2. Present your conclusions after doing analysis of variance to the following results of
the Latin Square Design experiment conducted in respect of five fertilizers which
were used on plots of different fertility.

16 10 11 9
E C A B
10 9 14 12 11
B D E C
15 8 8 10 18
D E B A
12 6 13 13 12
C A D E
13 11 10 7 14

Solution: Assume null hypothesis


(i) There is no significant difference between rows.
(ii) There is no significant difference between columns.
(iii) There is no significant difference between fertilizers.
Code the data by subtracting 10 from each value.

Sum of observations:

6 0 1
E C A B

B D E C
5
D E B A
40
C A D E

31

Sum of square of observations:

0 1 16 4
25 4 4 0
4 16 9 9
9 1 0 9
235

Therefore,

To find:

ANOVA table for Latin Square Design:


Source of Sum of Degrees of freedom Mean squares Ratio Critical
variation
squares value
Rows SSR

Columns SSC

Letters SSL -

Error SSE

41
Source of Sum of Degrees of Mean squares Ratio Critical
variation
squares freedom value
Rows

Columns

Letters

Error

Conclusion:
(vii) Calculated value Critical Value. Accept
There is no significant difference between rows.
(viii) Calculated value Critical Value. Reject
There is a significant difference between columns.
(ix) Calculated value Critical Value. Reject
There is a significant difference between fertilizers.

42
43

You might also like