BCS301.Module 5
BCS301.Module 5
The analysis of variance (ANOVA) is a statistical technique to test whether the means of three or
more populations are equal or not. This technique was developed by R A Fisher. This technique
is widely used in Professional Business and Physical Sciences.
A table showing the source of variation, the sum of squares, degrees of freedom, mean squares
and the formula for the F ratio is called ANOVA table.
If the given data is classified according to one factor, the classification is called one way
classification. Then ANOVA table for one-way classification is to be constructed.
If the given data is classified according to two factors, the classification is called two-way
classification. Then ANOVA table for two-way classification is to be constructed.
                                                                                     1
ANOVA table for one-way classification:
Total SST - -
Expansion of abbreviations:
SSC – Sum of squares between samples (Columns)
SSE – Sum of squares within sample (Rows)
SST – Total sum of squares of variations
MSC – Mean squares of variations between samples (Columns)
MSE - Mean squares of variations within samples (Rows)
Notations:
    Total sum all the observations
    Number of observations.
    Number of columns.
Working rule:
                                                                               2
1.   Three different machines are used for a production. On the basis of the outputs, test
     whether the machines are equally effective.
                                   Output
                        Machine 1 Machine 2 Machine 3
                           10        9         20
                            5        7         16
                           11        5         10
                           10        6          4
Solution:
Output
                        10   100     9     81     20    400
                         5    25     7     49     16    256
                        11   121     5     25     10    100
                        10   100     6     36      4     16
                        36   346    27    191     50    772
                                                                                3
 Construction of ANOVA table for one-way classification:
 Within
 samples
Calculated value:
Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples
Comparison:
Calculated value
Critical value
Conclusion:
                                                                       4
2.   Three samples each of size 5 were drawn from three uncorrelated normal populations
     with equal variances. Test the hypothesis that the population means are equal at
     level.
                             Sample 1    10 12 9 16 13
                             Sample 2     9  7 12 11 11
                             Sample 3    14 11 15 14 16
      Solution:
Null hypothesis . All the three samples have equal population means.
Output
                        10   100    9     81    14    196
                        12   144    7     49    11    121
                         9    81   12    144    15    225
                        16   256   11    121    14    196
                        13   169   11    121    16    256
                        60   750   50    516    70    994
                                                                              5
 Construction of ANOVA table for one-way classification:
 Within
 samples
Calculated value:
Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples
Comparison:
Calculated value
Critical value
Conclusion:
Reject All the three samples have not equal population means.
                                                                    6
3.   A Manager of a merchandizing firm wishes to test whether its three salesmen A, B, C
     tend to make sales of the same size or whether they differ in their selling abilities.
     During a week there have been 14 sales calls, A made 5 calls, B made 4 calls and C
     made 5 calls. Following are the weekly sales record ( in rupees) of the three salesmen:
                  A 500 400 700 300 600
                  B 300 700 400 600
                  C 500 300 500 400 300
      Perform the analysis of variance and draw your conclusions.
     Solution:
     The sales data have a common factor 100. Divide all the above values by 100.
                             5 4 7 3 6
                             3 7 4 6
                             5 3 5 4 3
Null hypothesis .
All the three salesmen tend to make sales of the same size.
Output
                         5     25     3         9    5   25
                         4     16     7        49    3    9
                         7     49     4        16    5   25
                         8     64     6        36    4   16
                         6     36     -         -    3    9
                        30    190    20       110   20   84
                                                                              7
Construction of ANOVA table for one-way classification:
Within
samples
Calculated value:
Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples
Comparison:
Calculated value
Critical value
Conclusion:
Accept All the three salesmen tend to make sales of the same size.
                                                                          8
4.   Three samples of five, five and four car tyres are drawn respectively from three
     brands A, B, and C manufactured by three machines. The lifetime of these tyres (per
     1000 miles) is given below. Test whether the average life time of the three brands of
     tyres are equal or not.
                                   A 35 40 33 36 31
                                   B 30 25 34 28 33
                                   C 28 24 30 26 -
Solution:
Null hypothesis .
Output
                                                                           9
Construction of ANOVA table for one-way classification:
Within
samples
Calculated value:
Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples
Comparison:
Calculated value
Critical value
Conclusion:
Reject The average lifetime of three brands of tyres are not equal.
                                                                           10
5.   To assess the significance of possible variation in performance in a certain test
     between the grammar school of a city, a common test was given to a number of
     students taken at random from the senior fifth class of each of the four schools
     concerned. The results are given below. Make an analysis of variance data.
                                                                Schools
     Solution:
                                                                8    12    18   13
     Subtract 10 from each of the given values.                 10   11    12   9
                                                                12   9     16   12
                                                                8    14    6    16
                                                                7    4     8    15
Null hypothesis
Output
                                                                          11
Construction of ANOVA table for one-way classification:
Within
samples
Calculated value:
Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples
Comparison:
Calculated value
Critical value
Conclusion:
                                                                   12
6.   The three samples below have been obtained from normal populations with equal
     variances. Test the hypothesis that the sample means are equal.
Solution:
Null hypothesis
All the three samples have taken from the same population.
                                                                       13
Construction of ANOVA table for one-way classification:
Within
samples
Calculated value:
Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples
Comparison:
Conclusion:
Reject All the three samples have not taken from the same population.
                                                                        14
7.    The following table gives the yields on 15 sample plots under three varieties of seeds:
     Find out if the average yields of land under different varieties of seeds show
     significant differences.
Null hypothesis
      The average yields of land under different varieties of seeds do not show significant
      differences.
                                                                                15
  Construction of ANOVA table for one-way classification:
Within
samples
Calculated value:
Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples
Comparison:
Conclusion: Reject
                                                             16
     The average yields of land under different varieties of seeds show significant differences.
8.    Test the significance of the variation of the retail prices of a commodity in three
      cities Mumbai, Chennai and Bengaluru. Four shops were chosen at random in each
      city and prices observed in rupees were as follows:
Do the data indicate that the prices in the three cities are significantly different?
Solution:
Null hypothesis
                                                                                 17
 Construction of ANOVA table for one-way classification:
   Within
   samples
Calculated value:
Critical value:
Level of significance
Degrees of freedom between samples
Degrees of freedom within samples
Comparison:
Conclusion: Accept
In a two-way classification, the data are classified according to two different criteria or factors.
Expansion of abbreviations:
SSC – Sum of squares between columns           CF – Correction Factor
SSR – Sum of squares between rows              MSC – Mean squares of variations between columns
SST – Total sum of squares of variations       MSR – Mean squares of variations between rows
SSE – Sum of squares due to errors             MSE - Mean squares of variations between rows
Error SSE
                                                                                     19
How to find SSC, SSE and SST from the following table?
Total
Total
Notation:
                Column totals        Grand total
              Row Totals        N – Total number of elements
Working rule:
(v)    Assume      There is no significant difference between rows and between columns.
(vi) Construct ANOVA table for two-way classification.
       Under
                                                                               20
1.   In a certain factory, production can be accomplished by four different workers on
     five different machines. A simple study, in context of a two-way design without
     repeated values, is being made with two-fold objectives of examining whether the four
     workers differ with respect to mean productivity and whether the mean productivity
     is the same for the five different machines. The researcher involved in this study
     reports while analyzing the gathered data as under:
     (i)      Sum of squares of variance between machines is 35.2
     (ii)     Sum of squares of variance between workmen is 53.8
     (iii)    Sum of squares for total variance is 174.2
     Setup ANOVA table for the given information and draw the inference about variance
     at 5% level of significance.
     Solution:
     Null hypothesis
     (i)    No significant difference between mean productivity of the four workers (Between
            columns).
     (ii)   No significant difference between mean productivity of five different machines
            (Between rows).
By data, ,
Therefore,
                                                                              21
ANOVA table for two-way classification:
  Between
   rows
Error
 (i)   Accept
                                                                22
              No significant difference between mean productivity of the four workers.
      (ii)    Accept
              No significant difference between mean productivity of the five machines.
2.   The following table gives the number of refrigerators sold by 4 salesmen in three
     months May, June and July:
       Month                                 Salesmen
                         A            B               C                 D
         May             50           40              48                39
        June             46           48              50                45
         July            39           44              40                39
     (i)     Is there any significant difference in the sales made by the four salesmen?
     (ii)    Is there any significant difference in the sales made during different months?
Solution:
     Null hypothesis
     (i)    There is no significant difference in the sales made by the four salesmen (between
            columns).
     (ii)   There is no significant difference in the sales made during different months
            (between rows).
                                                                               23
ANOVA table for two-way classification:
 Between
  rows
Error
Conclusion:
(i)       Accept     There is no significant difference in the sales made by the four salesmen.
                                                                                    24
(ii)    Accept      There is no significant difference in the sales made during different months.
               A    B     C    D    Total
          I
         II
        III
       Total
       Null hypothesis
       (i)    There is no significant difference between treatments (Between columns).
       (ii)   There is no significant difference between plots (Between rows).
                                                                                   25
  ANOVA table for two-way classification:
      Between
       rows
Error
      Between
      months
Conclusion:
                                                                                       26
4.   A tea company appoints 4 salesmen A, B, C and D and observes their sales in 3
     seasons- Summer, Winter and Monsoon. The figures (in lakhs) are given in the
     following table:
     Solution: The given data are coded by subtracting 30 from each observation.
       Seasons        Salesman A Salesman B Salesman C Salesman D
       Summer
        Winter
       Monsoon
     Null hypothesis
     (i)    There is no significant difference between the performance of salesmen (Between
            columns).
     (ii)   There is no significant difference between the seasons (Between rows).
                       Sum of observations:               Sum of square of observations:
                        A    B    C     D Total
                                                          36 36 81
               Summer                       8
                                                          4 1
                Winter
               Monsoon
                Total
                                                                              27
  ANOVA table for two-way classification:
      Between
       rows
Error
Comparison:
                                                                                        28
5.   To study the performance of three detergents and three different water temperatures
     the following whiteness readings were obtained with specially designed equipment.
Solution:
Null hypothesis
                                                                              29
          ANOVA table for two-way classification:
   Between
    rows
Total
  Conclusion:
(i)       Reject      Performance of three detergents is not equal.
(ii)      Accept       Performance of three different temperature of waters is equal.
                                                                                        30
6.   A Farmer applies three types of fertilizers on 4 separate plots. The figure on yield per
     square acre are tabulated below:
     Plots                                     Yield
     Fertilizers          A              B               C                D             Total
     Nitrogen              6              4               8               6               24
     Potash                7              6               6               9               28
     Phosphates            8              5              10               9               32
     Total                21             15              24              24               84
     Find out if the plots are materially different in fertility as also, if three fertilizers
     make any material difference in yields.
Solution:
Null hypothesis
                                                                              31
  ANOVA table for two-way classification:
      Between
       rows
Error
                                                                                  32
7.    Set up ANOVA table for the following information related to three drugs testing to
      judge the effectiveness in reducing blood pressure for three different groups of
      people.
             Group of people
     Do the drugs act differently? Are the different groups of people affected differently?
     Is the interaction term significant? Answer the above questions taking a significant
     level of 5%.
     Solution:
     Null Hypothesis
     (i)     Three drugs do not act differently.
     (ii)    Three groups of people are not affected differently.
     (iii) The interaction terms are not significantly different.
               X   Y     Z     Total
      A                         10
               5         1
      B
               1
      C
Total 7
                                                                            33
To find: SSC, SSR, SSE
ANOVA table:
 Between
  rows
                                                     34
Interaction
Errors
Total
Interaction
Conclusion:
(i)   Accept     There is a significant difference between columns.
      Drugs act differently.
(ii)  Accept     There is a significant difference between rows.
      Groups of people affect differently.
(iii) Reject     There is a significant difference within the group of individuals.
                                                                           35
                                  5.1 Latin square design
Introduction:
Example: Latin square of order 4, choose four symbols – A, B, C and D. These letters are Latin
letters which are used as symbols. Write them in a way such that each of the letters out of A, B,
C and D occurs once and only once in each row and each column.
                                           A B C D
                                           B C D A
                                           C D A B
                                           D A B C
This is a Latin square.
Latin square design: The LSD is an incomplete three-way layout in which each of the three
factors are rows, columns and treatments.
Example: Suppose four different brands of petrol are to be compared with respect to the mileage
per litre achieved in four motor cars. Important factors responsible for the variation in mileage
are 4 cars, 4 drivers and 4 petrol brands.
Expansion of abbreviations:
SSR – Sum of squares between rows
SSC – Sum of squares between columns
SSL – Sum of squares between letters
SSE – Sum of squares of errors
MSR – Mean squares of variations between rows
MSC - Mean squares of variations between columns
MSL - Mean squares of variations between letters
                                                                                 36
MSE - Mean squares of errors
Notations:
    Total sum all the observations
     Number of observations.
     Number of columns.
    Order of the Latin square
Columns SSC
Letters SSL -
Error SSE
Working rule:
                                                                                37
  Find tabulated value at    level at        degrees of freedom.
  Where        - Degrees of freedom of the numerator
              - Degrees of freedom of the denominator
  If calculated value tabulated value, accept    Reject otherwise.
1. Analyze and interpret the following statistics concerning output of wheat for field
    obtained as result of experiment conducted to test for four varieties of wheat viz, A,
    B, C and D under Latin square design.
          25         23      20         20
          A          D       C          B
          19         19      21         18
          B          A       D          C
          19         14      17         20
          D          C       B          A
          17         20      21         15
B A
                                                                                38
 Therefore,
To find:
Columns SSC
Letters SSL -
Error SSE
Columns 7.5
                                                                       39
Letters        48.5                       -
Error 10.5
     Conclusion:
     (iv)                  Calculated value Critical Value. Accept
           There is a significant difference between rows.
     (v)                   Calculated value Critical Value. Accept
           There is no significant difference between columns.
     (vi)                 Calculated value Critical Value. Reject
           There is a significant difference between four varieties of wheats.
   2. Present your conclusions after doing analysis of variance to the following results of
      the Latin Square Design experiment conducted in respect of five fertilizers which
      were used on plots of different fertility.
              16       10       11         9
              E        C        A         B
              10        9       14        12         11
              B        D        E         C
              15        8        8        10         18
              D        E        B         A
              12        6       13        13         12
              C        A        D         E
              13       11       10         7         14
Sum of observations:
                   6        0        1
                   E        C        A         B
                   B        D        E         C
                   5
                   D        E        B         A
                                                                                 40
                 C       A       D       E
31
                 0       1      16        4
                25       4       4        0
                 4      16       9        9
                 9       1       0        9
                                                          235
Therefore,
To find:
Columns SSC
Letters SSL -
Error SSE
                                                                           41
Source of     Sum of    Degrees of    Mean squares               Ratio         Critical
variation
             squares     freedom                                                value
Rows
Columns
Letters
Error
       Conclusion:
       (vii)                    Calculated value Critical Value. Accept
              There is no significant difference between rows.
       (viii)                   Calculated value Critical Value. Reject
              There is a significant difference between columns.
       (ix)                    Calculated value Critical Value. Reject
              There is a significant difference between fertilizers.
                                                                          42
43