Analysis of Variance
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.   Chap 12-1
                                                    Chapter Goals
    After completing this chapter, you should be able
      to:
        Recognize situations in which to use analysis of variance
        Understand different analysis of variance designs
        Perform a single-factor hypothesis test and interpret results
        Conduct and interpret post-analysis of variance pairwise
         comparisons procedures
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.   Chap 12-2
                                General ANOVA Setting
                  Investigator controls one or more independent
                   variables
                        Called factors (or treatment variables)
                        Each factor contains two or more levels (or
                         categories/classifications)
                  Observe effects on dependent variable
                        Response to levels of independent variable
                  Experimental design: the plan used to test
                   hypothesis
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.   Chap 12-3
                      One-Way Analysis of Variance
             Evaluate the difference among the means of
              three or more populations
                Examples:                   ● Accidentrates for 1st, 2nd, and 3rd shift
                                            ● Expected mileage for five brands of tires
             Assumptions
               Populations are normally distributed
               Populations have equal variances
               Samples are randomly and independently
                drawn
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.        Chap 12-4
                    Completely Randomized Design
               Experimental units (subjects) are assigned
                randomly to treatments
               Only one factor or independent variable
                     With two or more treatment levels
               Analyzed by
                     One-factor analysis of variance (one-way ANOVA)
               Called a Balanced Design if all factor levels
                have equal sample size
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.   Chap 12-5
                                  Hypotheses of One-Way
                                         ANOVA
                   H0 : µ1 = µ2 = µ3 = L = µk
                         All population means are equal
                         i.e., no treatment effect (no variation in means among
                          groups)
              
                   HA : Not all of the population means are the same
                         At least one population mean is different
                         i.e., there is a treatment effect
                         Does not mean that all population means are different
                          (some pairs may be the same)
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.   Chap 12-6
                                     One-Factor ANOVA
                                         H0 : µ1 = µ2 = µ3 = L = µk
                                        HA : Not all µi are the same
                                                                                  All Means are the same:
                                                                                 The Null Hypothesis is True
                                                                                   (No Treatment Effect)
                             µ1 = µ2 = µ3
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                          Chap 12-7
                                     One-Factor ANOVA
                                                                                                (continued)
                                        H0 : µ1 = µ2 = µ3 = L = µk
                                       HA : Not all µi are the same
                                      At least one mean is different:
                                     The Null Hypothesis is NOT true
                                      (Treatment Effect is present)
                                                                     or
                  µ1 = µ2 ≠ µ3                                                   µ1 ≠ µ2 ≠ µ3
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                       Chap 12-8
                               Partitioning the Variation
               Total variation can be split into two parts:
                                     SST = SSB + SSW
                                SST = Total Sum of Squares
                                SSB = Sum of Squares Between
                                SSW = Sum of Squares Within
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.   Chap 12-9
                               Partitioning the Variation
                                                                                 (continued)
                                            SST = SSB + SSW
     Total Variation (SST) = the aggregate dispersion of the
     individual data values across the various factor levels
             Between-Sample Variation (SSB) = dispersion among the
             factor sample means
             Within-Sample Variation (SSW) = dispersion that exists
             among the data values within a particular factor level
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.       Chap 12-10
                             Partition of Total Variation
                                             Total Variation (SST)
                   Variation Due to                                                  Variation Due to Random
    =               Factor (SSB)                                       +                 Sampling (SSW)
       Commonly referred to as:                                                       Commonly referred to as:
      Sum of Squares Between                                                       Sum of Squares Within
      Sum of Squares Among                                                         Sum of Squares Error
      Sum of Squares Explained                                                     Sum of Squares Unexplained
      Among Groups Variation                                                       Within Groups Variation
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                              Chap 12-11
                                     Total Sum of Squares
                                             SST = SSB + SSW
                                                                      k          ni
                                            SST = ∑∑ ( x ij − x )2
                                                                    i=1 j =1
                  Where:
                                  SST = Total sum of squares
                                  k = number of populations (levels or treatments)
                                  ni = sample size from population i
                                  xij = jth measurement from population i
                                  x = grand mean (mean of all data values)
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.        Chap 12-12
                                                    Total Variation
                                                                                               (continued)
        SST = ( x11 − x )2 + ( x12 − x )2 + ... + ( x knk − x )2
                  Response, X
                                    Group 1                      Group 2         Group 3
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                      Chap 12-13
                             Sum of Squares Between
                                             SST = SSB + SSW
                                                                     k
                                           SSB = ∑ ni ( x i − x )                2
                                                                   i=1
                                                                   i=
                  Where:
                                  SSB = Sum of squares between
                                  k = number of populations
                                  ni = sample size from population i
                                  xi = sample mean from population i
                                  x = grand mean (mean of all data values)
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.       Chap 12-14
                              Between-Group Variation
                                           k
                  SSB = ∑ ni ( x i − x )                                     2
                                          i=1
                         Variation Due to                                              SSB
                   Differences Among Groups
                                                                                 MSB =
                                                                                       k −1
                                                                                 Mean Square Between =
                                                                                 SSB/degrees of freedom
                               µi                  µj
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                      Chap 12-15
                              Between-Group Variation
                                                                                                 (continued)
                                                      2                               2                     2
  SSB = n1 ( x1 − x ) + n2 ( x 2 − x ) + ... + nk ( x k − x )
                 Response, X
                                                                                            x3
                                                                                 x2
                                                                                                 x
                                                           x1
                                   Group 1                       Group 2          Group 3
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                        Chap 12-16
                                  Sum of Squares Within
                                             SST = SSB + SSW
                                                                 k          nj
                                    SSW = ∑                               ∑       ( x ij − x i )   2
                                                               i =1         j=1
                  Where:
                                  SSW = Sum of squares within
                                  k = number of populations
                                  ni = sample size from population i
                                  xi = sample mean from population i
                                  xij = jth measurement from population i
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                         Chap 12-17
                                   Within-Group Variation
                                  k          nj
      SSW = ∑                              ∑           ( x ij − x i )            2
                                i =1         j=1
                                                                                           SSW
                  Summing the variation
                  within each group and then
                                                                                     MSW =
                  adding over all groups
                                                                                           nT − k
                                                                                      Mean Square Within =
                                                                                     SSW/degrees of freedom
                                          µi
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                          Chap 12-18
                                   Within-Group Variation
                                                                                                 (continued)
   SSW = ( x11 − x1 )2 + ( x12 − x 2 )2 + ... + ( x knk − x k )2
                  Response, X
                                                                                            x3
                                                                                 x2
                                                           x1
                                    Group 1                      Group 2          Group 3
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                        Chap 12-19
                                 One-Way ANOVA Table
        Source of                             SS                            df         MS               F ratio
        Variation
         Between                                                                       SSB       MSB
                                            SSB                         k-1      MSB =
         Samples                                                                       k - 1 F = MSW
         Within                                                                        SSW
                                            SSW                         nT - k   MSW =
         Samples                                                                       nT - k
                                       SST =
          Total                                                         nT - 1
                                     SSB+SSW
                                                             k = number of populations
                                                             nT = sum of the sample sizes from all populations
                                                             df = degrees of freedom
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                                   Chap 12-20
                                         One-Factor ANOVA
                                           F Test Statistic
                            H0: µ1= µ2 = … = µ k
                            HA: At least two population means are different
             Test statistic                                   MSB
                                                            F=
                                                               MSW
                                       MSB is mean squares between variances
                                       MSW is mean squares within variances
             Degrees of freedom
                    df1 = k – 1                     (k = number of populations)
                    df2 = nT – k                    (nT = sum of sample sizes from all populations)
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                 Chap 12-21
                   Interpreting One-Factor ANOVA
                               F Statistic
                   The F statistic is the ratio of the between
                    estimate of variance and the within estimate
                    of variance
                         The ratio must always be positive
                         df1 = k -1 will typically be small
                         df2 = nT - k will typically be large
                             The ratio should be close to 1 if
                                 H0: µ1= µ2 = … = µk is true
                             The ratio will be larger than 1 if
                                 H0: µ1= µ2 = … = µk is false
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.   Chap 12-22
                                         One-Factor ANOVA
                                          F Test Example
     You want to see if three                                                    Club 1   Club 2   Club 3
     different golf clubs yield                                                   254       234     200
     different distances. You                                                     263       218     222
     randomly select five                                                         241       235     197
     measurements from trials on                                                  237       227     206
     an automated driving                                                         251       216     204
     machine for each club. At the
     .05 significance level, is there
     a difference in mean
     distance?
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                         Chap 12-23
                       One-Factor ANOVA Example:
                            Scatter Diagram
                                                                             Distance
     Club 1                Club 2                Club 3                      270
      254                    234                  200                            260   •
      263                    218                  222                                  ••
                                                                                 250        x1
      241                    235                  197                            240   •
      237                    227                  206                                  •          ••
                                                                                 230
      251                    216                  204
                                                                                 220
                                                                                                   •    x2 •          x
                                                                                                   ••
                                                                                 210
x1 = 249.2 x 2 = 226.0 x 3 = 205.8
                                                                                                           ••    x3
                                                                                 200                        ••
                            x = 227.0                                            190
                                                                                        1         2        3
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                                    Chap 12-24
                                                                                                 Club
                        One-Factor ANOVA Example
                              Computations
    Club 1                Club 2                Club 3                       x1 = 249.2    n1 = 5
     254                    234                  200                         x2 = 226.0    n2 = 5
     263                    218                  222                         x3 = 205.8    n3 = 5
     241                    235                  197
                                                                                           nT = 15
     237                    227                  206                         x = 227.0
     251                    216                  204                                       k=3
        SSB = 5 [ (249.2 – 227)2 + (226 – 227)2 + (205.8 – 227)2 ] = 4716.4
        SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6
      MSB = 4716.4 / (3-1) = 2358.2                                                 2358.2
                                                                                 F=        = 25.275
      MSW = 1119.6 / (15-3) = 93.3                                                   93.3
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                        Chap 12-25
                       One-Factor ANOVA Example
                                Solution
    H0: µ1 = µ2 = µ3                                                             Test Statistic:
    HA: µi not all equal
                                                                             MSB 2358.2
    α = .05                                                               F=     =      = 25.275
    df1= 2     df2 = 12                                                      MSW   93.3
                        Critical                                                 Decision:
                        Value:
                                                                                  Reject H0 at α = 0.05
                        Fα = 3.885
                                     α = .05                       Conclusion:
                                                                    There is evidence that
0         Do not                      Reject H0                     at least one µi differs
          reject H0                                     F = 25.275
                         F.05 = 3.885                               from the rest
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                        Chap 12-26
                      The Tukey-Kramer Procedure
               Tells which population means are significantly
                different
                     e.g.: µ1 = µ2 ≠ µ3
                     Done after rejection of equal means in ANOVA
               Allows pair-wise comparisons
                     Compare absolute mean differences with critical
                      range
                                                                                 µ1= µ2   µ3   x
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                 Chap 12-27
                         Tukey-Kramer Critical Range
                                                                                 MSW  1 1 
                    Critical Range = qα                                                  +
                                                                                  2  ni n j 
             where:
                        qα = Value from standardized range table
                                with k and nT - k degrees of freedom for
                                the desired level of α
                      MSW = Mean Square Within
                  ni and nj = Sample sizes from populations (levels) i and j
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                     Chap 12-28
                   The Tukey-Kramer Procedure:
                            Example
                                                                                 1. Compute absolute mean
     Club 1                Club 2                Club 3                             differences:
      254                    234                  200
      263                    218                  222                            x1 − x 2 = 249.2 − 226.0 = 23.2
      241                    235                  197                            x1 − x 3 = 249.2 − 205.8 = 43.4
      237                    227                  206
      251                    216                  204                            x 2 − x 3 = 226.0 − 205.8 = 20.2
     2. Find the q value from the table in appendix J with
        k and nT - k degrees of freedom for
            the desired level of α
                                                                  qα = 3.77
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                              Chap 12-29
                   The Tukey-Kramer Procedure:
                            Example
                 3. Compute Critical Range:
                                                       MSW  1 1         93.3  1 1 
               Critical Range = qα
                                                            
                                                               +
                                                                   
                                                                     = 3.77       +  = 16.285
                                                        2  ni n j          2 5 5
                                                                                 4. Compare:
                5. All of the absolute mean differences
                                                                                     x1 − x 2 = 23.2
                are greater than critical range.
                Therefore there is a significant                                     x1 − x 3 = 43.4
                difference between each pair of
                means at 5% level of significance.                                   x 2 − x 3 = 20.2
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.                    Chap 12-30