Math 106         Lecture 7
Measures of Central Tendency
       (summarizing data with a single number)
              Mean, Median, Mode,
                   Grouped Data
                   Intro to Dispersion: Quartiles
                                                        1
© m j winter, ss2003
                         Mean, Median, Mode
                       Data points: x1, x2, …., xn
     Mean:              x1 + x2 + ... + xn ∑ x
                 x=                       =    =µ
                                n           n
     Median: list data in order: x1 < x2 < …. < xn if
      n is odd, median is middle point
      n is even, median is average of the two
      middle points
     Mode: the value of x which occurs most
      often. May be more than one; may be
      none.
                                                        2
                                                            1
               Example: Starting
                  Salaries
Data Set - Starting Salaries of Basket-weaving Majors:
$27000, $27,000, $49,500, $37,300, $487,000,
  $15,000, $32,000, $37,500, $41,300
What was the mean (average) starting salary?
What was the median starting salary?
What was the mode?
                         Example - 2
Data Set - Starting Salaries of Basket-weaving Majors:
$27,000 $27,000, $49,500, $37,300, $487,000,
  $15,000, $32,000, $37,500, $41,300
Mean salary? (Add the salaries and divide by 9)
  $83,733.33
Median salary? (List in order, take middle)
15.0 27.0 27.0 32.0 37.3 37.5 41.3 49.5 487
Mode? $27,000
How useful are these numbers?
                                                         4
                                                             2
                  Measure of Central Tendency
                  Advantages, Disadvantages
  Mean can be influenced by outliers.
       useful mathematically
       most useful when data is ‘continuous’
  Median also a central number. Often more meaningful.
  However, it is possible there is no data point anywhere near
    the mean or median (or very few)
  Mode useful when data is discrete – such as number of cars in
    a family, etc.
                         Questions
        x1 < x2 < x3 < x4 <x5 < x6 <x7 <x8 <x9< x10
calculate the mean µ = x + x + ... + x
                         1   2        10
                               10
and the median, m =       x5 + x6
                             2
Now increase the largest number by 20. What is the new
mean? The new median?
 New mean =
        x1 + x2 + ... + ( x10 + 20)     20
                                    =µ+    = µ+2
                    10                  10
   The median does not change. x5 + x6
                                     2                           6
                                                                     3
                    Detour – weighted averages
Calculate the average of: 3.2, 3.2, 3.2, 4.0, 2.5, 2.5
  3.2 + 3.2 + 3.2 + 4.0 + 2.5 + 2.5
                  6
 3.2 + 3.2 + 3.2 + 4.0 + 2.5 + 2.5 3*(3.2) + 1*(4.0) + 2*(2.5)
                                  =
                 6                              6
    3         1         2
 = (3.2) + (4.0) + (2.5) = 3.10
    6         6         6
  This is a weighted average.
                 3 1 2
  The numbers , , are called the weights.
                 6 6 6
  Note that the sum of the weights is 1.
                                                                 7
           Another example of Weighted Averages
 A student’s test average is 3.1 and the grade on the final
 exam is 2.8. If the exam is to count as 1/4 of the final
 average, how is this average computed?
 The weights are 3/4 and 1/4.
      3      1      3 3.1 + 1 2.8
        3.1 + 2.8 =               = 3.025
      4      4            4
                                                                     4
                  Mean or Average of Grouped Data
Set of 17 integers
between 2 and 9                    7        7
(inc)
  [2, 3]     7
                                                                    4
                                 Freq
  [4, 5]     3                                    3             3
  [6, 7]     3
  [8,9]      4
                                   0
                                            3.0   5.0       7.0     9.0
                                        2             unnamed             9
With data in a group, use the midpoint value.
                                                                               9
                 Mean or Average of Grouped Data - 2
                                            7
  Set of 17 numbers                7
  between 2 and 9
  (inc)
  Use mid-interval                                                  4
  value.                         Freq
                                                  3             3
                                   0
                                            3.0   5.0       7.0     9.0
   7 ⋅ 2.5 + 3 ⋅ 4.5 + 3 ⋅ 6.5 + 4 ⋅ 8.5 2        unnamed                 9
x=                                       = 4.9705..
                    17
                                                                              10
                                                                                   5
                     Calculating the mean from a relative
                       frequency (density) histogram
                                        7             .412
                                                        7
                                                                                    .235
                                                                                      4
                                      Freq                       ..176
                                                                     3       .176
                                                                              3
                                        0
                                                      3.0          5.0       7.0    9.0
                                                 2    2.5                 6.5
                                                                  4.5unnamed        8.5    9
   .412 * 2.5 + .176 * 4.5 + .176 *6.5 + .235 *8.5 = 4.964..
                                                                                                   11
                           Here’s the original data
 frequencies for noname.fma (column 1)
                                             4        4
     2 ...   2   4    23.53%
                                                             3                3        3
     3 ...   3   3    17.65%
     4 ...   4   2    11.76%
     5 ...   5   1     5.88%                                       2
                                        Freq
     6 ...   6   3    17.65%
     7 ...   7   0                                                       1                     1
     8 ...   8   3    17.65%
     9 ...   9   1     5.88%                 0
                                                 2.0 3.0 4.0 5.0 6.0
                                                  2
                                                                 6 78.08 9.0910.0
                                                      2 3 4 5unnamed           9
   mean value:                 4.76
The wider the bins, the more information you lose.
                                               12
                                                                                                        6
         The wider the bins, the more information
                         you lose.
The next slide shows four histograms formed from the
same data. The means are listed in the center.
they come from
        Grouping Will Change the Mean!
http://www.shodor.org/interactivate/activities/histogram/index.html
                                                                      13
                           139.84
                           147.2
                           112.24
                           206.00
                                                                      14
                                                                           7
                     Elevator-Simulation Examples
                 Number of time passengers got off at
                different floors (3 passengers, 6 floors)
     • List 1: (10 trials)
       6, 8, 8, 6, 9, 6, 5, 7, 5, 9, 6, 6, 3, 6, 5
     • List 2: (100 trials)
       56, 52, 49, 56, 50, 57, 61, 63, 56, 52, 55, 58, 49, 64,
       51, 51
     • List 3: (400 trials)
       213, 231, 221, 215
                                                                              15
                                         Sorted Lists
List 1
3     5    5    5    6    6    6    6      6    6    7    8    8    9    9
Mean =                        Median =                   Mode =
List 2
49    49   50   51   51   52   52   55    56   56   56   57   58   61   63   64
 Mean =                       Median =                   Mode =
List 3
213, 215, 221, 231
 Mean =            Median =                              Mode =
                                                                              16
                                                                                   8
                                        Sorted Lists
List 1
3    5    5    5    6    6    6    6      6    6    7    8     8    9    9
Mean = 6.33                   Median = 6                     Mode = 6
List 2
49   49   50   51   51   52   52   55    56   56   56   57    58   61   63   64
 Mean = 55                    Median = 55.5                   Mode = 56
List 3
213, 215, 221, 231
 Mean = 220                   Median = 218                     No Mode
                                                                              17
                         Quartiles - use with median
3 5 5 5 6 6 6 6 6 6 7 8 8 9 9
Median is midpoint - the number of elements below the
median equals the number above it.
First quartile: Take the median of the lower half.
Third quartile: Take the median of the upper half.
3    5    5    5    6    6    6    6      6    6    7    8     8    9    9
          interquartile range: 8 – 5 = 3
                                                                              18
                                                                                   9
Interquartile Range, Box Plot, 5-number summary
                     Roadhog: someone who takes his half of
                     the road out of the middle
                     The interquartile range is the width
                     (range) of the middle half of your data.
                 5 number summary: {3,5,6,8,9}
       min       first quartile   median         third quartile max
        3                 5           6               8       9
                              25.0%        25.0%
                          5                  8                        19
    Estimating the mean from the 5-number
                   summary
    5 number summary: {3,5,6,8,9}
    interval   weight     midpoint
    [3,5]      1/4        4
    [5,6]      1/4        5.5
    [6,8]      1/4        7
    [8,9]      1/4        8.5
1    1     1   1      4 + 5.5 + 7 + 8.5 25
  4 + 5.5 + 7 + 8.5 =                  =   = 6.25
4    4     4   4              4          4
                                                                      20
                                                                           10
       Commonly reported
       statistical results
List 1: (10 trials)
6, 8, 8, 6, 9, 6, 5, 7, 5, 9, 6, 6, 3, 6, 5
  6
Freq
  0
             3      5      6   7   8   9 10
       0         unnamed                 9
                                              21
                                                   11