Lecture 7
Comparison of Mean, Median and Mode
The measure of central tendency and measure of dispersion can describe
the distribution but they are not sufficient to describe the nature of the
distribution. For this purpose, we use other two statistical measures that
compare the shape to the normal curve called Skewness and Kurtosis.
Skewness and Kurtosis are the two important characteristics of
distribution that are studied in descriptive statistics.
1-Skewness
Skewness is a statistical number that tells us if a distribution is symmetric
or not. A distribution is symmetric if the right side of the distribution is
similar to the left side of the distribution. If a distribution is symmetric,
then the Skewness value is 0. i.e. If a distribution is Symmetric (normal
distribution): median= mean= mode, (Skewness value is 0) If Skewness is
greater than 0, then it is called right-skewed or that the right tail is longer
than the left tail. If Skewness is less than 0, then it is called left-skewed or
that the left tail is longer than the right tail.
For example, the symmetrical and skewed distributions are shown by
curves as:
And other example
2-Kurtosis
Kurtosis is a statistical number that tells us if a distribution is taller or
shorter than a normal distribution. If a distribution is similar to the
normal distribution, the Kurtosis value is 0. If Kurtosis is greater than 0,
then it has a higher peak compared to the normal distribution. If Kurtosis
is less than 0, then it is flatter than a normal distribution.
There are three types of distributions:
Leptokurtic: Sharply peaked with fat tails, and less variable.
Mesokurtic: Medium peaked
Platykurtic: Flattest peak and highly dispersed
For example, The different types of Kurtosis:
Examples: Calculate Sample Skewness and Sample Kurtosis from the
following grouped data
Measures of Variability
Range
Sample range: is the simplest measure of variability
and is simply given by:
Variance:
Variance in statistics is a measurement of the spread between numbers
in a data set. That is, it measures how far each number in the set is from
the mean and therefore from every other number in the set, so Variance
defined as the average of the squared differences from the mean.
Variance measures how far a data set is spread out:
Variance can be negative. A zero value means that all of the values within
a data set are identical. If the variance is low that’s mean the data collect
near average, while If the variance is high the data will spread from the
average.
Problem 1:
The heights (in cm) of students of a class is given to be 163, 158, 167,
174, 148. Find the variance.
Solution:
variance for grouped data
Find the variance of the following data:
Find the variance of the following data:
Standard deviation
Standard deviation is a measure of dispersement in statistics.
“Dispersement” tells you how much your data is spread out. Specifically,
it shows you how much your data is spread out around the mean or
average. It is the most robust and widely used measure of dispersion
since, unlike the range and inter-quartile range, it takes into account every
variable in the dataset. For example, are all your scores close to the
average? Or are lots of scores way above (or way below) the average
score? When the values in a dataset are pretty tightly bunched together
the standard deviation is small. When the values are spread apart the
standard deviation will be relatively large. The standard deviation is
usually presented in conjunction with the mean and is measured in the
same units.
For example, suppose we have five climatic stations and have recorded
rainfall in mm as follows (60,47,17,43,30). Calculate the standard
deviation for them.
Standard deviation for grouped data