PROBABILITY AND STATISTICS:
STAT 166
J.K. AFRIYIE
Department of Statistics and Actuarial Science
KNUST
archimedes09.jak@gmail.com
jonathan.afriyie@knust.edu.gh
June 3, 2025
1 / 28
Summarizing Data
Summarizing Data Graphically
2 / 28
Frequency Distribution Table
Definition
A frequency distribution table is the organisation of raw
data into mutually exclusive categories showing the number of
observations in each class.
3 / 28
Frequency Distribution Table
4 / 28
Constructing Frequency Distribution Tables
Rules for constructing frequency distribution
To construct frequency distribution, follow the following rules:
The table should be between 5 and 20 classes.
The classes must be mutually exclusive. This implies that
the class limits must be non overlapping so that each
observation cannot be placed into two classes.
The classes must be continuous. This implies that, there
must be no gaps in a frequency distribution.
The classes must be exhaustive. That is, there must be
enough classes to accommodate all the data.
The classes must be equal in width and size.
5 / 28
Definitions
Class Limit Class Class Boundary
Class limit is the starting and ending point 10 - 14 9.5 - 14.5
of a particular class. The starting value of 15 - 19 14.5 - 19.5
each class is called the lower limit of the 20 - 24 19.5 - 24.5
class and the ending value of each class is 25 - 29 24.5 - 29.5
called the upper limit. 30 - 34 29.5 - 34.5
Class Boundary
Class boundary describes the midpoint between the upper class
limit of a class and the lower class limit of the next class in
sequence. First find the difference between the upper class limit of
a particular class interval and the lower class limit of the next class
and divide by 2. From the table above, the difference between the
upper class limit of a particular class interval and the lower class
limit of the next class i.e. (15 − 14 = 1) and half of the difference is
0.5. Subtract the 0.5 obtained from the lower class limits of each
class and add the 0.5 to the upper class limit of each class.
6 / 28
Definitions
Class boundaries are also used to separate classes.
Class Boundary
Note that it is not always that you subtract 0.5 from the lower
class limits and add 0.5 to the upper class limits.
For the frequency table below we subtract 40 (i.e. half of 80)
from the lower class limits and add 40 to the upper class limits
if we follow the steps above.
Marks Frequency Class Boundaries
980 - 1000 10 940 - 1040
1080 - 1100 15 1040 - 1140
1180 - 1200 24 1140 - 1240
1280 - 1300 21 1240 - 1340
1380 - 1400 17 1340 - 1440
1480 - 1500 13 1440 - 1540
7 / 28
Definitions
Class Midpoint
Class Width/Size
Class midpoint is a point
It is the difference that divides a class into two
between the lower and equal parts. That is, the
the upper class average of the lower and
boundaries of a class. upper class limits.
Example
Class Midpoint freq
Class Frequency(f)
10 - 14 12 13
Class frequency is the 15 - 19 17 15
number of observations in 20 - 24 22 12
each class.
25 - 29 27 14
30 - 34 32 18
8 / 28
Definitions
Relative Frequency (RF)
It describes the proportion
of values falling into that
class. It is obtained by
dividing the frequency of the
class by the total frequency
Cumulative Frequency(CF)
Cumulative frequency is
obtained by summing the
frequency of a class and the
frequencies of all the classes
preceding it.
9 / 28
Constructing Grouped Frequency Distribution
Steps For Constructing Grouped Frequency Distribution
1. Decide the number of classes using the formula
2k ≥ n
where k is the number of classes and n is the number of
observations.
2. Determine the class width using the formula
Range H −L
w≥ =
k k
where H is the highest value and L is the lowest value.
NB: In the case of decimal value, round up the value to get the
class width.
10 / 28
Constructing Grouped Frequency Distribution
Steps For Constructing Grouped Frequency Distribution
3. Set the individual class limits. That is, set the lower limit of
the first class by starting from the lowest value in the data, and
then add the width (w) to get the lower limit of the next class.
Keep adding until there are k classes. Subtract 1 from the lower
limit of the second class to get the upper limit of the first class
and so on.
4. Tally and record the number of items in each class.
11 / 28
Constructing Grouped Frequency Distribution
Example 1
12 / 28
Constructing Grouped Frequency Distribution
Solutions
1. First, we decide the number of classes using the formula
2k ≥ n
where k is the number of classes and n = 50 is the number of
people interviewed in Kotsah Island. Since 26 ≥ 50,
we have k = 6
2. Obtain the class width:
H −L 78 − 2
w= = = 12.6667 ≈ 13 (1)
k 6
13 / 28
Constructing Grouped Frequency Distribution
14 / 28
Categorical Frequency Distribution
Example 2
15 / 28
Categorical Frequency Distribution
Solution
16 / 28
Categorical Frequency Distribution
17 / 28
Graphs For Quantitative Data
1. Histogram
A histogram is a graph that displays the data by using
contiguous vertical bars of various heights to represent the
frequencies of the classes. For example, Figure 1 shows the
histogram plot for the number of travel times in the table below:
18 / 28
Graphical Presentation of Data
Figure: A histogram showing the number of travel times
19 / 28
Graphs For Quantitative Data
2. Cumulative Frequency Curve (Ogive)
The cumulative frequency curve is a graph that represents
the cumulative frequencies for the classes in a frequency
distribution. For example, Figure 3 shows the Ogive plot for the
number of travel times in the table below:
Figure: A cumulative frequency curve (Ogive)
20 / 28
Graphs For Quantitative Data
Graphical Presentation of Data
Figure: A cumulative frequency curve showing the number of travel
times
21 / 28
Graphs For Quantitative Data
4. Scatter plot
A scatter plot shows how much one variable is affected by
another. This is often used when we have a bivariate dataset
and we wish to determine the relationship between the two
variables.
22 / 28
Graphs For Categorical Data
1. Bar Graph
A bar graph is a graph of vertical or horizontal bars whose
heights represent the frequencies of respective categories. For
instance,the figure shows a bar graph for different types of floor
tiles produced by a construction firm in a given day.
Figure: A bar graph showing the number of floor tiles produced in a
given day
23 / 28
Graphs For Categorical Data
2. Pie Chart
A pie chart is a circle divided into sectors. Each sector
represents a category of data. The area of each sector is
proportional to the frequency of the category.
Example
Problem: The data presented in Table 6 represent the
educational attainment of residents of the United States 25
years or older in 2006, based on data obtained from the U.S.
Census Bureau. The data are in thousands. Construct a pie
chart of the data.
24 / 28
Graphs For Categorical Data
25 / 28
Graphs For Categorical Data
Approach
The pie chart will have seven parts, or sectors,
corresponding to the seven categories of data. The area of
each sector is proportional to the frequency of each
category.
For example, 11, 742/191, 885 = 0.0612 of all U.S.
residents 25 years or older have less than a 9th-grade
education. The category-less than 9th grade will make up
6.12% of the area of the pie chart.
Since a circle has 360 degrees, the degree measure of the
sector for the category-less than 9th-grade will be
(0.0612)360◦ ≈ 22◦ .
Use a protractor to measure each angle.
Solution:
We follow the approach presented for the remaining categories
of data to obtain Table 7. 26 / 28
Graphs For Categorical Data
To construct a pie chart by hand, we use a protractor to approx-
imate the angles for each sector. See Figure 6.
27 / 28
Graphs For Categorical Data
28 / 28