0% found this document useful (0 votes)

46 views45 pages

Stats Part 1

Uploaded by

tshedyleemisa27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views45 pages

Stats Part 1

Uploaded by

tshedyleemisa27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

DATA COLLECTION AND SAMPLING

Learning Objective

Here, the learning objectives are as follow:

 Plan how to collect statistical data to test a set of predictions.

 Use data to make inferences and generalisation.
 Look at alternative ways to choose a sample and decide which the best
method to use.
This means at the end of this unit, you should be able to:

 Carry out statistical investigations, collect data and sample, make

inferences and generalisations.

Introduction

To answer statistical questions, you need to collect data.

Statistical Data: are the outcomes or counts or the observations obtained

from investigation or experiment.

Types of Data: There are different types of data:

1. Discrete data: are data that can only take certain values. They are fixed
data values determined by counting.

The data can be counted and has limited number of values that usually comes
in the form of whole numbers or integers. The values therefore can be 0, 1,
2, and so on.

Examples of discrete data are:

 The number of people in a class

 Test question answered correctly
2. Continuous data: are data that can only take any value, including decimal
values. They are values measured over a particular time interval. Height,
weight, length, temperature, masses; times etc. are all examples continuous
data.

Examples are:

 The weight of a baby in its first year

 The temperature in a room throughout the day

Note:
 Both discrete and continuous data are regarded as Quantitative data.
 Data about numeric values (number-based, countable or measurable:
how many, how much, how often etc.).

3. Categorical data: are data that can be grouped into categories instead of
being measured numerically. The data are not given in numbers but rather
in natural language descriptions (words).

Note:
 Categorical data is otherwise known as Qualitative data.
 Interpretation-based, descriptive and relating to language or word: why,
how or what …

Some examples are:

 Colour of hair
 Name of department in your school
 Political affiliation in your country
 The gender represented in a school basketball team.

So, data in essence can be categorized into two:

1. Qualitative data: categorical data as example.

2. Quantitative data: discrete and continuous data as examples
Data

Quantitative Qualitative

Discrete Data Continuous Data Categorical Data

Methods of Collecting Data

Earlier in grade 8, you learnt several ways by which data can be collected.
Some of which includes:

 Interviewing people
 The use of Questionnaire
 Carrying out Experiment and Observation
 Taking measurement
 Carrying out survey

Suggested Teaching Approach

Design and use practical activities to test learners previous knowledge of:
 Data and
 Data collection
Data Collection

To answer questions in statistics, you need to collect data.

Data collection is a way of gathering and measuring information that

enables one to answer relevant questions and evaluate outcomes.

The following are the steps for collecting data:

 First, decide which type of data you need to collect: discrete,

continuous or categorical.
 Decide how to collect the data (method): interview,
questionnaire, observation, survey etc.
 Evaluate the outcomes.

In your previous classes on data, the focus is on collection of data (type of data
and method collecting data). In this section, while planning statistical
investigations, you will not only collect data, you will also make predictions,
inferences and generalisation. It is therefore important to get familiar with
those.

Statistical Prediction

Statistical Prediction is the expected result or outcome of a test that is based

on assumption or hypothesis or theory (proposed explanation or
statement).

In scientific method, the prediction is constructed before any applicable

research.

For instance:

From the study of a student’ performance, such as scores on tests and

. exams, we may predict the student’s final grade with reasonable
accuracy.
Statistical Inference, Sample and Genaralisation

Statistical Inference

Statistical Inference is the practice of using sampled data to infer judgment,

draw conclusion or make predictions about a larger sample or
population (the people you are interested in).

This happens in a case where you cannot question the whole population. In
such case, you need to choose a sample.

In simple terms, inference is a situation where data is extracted from a

group of subjects and then used to make predictions about a larger group.

For example,

 If you take 100 students, from a school population, on

whether they like physical school or virtual learning,
 And 75 of the 100 students like virtual learning while 25 like
physical school.
 Using the data, you may infer or conclude that 75 percent of
the general population of students in the school like virtual
learning while 25 percent like physical school.

Statistical inference is therefore based on random sampling i.e. it is based on

the data sampled from a larger population.

Statistical inference therefore helps to make generalisation i.e.

 It helps to determine a general parameter about a larger sample or

population based on data acquired from samples.
 It helps to determine how a larger group of subjects will perform based
on the performance of the existing subjects.
 It helps to make educated predictions about how a set of data will
scale/land when applied to larger population.
Generalisation

Generalisation is therefore a general statement or conclusion reached or

obtained by inference from specific cases.

Let’s get to work by looking at this example:

Social media is considered a challenge to students and their study.

You are asked to investigate the gender that spends more time on
social media.

Here you need to think about things like:

 Activities or apps available on social media.

 Gender in relation to the activities students spend time on
 Age and gender in relation to time spend.

Sampling

1. Statistical Questions: first write some questions you could ask about the
gender that spends more time on social media.

 What type of activities students engage in on social media? WhatsApp,

Facebook, instagram engagement, playing games, watching events such
as sport, movies, trading, cartoons etc?
 Which gender spends more time on social media? Basing yourself on
outcome from individual activity?
 How age and gender determine the activity and who spend more time
on social media?
2. Make Predictions: from your questions, write some prediction you could
test.

 Girls spend more time on facebook and instagram than boys or

 Boys spend more time playing games or watching sport than girls.
 While girls generally spend more time watching movies/soapies, boys
spend time trading online.
 Girls within the age of 11-13, spend more time watching cartoon

3. Collect Data and Sample: Decide some different ways of choosing a sample
to test one or more of your predictions.

 To test your predictions or some, you need to think type of data you will
collect and how to collect the data.
 Here, you may think of collecting discrete and categorical data using
questionnaire and/or data sheet with headings to fill or tick.
 Then choose samples to test one or more of you predictions as shown
below:

i. Data Sheet

 Fill the activities using M for Male and F for Female.

Grade Facebook Instagram Games Sport

ii. Questionnaire

Questions Yes or No
Girls spend more time watching movies or soapies?
Boys are involved in online trading than girls?
Questionnaire

Which group of students spends more time on cartoons?

 Tick the appropriate age range and

 For the gender, write either M for Male or F for Female.

Age Range Gender

11-13 14-16 17-18

4. Trial: Carry out small trial investigation to test your data collection
method. You may think of ways to improve your investigation?

 Here, use the data sheet/questionnaire to get sample opinion says 120
students from different level of your school. For instance, 20 students
from each level (grade 8 to A Level).

5. Generalisation: Use the results of your trial to make a generalisation about

the gender of students that spend more time on social media.

You may use your samples to generalise among:

 Students as a whole.
 Students at the lower secondary (Grade 8 and 9).
 Students at the upper secondary (Grade 10 and 11).
 Students at advanced level (AS and A levels).
Do it Yourself Questions

1. You are going to investigate the impact of a school’s science fair on students
in the last five years.
a. Write some questions you could ask about the impact of science fair on
students in the last five years.
b. Write some predictions you can test.
c. Describe some different ways of choosing a sample to test one or more
predictions.
d. Which sample method is best? Give a reason for your answer.
e. Carry out a small trial of your investigation. Can you think of a ways to
improve your investigations?
f. use the results of your trial to make a generalisation about the impact of a
school’s science fair on students in the last five years.

Follow the guidelines in question one and answer the following questions:
2. A school has introduced the use of tablets at school. You are going to
investigate if the introduction of tablets at school helps any student.

3. The traffic controller of a police department wants to carry out

investigation to control traffic during closing hours and the accident it may
cause. Investigate how this section of the police may carry out the
investigation.
BIAS: POTENTIAL ISSUES AND SOURCES

Learning Objective

Here, the learning objectives shall be to learn:

 about sources of bias

 about ways to choose an unbiased sample
 how to identify wrong or misleading information
This means at the end of this unit, you should be able to:

 collect a data that is a representative of the whole population or

unbiased.

Statistical bias is statistics that do not provide an accurate representation of

the population.

Sources of Bias

There are different possible sources of bias. Statistical bias can occur due to
factor such as method of sampling i.e. some data is biased because the sample
of people it surveys doesn’t accurately represent the population or
certain groups are underrepresented in a sample data.

So, the accuracy or reliability of a statistical investigation depends on the

sample data collected. A sample data that does not represent the whole
population is biased.

Example 1

A presidential candidate wants to be the voice of the majority and so he

carried out an investigation to test what people in a community think about
meat industry.

 He heads to vegetarian community and ask 7 people and all of them

unanimously say the meat industry should be banned.
 And so he concluded that everyone in the community wants the meat
industry to be banned.
 In the election, he had less than 2% of the votes available.

Explain why?

 His sample is a representation of a group (the vegetarian)

who dislikes meat; the entire population is not represented.
His sample data therefore is a biased.
 A sample data should be a representative of the entire
population to be accurate. Hence why a bias is regarded
as prejudice in favour of or against one thing, person or
group compared with another.

Bias is bad and our objective in this section is to minimize as much bias
as we can.

Other sources of bias are selection bias, omitted variable bias, observer bias,
recall bias, funding bias etc.

Example 2

A garage coordinator wants to find out average number of taxi arriving at the
park in a day. On a cold day, he took a sample of taxi arriving from 7am –
12pm and based his conclusion on this sample.

Explain why this is a biased sample in the sense that:

 The time interval for the sample is short.

 The sample is taken only on a day.
 Being a cold day, people may have delayed to leave their
homes.
Activity
 Suggest what he can do to achieve a reliable investigation.
 Explain your suggestion

Do it Yourself Questions 1

1. Below are questions from investigation:

a. Do you agree that ChatGPT will make students to think less?
b. Do you think the cost of groceries is too high the minimum wage earners?
c. Do you prefer your new teacher of Philosophy?
i. Explain why these questions will give a biased result.
ii. How could the questions be written to prevent being biased?

2. A student carries out an investigation to test if a family of 12 member,

comprises of 4 children, 2 parents and 6 others prefers grape juice or orange
juice.
In her investigation, she only took a sample of the 4 children who say they
prefer orange juice and generalise that the whole family like orange juice.
a. Explain the problem with her investigation.
b. Suggest how she can solve this problem.

3. A football fan base comprises of 100 males and 50 females.

The football club wants to choose a representative sample of 60 fans to
support the team abroad.
i. Suggest how the selection should be done.
ii. Use you answer to determine how many males and females the football
club should choose.

4. A novelist wants to investigate if the readers of her new literature book it

fascinating or not. She managed to interview 64 readers out of the sample of
256.
a. Work out the percentage of readers she interviewed.
b. The novelist thinks that the percentage could cause a biased.
i. Explain why the novelist may be right.
ii. For better result, suggest another method she may use to collect her data.
Do it Yourself Questions 2

1. A teacher wants to investigate students’ low performance in Mathematics.

She developed a questionnaire which she gave out to 320 students
121student returned the questionnaire. Work out and explain how this
number may cause bias.

2. LHDA is inviting all Basotho to suggest names that should be given to the
Tunnel Boring Machine (TBM). The winning name suggested will result in
unforgettable prizes from the LHDA for the winner.
When choosing the names, participants are advised to consider “A legend
from the Leribe district, the area of the tunnel excavation” as one of the
criteria.
Explain why this particular criterion might give a bias.

3. A sample of students representing Grade 11 were given chance to write

either Physics or Physical Science.
When they were asked, ‘if they prefer Physical Science’, 80% of the sample
students said, ‘yes’.
a. Why do you think this result might be biased?
b. How would you asked the questioning to avoid bias?

4. The table below shows the numbers of students in a school house sport.

Soccer Basket ball Total

Red 48 34 82
Green 50 44 94
98 78 176

The school wants to choose a representative sample of 35 students for district

competition.
a. How many students in the sample should be in green house?
b. How many students in the sample should play soccer?
c. Use bar chart to show the data in the table.
REPRESENTING AND INTERPRETING DATA USING DIAGRAMS

Representing Data using Diagrams

For the purpose of interpretation of data collected during investigation, data

(categorical, discreet or continuous data) may be represented using
diagrams, charts and graphs.

The following are the lists of diagrams, charts and graphs that may be used to
represent data:

 Venn and Carroll diagrams.

 Tally charts, frequency tables and two-way tables.
 Dual and Compound bar charts.
 Pie Charts
 Line graphs, time series graphs and frequency polygons.
 Scatter graphs
 Stem-and-leaf and back-to-back stem-and-leaf diagrams.
 Infographics

So, in order to interpret data, you need to choose which representation to use
for a given data. It is therefore very important to decide which type of
representation (diagram, chart, graph) is best to use to represent data.
The table below will help you to decide the right representation:

Type of Diagram, Chart When it is to be used What it looks like

or Graph
Venn diagram When data are sorted into
groups that have some
things in common.

Bar chart When data being compared

are discrete data.

Dual bar chart When comparing two sets of

discrete data.

Compound bar chart When two or more data are

to be combined into one
bar. In order to identify the
individual quantity and the
total quantity.
Frequency diagram For comparing continuous
data.

Line graph It is used to see how data

changes over time.

Scatter graph For comparing two sets of

data points.

Pie chart For comparing the portion of

each sector with the whole
amount.

Infographic For showing information in a

quick and easy to
understand way.
Interpreting Data from Diagram

You already learn therefore know how to use most of these items in
representing and interpreting data (categorical, discreet or continuous data).

However, our objective in this session shall be to interpret data represented

using:

 Frequency Polygon
 Scatter Graphs
 Back-to-back and
 Stem-and-leaf diagrams.

Learning Objective

Here, the learning objectives shall be to:

 Draw and interpret frequency polygons for discrete and continuous

data.
 Draw and interpret scatter graphs.
 Draw and interpret back-to-back and stem-and-leaf diagrams.
This means at the end of this unit, you should be able to:

 Interpret data from the above diagrams.

FREQUENCY POLYGON

Frequency polygon is a type of line graph obtained when the class frequency
is plotted against the class midpoint and the points are joined by a line
segment to create a curve.

So, in order to draw a frequency polygon, the following are the steps:

 Calculate the midpoint of each class interval (by finding

the mean or average of the class interval.
 Prepare a table of class frequency and midpoint for the
class.
 Plot of a graph of class frequency (on 𝑦 −axis) and
midpoint of the class (𝑥 −axis).
 Then join the plotted points with a straight line segment
(polygon).

Note:
The following expected of you:
 Label the axes with the quantities you have plotted on the axes.
 Give the graph a title
Below is an example:

Class Interval Midpoint of Frequency

Length, 𝑙 (mm) Class interval
0 < 𝑙 ≤ 20 0 + 400 6
= 10
2
20 < 𝑙 ≤ 40 20 + 40 9
= 30
2
40 < 𝑙 ≤ 60 40 + 60 19
= 50
2
60 < 𝑙 ≤ 80 60 + 80 16
= 70
2
80 < 𝑙 ≤ 100 80 + 100 7
= 90
2
100 < 𝑙 ≤ 120 100 + 120 3
= 110
2

The line graph is as shown below:

Frequency-Height Line Graph

Activity

Use the table above and plot the frequency polygon if the frequencies are 2, 8,
9, 7 and 1 respectively.

SCATTER GRAPHS

A scatter graph is a statistical diagram that compares two sets of data. It

gives a visual representation of the relationship or correlation between
these two sets of data.

In scatter graphs, dots or crosses are used to represent values for these two
sets of data and the position of a dot or cross on the horizontal and vertical
axes indicates the values for an individual data point.

The two sets of data could have:

Positive Correlation: This is a case in which the data shows uphill pattern
as you move from left to right. As one value increases, the other also
increases and vice versa.
Negative Correlation: in this case, the data show a downhill pattern as you
move from right to left. This means as one value increases, the other
decreases and vice versa. This indicates a negative relationship between the
two sets of data.

No Correlation: This is a case when the data are random and do not have
any kind of pattern. This means there is no relationship between the two
sets of data.

Line of Best Fit on Scatter Graphs

Where two sets of data have positive or negative correlation, a line of best fit
may be drawn on the scatter graph.

 The line of best fit shows the relationship that exists between the two
sets of data.
 It helps to get an estimate value of the variable. It may therefore be
referred to as an estimated line of best fit.
It can also be used to show values such as:

 Strong correlation: is the case where most of the data points from the
two sets of data will be closed to the line of best fit.
 Weak correlation: is the case if most data points of the two sets of data
are not close to the line of best fit.

Interpolation

The line of best fit may as well be used to:

 Estimate the value of one variable when the value of the other
variable is given. This is called interpolation.

This is done in these ways:

 Identifying the given value on its axis (vertical or horizontal

axis). Let’s say it is on the horizontal axis.
 Make a broken line from this value (on the horizontal axis) to
the line of best fit.
 Then from the line of best fit, continue the line to the variable
you are looking for on the vertical axis.

We shall see example of interpolation as we continue.

Note:
When drawing line of best fit:
 Make sure it is a straight line in the direction of the correlation,
 With points distributed on each side of the line as equally as possible
along the line.
 The line may pass directly through a number of points.

To plot a scatter graph, the following are the steps:

 Identify the independence and dependence variables.

 Place the independence variable on the horizontal
(𝑥 − 𝑎𝑥𝑖𝑠) and the dependent variable on the vertical
(𝑦 − 𝑎𝑥𝑖𝑠).
 Label the axes with the variable names and units (where
needed)
 Then plot the point like any other graph.
 Give the graph a title.

Example:

The table below shows the height and weight of 10 students.

Student 1 2 3 4 5 6 7 8 9 10
Height (cm) 120 145 130 155 160 135 150 145 130 140
Weight (Kg) 40 50 47 62 60 55 58 52 50 49

a. By placing height on the vertical axis and weight on the horizontal axis, plot
a scatter graph to show the relationship between the height and weight.

 Plot each of the pair point using your previous knowledge of graphs, and
mark it with a cross or dot.
 Below is the graph.
b. Use the graph to describe the relationship between these two variables.

 The graph is a positive correlation. This means as one variable

increases, the other increases likewise and vice versa.
 To be more specific, the graph shows that the taller a student is, the
heavier the student as well and vice versa.
c. Draw a line of best fit on you graph and describe the strength of the
correlation.
 Most of the points are away from the line.
 It therefore means that the correlation or relationship is a weak one.
d. Use the graph to estimate the height of a student with a weight if 56kg.
 First go to where the weight is 56kg on the horizontal axis.
 From there, draw a line to the line of best fit.
 From the line of best fit, draw another line to the vertical axis
representing the height.
 The point at which this line touches the vertical axis is the height when
weight is 56kg.
 See the graph below.
Do it Yourself Questions

1. Describe and explain the correlation you would expect between each of
the data below.
a. The age of a vehicle and its speedometer.
b. The amount of time fishing and the amount of bait in the bucket.
c. The number of passenger in a bus and the number of traffic lights on the
route.

2. The table below shows the height of the waves at Durban beach and the
number of surfer at the beach.

Wave height 5 8 7 3 6
(feet)
Number of 26 63 58 17 37
Surfers
a. Plot a scatter graph to show the data
b. Describe the type and strength of the correlation between the two data.
Explain your answer.
c. Draw a line of best fit for the graph.
d. Use the line of best fit to estimate the height of the wave if 15 surfers were
at the beach
3. The scatter graph below shows the numbers of lawns mowed by a Gardner
during one week.
a. How many days does it take to mow 20 lawns?
b. About how many lawns can be moved in 1 day?
c. Describe the relationship shown by the data.

4. The scatter graph shows the weights of a baby taken from birth through
some months.

a. What is the weight of the baby at birth?

b. What is the age of the baby when the weight is 15 pounds?
c. Does the data show a positive, a negative or no correlation?
5. Below is a scatter graph showing relationship between numbers of boys
and girls in different classrooms.

a. How many classrooms are there altogether?

b. Zainab and James described the relationship as follow:

Zainab says the scatter graph

James says this can’t be true,
shows a negative correlation.
that there is no relationship
This means that the more the
between number of boys and
boys the less the girls and vice
of girls.
versa.

Discuss with other learners and decide who is correct between Zainab and
James
6. Here is a table showing the numbers of losses a gamer has in playing a video
game for 7 weeks.

Week 1 2 3 4 5 6 7
Losses 15 12 10 7 6 3 1

a. Plot a scatter graph for the data. Place Week on the horizontal axis and
Losses on the vertical axis.
b. Draw a line of best fit. Use your line of best fit to estimate how many losses
the gamer had in 3.5 weeks.
c. Describe the relationship and strength of the correlation.
STEM-AND-LEAF DIAGRAM

Data needs to be presented in a way that it is easy to visualise and quickly

understand the data. Using stem-and-leaf is one of the many ways this may be
done.

Stem-and-leaf diagram is another way of representing data where each

number is split into two parts, namely the stem and the leaf, hence the
name.

 The stem is the first few digits or every digit before the last digit.
 While the leaf is the last digit (it must be one digit only).
 The symbol ‘I’ is used to split and express the stem and leaf values.

For instance,

 In a 173, 17 will be the stem and 3 the leaf.

 In a number 46, 4 will form the stem and 6 the leaf.
 While in a number 3.9, 3 will be the stem while the leaf will be 9.
 A one digit number like 7 may be considered as 07, it therefore has a
stem of 0 and leave of 7.

Stem-and-leaf diagram has the following features:

 The numbers are arranged in line vertically and

horizontally.
 The numbers are arranged in order of size, from the
smallest to the largest.
 Use of keys to show how to read the diagram.

How to make a stem-and-leaf diagram

 First identify the smallest and largest number in the

data.
 Identify the stems and the leaves.
 Draw a vertical line and list the stem numbers to the
left of the line and each leaf number on the right next to
its corresponding stem
Below is the table of fruits found in a bag:

Fruit Number
Apple 22
Orange 32
Pear 14
Banana 21
Cherry 4
Avocado 29
Watermelon 4
Pineapple 13
Lemon 29
Plum 20
Guava 24
Coconut 2
Grape fruit 12
Fig 1

Draw a stem-and-leaf diagram for the number of fruit.

Stem Leaf

0 1 2 4 4  The numbers are between 1 and 32

 Set the number in order of size as
1 2 3 4
stem on the left and leaf on the right.
2 0 1 2 4 9 9

3 2

Key: 0 1 means 1 fruit

 You may then be asked to use your stem-and-leaf diagram to find other
information such as mean, median, mode and range.
 This shall be done extensively in the next topic. However, you already
had previous knowledge of stem-and-leaf in your previous grade.
Suggested Teaching Approach

Design and use practical activities to test learners previous knowledge of:
 Stem-and-leaf

BACK-TO-BACK STEAM-AND-LEAF DIAGRAM

Here in this grade, the objective shall be to draw and interpret back-to-back
steam-and-leaf diagrams. It will be a step further to you previous knowledge
of stem-and-leaf diagram.

A back-to-back stem-and-leaf diagram is a method of comparing two data by

attaching two sets of leaves to the same stem in a stem-and-leaf diagram.

How to make back-to-back stem-and-leaf diagram

This is similar to making stem-to-leaf diagram for a single data as follow:

 First identify the smallest and largest number in the

two data.
 Identify the stems and the leaves.
 Draw two vertical line and list the stem numbers
between the vertical lines.
 Then set out in order of size the leave for one data no
the left of the line and the leave for the other data to
the right of the line.
Let see the example below:

Earlier on we drew a stem-to-leaf diagram for a number of fruits in a bag.

Here is another table that shows the number of fruits found in another bag.

Fruit Number
Apple 19
Orange 27
Pear 29
Banana 33
Cherry 24
Avocado 21
Watermelon 5
Pineapple 5
Lemon 12
Plum 10
Guava 9
Coconut 13
Grape fruit 8
Fig 5

We are going to draw a back-to-back stem-and-leaf diagram to show the

number of fruits in the two bags, by so doing comparing the relationship
between the two bags.

bag 1 bag 2

4 4 2 1 0 5 5 5 8 9 The number ranges from 1 to 33

4 3 2 1 0 2 3 9 The leaves for bag one comes out from the
stem in order of size to the left.
9 9 4 2 1 0 2 1 4 7 9
While the leaves for bag 2 comes out from
2 3 3
the stem in order of size from the right.
Key: For bag 1, 0 1 means 1 fruit

For bag 2, 0 5 means 5 fruits

You may use the back-to-back stem diagram to answer questions like:

a. What fraction of the fruit in bag one is more than 10 but less than 20.

 The data in these category are: 11, 12 and 13

 These numbers added together
(11 + 12 + 13) 𝑎𝑑𝑑 𝑢𝑝 𝑡𝑜 36.
 The total number of ball in bag one add up to:
11 + 36 + 145 + 32 = 224
 Then divide the sum of fruits more than 10 but less than
20 by the total number of fruits in bag one:

Therefore, fraction of fruits in bag one that is more than 10 but

less than 20
36 18 9
= = =
224 112 56

Activity
b. Which fruits in bag two have equal number?
c. Which bag has the highest number of fruits?
CALCULATING MEAN, MEDIAN, MODE AND RANGE FOR GROUPED DATA

Learning Objective

Here, the learning objective is to use:

 Mean, Median, Mode and Range to compare two Grouped Data.

This means at the end of this unit, you should be able to:

 Carry out statistical trends and relationships between two sets of data.

You already had some basic knowledge about statistical mean, median, mode
and range, especially how to work out the mean, median, mode and range for
individual data and for data represented in a frequency table.

In this objective, we shall look beyond individual data to calculating mean,

median, mode and range for grouped data.

Note:
 Grouped data is also referred to as “Class Interval”.

Calculating mean, median, mode and range for grouped data requires a
different approach.

However, for the purpose and better understanding of this objective, we need
to remind ourselves the following:

 MEAN: is the sum of all the values divide by the number of values. It
is otherwise refer to as average.
 MEDIAN: is the middle value when the values are arranged in order
of increasing size.
 MODE: is the most frequent value-values that appears most in a set
of data.
 RANGE: is the largest value minus the smallest value.
Suggested Teaching Approach
Use activities to test learners previous knowledge ofMean, median, mode and
range from:
 Individual data and
 Frequency table

Grouped Data

A group data is a type of data that has been grouped or classified into
specific categories or ranges.

It is used to make it easier to analyse and interpret large amount of data.

A frequency table below shows the weights of certain people. It is an example

of a grouped data.

Weight (Kg) Frequency

20 < 𝑤 ≤ 30 2
30 < 𝑤 ≤ 40 13
40 < 𝑤 ≤ 50 7
50 < 𝑤 ≤ 60 6

 The grouped data’s frequency table is different from the individual

data’s frequency table in the sense that the values representing the
quantity being measured are grouped, hence the name, group data
(see the coloured column). Otherwise known as the “Class interval”.

The table shows that:

 2 people out of the total number of people have their weight within 21
and 30. They are therefore in the group or class interval 20 < 𝑤 ≤ 30.
 13 people have weight within 31 and 40 are in the group or class
interval 30 < 𝑤 ≤ 40 and so on.
Calculating Mean, Median, Moe and Range of Grouped Data

Calculating Mean of Grouped Data

Steps for finding the Mean of a Grouped Data:

1. First work out the midpoint of each group or class interval: The
midpoint is the average of the class interval and is worked out as
shown:

 20 < 𝑤 ≤ 30 is the first group or class interval in the

frequency table.
 Its midpoint will be the average of 20 and 30 (class
interval): i.e.

= = 25 (midpoint)

 So, 25 is the midpoint for class interval 20 < 𝑤 ≤ 30

Activity
 Repeat these steps and find the midpoint for each of the class interval

2. Then multiply each midpoint by the corresponding frequency i.e.

25 × 2 = 50

 Repeat this step for each class interval

Activity
 Repeat these steps for each class interval.
3. Add all the results obtained from multiplying midpoint by frequency
i.e.

50 + …

Activity
 Add the results obtained from multiplying midpoint and frequency.
 Also, add all the frequencies together.

4. Lastly, find the mean by dividing the total obtained by the sum of the
frequency: i.e.

Mean equals the total of all the product of midpoint and

frequency divided by the total of all the frequencies i.e.
𝑡𝑜𝑡𝑎𝑙 𝑜𝑓 𝑚𝑖𝑑𝑝𝑜𝑖𝑛𝑡 × 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑀𝑒𝑎𝑛 =
𝑠𝑢𝑚 𝑜𝑓 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦

Activity
 Using the formula, find the mean.
 Approximate your answer to the nearest whole number .

Note:
 These steps can simply be shown on a table by adding two columns
representing “midpoint” and “midpoint × frequency” to our frequency
table:
See the table below. The columns added are coloured.

Weight (Kg) Midpoint Frequency Midpoint × Frequency

20 < 𝑤 ≤ 30 25 2 50
30 < 𝑤 ≤ 40 35 13 455
40 < 𝑤 ≤ 50 45 7 315
50 < 𝑤 ≤ 60 55 6 330
Total 28 1150

𝑡𝑜𝑡𝑎𝑙 𝑜𝑓 𝑚𝑖𝑑𝑝𝑜𝑖𝑛𝑡 × 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 1150

𝑀𝑒𝑎𝑛 = = = 41.07𝑘𝑔
𝑠𝑢𝑚 𝑜𝑓 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 28

Therefore, the estimate of the mean is 41kg

Calculating Median of a Grouped Data

The steps for finding the median of class interval or grouped data are as
follow:

 First find the sum of all the frequencies.

 Divide it by 2 (Since median is known to be the middle
number).
 The median will be class interval or group corresponding
to the result (frequency) obtained after the division.
Let find the median for our group of data:

 The sum of all the frequencies = 28

 Then = 𝟏𝟒
 From the column of frequency, 14 falls in row 2 (because 2 in the
row 1 and 13 in row 2 add up to 15.
 And the class interval or group corresponding to row 2 is
30 < 𝒘 ≤ 𝟒𝟎.
 Therefore,
Calculating the Mode
theof a Grouped
median Data
is 30 < 𝑤 ≤ 40.

Calculating Mode of a Grouped Data

Mode is regarded as the simplest of statistics measure that involve mean,

median, mode and range.

It is the class interval or group with the highest frequency.

Steps for find mode:

 From the table identify the highest frequency.

-Let us use our table as example:

In our frequencytable,

 The greatest frequency is 13.

 And the class interval or group corresponding to this frequency
is 30 < 𝑤 ≤ 40.
 Therefore, the mode or modal class is 30 < 𝑤 ≤ 40.
Calculating Range of a Grouped Data

Range, as defined, is the difference between the largest and the smallest
data i.e. largest data minus the smallest data.

So, in our frequency table,

 The smallest value is 20

 And the largest value is 60
 So, the range = largest value – smallest value

𝑖. 𝑒. 𝑟𝑎𝑛𝑔𝑒 = 60 − 20 = 40𝑘𝑔

 The range therefore is estimated to be 40kg.

Activity
1. From our example,
a. What do you observe about the mean and the range? Explain why both
answers are estimate.
b. What can you observe about the median and the mode? Explain your
observation.

2. The table shows the record of work submitted by two departments of a

school.

Monday Tuesday Wednesday Thursday Friday

Department 1 20 21 22 20 21
Department 2 30 15 12 36 28

a. Draw a back-to-back stem-and-leaf diagram for these departments.

b. Find the estimate mean and range for these departments.
c. Which department is most consistent?
d. Compare and comment on the record of work submitted by these two
department.
e. The school thinks the record of work by Department 2 is better. Do you agree
or disagree. Justify your answer by explanation.
Do it Yourself Questions

1. A school takes heights, in cm, of 51 students from a school.

Class interval 100 ≤ ℎ < 110 110 ≤ ℎ < 120 120 ≤ ℎ < 130 130 ≤ ℎ < 140
Frequency 6 16 21 8

a. Estimate the mean and class interval where the median height falls.
b. Find the modal class.
c. Work out an estimate for the range.

2. The number shows the number of hours a sample of people spent viewing television
one during summer.

a. Complete the frequency table for this sample.

Viewing time/hours Number of people

0 ≤ ℎ < 10
10 ≤ ℎ < 20 27
20 ≤ ℎ < 30 33
30 ≤ ℎ < 40
40 ≤ ℎ < 50
50 ≤ ℎ < 60
b. Calculate the mean viewing time for these number of people
c. Work out an estimate for the range.
d. State one different you would expect to see in the data if it were to be carried out
during the winter.

3. A farmer buys 2 packets of seeds from two different companies. Each packet contains
20 seeds. The farmer records the number of plants that grow from each packets.

Company A 20 5 20 20 20 6 20 20 20 8
Company B 17 18 15 16 18 18 17 15 17 18

Draw
a. A scatter diagram of the two companies
b. A back-to-back stem-and-leaf diagram of the two companies.
c. Find the mean, median and mode for each company’s seeds.
d. Which company does the mode suggest is best?
e. Which company does the mean suggest is best?
f. Find the range of each company seeds.

4. The list below shows the maximum daily temperature, in 0F, in a certain month of the
year.

55.3. 49.4 63.9 55.7 56.3 54.0 52.2 58.7 58.9 52.0
45.8 55.3 42.6 62.5 63.4 61.0 58.5 48.9 62.3 68.4
56.4 67.0 43.3 58.1 53.6 52.1 46.9 51.3 56.7 63.4

a. Complete the grouped frequency table below.

Temperature, T Tallies Frequency

40 < 𝑇 ≤ 44
44 < 𝑇 ≤ 48
48 < 𝑇 ≤ 52
52 < 𝑇 ≤ 56
56 < 𝑇 ≤ 60
60 < 𝑇 ≤ 64
64 < 𝑇 ≤ 68
68 < 𝑇 ≤ 72
b. Represent the data using:
i. bar chart ii. Pie chart. Iii. Frequency polygon iv. Scatter diagram v. Stem-and-
leaf diagram
c. Calculate an estimate of the mean of the temperature.

Math Textkbook Y7-C6
No ratings yet
Math Textkbook Y7-C6
12 pages
Mathematical Literacy Grade 12 Term 1 Week 6
No ratings yet
Mathematical Literacy Grade 12 Term 1 Week 6
10 pages
LP 3 - This Lesson Plan Is About Gathering Statistical Data
No ratings yet
LP 3 - This Lesson Plan Is About Gathering Statistical Data
11 pages
Bks MaiHL 02uu tn00 Xxaann
100% (1)
Bks MaiHL 02uu tn00 Xxaann
17 pages
Unit 6 - Grade 8 Course Book
No ratings yet
Unit 6 - Grade 8 Course Book
9 pages
Math7 Q3 W1 Day1
No ratings yet
Math7 Q3 W1 Day1
25 pages
Q3 - LE - Mathematics 7 - Lesson 1 - Week 1
No ratings yet
Q3 - LE - Mathematics 7 - Lesson 1 - Week 1
15 pages
Q3 Week1 Day1 Dec.13-16
No ratings yet
Q3 Week1 Day1 Dec.13-16
3 pages
LP 3
No ratings yet
LP 3
11 pages
Mathematical Literacy Grade 10 Term 1 Week 7 - 2021
No ratings yet
Mathematical Literacy Grade 10 Term 1 Week 7 - 2021
13 pages
Lesson 2 Qualitative and Quantitative Data
No ratings yet
Lesson 2 Qualitative and Quantitative Data
39 pages
Collecting Data: Chapter Overview
No ratings yet
Collecting Data: Chapter Overview
13 pages
Name of School: Quarter: 3 Quarter Grade Level & Section: Grade 7 Week: Week 1 Day 1 Subject: Math Date and Time: Topic: Teacher
No ratings yet
Name of School: Quarter: 3 Quarter Grade Level & Section: Grade 7 Week: Week 1 Day 1 Subject: Math Date and Time: Topic: Teacher
10 pages
Business Mathematics and Statistics - Chapter 4
No ratings yet
Business Mathematics and Statistics - Chapter 4
32 pages
Grade 10 Statistics Lesson Plan
100% (1)
Grade 10 Statistics Lesson Plan
11 pages
Grade 7 Q3W1 - Data and Probability (MATATAG Curriculum)
No ratings yet
Grade 7 Q3W1 - Data and Probability (MATATAG Curriculum)
35 pages
HZ10 Stats Lesson 1
No ratings yet
HZ10 Stats Lesson 1
4 pages
Chapter 3
No ratings yet
Chapter 3
10 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
62 pages
Collecting and Organising Data: Chapter Overview
No ratings yet
Collecting and Organising Data: Chapter Overview
13 pages
CH 05 - Planning - and - Collecting - Data
No ratings yet
CH 05 - Planning - and - Collecting - Data
3 pages
Teacher S Notes 6
No ratings yet
Teacher S Notes 6
14 pages
Smm105 - Reviewer (Prelim)
No ratings yet
Smm105 - Reviewer (Prelim)
7 pages
DLP Math 7 Statistics
No ratings yet
DLP Math 7 Statistics
7 pages
Statistics Q1-W2
No ratings yet
Statistics Q1-W2
7 pages
Lesson Guide: Math 7
No ratings yet
Lesson Guide: Math 7
17 pages
STATISTICS - Is A Branch of Mathematics That Deals With The Collection
No ratings yet
STATISTICS - Is A Branch of Mathematics That Deals With The Collection
14 pages
Mathematics - Quarter - Week 1
No ratings yet
Mathematics - Quarter - Week 1
97 pages
Module 1 Introduction To Statistics and Data Analysis Math403 2020 PDF
No ratings yet
Module 1 Introduction To Statistics and Data Analysis Math403 2020 PDF
29 pages
SDLP Values 2
No ratings yet
SDLP Values 2
5 pages
Data Collection and Analysis: Interpretation and Providing Solution
No ratings yet
Data Collection and Analysis: Interpretation and Providing Solution
39 pages
May 8 DLP Honeylet
No ratings yet
May 8 DLP Honeylet
9 pages
Math 7 Q3 W1
No ratings yet
Math 7 Q3 W1
78 pages
Data Gathering & Organization Guide
No ratings yet
Data Gathering & Organization Guide
55 pages
Introduction To Statistics Data Collection
No ratings yet
Introduction To Statistics Data Collection
22 pages
Statistics & Probability
No ratings yet
Statistics & Probability
54 pages
GCSEStatisticsAnswers AQA
No ratings yet
GCSEStatisticsAnswers AQA
85 pages
Grade 7 Math Lesson Plan: Statistics
No ratings yet
Grade 7 Math Lesson Plan: Statistics
5 pages
Math 7 Q3 W1
No ratings yet
Math 7 Q3 W1
90 pages
Lesson 1 - Definition of Statistics
No ratings yet
Lesson 1 - Definition of Statistics
48 pages
Q3-Math7-Week-1-Collection-of-Data-Sampling-Method
No ratings yet
Q3-Math7-Week-1-Collection-of-Data-Sampling-Method
73 pages
Lesson in Statistics
No ratings yet
Lesson in Statistics
10 pages
CH 5. Planning & Collecting Data
No ratings yet
CH 5. Planning & Collecting Data
3 pages
Year 8 Data Collection Guide
0% (1)
Year 8 Data Collection Guide
21 pages
4QW2G7
No ratings yet
4QW2G7
12 pages
Department of Education Schools Division of Zamboanga Del Norte
No ratings yet
Department of Education Schools Division of Zamboanga Del Norte
3 pages
LE - Q3 - Math 7 - Lesson 1 - Week 1
No ratings yet
LE - Q3 - Math 7 - Lesson 1 - Week 1
19 pages
Statistics For Year Two
No ratings yet
Statistics For Year Two
17 pages
Maths Module 3: Statistics: Student's Book
No ratings yet
Maths Module 3: Statistics: Student's Book
49 pages
Conducting and Gathering Information From Surveys, Experiments or Observations
No ratings yet
Conducting and Gathering Information From Surveys, Experiments or Observations
53 pages
Grade 7 Math: Statistics Essentials
No ratings yet
Grade 7 Math: Statistics Essentials
8 pages
Statistics - Day1 - Basic Stats
No ratings yet
Statistics - Day1 - Basic Stats
64 pages
Pr2 g12 q2 18 Plan-data-Analysis Fortugaliza
No ratings yet
Pr2 g12 q2 18 Plan-data-Analysis Fortugaliza
21 pages
PR Data Collection
No ratings yet
PR Data Collection
25 pages
Math 1 Reviewer 2nd Grading
No ratings yet
Math 1 Reviewer 2nd Grading
3 pages
Chapter No 3 Data Collection and Sampling Notes
No ratings yet
Chapter No 3 Data Collection and Sampling Notes
10 pages
Collection of Data
100% (1)
Collection of Data
11 pages
LP 1
No ratings yet
LP 1
7 pages
Machine Learning Techniques - Types of Machine Learning - Applications Mathematical Foundations of Machine Learning
No ratings yet
Machine Learning Techniques - Types of Machine Learning - Applications Mathematical Foundations of Machine Learning
15 pages
Essentials of Business Analytics Jeffrey D. Camm PDF Available
No ratings yet
Essentials of Business Analytics Jeffrey D. Camm PDF Available
89 pages
Worksheet 4final Exam
No ratings yet
Worksheet 4final Exam
4 pages
Group-8 DIP MiniProject
No ratings yet
Group-8 DIP MiniProject
22 pages
Intro to CS & Probability Courses
No ratings yet
Intro to CS & Probability Courses
86 pages
Jindal Global Business School: Course Outline
No ratings yet
Jindal Global Business School: Course Outline
5 pages
Airport Risk Assessment
No ratings yet
Airport Risk Assessment
14 pages
CH 06
No ratings yet
CH 06
68 pages
EEE Undergraduate 24-10-2011
No ratings yet
EEE Undergraduate 24-10-2011
70 pages
Introduction To Statistics and Data Analysis 5th Edition Roxy Peck Latest PDF 2025
No ratings yet
Introduction To Statistics and Data Analysis 5th Edition Roxy Peck Latest PDF 2025
165 pages
GITAM School of International Business GITAM University
No ratings yet
GITAM School of International Business GITAM University
3 pages
Probability Guide for Students
100% (1)
Probability Guide for Students
7 pages
.Chapter 1: What Is Statistics?: 1.1 Key Statistical Concepts
No ratings yet
.Chapter 1: What Is Statistics?: 1.1 Key Statistical Concepts
66 pages
Lecture Notes Combined - 231026 - 194010
No ratings yet
Lecture Notes Combined - 231026 - 194010
358 pages
Inventory Optimization
100% (7)
Inventory Optimization
328 pages
Finale Project SPP Type - Arduino-Based Nicer Dice
No ratings yet
Finale Project SPP Type - Arduino-Based Nicer Dice
4 pages
Goda 2000
No ratings yet
Goda 2000
31 pages
Mca Sly (Effective From The Academic Year 2022-23)
No ratings yet
Mca Sly (Effective From The Academic Year 2022-23)
182 pages
Grade 11 Statistics Lesson Plan
No ratings yet
Grade 11 Statistics Lesson Plan
8 pages
05 Simplified Procedure For Estimating Seismic-Induced Slope Displacements in Subduction Zones (Macedo, Bray and Travasarou 2017)
No ratings yet
05 Simplified Procedure For Estimating Seismic-Induced Slope Displacements in Subduction Zones (Macedo, Bray and Travasarou 2017)
13 pages
Statistical Drake Equation
No ratings yet
Statistical Drake Equation
18 pages
Statistics Exam Practice
No ratings yet
Statistics Exam Practice
2 pages
NEP BCA DA 2021 Final Ver05
No ratings yet
NEP BCA DA 2021 Final Ver05
59 pages
Financial Risk Manager Handbook 2nd Edition Philippe Jorion Updated 2025
100% (2)
Financial Risk Manager Handbook 2nd Edition Philippe Jorion Updated 2025
125 pages
Tabular and Graphical Presentation Using Excel: 2.1 Summarizing Categorical Data
No ratings yet
Tabular and Graphical Presentation Using Excel: 2.1 Summarizing Categorical Data
15 pages
Midterm Examination in Statistics and Probability: For Numbers: 6 - 8, Given The Table
No ratings yet
Midterm Examination in Statistics and Probability: For Numbers: 6 - 8, Given The Table
4 pages
Geometric & Negative Binomial Problems
No ratings yet
Geometric & Negative Binomial Problems
18 pages
Normal Distribution Statistics
No ratings yet
Normal Distribution Statistics
18 pages
Mixed Logit Model Explained
No ratings yet
Mixed Logit Model Explained
12 pages
Probability Basics for Students
No ratings yet
Probability Basics for Students
9 pages