Republic of the Philippines
Department of Education
MIMAROPA Region
SCHOOLS DIVISION OF ORIENTAL MINDORO
SAN MARIANO NATIONAL HIGH SCHOOL
San Mariano. Roxas, Oriental Mindoro
Statistics and Probability
Quarter 4 – Week 7
Self-Learning Module 7
PEARSON’S SAMPLE
CORRELATION COEFFICIENT
What I Need to Know
The scatter plot is not accurate enough to describe the strength and
direction of relationship between two variables. A more analytical approach to
describe the relationship between two variables is by computing the correlation
coefficient. In this module, we shall learn how to calculate the Pearson’s
sample correlation coefficient.
After going through this module, you are expected to:
1. Calculates the Pearson’s sample correlation coefficient.
(M11/12SP-IVh-2)
2. Solves problems involving correlation analysis. (M11/12SP-IVh-3)
What I Know
Directions: Fill in the blanks to complete the statements.
1. ___________ is a statistical method used to determine whether a
relationship between two variables exist.
2. There is a _____________ correlation between two variables if the points
are very close to a straight line with a negative slope.
3. The ____________ is the line closest to the points. The direction of the
line tells the direction that exist between the variables.
4. An r equal to 1 or -1 implies a ____________ relationship between
two variables.
5. An r = 0 implies a ____________ between two variables.
Lesson
9
Pearson’s Sample Correlation Coefficient
1
This lesson introduces the concept of correlation analysis, direction and
strength of correlation and Pearson r and introduce the application of Pearson
Product-Moment Correlation in real-life situation.
What’s In
Bivariate data deal with two variables that are compared in order to find
or establish their relationship.
A scatter plot is the most common display of qualitative data. It shows
pattern, trends, relationship and possible extraordinary value/s between the
variable.
Steps in Constructing a Scatter Plot
1. Draw a graph and label the x- and y- axes.
2. Assign each qualitative variable to an axis.
3. Choose a range for each axis that includes the maximum and the
minimum values in the data set.
4. Plot each point on the graph.
What is It
Correlation analysis is a statistical method used to determine whether a
relationship between two variables exist.
Direction of Correlation
• Positive Correlation exists when high values of one variable correspond
to high values in the other variable or low values in one variable
correspond to low values in the other variable.
• Negative Correlation exists when high values of one variable correspond
to low values in the other variable or low values in one variable
correspond to high values in the other variable.
• Zero Correlation exists when high values in one variable correspond to
either high or low values in the other variable
Strength of Correlation
2
• Perfect
• Very high
• Moderately high
• Moderately low
• Very low
• Zero
The trend line is the line closest to the point. The direction of the line
tells the direction of correlation that exist between the variables. If the trend
line points to the right, its slope is positive, thus there is a positive correlation
between two variables. If it points to the left, there is negative correlation
between two variables.
Pearson Product-Moment Correlation Coefficient
The Pearson Product-Moment Correlation Coefficient also called the sample
correlation coefficient r, is a widely used statistical measure of strength of a
linear relationship between two variables. It is given by
3
n ( ∑ XY ) −( ∑ X )( ∑ Y )
r=
√ [ n ∑ X −(∑ X ) ][ n ∑ Y −(∑ Y ) ]
2 2 2 2
where
r = sample correlation coefficient
n = sample size
X = values of variable x
Y = values of variable Y
We will use the given table to determine the strength of the computed r.
Pearson r Qualitative Description
±1 Perfect
±0.75 to < ±1 Very high
±0.50 to < ±0.75 Moderately high
±0.25 to < ±0.50 Moderately low
> 0 to < ±0.25 Very low
0 No correlation
Steps in Solving the Pearson’s r Correlation Coefficient
1. Arrange the given bivariate data in tabular form with the values of the
first variable (X) in the first column and the second variable (Y) in the
second column.
2. Calculate the sum of all the values of X and Y.
3. Square each value of the first variable X and then find the summation of
the squares.
4. Do the same with the second variable Y.
5. Multiply the corresponding values of X and Y and solve the summation
of the products.
6. Substitute the summation values in the formula, solve and interpret the
result.
Example 1:
Determine the value of Pearson r for the following data and interpret the
results.
4
X 3 5 6 8 10
Y 16 14 10 12 20
Solution:
a) Construct the table shown below
X Y X2 Y2 XY
3 16
5 14
6 10
8 12
10 20
b) Complete the table above by:
• Square all entries in the X column and put them under X2 column.
• Square all entries in the Y column and put them under Y2 column.
• Multiply entries in X and Y columns and put them in XY column.
• Get the summation of all entries in X, Y, X2, Y2 and XY column.
X Y X2 Y2 XY
3 16 9 256 48
5 14 25 196 70
6 10 36 100 60
8 12 64 144 96
∑ 𝑋 = 32 ∑ 𝑌 = 72 ∑ 𝑋2 = 234 ∑ 𝑌2 = 1096 ∑ 𝑋𝑌 = 474
10 20 100 400 200
c) Use the Pearson Product Moment Correlation Formula to solve for r
and interpret.
Solving for r
∑ 𝑋2 = 234
∑ 𝑋 = 32 ∑ 𝑌2 = 1096
n=5
∑ 𝑌 = 72 ∑ 𝑋𝑌 = 474
n ( ∑ XY ) −( ∑ X )( ∑ Y )
r=
√ [ n ∑ X −(∑ X ) ][ n ∑ Y −(∑ Y ) ]
2 2 2 2
5
5 ( 474 )−( 32 ) ( 72 )
r=
√ [ 5 ( 234 )−(32) ] [ 5 ( 1096 )−(72) ]
2 2
2370−2304
r=
√ [ 1170−1024 ][ 5480−5184 ]
66
r=
√(146)(296)
66
r=
√ 43216
66
r=
207.88
𝑟 = 0.32; moderately low but positive
Example 2:
Andrew studies of age correlates with the average number of hours of
sleep, so he selects a random sample of size 6 and surveyed the needed data.
Can Andrew conclude a strong relationship between a person’s age and the
number of hours he or she sleeps? The gathered data are given below:
Age (X) 10 16 22 30 34 40
Hours of Sleep (Y) 8 7 8 7 6 5
2 2
X Y X Y XY
10 8 100 64 80
16 7 256 49 112
22 8 484 64 176
30 7 900 49 210
34 6 1156 36 204
∑ 𝑋 =152 ∑ 𝑌 = 41 ∑ 𝑋2 = 4496 ∑ 𝑌2 = 287 ∑ 𝑋𝑌 =982
40 5 1600 25 200
Solving for r
6
∑ 𝑋2 = 4496
∑ 𝑋 = 152 ∑ 𝑌2 = 287
n=6
∑ 𝑌 = 41 ∑ 𝑋𝑌 = 982
n ( ∑ XY ) −( ∑ X )( ∑ Y )
r=
√ [ n ∑ X −(∑ X ) ][ n ∑ Y −(∑ Y ) ]
2 2 2 2
6 ( 982 )−( 152 ) ( 41 )
r=
√ [ 6 ( 4496 )−(152) ] [ 6 ( 287 )−(41) ]
2 2
5892−6232
r=
√ [ 26976−23104 ] [ 1722−1681 ]
−340
r=
√(3872)(41)
−340
r=
√ 158752
−340
r=
398.44
𝑟 = − 0.85
The computed r value is -0.85. Hence, the relationship between a
person’s age and the number of hours he or she sleep is very high but
negative.
7
What’s More
Directions: Solve the problem below and interpret the result. Show your
complete solution.
The following are the height in centimeter and weights in kilogram of 5
teachers in a certain school. Determine the relationship between the height
(cm) and weight (kg) of the 5 teachers.
TEACHER A B C D E
Height(cm) X 163 160 168 159 170
Weight(kg) Y 52 50 64 51 69
What I Have Learned
Pearson Product-Moment Correlation Coefficient
n ( ∑ XY ) −( ∑ X )( ∑ Y )
r=
√ [ n ∑ X −(∑ X ) ][ n ∑ Y −(∑ Y ) ]
2 2 2 2
Steps in Solving the Pearson’s r Correlation Coefficient
1. Arrange the given bivariate data in tabular form with the values of the
first variable (X) in the first column and the second variable (Y) in the
second column.
2. Calculate the sum of all the values of X and Y.
3. Square each value of the first variable X and then find the summation of
the squares.
4. Do the same with the second variable Y.
5. Multiply the corresponding values of X and Y and solve the summation
of the products.
6. Substitute the summation values in the formula, solve and interpret the
result.
8
References
Books:
Belecina, R.R.; Baccay, E.S.; Mateo, E.B., Statistics and Probability, First
Edition, Rex Book Store
Canlapan, Raymond B. (2016). Statistics and Probability. Diwa Learning
Systems Inc.
Malate, Jose. 2018. Statistics & Probability. Vicarish Publications and
Trading, Inc. 152-153
Tizon, M. B. & Mesa, Y. H. (2016). Stat Speaks Statistics and Probability for
21st Century Learners. St. Bernadette Publishing House Corporation
Department of Education. Division of Pasig City
Department of Education – Region IV-A CALABARZON