0% found this document useful (0 votes)
301 views24 pages

Measures of Correlation Module

This document provides an overview of measures of correlation, including Pearson's r and Spearman's rho correlation coefficients. It aims to give information on computing the strength and direction of relationships between two variables. Key topics covered include defining simple, multiple, and partial correlation, discussing linear and non-linear correlation, and interpreting the values of Pearson's r and Spearman's rho correlations.

Uploaded by

Meynard Magsino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
301 views24 pages

Measures of Correlation Module

This document provides an overview of measures of correlation, including Pearson's r and Spearman's rho correlation coefficients. It aims to give information on computing the strength and direction of relationships between two variables. Key topics covered include defining simple, multiple, and partial correlation, discussing linear and non-linear correlation, and interpreting the values of Pearson's r and Spearman's rho correlations.

Uploaded by

Meynard Magsino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Module 6

Measures of Correlation
LEARNING
O OUTCOMES

After successfully completing this module, the student should be able to:
1. Identify different measures of correlation.
2. Differentiate Pearson’s and Spearman Rho correlation
3. To provide understanding and skills in using linear regression.
4. To be able to define commonly used terms in regression

PRE-TEST

Directions: Read the following statements and choose the correct answer inside the box below.
____________1. The degree of relationship between the variables under consideration is
measure through the correlation analysis.
____________2. It analyze and recognizes more than two variables but considers only two
variables keeping the other constant.
____________3.It is a graph of observed plotted points where each points represents the
values of X & Y as a coordinate.
____________4. It is the ratio of change between the two variables either in the same direction
or opposite direction and the graphical representation of the one variable with respect to other
variable is straight line.
____________5. It is also called Pearson’s R.
____________6. is a statistical method used in finance, investing, and other disciplines that
attempts to determine the strength and character of the relationship between one dependent
variable (usually denoted by Y) and a series of other variables (known as independent
variables).
____________7. The graphical representation of the two variables will be a curved line. Such
a relationship between the two variables is termed as _______________.
____________8. It means no relationship between the two variables X and Y.
____________9. It determines the strength and direction of the monotonic
relationship between your two variables rather than the strength and
direction of the linear relationship between your two variables
____________10.Quantifies the relationship between one or more predictor variable(s) and
one outcome variable.

Partial Correlation Linear Correlation Spearman Correlation Coefficient


Scatter Diagram Pearson’s Correlation Coefficient Linear Regression
Correlation Regression Curvilinear Correlation
Zero Correlation

1 | Page Module in Statistics and Evaluation in Education


INTRODUCTION

Measures of Correlation is a broad topic which includes the discussion of Pearson’s R,


Spearman rho and also Regression. It aims to give vast information and formula on how to
compute the strength and direction of the two variables. This will help learners come up with
systematic arrangement of raw data so it is easier to analyze and study. Those topics push
through the practice of identifying the terms commonly used in regression and differentiation
of Pearson’s and Spearman rho. At the end, it is important to know the measures of correlation
and its difference and able to know the regression.

Activity. Arrange the jumbled letters in a flow chart below.

Search for the hidden words.

Correlation Linear Spearman


Coefficient Value Pearson’s
Model Simple Regression Product

T R E G R E S S I O N
N J D A F Q J D S O A
E M L S D R N F I J M
I O K ‘S H H V T M U R
C D W N J I A O P T A
I E H O O L L R L C E
F L F S E I U O E U P
F T S R P N E Y K D S
E Q R A A E M E I O K
O O B E S A N D L R J
C Q V P F R Y Z O P Q

2 | Page Module in Statistics and Evaluation in Education


Lesson 1

Measures of Correlation

1.1 Meaning of Correlation

The degree of relationship between the variables under consideration is measure through
the correlation analysis. The measure of correlation called the correlation coefficient .

Correlation is a statistical tool that helps to measure and analyze the degree of relationship
between two variables. Correlation analysis deals with the association between two or more
variables.

1.2 Discussion

Types of Correlation

Correlation
Simple Multiple
Partial Total

• Simple correlation: Under simple correlation problem there are only two variables
are studied.

• Multiple Correlation: Under Multiple Correlation three or more than three variables
are studied. Ex. Qd = f ( P,PC, PS, t, y )

• Partial correlation: analysis recognizes more than two variables but considers only
two variables keeping the other constant.

• Total correlation: is based on all the relevant variables, which is normally not
feasible.

3 | Page Module in Statistics and Evaluation in Education


Types of Correlation

• Linear correlation: Correlation is said to be linear when the amount of change in


one variable tends to bear a constant ratio to the amount of change in the other. The
graph of the variables having a linear relationship will form a straight line.

Example :
X = 1, 2, 3, 4, 5, 6, 7, 8,
Y = 5, 7, 9, 11, 13, 15, 17, 19,
Y = 3 + 2x

• Non-Linear correlation: The correlation would be non-linear if the amount of change


in one variable does not bear a constant ratio to the amount of change in the other
variable.

Types of Correlation

A. Positive, Negative or Zero Correlation:

• When the increase in one variable (X) is followed by a corresponding increase


in the other variable (Y); the correlation is said to be positive correlation. The
positive correlations range from 0 to +1; the upper limit i.e. +1 is the perfect
positive coefficient of correlation.
• If, on the other hand, the increase in one variable (X) results in a corresponding
decrease in the other variable (Y), the correlation is said to be negative
correlation.
• Zero correlation means no relationship between the two variables X and Y; i.e.
the change in one variable (X) is not associated with the change in the other
variable (Y). For example, body weight and intelligence, shoe size and monthly
salary; etc. The zero correlation is the mid-point of the range – 1 to + 1.

B. Linear or Curvilinear Correlation:


• Linear correlation is the ratio of change between the two variables either in
the same direction or opposite direction and the graphical representation of
the one variable with respect to other variable is straight line.
• The graphical representation of the two variables will be a curved line. Such a
relationship between the two variables is termed as the curvilinear correlation.

Methods of Computing Co-Efficient of Correlation

In ease of ungrouped data of bivariate distribution, the following three methods


are used to compute the value of co-efficient of correlation:
1. Scatter diagram method.
2. Pearson’s Product Moment Co-efficient of Correlation.
3. Spearman’s Rank Order Co-efficient of Correlation.

4 | Page Module in Statistics and Evaluation in Education


A. Scatter Diagram Method:
Scatter diagram or dot diagram is a graphic device for drawing certain conclusions
about the correlation between two variables.
In preparing a scatter diagram, the observed pairs of observations are plotted by dots
on a graph paper in a two dimensional space by taking the measurements on variable X
along the horizontal axis and that on variable Y along the vertical axis.

B. Pearson’s Product Moment Co-efficient of Correlation:


The coefficient of correlation, r, is often called the “Pearson r” after Professor Karl
Pearson who developed the product-moment method, following the earlier work of Gallon and
Bravais.

C. Spearman’s Rank Correlation Coefficient


D. Wherein, the objects or individuals may be ranked and arranged in order of
merit or proficiency on two variables and when these 2 sets of ranks covary or have
agreement between them, we measure the degrees of relationship by rank correlation.

Interpretation of Value

Pearson’s R

5 | Page Module in Statistics and Evaluation in Education


Spearman Rho

Lesson 2

Pearson’s Product – Moment Correlation or Pearson’s r


What is Pearson’s Correlation Coefficient?
It is also called Pearson’s and introduced by Karl Pearson (1867-1936). It is defined in
statistics as the measurement of the strength of the relationship between two variables and
their association with each other. It is the correlation between sets of data is a measure of how
well they are related. The full name is the Pearson Product -Moment Correlation (PPMC).
Correlation coefficient formulas are used to find how strong a relationship is between
data. The formulas return a value between -1 and 1, where:
• 1 indicates a strong positive relationship.
• -1 indicates a strong negative relationship.
• A result of zero indicates no relationship at all.

Meaning
 A correlation coefficient of 1 means that for every positive increase in one variable,
there is a positive increase of a fixed proportion in the other. For example, shoe sizes
go up in (almost) perfect correlation with foot length.
 A correlation coefficient of -1 means that for every positive increase in one variable,
there is a negative decrease of a fixed proportion in the other. For example, the
amount of gas in a tank decrease in (almost) perfect correlation with speed.
 Zero means that for every increase, there isn’t a positive or negative increase. The two
just aren’t related.

Pearson’s R formula:

6 | Page Module in Statistics and Evaluation in Education


Where:
r = Pearson Correlation Coefficient
N = number of observations
∑xy = the sum of the products
∑x = the sum of the x scores
∑y = the sum of the y scores
∑x² = the sum of squared x scores
∑y² = the sum of squared y scores

Example:
Find the value of the Pearson correlation coefficient from the following table:
Subject Age Glucose
(x)` Level
(y)

1 43 99

2 21 65

3 25 79

4 42 75

5 57 87

6 59 81

Step 1: Make a chart. Use the given data, and add three more columns: xy, x2, and y2.
Step 2: Multiply x and y together to fill the xy column. For example, row 1 would be 43 × 99
= 4,257.
Subject Age Glucose xy x² y²
(x)` Level
(y)

1 43 99 4257

2 21 65 1365

3 25 79 1975

4 42 75 3150

7 | Page Module in Statistics and Evaluation in Education


5 57 87 4959

6 59 81 4779

Step 3: Take the square of the numbers in the x column, and put the result in the x2 column
Step 4: Take the square of the numbers in the y column, and put the result in the y2 column.
Subject Age Glucose xy x² y²
(x)` Level
(y)

1 43 99 4257 1849 9801

2 21 65 1365 441 4225

3 25 79 1975 625 6241

4 42 75 3150 1764 5625

5 57 87 4959 3249 7569

6 59 81 4779 3481 6561

Step 5: Add up all of the numbers in the columns and put the result at the bottom of
the column. The Greek letter sigma (Σ) is a short way of saying “sum of” or summation.
Subject Age Glucose xy x² y²
(x)` Level
(y)

1 43 99 4257 1849 9801

2 21 65 1365 441 4225

3 25 79 1975 625 6241

4 42 75 3150 1764 5625

5 57 87 4959 3249 7569

6 59 81 4779 3481 6561

∑ 247 486 20485 11409 40022

8 | Page Module in Statistics and Evaluation in Education


Step 6: Use the following correlation coefficient formula.

From our table:


• Σx = 247
• Σy = 486
• Σxy = 20,485
• Σx2 = 11,409
• Σy2 = 40,022
• n is the sample size, in our case = 6

6 (20,485) – (247) (486)


r=
[6 (11,609) – (247²)] [6 (40,022) – (486²)]

(122,910) – (120,024)
r=
[68,454 −61,009)] [240,132 – 236,196)]
2,868
r=
[7,445] [3,936]
2,868
r=
[29,303,520]
2,868
r= , .
r= 0.5298
Our result is 0.5298 or 52.98%, which means the variables have a moderate positive
correlation.

Example No.2
Find out the number of pairs of variables, which is denoted by n. Let us presume
x consists of 3 variables – 6, 8, 10. Let us presume that y consists of corresponding 3
variables 12, 10, 20.
Step 1: Make a chart. Use the given data, and add three more columns: xy, x2, and y2.
x y xy x² y²
6 12
8 10
10 20

9 | Page Module in Statistics and Evaluation in Education


Step 2: Multiply x and y together to fill the xy column.
x y xy x² y²
6 12 72
8 10 80
10 20 200
Step 3: Take the square of the numbers in the x column, and put the result in the x2 column
Step 4: Take the square of the numbers in the y column, and put the result in the y2 column.
x y xy x² y²
6 12 72 36 144
8 10 80 64 100
10 20 200 100 400

Step 5: Add up all of the numbers in the columns and put the result at the bottom of the
column. The Greek letter sigma (Σ) is a short way of saying “sum of” or summation.

x y xy x² y²
6 12 72 36 144
8 10 80 64 100
10 20 200 100 400
∑ 24 42 352 200 644
Step 6: Use the following correlation coefficient formula.

From our table:


• Σx = 24
• Σy = 42
• Σxy = 352
• Σx2 = 200
• Σy2 = 644
• n is the sample size, in our case = 3

( ) ( )( )
r=
[3 (200) – (24²)] [3(644) – (42)²)]

(1056) – (1008)
r=
[600 −567] [1932 – 1764)]
48
r=
[24] [168]

10 | Page Module in Statistics and Evaluation in Education


48
r=
[4032]
48
r= .
r =0.7559

Our result is 0.7559 or 75.59 %, which means the variables have a STRONG
positive correlation.

Spearman Product- Moment of Correlation or Spearman Rho

What is Spearman Rho?


It was named after Charles Spearman. It is often denoted by the Greek letter ρ (rho) or
as rₛ. It is a nonparametric measure of rank correlation (statistical dependence between
the rankings of two variables). It also measures the strength and direction of association
between two ranked variables
Spearman's correlation determines the strength and direction of the monotonic
relationship between your two variables rather than the strength and direction of the linear
relationship between your two variables, which is what Pearson's correlation determines.
Spearman Rho formula:
Or

Here,
n= number of data points of the two variables
di= difference in ranks of the “ith” element
The Spearman Coefficient, ⍴, can take a value between +1 to -1 where,

• A ⍴ value of +1 means a perfect association of rank


• A ⍴ value of 0 means no association of ranks
• A ⍴ value of -1 means a perfect negative association between ranks.

Example:
The scores for nine students in physics and math are as follows:

• Physics: 35, 23, 47, 17, 10, 43, 9, 6, 28


• Mathematics: 30, 33, 45, 23, 8, 49, 12, 4, 31

Compute the student’s ranks in the two subjects and compute the Spearman rank correlation.

Step 1: Find the ranks for each individual subject. To rank by hand, order the scores from
highest to smallest; assign the rank 1 to the highest score, 2 to the next highest and so on:

11 | Page Module in Statistics and Evaluation in Education


Physics Math Rank Rank
(x) (y) (x) (y)
35 30 3 5
23 33 5 3
47 45 1 2
17 23 6 6
10 8 7 8
43 49 2 1
9 12 8 7
6 4 9 9
28 31 4 4

Step 2: Add a third column, d, to your data. The d is the difference between ranks. For
example, the first student’s physics rank is 3 and math rank is 5, so the difference is -2 points.
In a fourth column, square your d values.

Physics Math Rank Rank dᵢ dᵢ²


(x) (y) (x) (y) (rx-ry)
35 30 3 5 -2 4
23 33 5 3 2 4
47 45 1 2 -1 1
17 23 6 6 0 0
10 8 7 8 -1 1
43 49 2 1 1 1
9 12 8 7 1 1
6 4 9 9 0 0
28 31 4 4 0 0

Step 3: Sum (add up) all of your d-squared values

Physics Math Rank Rank dᵢ dᵢ²


(x) (y) (x) (y) (rx-ry)
35 30 3 5 -2 4
23 33 5 3 2 4
47 45 1 2 -1 1
17 23 6 6 0 0
10 8 7 8 -1 1

12 | Page Module in Statistics and Evaluation in Education


43 49 2 1 1 1
9 12 8 7 1 1
6 4 9 9 0 0
28 31 4 4 0 0
∑ 12

Step 4: Insert the values into the formula.

6 (12)
𝜌 =1−
(9)(9² − 1)
6 (12)
𝜌 =1−
(9)(81 − 1)
72
𝜌 =1−
(9)(80)
72
𝜌 =1−
720
𝜌 = 1 − 0.10
ρ = 0.9

The Spearman’s Rank Correlation for this data is 0.9 and as mentioned above if
the ⍴ value is nearing +1 then they have a perfect association of rank.

Example No.2

Calculate the Spearman’s rank correlation coefficient for the following data
Candidates

Geography 75 40 52 65 60

English 25 42 35 20 33

Step 1: Find the ranks for each individual subject. To rank by hand, order the scores from
highest to smallest; assign the rank 1 to the highest score, 2 to the next highest and so on:
Geography Rank (x) English Rank (y)

75 1 25 4

40 5 42 1

52 4 35 2

13 | Page Module in Statistics and Evaluation in Education


65 2 20 5

60 3 33 3

Step 2: Add a third column, dᵢ, to your data. The dᵢ is the difference between ranks. For
example, the first student’s geography rank is 1 and English rank is 4, so the difference is -3
points. In a fourth column, square your dᵢ values.

Step 3: Sum (add up) all of your d-squared values

Geography Rank (x) English Rank (y) dᵢ d²

75 1 25 4 -3 9

40 5 42 1 4 16

52 4 35 2 2 4

65 2 20 5 -3 9

60 3 33 3 0 0

Σ 38

Step 4: Insert the values into the formula.

6 (38)
𝜌= 1−
(5)(5² − 1)
228
𝜌= 1−
(5)(24)
228
𝜌= 1−
120
𝜌 = 1 − 1.9
ρ = -0.9

The Spearman’s Rank Correlation for this data is -0.9 and as mentioned above if
the ⍴ value is nearing -1 then they have a negative association of rank.

14 | Page Module in Statistics and Evaluation in Education


Lesson 3

Regression
WHAT IS REGRESSION AND LINEAR REGRESSION?

Regression is a statistical method used in finance, investing, and other disciplines


that attempts to determine the strength and character of the relationship between one
dependent variable (usually denoted by Y) and a series of other variables (known as
independent variables).

Linear regression quantifies the relationship between one or more predictor


variable(s) and one outcome variable. Linear regression is commonly used for predictive
analysis and modeling. For example, it can be used to quantify the relative impacts of
age, gender, and diet (the predictor variables) on height (the outcome variable).

TYPES OF REGRESSION

LINEAR REGRESSION

A. Simple Linear Regression


Y = a + bX + u

B. Multiple Linear Regression


Y = a + b1X1 + b2X2 + b3X3 + ... + btXt + u

Where:

Y = the variable that you are trying to predict (dependent variable).

X = the variable that you are using to predict Y (independent variable).

a = the intercept.

b = the slope.

u = the regression residual.

SIMPLE LINE REGRESSION EQUATION

Often referred as THE PREDICTION LINE. The predicted value of Y equals the Y
intercept plus the slope multiplied by the value of X.

Ŷ1 = b0+ b1X1

Where:

Ŷ1 = predicted value of Y for observation i

X1 = value of X for observation i

15 | Page Module in Statistics and Evaluation in Education


b0 = sample Y intercept

b1 = sample slope

A regression line is the “best fit” line for your data. You basically draw a line that best
represents the data points. It’s like an average of where all the points line up. In linear
regression, the regression line is a perfectly straight line:

Example

Given the following data below:

X units of fertilizer used in corn field 0.3 06 0.9 1.2 1.5 1.8
2.1 2.4

Y Corn Yield 10 15 30 35 25 30 50 45

A. Plot the dependent variable Y against the independent variable X


B. Find the equation of the best fitting straight line
C. If 3 units of fertilizer were used, what would be a good guest as to the corn yield?
D. Compute the coefficient of Correlation.

Solution:

1. First Plot Y against X

Corn Yield per unit Fertilizer


60 best fitting line
50
40
30
20
10
0
0 0.5 1 1.5 2 2.5 3

b. The Equation of the best fitting line is y= a +bx where

b = Σx Σy – nΣ xy
(Σ x)2 –n Σx² and a= Σy - b Σx

16 | Page Module in Statistics and Evaluation in Education


Let have some computations:

x y x² y²
0.3 10 0.09 100
0.6 15 0.36 225
0.9 30 0.81 900
1.2 35 1.44 1225
1.5 25 2.25 625
1.8 30 3.24 900
2.1 50 4.41 2500
2.4 45 5.76 2025

Σ 10.8 240 18.36 8500

Substituting the values


b = Σx Σy – nΣ xy
(Σ x)2 –n Σx²
= 10.8 (240) – 8 (385.5)
10.8² - 8(18.36)
= 16.27
a = 1 (Σy –b Σx)
n
= {240 – 16.27(10.8)}
8
= 8.036
The equation of the best fitting line (regression line) is:
y= a + bx
y = 8.036 + 16.27x

c. If x= 3, then y= 8.036 + 16.27x


y = 8.036 + 16.27 (3)
y= 56.85
d. To compute the coefficient of correlation

Substituting :
r = 8( 385.5) – 10.8 (240)_____________
√8(18.36) – 116.64 √8( 8500-576,600
r = 0.877
r² = 0.77

17 | Page Module in Statistics and Evaluation in Education


Example No. 2
The table below shows some data from the early days of the Italian clothing company
Benetton. Each row in the table shows Benetton’s sales for a year and the amount spent on
advertising that year

SALES IN
ADVERTISEMENT
YEAR MILLION
IN MILLION EURO
EUROS
1 651.00 23.00
2 762.00 26.00

3 856.00 30.00
4 1,063.00 34.00
5 1,190.00 43.00

6 1,298.00 48.00
7 1,421.00 52.00
8 1,440.00 57.00

9 1,518.00 58.00

Σ 10,199.00 371.00

First lets plot the points and calculate the mean


Ӯ = 41.22

BENETTON SALES VERSUS ADS


EXPENSES
80.00

60.00

40.00

20.00

-
- 500.00 1,000.00 1,500.00 2,000.00

18 | Page Module in Statistics and Evaluation in Education


BENETTON SALES VERSUS ADS EXPENSES
70.00

60.00

-15.71
50.00 Error/ Residual

40.00

+18.22
30.00 Error/ Residual

20.00

10.00

-
- 200.00 400.00 600.00 800.00 1,000.00 1,200.00 1,400.00 1,600.00

BENETTON SALES VERSUS ADS EXPENSES


70.00

60.00

50.00
Ӯ = 41.22
Best fit line
40.00

18.22
30.00
Error/
Residual
20.00

10.00

-
- 200.00 400.00 600.00 800.00 1,000.00 1,200.00 1,400.00 1,600.00

19 | Page Module in Statistics and Evaluation in Education


ADVERTISEMENT RESIDUAL/
IN MILLION EURO ERROR

23.00 18.22
26.00 15.22
30.00 11.22
34.00 7.22
43.00 - 1.78
48.00 - 6.78
52.00 -10.78
57.00 - 15.78
58.00 - 16.78
371.00 0

ADVERTISEMENT RESIDUAL/ RESIDUAL/ ERROR


IN MILLION EURO ERROR SQUARE

23.00 18.22 332.05

26.00 15.22 231.72

30.00 11.22 125.94

34.00 7.22 52.16

43.00 - 1.78 3.16

48.00 - 6.78 45.94

52.00 - 10.78 116.16

57.00 - 15.78 248.94

58.00 - 16.78 281.49

Σ 371.00 0 1437.56

The goal of simple linear regression is to create a linear model that minimize the sum
of squares of the residuals/ errors (SSE).
When conducting simple linear regression with TWO variables, we will determine
how good that line “ fits” the data by comparing it to THIS TYPE; where we pretend the
second variable does not even exist.

20 | Page Module in Statistics and Evaluation in Education


USES OF REGRESSION

◉ Regression is use to develop a more formal understanding of relationships between


variables.
◉ Regression is often used to determine how many specific factors such as the price of
a commodity, interest rates, particular industries, or sectors influence the price
movement of an asset.
◉ Use to determine which variables have an effect on the response or help explain the
response.

DIFFERENCE OF CORRELATION AND REGRESSION

BASIS FOR
CORRELATION REGRESSION
COMPARISON
Correlation is a statistical
Regression describes how to numerically
measure that determines the
Meaning relate an independent variable to the
association or co-relationship
dependent variable.
between two variables.
To represent a linear
To fit the best line and to estimate one
Usage relationship between two
variable based on another.
variables.
Dependent and
Independent
variables No difference Both variables are different.

Correlation coefficient Regression indicates the impact of a


Indicates indicates the extent to which change of unit on the estimated variable (
two variables move together. y) in the known variable (x).

To find a numerical value To estimate values of random variables


Objective expressing the relationship on the basis of the values of a fixed
between variables. variables.

21 | Page Module in Statistics and Evaluation in Education


LEARNING ACTIVITIES

I. Directions: Write TRUE if the statement is correct and FALSE if the statement is wrong.
_________1. Pearson’s defined in statistics as the measurement of the strength of the
relationship between two variables and their association with each other and
Spearman rho determines the strength and direction of the monotonic
relationship between your two variables rather than the strength and
direction of the linear relationship between your two variables
_________2. Correlation is a statistical measure that determines the association or co-
relationship between two variables while Regression describes how to
numerically relate an independent variable to the dependent variable
_________3. . Partial Correlation it analyze and recognizes more than two variables but
considers only two variables keeping the other constant.
_________4. Positive Correlation means no relationship between the two variables X and
Y.
_________5. Regression quantifies the relationship between one or more predictor
variable(s) and one outcome variable.

Find out the Pearson correlation coefficient from the below data.

With the help of the following details in the table of the 6 people having a different age and
different weights given below for the calculation of the value of the Pearson R.

Sr No Age (x) Weight (y)

1 40 78

2 21 70

3 25 60

4 31 55

5 38 80

6 47 66

22 | Page Module in Statistics and Evaluation in Education


Spearman correlation coefficient

To calculate a Spearman rank-order correlation on data without any ties we will use the
following data:

Marks

English 56 75 45 71 62 64 58 80 76 61

Math 66 70 40 60 65 56 59 77 67 63

Then complete the following table and find the Spearman correlation efficient.

English Maths Rank Rank


di di2
(mark) (mark) (English) (math)

56 66

75 70

45 40

71 60

62 65

64 56

58 59

80 77

76 67

61 63

23 | Page Module in Statistics and Evaluation in Education


Linear Regression

A Factory is producing and stockpiling metal sheets to be shipped to an automobile


manufacturing plant. The factory ships only when there is a minimum of 2,250 sheets in stocks
at the beginning of that day. The table shows the day, x, and the number of sheets in the
stocks, y, at the beginning of that day.

Day (x) Sheets in Stock (y)


1 860
2 930
3 1000
4 1150
5 1200
6 1360

a. Write a linear regression equation y = a + bx predicting the number of sheets in


stocks against days (x).
b. Construct a scatter diagram
c. Use this equation to determine the day the sheets will be shipped

Online references:
1. https://www.statisticshowto.com/probability-and-statistics/correlation-coefficient-
formula/
2. https://www.slideshare.net/jherylmata/measures-of-correlation-pearsons-r-correlation-
coefficient-and-spearman-rho
3. https://www.analyticsvidhya.com/blog/2015/06/correlation-common-questions/
4. https://slideplayer.com/slide/6118301/
5. https://byjus.com/pearson-correlation-formula/
6. https://en.wikipedia.org/wiki/Linear_regression
7. https://www.statisticssolutions.com/free-resources/directory-of-statistical-analyses/what-
is-linear-regression/
8. https://www.healthknowledge.org.uk/e-learning/statistical-methods/specialists/linear-
regression-correlation
9. https://www.statstutor.ac.uk/resources/uploaded/coventrycorrelation.pdf
10. https://www.investopedia.com/ask/answers/060315/what-difference-between-linear-
regression-and-multiple-regression.asp

24 | Page Module in Statistics and Evaluation in Education

You might also like