S.M.A.
- Pearson – Product
Moment Correlation
Measures Coefficient
- Spearman Rank – Order
Correlation Coefficient
Used to analyze a collection of paired sample data
and determine whether these appears to be a
relationship between the two quantitative variables.
One variable (y) is treated as the response (dependent
or outcome) variable
The other variable (x) is the explanatory (independent
or predictor) variable
- Measures the strength of
association
Linear - The sign describes the direction
while the absolute value of the
Correlation correlation coefficient describes
Coefficient the magnitude of the relationship
- Denoted by rho (𝜌) or r
−1 ≤ ρ/r ≤ 1
CORRELATION INTERPRETATION X Y
COEFFICIENT
+ 0.91 - +1.00 VERY HIGH
+ 0.71 - +0.90 HIGH
+
+ 0.41 - +0.70 MODERATE
+ 0.21 - +0.40 LOW
-
+ 0.00 - +0.20 VERY LOW
Pearson
Product
Moment Karl Pearson
Coefficient Correlation of continuous variables
of
Correlation
n xy x y
rxy
n x x n y y
2 2 2 2
Weight HbA1C
A researcher abstracted level (%)
hemoglobin A1C and 269 7.3
weight from consenting 120 12.1
participants’ medical 224 7.5
records. 223 9.4
211 5.9
x y x² y² xy 𝑛 𝑥𝑦 − 𝑥 𝑦
269 7.3
𝑟𝑥𝑦 =
72361 53.29 1963.7 𝑛 𝑥2 − 𝑥 2 𝑛 𝑦2 − 𝑦 2
120 12.1 14400 146.41 1452 5 8436.8 − 1047 42.2
𝑟𝑥𝑦 =
224 7.5 50176 56.25 1680 5 231187 − 1047 2 5 379.12 − 42.2 2
223 9.4 49729 88.36 2096.2 42184 − 44183.4
𝑟𝑥𝑦 =
211 5.9 44521 34.81 1244.9 1155935 − 1096209 1895.6 − 1780.84
∑x= ∑y= ∑x²= ∑y²= ∑xy= −1999.4
379.12 8436.8 𝑟𝑥𝑦 =
1047 42.2 231187 59726 114.76
***Therefore, there is an inverse −1999.4
high relationship between the 𝑟𝑥𝑦 =
6854155.76
weight and hemoglobin A1C level
of the participants. −1999.4
𝑟𝑥𝑦 =
*** As their weight increases , their 2618.04
hemoglobin A1C level decreases 𝑟𝑥𝑦 = −0.76
and vice versa.
Regression is the next step after correlation.
Correlation analysis is concerned with strength of
the relationship while regression analysis predicts
the value of a dependent variable (y) based on the
value of at least one independent variable (x).
Regression is a statistical model. A mathematical
formula or assumption that describes a real world
situation.
𝑦= a + bx
where: a= y-intercept
b= slope
𝑦 = predicted value of the dependent variable
𝑛 𝑥𝑦 − 𝑥 𝑦
b=
𝑛 𝑥2 − 𝑥 2
a= 𝑦 - b𝑥
Weight HbA1C
A researcher abstracted level (%)
hemoglobin A1C and 269 7.3
weight from consenting 120 12.1
participants’ medical 224 7.5
records. 223 9.4
211 5.9
x y x² y² xy 𝑛 𝑥𝑦 − 𝑥 𝑦
b= 𝑎 = 𝑦 − 𝑏𝑥
269 7.3 𝑛 𝑥2 − 𝑥 2
72361 53.29 1963.7
𝑦 42.2
120 12.1 𝑦= = = 8.44
14400 146.41 1452 5 8436.8 − 1047 42.2 𝑛 5
b=
224 7.5 5 231187 − 1047 2 𝑥 1047
50176 56.25 1680 𝑥= = = 209.4
𝑛 5
223 9.4 49729 88.36 2096.2 −1999.4
b= 𝑎 = 𝑦 − 𝑏𝑥
59726
211 5.9 44521 34.81 1244.9 𝑎 = 8.44 − (−0.03)(209.4)
∑x= ∑y= ∑x²= ∑y²= ∑xy= b=-0.03 𝑎 = 14.72
1047 42.2 231187 379.12 8436.8
𝑦= 𝑎 + 𝑏𝑥 a. Weight = 130 b. x= 255
𝑦 = 14.72 − 0.03𝑥 𝑦 = 14.72 − 0.03𝑥
𝑦 = 14.72 + (−0.03)𝑥 𝑦 = 14.72 − 0.03(130) 𝑦 = 14.72 − 0.03(255)
𝑦 = 10.82 𝑦 = 7.07
𝑦 = 14.72 − 0.03𝑥
Alcausin, G. Garcia, E., & Manikis, M. (1989).
Makati: Salesiana Publishers, Inc.
Goodman, M. (2018). Biostatistics for Clinical
and Public Health Research. New York:
Routledge.