Correlation
Topic Outcomes
❑ Explain the term “correlation” in relation to data where both
variables must be random.
❑ Calculate correlation coefficient
❑ Evaluate correlation coefficient results in relation to appearance of
the scatter diagrams and with reference to values close to -1, 0,
+1.
Correlation
• Correlation indicates the direction and strength of a relationship between 2 variables.
• The relationship between two variables can be easily seen by plotting a scatter
diagram.
• Bivariate data is data that comes in pairs and there may or may not be a relationship
between them.
Variability of bivariate data
σ(𝑥−𝑥)2
Variance =
𝑛
In correlation : 𝑆𝑥𝑥 = σ(𝑥 − 𝑥)2 and 𝑆𝑦𝑦 = σ(𝑦 − 𝑦)2
σ(𝑥−𝑥)(𝑦−𝑦)
Co-variance =
𝑛
In correlation: 𝑆𝑥𝑦 = σ(𝑥 − 𝑥)(𝑦 − 𝑦)
Variability of bivariate data
σ 𝑥2 2
Variance = −𝑥
𝑛
2 (σ 𝑥)2 2 (σ 𝑦)2
In correlation : 𝑆𝑥𝑥 = σ 𝑥 − and 𝑆𝑦𝑦 = σ 𝑦 −
𝑛 𝑛
σ𝑥σ𝑦
In correlation: 𝑆𝑥𝑦 = σ 𝑥𝑦 −
𝑛
Correlation coefficient r
• You can get a measure of the amount of correlation between two variables by
using the product moment correlation coefficient r.
• This is defined as;
𝑺𝒙𝒚
𝒓=
𝑺𝒙𝒙 𝑺𝒚𝒚
• The value of r varies between -1 and +1.
Correlation coefficient r
If r = 1 there is a perfect positive linear correlation.
If r = -1 there is a perfect negative linear correlation.
If r is zero, there is no linear correlation.
Example
Spearman’s rank coefficient of correlation
• It is called rank correlation because it is based on rankings of the values of
x and y
• Spearman’s measurement is the most common
𝟐
𝟔 ∙ σ 𝒓𝒙 − 𝒓𝒚
𝒓′ =𝟏−
𝒏 ∙ 𝒏𝟐 − 𝟏
• The rankings are usually in ascending order
• But can be descending
• As long as there is consistency in the method of ranking for both x and y, it doesn’t matter
Computation Steps
Example
• The data in the table below shows annual rents and rate bills.
Calculate Spearman’s Rank Correlation Coefficient to assess whether
there is any correlation between rent and rate bills.