0% found this document useful (0 votes)
4 views11 pages

8.3 Correlation

The document provides an overview of correlation in statistics, detailing its definition, types (positive, negative, and no correlation), and the Pearson correlation coefficient. It includes mathematical formulas for calculating correlation coefficients and interpretations of their values. Additionally, it discusses Spearman's rank correlation coefficient and Cramer's coefficient of contingency for categorical variables.

Uploaded by

osama7abx
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views11 pages

8.3 Correlation

The document provides an overview of correlation in statistics, detailing its definition, types (positive, negative, and no correlation), and the Pearson correlation coefficient. It includes mathematical formulas for calculating correlation coefficients and interpretations of their values. Additionally, it discusses Spearman's rank correlation coefficient and Cramer's coefficient of contingency for categorical variables.

Uploaded by

osama7abx
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

8.

3_Correlation
Correlation

Introduction to Correlation
In statistics, correlation measures the strength and direction of a linear relationship
between two quantitative variables.
It quantifies how the change in one variable is associated with the change in another.
A measurement of the degree of linear relationship can determine the usefulness of
the regression relationship analysis in a specific application .

Types Of Correlation :
Positive Correlation :
As one variable increases, the other also increases.

Negative Correlation :

As one variable increases, the other decreases.

No Correlation :
No apparent relationship between the variables.

Linear Correlation Coefficient (Pearson’s Correlation


Coefficient)
The Pearson correlation coefficient (r) measures the linear relationship between
two continuous variables .

Mathematical Formulas :
First Formula :
n
∑ (x i − x̄)(y i − ȳ)
i=1
r =
1
n n
2 2 2
[∑ (x i − x̄) ∑ (y i − ȳ) ]
i=1 i=1

Where:

xi , y i ​: Individual data points of variables X and Y .


x̄ , ȳ : Mean of variables X and Y .
r : Correlation coefficient .
ranges : from -1 to +1

Interpretation :

r = +1 : Perfect positive linear correlation


r = −1 : Perfect negative linear correlation
r = 0 : No linear correlation

Second Formula :
n n
n ∑ xi ∑ yi
i=1 i=1
∑ xi yi −
i=1 n
r =
1

n 2 n 2 2
n (∑ i=1 x i ) n (∑ i=1 y i )
2 2
[(∑ x − ) (∑ y − )]
i=1 i n i=1 i n

n n n
n∑ xi yi − ∑ xi ∑ yi
i=1 i=1 i=1
r =
1

n n 2 n n 2 2
2 2
[(n ∑ x − (∑ x i ) ) (n ∑ y − (∑ y i ) )]
i=1 i i=1 i=1 i i=1

Third Formula :

SS xy
r =
√SS xx SS yy

Where :

SS xy : represents the covariance between two variables x and y .

n n n n
(∑ x i ) (∑ yi )
i=1 i=1
SS xy = ∑ x i y i − = ∑(x i − x̄)(y i − ȳ)
n
i=1 i=1

SS xx : represents the variance of the x-values.


n 2
n n
(∑ xi )
2 i=1 2
SS xx = ∑ x i − = ∑(x i − x̄)
n
i=1 i=1

SS yy : represents the variance of the y-values.

n 2
n n
(∑ yi )
2 i=1 2
SS yy = ∑ y i − = ∑(y i − ȳ)
n
i=1 i=1

Examples :

Ex 1 :
Ex 2 :
Rank Correlation Coefficient (Spearman’s Rank
Correlation Coefficient)
Spearman's rank correlation measures the strength and direction of a monotonic
relationship between two ranked variables .
It is useful when data do not meet the assumptions of Pearson's correlation (non-
linear relationships or ordinal qualitative data).

Mathematical Formula :
n 2
6∑ d
i=1 i
ρ = 1 −
2
n(n − 1)

Where :
di ​: Difference between ranks of corresponding values of X and Y .
n : Number of data pairs of values for the variables (X, Y )
ρ : Spearman's rank correlation coefficient
ranges : from -1 to +1 )

Interpretation :

r = +1 : Perfect positive correlation .


r = −1 : Perfect negative correlation .
r = 0 : No correlation .

Examples :

Ex 1 :
Ex 2 :
Cramer's Coefficient of contingency
The coefficient of contingency r c is used to measure the association between two
categorical variables in a contingency table .

Mathematical Formula:
if we have two phenomena, the first phenomenon has n a elements belonging to
characteristic A , while n b elements belong to characteristic B, while the second
phenomenon has m a elements belonging to characteristic A , while m b elements
belong to characteristic B .

The Coefficient of contingency for the two phenomena is :

na . mb − nb . ma
rc =
na . mb + nb . ma

Examples :
Ex 1 :

Ex 2 :
The Coefficient of association
If we have descriptive or quantitative data represent two phenomena under study
and summarized in a table contains more than four columns .
In other words, if the first characteristic is divided into n categories and the second
characteristic into n categories and n. m > 4 and f ij is the number of items in
row number i and column number j as in the following table :

Formula :
2 2 2
(f 11 ) (f 12 ) (f nm )
C = + + … +
(f 1∙ )(f ∙1 ) (f 2∙ )(f ∙1 ) (f m∙ )(f ∙n )

Then we calculate The Coefficient of association r a from the relation :

C − 1
ra = √
C

Where :

∙ : means sum .

Examples :

Ex 1 :

You might also like