ANALYSIS OF
RELATIONSHIPS
CORRELATION
CHI SQUARE TEST
REGRESSION ETC.
4.2 Introduction
4.3 UnderstandingCorrelation
4.3.1 Scatter Diagram
4.3.2 ComponentsofCorrelation: Direction and
Magnitude
4.3.3 MeaningofCorrelation
4.4 CalculatingPearson’sCorrelation
4.5 Correlationandcausation
4.6 EffectsofLinearScoreTransformations
4.7 FactorsInfluencingCorrelation
4.8 SpearmanRankCorrelationMethod
WHEN
4.9 LinearRegressionAnalysis/SimpleRegression
TABLE OF
CONTENT
ANALYSIS OF
RELATIONSHIPS
4.2 Introduction
SOME TERMS:
Bivariate Dichotomous
SOME TERMS:
Bivariate Dichotomous
Statistical analysis a variable having ONLY
involving two variables two levels or two
values
SOME TERMS:
Correlation is defined as a statistical technique used to
understand the level of association between variables.
A correlation refers to a relationship between two variables
In statistics, Correlation studies and measures the direction
and extent of relationship among variables,
so the correlation measures co-variation, not causation.
MULTIPLE CORRELATION
A GLIMPSE OF PAST :
Darwin-Evolution
Francis Galton (Darwin’s cousin) -(Late 1800s)
He wanted to understand the role of inheritance on the stature
of the children, for this he collected data of the height of the
parents as well as their offspring, he then tabulated this data.
often credited with pioneering the study of correlation -(father)
Karl Pearson in the year 1896 formulated a technique to
mathematically calculate this relationship, which is referred to as
correlation.
WHICH TERM TO USE THEN?
for correlation?
Bivariate Dichotomous
WHICH TERM TO USE THEN?
for correlation?
Bivariate Dichotomous
BIVARIATE DISTRIBUTION
WHICH TERM TO USE THEN?
for correlation?
A bivariate distribution is a distribution that
shows the relationship between two
Bivariate variables.
BIVARIATE DISTRIBUTION
WHICH TERM TO USE THEN?
for correlation?
Bivariate
BIVARIATE DISTRIBUTION
WHICH TERM TO USE THEN?
for correlation?
Correlation matrix
ANALYSIS OF
RELATIONSHIPS
4.3. UNDERSTANDING CORRELATION
4.3.1 Scatter Diagram
4.3.2 ComponentsofCorrelation: Direction and Magnitude
4.3.3 MeaningofCorrelation
UNDERSTANDING CORRELATION
SCATTER PLOT
Scatter diagram also known as
scatter plot is a way of representing
information regarding relationship
between variables.
A scatter plot (aka scatter chart,
scatter graph) usually uses dots to
represent values for two different
numeric variables.
UNDERSTANDING CORRELATION
SCATTER PLOT
Step 1 : Start by assigning labels to X and Y to the variables.
Step 2 : Next, plot values of the variables on the x-axis and y-
axis.
Step 3 : After plotting the values, we will find values of Y for
the corresponding values of X.
Step 4 : Mark the intersection with the dot.
Step 5 : Repeat step 3 for all the values.
Step 6 : Name each axis and add title to the graph.
SCATTER PLOT
SCATTER PLOT
SCATTER PLOT
Ten employees were given rating regarding their work performance by their
manager and by the customers.
Employee id: 101, 102, 103, 104, 105, 106, 107, 108, 109, 110
Customers ratings: 7, 8, 4, 5, 6, 3, 9, 7, 5, 2
Managers ratings: 5, 6, 2, 8, 7, 5, 8, 6, 7, 6
COMPONENTS OF CORREALTION
strength direction
COMPONENTS OF CORREALTION
strength direction
COMPONENTS OF CORREALTION
direction
COMPONENTS OF CORREALTION
direction
COMPONENTS OF CORREALTION 6
direction
COMPONENTS OF CORREALTION 6
Magnitude implies the strength of the relationship between the
variables.
the value of r ranges between -1 to +1, in this the signs (+ or -) that is
positive or negative indicates the direction of the relationship.
The values represent the magnitude of the relationship.
r is closer to ±1, higher will be the correlation coefficient, irrespective
of the direction. For example, a correlation of +0.9 is same in strength
as -0.9, the difference is in the direction, one is positive and the
second is negative.
6
MEANING OF CORELATION
r = -0.60 ?
1. Correlation does not mean causation.
2. when comparing, can we say 0.60 is twice as strong as 0.30?
Then how to interpret the magnitude of
difference in correlation coefficient ?
MEANING OF CORELATION
4.4. PEARSON PRODUCT MOMENT
CORRELATION
4.4. PEARSON PRODUCT MOMENT Sn Variabl Varia
X2 Y2 XY
o eX ble Y
CORRELATION
1 10 5
2 11 8
3 12 10
4 13 11
Where: 5 15 9
N = the number of pairs of scores
Σxy = the sum of the products of paired scores 6 19 15
Σx = the sum of x scores
Σy = the sum of y scores 7 11 6
Σx2 = the sum of squared x scores
Σy2 = the sum of squared y scores 8 12 8
9 8 4
10 13 12
S
4.4. PEARSON PRODUCT MOMENT n
o
Variabl
eX
Varia
ble Y
X2 Y2 XY
CORRELATION 1 10 5 100 25 50
2 11 8 121 64 88
3 12 10 144 100 120
4 13 11 169 121 143
5 15 9 225 81 135
Where: 6 19 15 361 225 285
N = the number of pairs of scores
Σxy = the sum of the products of paired scores 7 11 6 121 36 66
Σx = the sum of x scores
Σy = the sum of y scores 8 12 8 144 64 96
Σx2 = the sum of squared x scores
9 8 4 64 16 32
Σy2 = the sum of squared y scores
10 13 12 169 144 156
Σ 124 88 1618 876 1171
S
4.4. PEARSON PRODUCT MOMENT n
o
Variabl
eX
Varia
ble Y
X2 Y2 XY
CORRELATION 1 10 5 100 25 50
2 11 8 121 64 88
3 12 10 144 100 120
4 13 11 169 121 143
5 15 9 225 81 135
Where: 6 19 15 361 225 285
N = the number of pairs of scores
Σxy = the sum of the products of paired scores 7 11 6 121 36 66
Σx = the sum of x scores
Σy = the sum of y scores 8 12 8 144 64 96
Σx2 = the sum of squared x scores
Σy2 = the sum of squared y scores
r= 0.88 9 8 4 64 16 32
10 13 12 169 144 156
Σ 124 88 1618 876 1171
CORRELATION AND CAUSATION
CORRELATION AND CAUSATION
CORRELATION AND CAUSATION
CORRELATION DOESN’T IMPLY CAUSATION.
IT MEANS THAT THERE IS ONLY ASSOCIATION BETWEEN THE VARIABLES AND
NOT A CAUSE-AND-EFFECT RELATIONSHIP
EFFECTS OF LINEAR SCORE TRANSFORMATION
EFFECTS OF LINEAR SCORE TRANSFORMATION
Linear transformation involves changes in each raw score by adding a
constant, subtracting a constant, multiplying a constant, or dividing by
a constant.
All these changes in the raw score does not influence the value of the
correlation coefficients.
Another form of linear score transformation is converting the raw
scores into standard scores and then calculating correlation
coefficient. In this instance also the correlation coefficient comes out
to be the same.
FACTORS INFLUENCING CORRELATION
Sample Size: Smaller sample size --> r is slightly unstable
Larger sample size-->r becomes more reliable.
Nature of sample: Depends on sample- r is not fixed for any
two variables.
Linear relationship
Variability of Scores: Higher the variability in the data-->lower
the correlation coefficient.
Discontinuity: Missing values in one variable or both then
correlation coefficient overestimates the strength of
relationship.
Outliers
SPEARMAN RANK CORRELATION
Unlike Pearson correlation, which assesses linear relationships, Spearman
correlation is based on the ranks of the data points rather than their actual
values. It’s particularly useful when dealing with ordinal or non-normally
distributed data.
Where,
rs = Spearman’s rank-order correlation
D = difference between the pair of ranks of X
and Y
n = the number of pairs of ranks
SPEARMAN RANK CORRELATION
SPEARMAN RANK CORRELATION
APPLICATION OF CORRELATION
Validity and reliability
Putting variables in groups
Computation of further statistical analysis like
regression
LINEAR REGRESSION ANALYSIS/SIMPLE REGRESSION 6
LINEAR REGRESSION ANALYSIS/SIMPLE REGRESSION 6
Linear regression is a statistical method used to model the
relationship between a dependent variable (often denoted as Y) and
one or more independent variables (often denoted as X).
It assumes a linear relationship between the independent variables
and the dependent variable.
The goal of linear regression is to find the best-fitting line or plane
that describes the relationship between the variables.
LINEAR REGRESSION ANALYSIS/SIMPLE REGRESSION 6
Start by placing your data into a table. For this example, let us assume that
we have the following data: (4.1, 2.2) (6.5, 4.5) (12.6, 10.4)
LINEAR REGRESSION ANALYSIS/SIMPLE REGRESSION 6
LINEAR REGRESSION ANALYSIS/SIMPLE REGRESSION 6