0% found this document useful (0 votes)
9 views22 pages

Correlation

Uploaded by

Akshay Batra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views22 pages

Correlation

Uploaded by

Akshay Batra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Unit 1

Correlation
MBA (Sem-01)
Session 2022-2023

1
Example of Correlation

Is there an association between:


 Children’s IQ and Parents’ IQ
 Degree of social trust and number of membership in
voluntary association ?
 Urban growth and air quality violations?
 GRA funding and number of publication by Ph.D.
students
 Number of police patrol and number of crime
 Grade on exam and time on exam
Correlation

 Correlation coefficient: statistical index of the degree


to which two variables are associated, or related.

 We can determine whether one variable is related to


another by seeing whether scores on the two variables
covary---whether they vary together.
Scatterplot

 The relationship between any two variables can be


portrayed graphically on an x- and y- axis.
 Each subject i1 has (x1, y1). When score s for an entire
sample are plotted, the result is called scatter plot.
Scatterplot
Direction of the relationship

Variables can be positively or negatively correlated.


Positive correlation: A value of one variable increase,
value of other variable increase.

Negative correlation: A value of one variable


increase, value of other variable decrease.
Strength of the relationship

The magnitude of correlation:

 Indicated by its numerical value


 ignoring the sign
 expresses the strength of the linear relationship
between the variables.
r =1.00 r = .42

r =.17
r =.85
Pearson’s correlation coefficient

There are many kinds of correlation coefficients but


the most commonly used measure of correlation is the
Pearson’s correlation coefficient. (r)

 The Pearson r range between -1 to +1.


 Sign indicate the direction.
 The numerical value indicates the strength.
 Perfect correlation : -1 or 1
 No correlation: 0
 A correlation of zero indicates the value are not linearly related.
 However, it is possible they are related in curvilinear fashion.
Standardized relationship

 The Pearson r can be thought of as a standardized measure of


the association between two variables.
 That is, a correlation between two variables equal to .64 is the
same strength of relationship as the correlation of .64 for two
entirely different variables.
 The metric by which we gauge associations is a standard
metric.
 Also, it turns out that correlation can be thought of as a
relationship between two variables that have first been
standardized or converted to z scores.
Correlation Represents
a Linear Relationship
 Correlation involves a linear relationship.
 "Linear" refers to the fact that, when we graph our two
variables, and there is a correlation, we get a line of
points.
 Correlation tells you how much two variables are linearly
related, not necessarily how much they are related in
general.
 There are some cases that two variables may have a
strong, or even perfect, relationship, yet the relationship
is not at all linear. In these cases, the correlation
coefficient might be zero.
Coefficient of Determination r2

 The percentage of shared variance is represented by


the square of the correlation coefficient, r2 .
 Variance indicates the amount of variability in a set
of data.
 If the two variables are correlated, that means that we
can account for some of the variance in one variable
by the other variable.
Coefficient of Determination r2

r2
Statistical significance of r

 A correlation coefficient calculated on a sample is


statistically significant if it has a very probability of
being zero in the population.
 In other words, to test r for significance, we test the
null hypothesis that, in the population the correlation
is zero by computing a t statistic.
 Ho: r = 0
 HA: r = 0
Some consideration in interpreting correlation

1. Correlation represents a linear relations.

 Correlation tells you how much two variables are


linearly related, not necessarily how much they are
related in general.
 There are some cases that two variables may have a
strong perfect relationship but not linear. For
example, there can be a curvilinear relationship.
Some consideration in interpreting correlation

2. Restricted range (Slide: Truncated)

 Correlation can be deceiving if the full information


about each of the variable is not available. A
correlation between two variable is smaller if the
range of one or both variables is truncated.
 Because the full variation of one variables is not
available, there is not enough information to see the
two variables covary together.
Some consideration in interpreting correlation

3. Outliers

 Outliers are scores that are so obviously deviant from


the remainder of the data.

 On-line outliers ---- artificially inflate the correlation


coefficient.
 Off-line outliers --- artificially deflate the correlation
coefficient
On-line outlier

 An outlier which falls near where the regression line


would normally fall would necessarily increase the
size of the correlation coefficient, as seen below.
 r = .457
Off-line outliers

 An outlier that falls some distance away from the


original regression line would decrease the size of the
correlation coefficient, as seen below:
 r = .336
Correlation and Causation

 Two things that go together may not necessarily mean


that there is a causation.
 One variable can be strongly related to another, yet
not cause it. Correlation does not imply causality.

 When there is a correlation between X and Y.


 Does X cause Y or Y cause X, or both?
 Or is there a third variable Z causing both X and Y ,
and therefore, X and Y are correlated?

You might also like