Correlation
Concept of Correlation
• “Correlation analysis contributes to the understanding of economic behaviors, aids in
  locating the critically important variables on which others depend, may reveal to the
  economist the connections by which disturbances spread and suggest to him the paths
  through which stabilizing forces may become effective.”—W.A. Neiswanger
• According to L.R. Connor, “If two or more quantities vary in sympathy so that movements
  in one tend to be accompanied by corresponding movements in others, then they are said to
  be correlated.”
• In the words of Croxton and Cowden, “When the relationship is of a quantitative nature,
  the appropriate statistical tool for discovering and measuring the relationship and expressing
  it in a brief formula is known as correlation.”
• Correlation refers to the statistical relationship between the two entities. It measures the
  extent to which two variables are linearly related. For example, the height and weight of a
  person are related, and taller people tend to be heavier than shorter people.
• Correlation is a statistical measure that expresses the extent to which two variables
  are linearly related (meaning they change together at a constant rate).
• There are instances in real-world situations where distributions have two variables
  like data related to income and expenditure, prices and demand, height and weight,
  etc. The distribution with two variables is referred to as bivariate distribution.
• Correlations with a unit-free measure called the correlation coefficient which ranges
  from -1 to +1 and is denoted by r. Statistical significance is indicated with a p-value.
  Therefore, correlations are typically written with two key numbers: r = and p
                        Types of Correlation
• A Positive correlation is a relationship between two variables in which both
  variables move in the same direction. Therefore, when one variable increases as the
  other variable increases or one variable decreases while the other decreases. For
  example, Relationship between the price and supply of commodity products,
  income and expenditure, price of Crude Oil on Price of Price of Petro-chemcials ,
  etc.
• A negative correlation is a relationship between two variables in which an increase in
  one variable is associated with a decrease in the other. For example, the relationship
  between the price and demand, temperature and sale of woolen garments, etc.
• Zero Correlation: If there is no relation between two series
  or variables, it is said to have zero or no correlation. It means
  that if one variable changes and it does not have any impact
  on the other variable, then there is a lack of correlation
  between them. In such cases, the Coefficient of Correlation
  will be 0.
      Methods of Correlation
• Karl Pearson’s coefficient of correlation (Covariance method).
  • Two-way frequency table (Bivariate correlation method).
                       • Rank method.
    Karl Pearson’s coefficient of correlation (Covariance method).
• A mathematical method for measuring the intensity or the magnitude of linear
  relationship between two variable series was suggested by Karl Pearson (1867-1936),
  a great British Bio-metrician and Statistician and is by far the most widely used
  method in practice.
• Karl Pearson’s measure, known as Pearsonian correlation coefficient between two
  variables (series) X and Y, usually denoted by r (X, Y) or r xy or simply r, is a
  numerical measure of linear relationship between them and is defined as the ratio of
  the covariance between X and Y, written as Cov (x, y), to the product of the standard
  deviations of X and Y.
• Karl Pearson’s correlation coefficient is also known as the product moment
  correlation coefficient.
Formula
Sample Problem
Problems
Ans) X = 528, Y= 340, Dx2=480, Dy2= 1828, dxdy=-519, R= -0.559
Ans) X = 528, Y= 340, Dx2=480, Dy2= 1828, dxdy=-519, R= -0.559
Ans) X = 311, Y= 257, Dx2=203, Dy2= 163, dxdy=179, R= -0.9956
BIVARITE METHOD OF GROUP DATA
               SR –am-200 and AE, am-
                        12.5
11
     Fi –am-450 and fe am-17.5
RANK CORRELATION
       CONCEPT OF RANK CORRELATION
• A rank correlation is any of several statistics that measure an ordinal association—
  the relationship between rankings of different ordinal variables or different rankings
  of the same variable.
• where a "ranking" is the assignment of the ordering labels "first", "second", "third",
  etc. to different observations of a particular variable.
• A rank correlation coefficient measures the degree of similarity between two
  rankings, and can be used to assess the significance of the relation between them.
• Two common nonparametric methods of significance that use rank correlation are
  the Mann–Whitney U test and the Wilcoxon signed-rank test.
•   Rank correlation statistics include
•   Spearman's ρ
•   Kendall's τ
•   Goodman and Kruskal's γ
•   Somers’ D
An increasing rank correlation coefficient implies increasing agreement between rankings. The
coefficient is inside the interval [−1, 1] and assumes the value:
1 : if the agreement between the two rankings is perfect; the two rankings are the same.
0 : if the rankings are completely independent.
−1: if the disagreement between the two rankings is perfect; one ranking is the reverse of the
other.
where d is the difference between the pair of ranks of the same individual in the
two characteristics and n is the number of pairs.
                     Repeated Rank Correlation
• When two or more items have equal values (i.e., a tie) it is difficult to give ranks to
  them. In such cases the items are given the average of the ranks they would have
  received.
• For example we have a series 50, 70, 80, 80, 85, 90 then 1st rank is assigned to 90
  because it is the biggest value then 2nd to 85, now there is a repetition of 80 twice.
  Since both values are same so the same rank will be assigned which would be
  average of the ranks that we would have assigned if there were no repetition. Thus,
  both 80 will receive the average of 3 and 4 i.e. (Average of 3 & 4 i.e. (3 + 4) / 2=
  3.5) 3.5 then 5th rank is given to 70 and 6th rank to 50. Thus, the series and ranks
  of items are.
Formula for Repeated Ranks