0% found this document useful (0 votes)

288 views21 pages

Curve Fitting & Correlation Techniques

The document discusses curve fitting using the method of least squares. It provides 3 examples of using the method to fit straight lines and parabolas to data points. The key steps are: 1) Assume a model curve (e.g. straight line y=a+bx or parabola y=a+bx+cx^2) 2) Derive "normal equations" by minimizing the sum of squared errors 3) Solve the normal equations to determine the curve parameters (a, b, c etc.) 4) Check how well the fitted curve matches the original data points.

Uploaded by

C1DA20ME030 Hemanth S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

288 views21 pages

Curve Fitting & Correlation Techniques

Uploaded by

C1DA20ME030 Hemanth S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

UNIT-3

Statistical Techniques: Curve fitting by method of least squares: y=a+bx, y=a+bx+cx2 and
y=ab . Correlation–Karl Pearson’s coefficient of correlation, Regression analysis–lines of
x

regression (without proof)- problems.

3 Curve fitting: Least Squares Methods

Curve fitting is a problem that arises very frequently in science and engineering.
The process of constructing an approximate curve y  f ( x) which
fit best to a given discrete set of points ( xi , yi ),
i  1, 2, 3,......., n is called curve fitting

Principle of Least Squares:

The principle of least squares (PLS) is one of the most popular
methods for finding the curve of best fit to a given data set
( xi , yi ), i  1, 2, 3,......., n .
Let y  f ( x) be the equation of the curve to be fitted to the given
set of points P1 ( x1, y1 ), P2 ( x2 , y2 ), P3 ( x3 , y3 ),............, Pn ( xn , yn ).
Then e1  y1  f ( x1 )
e2  y2  f ( x2 )
e3  y3  f ( x3 )
……………..
……………..
ei  yi  f ( xi )
Squaring each error (or residue) ei and adding, we get
n n
E  e12  e2 2  e32  ..................  en 2   ei 2    yi  f ( xi )  …..(i)
2

i 1 i 1
The curve of best fit is that for which E is minimum. This is called the Principle of least squares
(PLS).

Some standard approximating curves :

1. y  a  bx (straight line)
2. y  a  b x  cx 2 (parabola or quadratic curve)
3. y  a b x (exponential curve)

3.1 Fitting a straight line by least squares

Let be the straight line y  f ( x)  a  b x ……………………..(ii)
to be fitted to the given set of data points ( x1, y1 ), ( x2 , y2 ), ( x3 , y3 ),............,( xn , yn ) .
To determine the two unknowns a (intercept) and b (slope) in (i) use the PLS criteria that E is
minimum,
n n n
i.e., E   ei 2    yi  f ( xi )    yi  a  bxi  ………………..(iii)
2 2

i 1 i 1 i 1
is minimum. Differentiating (iii) partially w.r.to a and b, and equating to zero, we get
E E
0 and 0
a b
E E
 0  0
a b
n n
  2  yi  a  bxi ) (1)  0   2  yi  a  bxi ( xi )  0
i 1 i 1
n n
   yi  a  bxi   0    yi xi  axi  bxi 2   0
i 1 i 1
n n n n n n
  yi   a   bxi   yi xi   axi  b xi 2
i 1 i 1 i 1 i 1 i 0 i 0
n n n n n n
 yi  a1  b xi   xi yi  a  xi b xi 2
i 1 i 1 i 1 i 1 i 0 i 0
n n
  yi  na b  xi
i 1 i 1
Thus the two unknown parameters a and b of Eq.(i) are determined from the two equations
 y  na  b  x …………………….(iii)
 xy  a x  b  x
2
…………………(iv)
Equations (iii) and (iv) are known as “normal equations” for fitting a straight line y  a  b x .
Note: Let y  a  b x being a straight line, then the normal equations are
  y  na  b  x and

 xy  a x  b  x2
3.2 Fitting a quadratic curve (parabola) by method least squares
Assume that y  a  b x  c x 2 being a parabola.
Approximate the data according to PLS. Then the unknown three parameters a, b, c are
determined from the following three normal equations obtained in similar way as above,
 y  na  b  x  c  x2 …………………….(iii)
 xy  a x  b  x2  c  x3 …………………(iv)
 x2 y  a x2  b  x3  c  x4 ……………..(v)
3.3 Fitting a nonlinear curve by least squares
Assume that y  a b x
Taking logarithm on both sides, we get
log y  log a  x log b
 Y  A  B X …………………..(i)
where Y  log y , A  log a , B  log b and X  x
Equation (i) is a linear equation in Y and X. For estimating A and B, normal equations are
Y  nA  B  X  XY  A X  B  X
2
and
where n is the number of pairs of values of x and y .
Ultimately, a  antilog( A). and b  antilog( B).

Example 1 By the method of least squares, find a straight line that best fits the following data
points:
x 0 1 2 3 4
y 1.0 2.9 4.8 6.7 8.6

Solution: Let line of best fit be given by y  a  bx ……………..(i)

Where a and b are constants to be determined by the normal equations.
The normal equations are  y  na  b  x ………………… .(ii)

 xy  a x  b  x …………………(iii)
2

Calculating the values of  x,  y,  xy,  x2 from the following data:

x y xy x2
0 1.0 0 0
1 2.9 2.9 1
2 4.8 9.6 4
3 6.7 20.1 9
4 8.6 34.4 16
 x  10  y  24  xy  67  x2  30
Here n  5 (number of pairs)
The normal equations are 24  5a  10 b ………………… .(iv)
67  10a  30b …………………….(v)
Solving (iv) and (v), we get a  1 and b  1.9.
Substituting in Eq.(i), line of best fit is y  1  1.9 x .

Example 2 By the method of least squares, find a straight line that best fits the following data
points:
x 1 2 3 4 5
y 14 27 40 55 68

Solution: Let line of best fit be given by y  a  bx ……………..(i)

The normal equations are  y  na  b  x …………………..(ii)

 xy  a x  b  x ……………...(iii)
2

Calculating the values of  x,  y,  xy,  x2 from the following data:

x y xy x2
1 14 14 1
2 27 54 4
3 40 120 9
4 55 220 16
5 68 340 25
 x  15  y  204  xy  748  x2  55
Here n  5 (number of pairs)
The normal equations are 204  5a 15b ………………… .(iv)
748  15a  55b …………………….(v)
Solving (iv) and (v), we get a  0 and b  13.6
Substituting in Eq.(i), line of best fit is y  13.6 x .

Example 3. If P is the pull required to lift a W by means of a pulley block, find a linear law of
the form P  c  mW connecting P and W , using the following data:
P 12 15 21 25
W 50 70 100 120
where P and W are taken in kg.wt. Compute P when 150 kg.wt.

Solution: Line for best fit is given as P  c  mW ………..(i)

The corresponding normal equations are
 P  nc  mW …………………(ii)
 PW  cW  mW
2
…………………….(iii)
W P WP W2
50 12 600 2500
70 15 1050 4900
100 21 2100 10000
120 25 3000 14400
W  340  P  73  PW  31800 W 2  6750
Equations (ii) and (iii) becomes
73  4c  340 m
6750  340c  31800 m
i.e., 2c  170 m  365
34c  3180 m  675 .
On solving above equations, we get m  0.1879 and c  2.2785 .
Substituting in Eq.(i), line of best fit is P  2.2759  0.1879W .
When W  150 kg. P  2.2759  0.1879 (150)  30.4635 kg.

Example 4 Fit a 2nd parabola to the given data

x 1 3 4 6 8 9 11 14
y 1 2 4 4 5 7 8 9

Solution: Let the parabola of best fit be given by y  a  b x  cx 2 ………..(i)

where a, b, c are costants to be determined.
By normal equations, we have
 y  na  b  x  c  x2 …………………….(ii)
 xy  a x  b  x2  c  x3 …………………(iii)
 x2 y  a x2  b  x3  c  x4 ……………..(iv)

x y xy x2 x2y x3 x4
1 1 1 1 1 1 1
3 2 6 9 18 27 81
4 4 16 16 64 64 256
6 4 24 36 144 216 1296
8 5 40 64 320 512 4096
9 7 63 81 567 729 6561
11 8 88 121 968 1331 14641
14 9 126 196 1764 2744 38416
 x  56  y  40  xy  364  x2  524  x2 y   x3  x4
3846  5624  65348

Substituting these values in Eqs.(ii)-(iv), we get

40 = 8a+56b+524c
364 = 56a+524b+5624c
3846 = 524a+5624b+65348c
On solving above equations, we get
a = 0.195, b = 0.77, c = 0.009.
Substituting in (i), parabola of best fit is y  0.195  0.77 x  0.009 x 2 .

3.4 Change of Scale

If the data values are equispaced (with height (h)) and quite large for computation, simplification
may be done by origin shifting as given below:
 When number of observations (n) is odd, take the origin at middle value of the table; say
x  x0
( x0 ) and substitute u 
h
 y values if small; may be left unchanged; or we can shift them at average value of y
y  y0
data v 
h
 When number of observations (n) is even, take the origin as mean of two middle values,
h x  x0
with new height   and substitute u  .
2 h/2

Example 5 Fit a 2nd parabola to the following data:

x 0 1 x0 =2 3 4
y 1 1.8 1.3 2.5 6.3

Solution: Here number of the given data is n=5 (odd), h=1, then
x  x0 x  2
u   x  2 and y  v so that the parabola of fit y  a  b x  cx 2 ………….(i)
h 1
becomes v  A  B u  C u 2 …………………(ii)
The normal equations of (ii) are
 v  nA  B  u  C  u 2 …………………(iii)
 uv  A u  B  u 2  C  u3 ……………(iv)
 u 2v  A u 2  B  u3  C  u 4 ………….(v)
u=x-2 v=y u2 u 2v u3 u4 uv
-2 1 4 4 -8 16 -2
-1 1.8 1 1.8 -1 1 -1.8
0 1.3 0 0 0 0 0
1 2.5 1 2.5 1 1 2.5
2 6.3 4 25.2 8 16 12.6
 u =0  u =12.9  u 2 =10 u v
2
u
3
=0 u
4
=34  uv
=33.5 =11.3

Equations (iii)-(v) are

12.9  5 A 10C
11.3  10B
33.5  10 A  34C
Solving these simultaneous equations, we get
A  1.48, B  1.13 and C  0.55.
Equation (ii) yields
v  A  B u  C u 2  1.48  1.13 u  0.55 u 2
Hence y  1.48  1.13( x  2)  0.55( x  2) 2
i.e., y  1.42  1.07 x  0.55 x 2 .
Which is the required solution of B parabola,
7
2.2

6 2.0

5 1.8

4 1.6
y
3 1.4

2 1.2

1 1.0

0 1 2 x 3 4 0.5 1.0 1.5 2.0 2.5

Fig.1. Plot of y verses x : Given data Fig.2. Plot of y verses x : y  1.42  1.07 x  0.55 x 2

Example 6 Fit a 2nd parabola to the following data:

x 1.0 1.5 2.0 2.5 3.0 3.5 4.0
y 1.1 1.3 1.6 2.0 2.7 3.4 4.1

Solution: Since number of observations is odd and h  0.5.

x  x0 x  2.5
Taking u    2 x  5 and y  v so that the parabola of fit
h 0.5
y  a  b x  cx 2 …………(i)
becomes v  A  B u  C u 2 …………………(ii)
The normal equations are
 v  nA  B  u  C  u 2 ……………….(iii)
 uv   u  B  u 2  C  u3 …………..(iv)
 u 2v  A u 2  B  u3  C  u 4 ………(v)
x y=v u  2x  5 u2 u3 u4 uv u2 v
1.0 1.1 -3 9 -27 81 -3.3 9.9
1.5 1.3 -2 4 -8 64 -2.6 5.2
2.0 1.6 -1 1 -1 1 -1.6 1.6
2.5 2.0 0 0 0 0 0 0.0
3.0 2.7 1 1 1 1 2.7 2.7
3.5 3.4 2 4 8 64 6.8 13.6
4.0 4.1 3 9 27 81 12.3 36.9
 v  u   u 2   u3   u 4   uv   u 2v 
16.2 0 28 0 196 14.3 69.9
Using the table values, Eqs.(iii)-(v) reduces to
16.2  7 A  0  28 C  7 A  28 C  16.2
14.3  0  28 B  0  28 B  14.3
69.9  28 A  0  196 C  28 A  196 C  69.9
On solving the simultaneous equations, we get
A  2.07, B  0.511, C  0.061 .
Equation (ii) becomes v  2.07  0.511u  0.061u 2 .
Put u  2x  5 then y  2.07  0.511(2 x  5)  0.061(2 x  5) 2 .
i.e., y  1.04  0.198 x  0.244 x 2 .
Which is the best fit of the parabola.

Example 7. Fit a 2nd degree parabola for the following data:

x 1989 1990 1991 1992 1993 1994 1995 1996 1997
y 352 356 357 358 360 361 361 360 359

Solution: Since number of observations is odd and h  1.

x  x0 x  1993 y  y0 y  357
Taking u    x  1993 and v    y  357 so that the parabola
h 1 h 1
of fit y  a  b x  cx 2 ………………….(i)
becomes v  A  B u  C u 2 …………………(ii)
The normal equations are
 v  nA  B  u  C  u 2 ……………….(iii)
 uv   u  B  u 2  C  u3 …………..(iv)
 u 2v  A u 2  B  u3  C  u 4 ………(v)
x y u  x 1993 v  y  360 u2 u3 u4 uv u2v

1989 352 -4 -5 16 -64 256 20 -80

1990 356 -3 -1 9 -27 81 3 -9
1991 357 -2 0 4 -8 16 0 0
1992 358 -1 1 1 -1 1 -1 1
1993 360 0 3 0 0 0 0 0
1994 361 1 4 1 1 1 4 1
1995 361 2 4 4 8 16 8 16
1996 360 3 3 9 27 81 9 27
1997 359 4 2 16 64 256 8 32
u v  u2  u3  u4  uv  u 2v
0  11  60 0  708  51  9

Using the table values, Eqs.(iii)-(v) reduces to

11  9 A  60 C  9 A  60 C  11
51  60 B  60 B  51
9  60 A  708 C  60 A  708 C  9
On solving the simultaneous equations, we get
694 17 247
A , B , C .
231 20 924
694 17 247 2
Equation (ii) becomes v   u u .
231 20 924
Substituting u  x 1993 and v  y  357 into Eq,(i), we get
694 17 247
y  357   ( x  1993)  ( x  1993) 2 .
231 20 924
i.e., y  1000106.41  1034.29 x  0.267 x 2 .

Example 8. The pressure and volume of a gas related to the equation p v  k where  and k
being constants. Fit this equation to the following data:
x  p (kg / cm2 ) 0.5 1.0 1.5 2.0 2.5 3
y  v (liters ) 1.62 1.00 0.75 0.62 0.52 0.46

Solution: Given p v  k …………………(i)

where  and k are constants to be determined.
Taking log, log10 p   log10 v  log10 k
 log10 v  log10 k  log10 p
1 1
log10 v  log10 k  log10 p
 
i.e., Y  A  B X ……………………..(ii)
1 1
where Y  log10 v, A  log10 k , B   , X  log10 p .
 
Normal equations of (ii) are Y  nA  B  X
 XY  A X  B  X 2
p v X  log10 p Y  log10 v XY X2

0.5 1.62 -0.3010 0.2095 -0.0630 0.0906

1.0 1.00 0.0000 0.0000 -0.0000 0
1.5 0.75 0.1761 -0.1249 -0.0220 0.0310
2.0 0.62 0.3010 -0.2076 -0.0625 0.0906
2.5 0.52 0.3979 -0.2840 -0.1130 0.1583
3.0 0.46 0.4771 -0.3372 -0.1609 0.2276
X Y  XY X2
 1.0511  0.7442  0.4214  0.5981
Here, n = 6
 X , Y ,  XY ,  X
2
Substituting the values of into the normal equations, we get
0.7442  6 A  1.0511 B
0.4214  1.0511A  0.5981 B
Solving these, we get A  0.0132 and B  0.7836
1 1
Now      1.1276
B 0.7836
1
Again , A  log10 k  log k   A

 k  anti log( A)  anti log( A)  anti log(0.0168)  1.039 .
Substituting the values of  and k in Eq.(i), we get
p v1.1276  1.039.
Which is the required curve.

Example 9. An experiment gave the following data

v ( ft / min) 350 400 500 600
t (min) 61 26 7 2.6
It is known that v and t are connected by v  a t b . Find the best possible values of a and b.
Solution: Given v  a t b is the non-linear equation. ………..(i)
Where a and b are constants to be determined.
Taking log on both sides, we get
log10 v  log10 a  b log10 t
 Y  A  B X is the linear equation
where Y  log10 v , X  log10 t , A  log10 a , B  b.
The normal equations are Y  nA  B  X and  XY  A  X  B  X 2 .
v t X  log10 t Y  log10 v XY X2

350 61 1.7853 2.5441 4.542 3.187

400 26 1.4150 2.6021 3.682 2.002
500 7 0.8451 2.6990 2.281 0.714
600 2.6 0.4150 2.7782 1.153 0.172
X Y  XY X2
 4.4604  10.6234  11.658  6.075
Here, n = 4. The normal equations become
10.6234  4 A  4.4604 B and 11.658  4.4604 A  6.075B
Solving these, A  2.845 and B  b  0.1697 .
Now A  log10 a  a  anti log( A)  anti log(2.845)  699.8 .
Substituting the values of a and b into Eq.(i), we get
v  699.8 t 0.1697 .

Example10. By the method of least squares, find the straight line that best fits the following
data:
x 1 2 3 4 5
y 14 27 40 55 68

Home work problem.

3.4 Correlation
In a bivariate distribution, if the change in one variable affects a change in the other
variable, the variables are said to be correlated.
If the two variables deviate in the same direction i.e., if the increase (or decrease) in one
results in a corresponding increase (or decrease) in the other, correlation is said to be direct or
positive.

Fig.1. Positive Correlation Fig.2. Negative Correlation

e.g., the correlation between income and expenditure is positive.

If the two variables deviate in opposite direction i.e., if the increase (or decrease) in one
results in a corresponding decrease (or increase) in the other, correlation is said to be inverse or
negative.
e.g., the correlation between volume and the pressure of a perfect gas or the correlation
between the price and demand is negative.
Correlation is said to be perfect if the deviation in one variable is a followed by a
corresponding proportional deviation in the other.

3.4.1 Scatter or dot diagrams

It is the simplest method of the diagrammatic representation of bivariate data. Let
( xi , yi ), i  1, 2,3,....., n be a bivariate distribution. Let the values of the variables x and y be
plotted along the x-axis and y-axis on a suitable scale. Then corresponding to every ordered pair,
there corresponds a point or dot in the xy-plane. The diagram of dots so obtained is called a dot
or scatter diagram.
If the dots are very close to each other and the number of observations is not very large, a
fairly good correlation is expected. If the dots are widely scattered, a poor correlation is
expected.

3.4.2 Coefficient of Correlation

Coefficient of correlation ( r ) lies between -1 and +1, i.e., 1  r  1 .
If r is zero; no correlation between two variables, positive correlation ( 0  r  1 ); when both
variables increase or decrease simultaneously, and negative correlation ( 1  r  0 ); when
increase in one is associated with decrease in other variable and vice-versa.

3.4.3 Karl Pearson Coefficient of Correlation

Coefficient of correlation ( r ) between two variables x and y is defined as

r
Covariance(x, y)



 XY (remember)
Variance(x) Variance(y )  x y  X 2
Y 2

where X  x  x , Y  y  y , x , y are means of x and y data values.

1
  Cov(x, y )   XY is the covariance between the variables x and y,
n

x
 x , y   y are means of x and y series respectively, also
n n

x  X2 and  y 
Y 2 are called the Standard Deviation (SD) of x and y respectively.
n n

Alternate form : r ( x, y ) 
 XY 
 ( x  x )( y  y )
 X 2  Y 2  ( x  x )2  ( y  y )2
n xy   x y
That is r ( x, y ) 
n x 2    x  n y 2    y 
2 2

Here n is the number of pairs of values of x and y.

Example 1. If Cov(x, y )  10, Var( x)  25, Var( y )  9 , find coefficient of correlation.

Covariance(x, y) 10 10
Solution: r     0.67.
Variance(x) Variance(y) 25 9 5  3

Example 2. Calculate coefficient of correlation from the following data:

x 9 8 7 6 5 4 3 2 1
y 15 16 14 13 11 12 10 8 9
Solution: Karl Pearson coefficient of correlation ( r ) is given by r 
 XY
 X 2 Y 2
where X  x  x , Y  y  y , x , y are means of x and y data values.

Here x 
 x  45  5 , y
 y  108  12 .
n 9 n 9
x y X  xx Y  y y X2 Y2 XY
9 15 4 3 16 9 12
8 16 3 4 9 16 12
7 14 2 2 4 4 4
6 13 1 1 1 1 1
5 11 0 -1 0 1 0
4 12 -1 0 1 0 0
3 10 -2 -2 2 4 4
2 8 -3 -4 9 16 12
1 9 -4 -3 16 9 12
x y X2 Y 2  XY
 45  108  60  60  57
The Karl Pearson coefficient of correlation is r 
 XY 
57

57
 0.95 .
 X 2 Y 2 60 60 60

Example 3. Psychological tests of intelligence and of engineering ability were applied to 10

students. Here is a record of ungrouped data showing intelligence ratio (I.R) and engineering
ratio (E.R). Calculate the coefficient of correlation.
Student A B C D E F G H I J
IR 104 105 102 101 100 99 98 96 93 92
ER 101 103 100 98 95 96 104 92 97 94

Solution : Karl Pearson coefficient of correlation ( r ) is given by r 

 XY ………(i)
 X 2 Y 2
where X  x  x , Y  y  y , x , y are means of x and y data values.

Here x 
 x  990  99 , y
 y  980  98 .
n 10 n 10

Student I.R (x) E.R (y) X  xx Y  y y X2 Y2 XY

A 104 101 6 3 36 9 18
B 105 103 5 5 25 25 25
C 102 100 3 2 9 4 6
D 101 98 2 0 4 0 0
E 100 95 1 -3 1 9 -3
F 99 96 0 -2 0 4 0
G 98 104 -1 6 1 36 -6
H 96 92 -3 -6 9 36 18
I 93 97 -6 -1 36 1 6
J 92 94 -7 -4 49 16 28
x y X Y X2 Y 2  XY
 990  980 0 0  170  140  92

Substituting these values in Eq.(i), we get r 

 XY 
92

92
 0.59.
 X 2 Y 2 170 140 154.3

Example 4. Find the coefficient of correlation between the values of x and y (using alternate
form):
x 1 3 5 7 8 10
y 8 12 15 17 18 20

Solution : Here, n = 6 .

x y x2 y2 xy
1 8 1 64 8
3 12 9 144 36
5 15 25 225 75
7 17 49 289 119
8 18 64 324 144
10 20 100 400 200
x y x
2
y
2
 xy
 34  90  248  1446  582

Karl Pearson's coefficient of correlation is given by

n xy   x y 6(582)  (34)(90)
r ( x, y )   = 0.9879
n x    x  n y    y  6(248)   34  6(1446)   90 
2 2 2 2 2 2

Shortcut Method for Karl Pearson Coefficient of Correlation

 We can also find Karl Pearson Coefficient of Correlation by taking assumed means as
shown:
If we take X  x  a, Y  y  b
where a and b are assumed means of x and y data values.
xa
 If x ’s are equispaced with height h , we can take : u  .
h
y b
Similarly y ’s are equispaced with height k , we can take : v  .
k
n uv   u  v
Then r ( x, y )  .
n u 2    u  n v 2    v 
2 2

Example 5. Find the co-efficient of correlation for the following table:

x 10 14 18 22 26 30
y 18 12 24 6 30 36
x  22 y  24
Solution : Let u  , v .
4 6
x y u v u2 v2 uv
10 18 -3 -1 9 1 3
14 12 -2 -2 4 4 4
18 24 -1 0 1 0 0
22 6 0 -3 0 9 0
26 30 1 1 1 1 1
30 36 2 2 4 4 4
u v u 2
v
2
 uv
 3  3  19  19  12
Here n  6.
n uv   u  v
The Karl Pearson Coefficient of Correlation : r ( x, y ) 
n u 2    u  n v 2    v 
2 2

6(12)  (3)(3) 63 63 3
i.e., r ( x, y)    
6(19)   3 6(19)   3 105 105 105 5
2 2

Therefore, r ( x, y )  0.6.
 x 2   y 2   x y 2
Example 6. Establish the formula r  .
2 x y
where r is the correlation coefficient between x and y. Using the above formula, calculate the
"coefficient of correlation” from the following data:
x: 21 23 30 54 57 58 72 78 87 90
y: 60 71 72 83 110 84 100 92 113 135

Solution: Let z  x  y , then z  x  y .

Therefore, z  z  ( x  y )  ( x  y )
z  z  (x  x )  ( y  y )
Squaring on both sides, we get
( z  z )2  ( x  x ) 2  ( y  y ) 2  2( x  x )( y  y )
Operating ‘  ’ on both sides, we get
 ( z  z )2   ( x  x )2   ( y  y )2  2 ( x  x )( y  y )
 ( z  z )2   ( x  x )2   ( y  y )2  2  ( x  x )( y  y )
n n n n
 z   x   y  2r x y
2 2 2 
 r
 ( x  x )( y  y ) 
 n x y 
 x   y   x y
2 2 2
Therefore, r  .
2 x y
which is the required result.
Home work: Using the above formula, calculate the "coefficient of correlation” from the given
data. Try yourself, submit through the e-mail (nanjundappace@gmail.com)

3.5 Regression
Regression is a statistical method used in finance, investing, and other disciplines that attempts to
determine the strength and character of the relationship between one dependent variable (usually
dependent variable y) and a series of other variables (known as independent variable x) and vice
versa.

Use of Regression Analysis

(i) In the field of Business, this tool of statistical analysis is widely used. Businessmen are
interested in predicting future production, consumption, investment, prices, profits and sales etc.

(ii) In the field of economic planning and sociological studies, projections of population, birth
rates, death rates and other similar variables are of great use.

3.5.1 Linear Regression

Regression describes the functional relationship between dependent and independent variables;
which helps us to make estimates of one variable from the other. Correlation quantifies the
association between the two variables; whereas linear regression finds the best line that predicts y
from x and x also from y. The difference between correlation and regression is illustrated in the
adjoining figure.
Correlation Regression

3.5.2 Lines of Regression

A line of regression is the straight line which gives the best fit in the least square sense to the
given frequency.
In case of n pairs ( xi , yi ); i  1, 2,3,............., n. from bivariate data, we have no reason or
justification to assume y as dependent variable or x as independent variable. Either of the two
may be estimated for the given values of the other. Thus if we wish to estimate y for given values
of x, we shall have the regression equation of the form y  a  bx , called the regression line of y
on 'x'. If we wish to estimate x for given values of y, we shall have the regression line of the form
x  a  by , called the regression line x on y.
Thus it implies, in general, we always have two lines of regression.

3.5.3 Derivation of Lines of Regression

3.5.3(i) Line of Regression of y on x
To obtain the line of regression of y on x, we shall assume y as dependent variable and x as
independent variable. Let the equation of regression line of y on x is
y  a  bx ……………………………………………(i)
The normal equations as derived by the method of least Square are:
 y  na  b  x ………………… .(ii)
 xy  a x  b  x
2
…………………(iii)
Solving (ii) and (iii) for 'a' and 'b', we get
n xy   x y
b and
n x 2    x 
2

a
 y b  x  y b x .
n n
Substituting the values of 'a' in Eq.(i), we get
y  y  b( x  x ) ………………………(iv)
Equation (iv) is called regression line of y on x,. 'b' is called the regression coefficient of y on x
and is usually denoted by b yx .
Hence Eq.(iv) can be written as
y  y  byx ( x  x )

is called the regression line y on x.

where x , y are the mean values of x and y respectively, while
n xy   x  y
byx  . (Remember)
n x 2    x 
2

In equation (iii), shifting the origin to ( x , y ) , we get

 ( x  x )( y  y )  a ( x  x )  b  ( x  x ) ……………….(v).
2

We know that  ( x  x )  0 ,

 2
X 2  ( x  x )2
x    ( x  x )2  n 2
x
n n

r
 XY 
 ( x  x )( y  y )   ( x  x )( y  y )
 X 2  Y 2  ( x  x )2  ( y  y )2 n x y

  ( x  x )( y  y )  rn x y
Equation (v) reduces to
 ( x  x )( y  y )  a ( x  x )  b  ( x  x )
2

 rn x y  a (0)  b n x2
y
Therefore b  r
x
y
That is byx  r called the regression coefficient (slope of line of regression) y on x.
x
Here r is the coefficient of correlation,  y and  x are the standard deviation of x and y series
respectively.
Note: The regression line of y on x is y  y  byx ( x  x ) (remember)
y
Where byx  r called the regression coefficient (slope of line of regression) y on x.
x

3.5.3(ii) Line of Regression of x on y

Proceeding in the same way as 4.5.2(i), we can derive the regression line of x on y as
x  x  bxy ( y  y )
is called the line of regression of x on y
Here bxy is the regression coefficient of x on y and is given by
n xy   x  y
bxy 
n y 2    y 
2

x
or bxy  r
y
where the terms have their usual meanings.

Here b yx and bxy are known coefficients of regression and are connected by the relation:
 y   x  2
byx  bxy   r    r   r .
 x   y 
Note : If r  0 , the two lines of regression become x  x and y  y which are two straight lines
parallel to x and y axes respectively and passing through their means y and x . They are
mutually perpendicular.
If r  1 , the two lines of regression will coincide.

3.5.4 Properties of Regression Coefficients

 As byx  bxy  r , the coefficient of correlation (r) is the geometric mean between the
two regression coefficients.
byx  bxy
 Since  byx  bxy  r ,  arithmetic mean of the two regression coefficients is
2
greater than or equal to the correlation coefficient ( r ).
 If there is a perfect correlation between the two variables under consideration, then
byx  bxy  r ; and the two lines of regression coincide. Converse is also true, i.e. if two
lines of regression coincide, then there is a perfect correlation.
 Since byx  bxy  r 2  0 , the signs of both regression coefficients b yx and bxy and
coefficient of correlation ( r ) must be same; either all three negative or all positive.
 Since byx  bxy  r 2  1 , if one of the regression coefficients is greater than unity, other
must be less than unity.
 Point of intersection of two lines of regression is ( x , y ), where x and y are the means x
and y series respectively.
 If both lines of regression cut each other at right angle, there is no correlation between the
two variables; i.e., r  0.

3.5.5 Angle between the Lines of Regression

If  be the acute angle between the two regression lines for two variables x and y, then
1  r 2   x y 
tan    .
r   x 2   y 2 
Proof: The two lines of regression are given by:
y
y y r ( x  x ) …………………..(i)
x
 y
and x  x  r x ( y  y ) or y  y  ( x  x ) ………………….(ii)
y r x
If m1 and m2 are slopes of lines (i) and (ii), then
m  m1 r y 
tan   2 , where m1  and m2  y
1  m1m2 x r x
 y r y  1 y
  r
r x  x r x
 tan   
 r y   y    
2
1    1  y

  x  r x  x 
1  r 2   x y 
Therefore, tan     …………………………(iii)
r   x 2   y 2 

 When r  0, tan      
2
Therefore, the two lines of regression are perpendicular to each other.
 When r  1, tan   0    0
Therefore, the two lines of regression are coincident

Example 1. The two regression equations of the variables x and y are x  18.13  0.87 y and
y  11.64  0.54 x . Find (1) the mean of x’s and y’s, (2) the co-efficient of correlation between x
and y.
Solution: Given x  18.13  0.87 y ……………..(i)
and y  11.64  0.54 x …………(ii)
Since the mean of x’s and y’s lie on the two regression lines, we have
x  18.13  0.87 y ……………………………..(iii)
and y  11.64  0.54 x ……………........(iv)
(1) On solving the above equations, we get x  15.79 and y  3.74.
(2) Regression coefficient y on x from Eq(ii) is byx  0.54 and Regression coefficient x on y
from Eq(i) is bxy  0.87.
Therefore, the coefficient of correlation is the geometric mean between the two regression
coefficients is given by r  byx  bxy  (0.54)(0.87)  0.66  0.66.
(here the -sign is taken, since both the regression coefficients are - sign ).

Example 2. In the partially destroyed laboratory record, only the lines of regression of y on x
and x on y are available as 4 x  5 y  33  0 and 20 x  9 y  107 respectively. Calculate (ii) x
and y , and (ii) the co-efficient of correlation between x and y.
Solution: Given 4 x  5 y  33 ……………..(i)
and 20 x  9 y  107 ………………(ii).
(i) Since the mean of x’s and y’s lie on the two regression lines (i) and (ii), we have
4 x  5 y  33
20 x  9 y  107 .
On solving these equations, we have x  13 and y  17.
4 33
(ii) The regression line y on x from the Eq(.(i) is y  x  …………..(iii)
5 5
4
The regression coefficient y on x is byx  .
5
9 107
Similarly, the regression line x on y from the Eq(.(ii) is x  y …………..(iii)
20 9
9
The regression coefficient x on y is bxy  .
20
Therefore, the coefficient of correlation is the geometric mean between the two regression
 4  9 
coefficients is given by r  byx  bxy      0.36  0.6  0.6 .
 5  20 
(here, the +sign is taken, since both the regression coefficients are + sign ).

Example 3. In the following table are recorded data showing the test scores made by salesmen
on an intelligence test and their weekly sales :
Salesmen 1 2 3 4 5 6 7 8 9 10
Test Scores=x 40 70 50 60 80 50 90 40 60 60
Sales(000)=y 2.5 6.0 4.5 5.0 4.5 2.0 5.5 3.0 4.5 3.0
Calculate the lines of regression of sales (y) on test scores (x) and estimate the most probable
weekly sales volume if a sales man makes a score of 70.
Solution: Determine the regression line of sales (y) on test scores (x): y  y  byx ( x  x ) .
n xy   x  y
where byx  .
n x    x 
2 2

From the table, we have

x  mean of x (test scores) 

 x  600  60 and
n 10

y  mean of y (sales) 
 y  40.5  4.05 .
n 10
Test scores = x Sales(000) = y xy x2
40 2.5
70 6.0
50 4.5
60 5.0
80 4.5
50 2.0
90 5.5
40 3.0
60 4.5
60 3.0
 x =600  y  40.5
n xy   x  y
Therefore, byx   0.06 .
n x 2    x 
2

The required regression line y on x is y  4.05  (0.06)( x  60)

i.e., y  0.06 x  0.45 .
When x  70 , y  0.06(70)  0.45  4.65 .
Thus the most probable weekly sales volume if a sales man makes a score of 70 is 4.65.

Example 4. Following data depicts the statistical values of rainfall and production of wheat in a
region for a specified time period.
Mean Standard Deviation
Production of Wheat (kg. per unit area) =y 10 8
Rainfall (cm) =x 8 2
Estimate the production of wheat (y) when rainfall (x) is 9cm if correlation coefficient between
production and rainfall is given to be 0.5.
Solution: Let the variables x and y denote rainfall and production respectively.
Given that x  8,  x  2; y  10,  y  8, r  0.5.
Now equation of regression of y on x is given by:
y
y y r (x  x )
x
(0.5)(8)
 y  10  ( x  8)
2
 y  10  2( x  8)
 y  2x  6  0 .
 y  2x  6
That is the production of wheat on the rain fall.
When rainfall is 9cm (x), production (y) of wheat is estimated to be 2(9)  6  12 kg. per unit
area.
Example 5. Find the coefficient of correlation and the lines of regression for the data
given below:
n  18,  x  12,  y  18,  x2  60,  y 2  96,  xy  48.
n xy   x y
Solution: (i) The coefficient of correlation: r ( x, y )  and
n x    x  n y    y 
2 2 2 2

(ii) Equations of regression lines: y  y  byx ( x  x ) , x  x  bxy ( y  y ) .

Now x 
 x  12  0.67, y
 y  18  1.
n 18 n 18

 x2  
  x  16  12 
2 2 2
x
        2.9   x  1.7.
n  n  18  18 
 y 2    y 
2 2
96  18 
y 2
        4.33   y  2.08.
n  n  18  18 
y  2.08    1.7 
byx  r  (0.57)    0.7 ; bxy  r x  (0.57)    0.47 .
x  1.7  y  2.08 

18(48)  (12)(18)
(i) The coefficient of correlation: r   0.57
18(60)  12  18(96)  18 
2 2

(ii) Equations of regression lines :

y  y  byx ( x  x ) , x  x  bxy ( y  y )
 y  1  0.7( x  0.67) , x  0.67  0.47( y  1)
 y  0.7 x  0.53 , x  0.47 y  0.2 .
Example 6. Find the correlation coefficient between x and y, when the two lines of
regression are given by: 2 x  9 y  6  0 and x  2 y  1  0 .
Solution: Let the line of regression of x on y be 2 x  9 y  6  0 ………..(i)
Then the line of regression of y on x be x  2 y  1  0 …………(ii)
9 9
From Eq.(i), we have x  y  3  bxy  and
2 2
1 1 1
From Eq.(ii), we have y  x   byx  .
2 2 2
1 9 3
 r  byx  bxy     1.5 , which is not possible as 1  r  1 .
2 2 2
So our choice of regression lines is incorrect.
2 2 2
Therefore, line of regression of y on x is 2 x  9 y  6  0  y  x   byx 
9 3 9
Also, line of regression of x on y is x  2 y  1  0  x  2 y  1  bxy  2
2 2
 r  byx  bxy  2  .
9 3
2
Hence coefficient of correlation between x and y is .
3

************************************END******************************

Curve Fitting with Least Squares
No ratings yet
Curve Fitting with Least Squares
6 pages
Unit 1 - Curve Fitting & Statistical Methods
No ratings yet
Unit 1 - Curve Fitting & Statistical Methods
23 pages
NA 1.CurveFitting
No ratings yet
NA 1.CurveFitting
12 pages
Curve Fitting
100% (4)
Curve Fitting
37 pages
Unit - I - Curve Fitting
No ratings yet
Unit - I - Curve Fitting
42 pages
Curve Fitting For Gtu Amee
No ratings yet
Curve Fitting For Gtu Amee
20 pages
Curve fitting-I-II
No ratings yet
Curve fitting-I-II
12 pages
Curve Fitting, NP Bali
No ratings yet
Curve Fitting, NP Bali
10 pages
Curve Fitting & Regression Techniques
No ratings yet
Curve Fitting & Regression Techniques
25 pages
Curve Fitting Techniques Guide
No ratings yet
Curve Fitting Techniques Guide
9 pages
Curve Fitting ST Line and Parabola
0% (1)
Curve Fitting ST Line and Parabola
12 pages
Curve Fitting
No ratings yet
Curve Fitting
7 pages
Probability and Statistics - Book (DR Hari Arora)
100% (4)
Probability and Statistics - Book (DR Hari Arora)
473 pages
Curve Fitting
No ratings yet
Curve Fitting
21 pages
Notes-Curve Fitting & Interpolation
No ratings yet
Notes-Curve Fitting & Interpolation
23 pages
PS CH1,2,3
No ratings yet
PS CH1,2,3
79 pages
Curve Fitting: Fitting A Straight Line
No ratings yet
Curve Fitting: Fitting A Straight Line
17 pages
Principle of Least Squares
No ratings yet
Principle of Least Squares
8 pages
Curve Fitting and Solution of Equation
No ratings yet
Curve Fitting and Solution of Equation
37 pages
Adobe Scan Dec 30, 2023
No ratings yet
Adobe Scan Dec 30, 2023
22 pages
Unit-III (B)
No ratings yet
Unit-III (B)
9 pages
Math
No ratings yet
Math
8 pages
5 - Curve Fitting by Numerical Methods
No ratings yet
5 - Curve Fitting by Numerical Methods
57 pages
Module-IV Curve Fitting & Statistical Methods: RV Institute of Technology & Management
No ratings yet
Module-IV Curve Fitting & Statistical Methods: RV Institute of Technology & Management
28 pages
Curve Fitting - DS
100% (1)
Curve Fitting - DS
9 pages
Image To PDF 20231110 11.37.30
No ratings yet
Image To PDF 20231110 11.37.30
20 pages
Curve Fitting
No ratings yet
Curve Fitting
37 pages
Curve Fitting
No ratings yet
Curve Fitting
37 pages
Chapter 9
No ratings yet
Chapter 9
26 pages
Curve Fitting and Regression
No ratings yet
Curve Fitting and Regression
24 pages
ProbStat - Curvefitting - U5notes
No ratings yet
ProbStat - Curvefitting - U5notes
25 pages
Stats Main
No ratings yet
Stats Main
18 pages
Lecture25 Ps
No ratings yet
Lecture25 Ps
10 pages
EE 3121 - L16 - L17 - Curvefitting Least Square PDF
No ratings yet
EE 3121 - L16 - L17 - Curvefitting Least Square PDF
50 pages
Linear Least Square and Euler Method
No ratings yet
Linear Least Square and Euler Method
18 pages
Chapter 3-3
No ratings yet
Chapter 3-3
59 pages
Method of Least Square - 20210823-232902
No ratings yet
Method of Least Square - 20210823-232902
11 pages
Index Number
No ratings yet
Index Number
48 pages
UNIT-III Curve Fitting & Smpling, App
No ratings yet
UNIT-III Curve Fitting & Smpling, App
51 pages
Ch. 9 Curve Fitting
No ratings yet
Ch. 9 Curve Fitting
25 pages
Fitting A Straight Line by The Method of Least Squares
No ratings yet
Fitting A Straight Line by The Method of Least Squares
6 pages
Curve Fitting with Least Squares
No ratings yet
Curve Fitting with Least Squares
20 pages
Notes UnitIII
No ratings yet
Notes UnitIII
53 pages
Least Square Method
100% (2)
Least Square Method
14 pages
SE 403 Lecture 5
No ratings yet
SE 403 Lecture 5
10 pages
Fittinf of Curvess
No ratings yet
Fittinf of Curvess
7 pages
CH-6.2-Curve Fitting Spring 24-25
No ratings yet
CH-6.2-Curve Fitting Spring 24-25
7 pages
Curve Fitting: Fitting A Straight Line
No ratings yet
Curve Fitting: Fitting A Straight Line
3 pages
The Method of Least Squares: y A BX CX y X y A BX
No ratings yet
The Method of Least Squares: y A BX CX y X y A BX
3 pages
M Iii 118 127
No ratings yet
M Iii 118 127
10 pages
Linear Regression Course
No ratings yet
Linear Regression Course
22 pages
Curve Fitting
No ratings yet
Curve Fitting
20 pages
Lecture 02
No ratings yet
Lecture 02
8 pages
Engineering Curve Fitting Guide
No ratings yet
Engineering Curve Fitting Guide
4 pages
Mathcs41 Module 4
No ratings yet
Mathcs41 Module 4
28 pages
Chapter IV
No ratings yet
Chapter IV
24 pages
P&S Unit 2
No ratings yet
P&S Unit 2
42 pages
Dom 1
No ratings yet
Dom 1
10 pages
Usn:1 D A - IDO: Technology
No ratings yet
Usn:1 D A - IDO: Technology
8 pages
Cad Cam
No ratings yet
Cad Cam
6 pages
18me34 MS
No ratings yet
18me34 MS
165 pages
UNIT-3 Transformers
No ratings yet
UNIT-3 Transformers
20 pages
UNIT 4 (Complete Notes)
No ratings yet
UNIT 4 (Complete Notes)
47 pages
Unit 2-Fourier Transforms - 3
No ratings yet
Unit 2-Fourier Transforms - 3
7 pages
Quadratic Equation EX-4
No ratings yet
Quadratic Equation EX-4
9 pages
MA1 - Spreadsheet Task
No ratings yet
MA1 - Spreadsheet Task
4 pages
Lecture On Holography Method of Condmatphy
No ratings yet
Lecture On Holography Method of Condmatphy
86 pages
Stochastic Processes Harmonizable Theory 1st Edition M.M. Rao Instant Download
100% (1)
Stochastic Processes Harmonizable Theory 1st Edition M.M. Rao Instant Download
53 pages
Mechanical Dept Syllabus 2019
No ratings yet
Mechanical Dept Syllabus 2019
19 pages
Problem Set 3 Sol
No ratings yet
Problem Set 3 Sol
3 pages
Shape Guessing Game for Kids
No ratings yet
Shape Guessing Game for Kids
4 pages
10 Most Asked LLM Interview Questions
No ratings yet
10 Most Asked LLM Interview Questions
12 pages
Amathole West District Topic Test Grade 11
No ratings yet
Amathole West District Topic Test Grade 11
3 pages
Full Finite Mathematics and Applied Calculus, 8e Stefan Waner Ebook All Chapters
100% (4)
Full Finite Mathematics and Applied Calculus, 8e Stefan Waner Ebook All Chapters
51 pages
Program Logic Formulation
No ratings yet
Program Logic Formulation
3 pages
Data Visualizations Charts
No ratings yet
Data Visualizations Charts
18 pages
Notes On Vortex Based Mathematics PDF
86% (14)
Notes On Vortex Based Mathematics PDF
78 pages
GCSE Maths: Cubic & Reciprocal Graphs
No ratings yet
GCSE Maths: Cubic & Reciprocal Graphs
8 pages
Squares and Square Roots Bingo
100% (2)
Squares and Square Roots Bingo
34 pages
Assignments EM
100% (2)
Assignments EM
16 pages
Engineering Textbook Errata
No ratings yet
Engineering Textbook Errata
5 pages
Disease Subtyping
No ratings yet
Disease Subtyping
19 pages
Information Theory and Entropy
100% (1)
Information Theory and Entropy
111 pages
EN - General Formulas PDF
No ratings yet
EN - General Formulas PDF
108 pages
Exact Differential Equation
50% (2)
Exact Differential Equation
27 pages
Unit3 Problem Well Analysis Well Performance Prediction II
No ratings yet
Unit3 Problem Well Analysis Well Performance Prediction II
38 pages
Best in gr.10
No ratings yet
Best in gr.10
10 pages
50 Questions 1D Arrays2d Arrays Use in AP Review
No ratings yet
50 Questions 1D Arrays2d Arrays Use in AP Review
26 pages
Directorate of Technical Education Madhya Pradesh
No ratings yet
Directorate of Technical Education Madhya Pradesh
27 pages
IC3 GS5 Key Applications Projects Lesson 04
No ratings yet
IC3 GS5 Key Applications Projects Lesson 04
4 pages
SAT Math To Know in One Page PDF
No ratings yet
SAT Math To Know in One Page PDF
3 pages
CBSE Class 10 Maths Solution PDF 2019 Set 1 PDF
No ratings yet
CBSE Class 10 Maths Solution PDF 2019 Set 1 PDF
13 pages
CE 463.3 - Advanced Structural Analysis Lab 2 - SAP2000 Truss Structures
No ratings yet
CE 463.3 - Advanced Structural Analysis Lab 2 - SAP2000 Truss Structures
5 pages
Exercise Chapter 1
No ratings yet
Exercise Chapter 1
3 pages

Curve Fitting & Correlation Techniques

Uploaded by

Curve Fitting & Correlation Techniques

Uploaded by

UNIT-3

regression (without proof)- problems.

3 Curve fitting: Least Squares Methods

Principle of Least Squares:

Some standard approximating curves :

3.1 Fitting a straight line by least squares

Solution: Let line of best fit be given by y  a  bx ……………..(i)

Calculating the values of  x,  y,  xy,  x2 from the following data:

Solution: Let line of best fit be given by y  a  bx ……………..(i)

Calculating the values of  x,  y,  xy,  x2 from the following data:

Solution: Line for best fit is given as P  c  mW ………..(i)

Example 4 Fit a 2nd parabola to the given data

Solution: Let the parabola of best fit be given by y  a  b x  cx 2 ………..(i)

Substituting these values in Eqs.(ii)-(iv), we get

3.4 Change of Scale

Example 5 Fit a 2nd parabola to the following data:

Equations (iii)-(v) are

0 1 2 x 3 4 0.5 1.0 1.5 2.0 2.5

Example 6 Fit a 2nd parabola to the following data:

Solution: Since number of observations is odd and h  0.5.

Example 7. Fit a 2nd degree parabola for the following data:

Solution: Since number of observations is odd and h  1.

1989 352 -4 -5 16 -64 256 20 -80

Using the table values, Eqs.(iii)-(v) reduces to

Solution: Given p v  k …………………(i)

0.5 1.62 -0.3010 0.2095 -0.0630 0.0906

Example 9. An experiment gave the following data

350 61 1.7853 2.5441 4.542 3.187

Home work problem.

Fig.1. Positive Correlation Fig.2. Negative Correlation

e.g., the correlation between income and expenditure is positive.

3.4.1 Scatter or dot diagrams

3.4.2 Coefficient of Correlation

3.4.3 Karl Pearson Coefficient of Correlation

where X  x  x , Y  y  y , x , y are means of x and y data values.

Here n is the number of pairs of values of x and y.

Example 1. If Cov(x, y )  10, Var( x)  25, Var( y )  9 , find coefficient of correlation.

Example 2. Calculate coefficient of correlation from the following data:

Example 3. Psychological tests of intelligence and of engineering ability were applied to 10

Solution : Karl Pearson coefficient of correlation ( r ) is given by r 

Student I.R (x) E.R (y) X  xx Y  y y X2 Y2 XY

Substituting these values in Eq.(i), we get r 

Karl Pearson's coefficient of correlation is given by

Shortcut Method for Karl Pearson Coefficient of Correlation

Example 5. Find the co-efficient of correlation for the following table:

Solution: Let z  x  y , then z  x  y .

Use of Regression Analysis

3.5.1 Linear Regression

3.5.2 Lines of Regression

3.5.3 Derivation of Lines of Regression

is called the regression line y on x.

In equation (iii), shifting the origin to ( x , y ) , we get

3.5.3(ii) Line of Regression of x on y

3.5.4 Properties of Regression Coefficients

3.5.5 Angle between the Lines of Regression

From the table, we have

x  mean of x (test scores) 

The required regression line y on x is y  4.05  (0.06)( x  60)

(ii) Equations of regression lines: y  y  byx ( x  x ) , x  x  bxy ( y  y ) .

(ii) Equations of regression lines :

You might also like