while for Fig. (6.
2) we could try a parabola or quadratic curve:
y = a + b x + c x2 (6.2)
Sometimes it helps to plot scatter diagrams in terms of transformed variables. For example, if log y
vs x leads to a straight line, we would try log y = a + b x as an equation for the approximating curve.
6.2 The Method of Least Squares
Consider Fig. 6.2 in which the data points are (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ). For given value of x,
say, x1 , there will be a difference between the value y1 and the corresponding value as determined
from the curve C. We denote this difference by d1 , which is sometimes referred to as a deviation, error,
or residual and may be positive, negative, or zero. Similarly, corresponding to the values x2 , . . . , xn ,
we obtain the deviations d2 , . . . , dn .
Fig. 6.2: Showing the deviations
A measure of the goodness of fit of the curve C to the set of data is provided by the quantity,
S = d21 + d22 + · · · + d2n (6.3)
If S is small, the fit is good, if it is large, the fit is bad. We therefore make the following definition:
Definition Of all curves in a given family of curves approximating a set of n data points, a curve
having the property that
S = d21 + d22 + · · · + d2n = a minimum (6.4)
is called a best-fitting curve in the family.
A curve having this property is said to fit the data in the least-squares sense and is called a least-
squares curve. A line having this property is called a least-squares line; a parabola with this property
is called a least-squares parabola, etc.
It is customary to employ the above definition when x is the independent variable and y is the de-
pendent variable. Unless otherwise specified, we shall consider y as the dependent and x as the
independent variable.
2
and X X X
xy = a x+b x2 (6.7)
Solving the eqs. (6.6) and (6.7), we get
P P P P
( y)( x2 ) ( x)( x y)
a= P P (6.8)
n ( x2 ) ( x)2
and P P P
n( x y) ( x)( y)
b= P P (6.9)
n ( x2 ) ( x)2
Note: From eq. (5.6), we have
X X
y = na + b
x
1 X 1 X
=) y =a+b x
n n
1X
=) ȳ = a + b x̄ * z̄ = z
n
=) a = ȳ b x̄
where,
P P P
n( x y) ( x)( y)
b= P P
n ( x2 ) ( x)2
P
n ( x y) (n x̄)(n ȳ) 1X X
= P * z̄ = z =) z = n z̄
n ( x2 ) (n x̄)2 n
P
x y n x̄ ȳ
= P 2
x n x̄2
P
x y n x̄ ȳ + n x̄ ȳ n x̄ ȳ
= P 2
x n x̄2 + n x̄2 n x̄2
P P P P
x y x̄ y+ x̄ ȳ ȳ x 1X X X
= P 2 P P 2 P * z̄ = z! z = n z̄ = z̄
x x̄ x+ x̄ x̄ x n
P
[x y x̄ y + x̄ ȳ ȳ x]
= P 2
[x 2 x x̄ + x̄2 ]
P
[x(y ȳ) x̄(y ȳ)]
= P
(x x̄)2
P
(x x̄) (y ȳ)
= P
(x x̄)2
6.4 Examples
Example 1. Find the best values of a and b so that y = a + b x fits the data given in the table:
x 1 2 3 4 5
y 14 27 40 55 68
Solution Let the least-squares line to the given data be
y = a + bx (6.10)
4
then normal equations are (n = 5)
X X
y = na + b x
X X X (6.11)
xy = a x+b x2
Consider the following table:
x y xy x2
1 14 14 1
2 27 54 4
3 40 120 9
4 55 220 16
5 68 340 25
P P P P 2
x = 15 y = 204 x y = 748 x = 55
Eqs. (6.11) becomes
204 = 5 a + 15 b
(6.12)
748 = 15 a + 55 b
Solving eqs. (6.11), we get
a=0 and b = 68/5 (6.13)
Thus, the required line is
68
y= x
5
Example 2. Find the best values of a and b so that y = a eb x fits the data given in the table:
x 2 4 6 8 10
y 4.077 11.084 30.128 81.897 222.62
Solution Given y = a eb x ; Taking logarithm both sides,
) ln y = ln a + b x
Let the least-squares line to the given data be
Y = A + bx (6.14)
Where, Y = ln y and A = ln a.
then normal equations are (n = 5)
X X
Y = nA + b x
X X X (6.15)
xY = A x+b x2
Consider the following table: