Chapter 2
Solution of equations in one variable
Contents
2.1 Bisection method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Fixed point iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Newton’s method (or Newton-Raphson method) . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Accelerating convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
One of the most important problems in mathematics is the root-finding problem.
Definition 2.1. A solution x of an equation of the form f (x) = 0 for a given function f is called a root or a solution
of the equation or a zero of the function f .
2.1 Bisection method
• Suppose that f : [a, b] ! R is continuous, and f (a)f (b) < 0.
• By the intermediate value theorem,
9p 2 (a, b) : f (p) = 0.
• We set
a 1 + b1
a1 = a, b1 = b, p1 = .
2
• The bisection algorithm proceeds as follows:
Algoritm 2.1.
• Suppose
a n + bn
an , bn , and pn =
2
are determined with
f (an )f (bn ) < 0 (n 1).
• If f (pn ) = 0, then p = pn and the algorithm terminates. In this case, we set pk = pn for all k n.
• Otherwise, there are precisely two possibilities:
5
CHAPTER 2. SOLUTION OF EQUATIONS IN ONE VARIABLE 6
• f (an )f (pn ) < 0 : In this case, set:
an+1 + bn+1
an+1 = an , bn+1 = pn , pn+1 = .
2
• f (pn )f (bn ) < 0 : In this case, set:
an+1 + bn+1
an+1 = pn , bn+1 = bn , pn+1 = .
2
Theorem 2.1 (global convergence of bisection method). Suppose f 2 C[a, b] and f (a)f (b) < 0. The Bisection
method generates a sequence {pn }1
n=1 approximating a zero p of f in [a, b] with
b a
|pn p| , n 1.
2n
Proof. For n 1, we have
b a
bn an = & p 2 (an , bn ).
2n 1
a n + bn
Since pn = is the midpoint of [an , bn ], we thus get
2
bn an b a
|pn p| = .
2 2n
⇤
2.2 Fixed point iteration
Definition 2.2. A point p is a fixed point of a function g if g(p) = p.
Remark 2.1. Fixed point problems and root finding problems are equivalent in the following sense:
(i) If p is a fixed point of the function g, then p is a zero of the function f (x) = g(x) x or f (x) = x g(x).
(ii) If p is a root of the function f , then we can construct functions g with p as their fixed point as follows:
g(x) = x + h(x) f (x)
where h is any function defined at p.
Theorem 2.2 (a fixed point theorem). Suppose g : [a, b] ! [a, b] is continuous. Then g has a fixed point.
Proof. If g(a) = a or g(b) = b, there is nothing to prove. Otherwise
f : [a, b] ! R : x 7! x g(x)
is continuous and
f (a) = a g(a) < 0 & f (b) = b g(b) > 0
so that f (a)f (b) < 0. Thus, by the intermediate value theorem, there exists p 2 (a, b) such that
f (p) = p g(p) = 0.
⇤
CHAPTER 2. SOLUTION OF EQUATIONS IN ONE VARIABLE 7
Definition 2.3. Let U be a subset of a metric space (X, d) (e.g. X = R). A function g : U ! X is called Lipschitz
continuous if there exists a constant 0 such that
8x, y 2 U d(g(x), g(y)) d(x, y).
Here is called the Lipschitz constant. If 2 [0, 1), g is called a contraction with contraction constant .
Remark 2.2. Every Lipschitz function is continuous.
Lemma 2.1. Every contraction mapping has at most one fixed point.
Proof. If p and q are two fixed points, then
d(p, q) = d(g(p), g(q)) d(p, q)
implies that p = q. ⇤
Theorem 2.3 (Banach fixed point theorem or contraction mapping theorem). Let U be a complete subset of a
metric space (X, d), and let g : U ! U be a contraction with contraction constant . Then:
(i) g has a unique fixed point.
(ii) For any x0 2 U , the sequence {xn } defined by
xn = g(xn 1 ), n 1,
converges to this unique fixed point, say p. Moreover, we have the a priori error estimate
n
d(xn , p) d(x1 , x0 ) (2.1)
1
and the a posteriori error estimate
d(xn , p) d(xn , xn 1) (2.2)
1
for all n 1.
Proof. We have
d(xn+1 , xn ) = d(g(xn ), g(xn 1 )) d(xn , xn 1)
so that by induction
n
d(xn+1 , xn ) d(x1 , x0 ), n = 1, 2, . . . .
Hence, for m > n, we have
d(xm , xn ) d(xm , xm 1) + · · · + d(xn+1 , xn )
m 1 n
+ ··· + d(x1 , x0 )
m n
n1
= d(x1 , x0 )
1
n
d(x1 , x0 ).
1
Since 2 [0, 1), this shows {xn } is a Cauchy sequence in U . Since U is complete, {xn } converges to a point
p 2 U . Letting m ! 1 in the preceding estimate yields (2.1); and (2.2) follows from (2.1) applied to the
sequence {yn } initiating from y0 = xn 1 . ⇤
CHAPTER 2. SOLUTION OF EQUATIONS IN ONE VARIABLE 8
Example 2.1. The function f : [0, 1) ! [0, 1) given by
1
f (x) = x +
1+x
fulfills the condition
|f (x) f (y)| |x y| , for x 6= y
as a consequence of
✓ ◆ ✓ ◆
1 1 x + x2 y y 2
f (x) f (y) = x + y+ =
1+x 1+y (1 + x)(1 + y)
x y + (x y)(x + y) 1+x+y
= = (x y).
(1 + x)(1 + y) 1 + x + y + xy
However, because of
1
>0 for x 0,
1+x
it does not have a fixed point.
Corollary 2.1. Suppose g : [a, b] ! [a, b] is continuous and
:= sup |g 0 (x)| < 1.
x2(a,b)
Then g is a contraction.
Proof. By the mean value theorem,
g(x) g(y) = g 0 (⇠) (x y)
for some ⇠ between x and y. Thus
|g(x) g(y)| |x y|.
⇤
Algoritm 2.2 (the fixed point algorithm or the method of successive approximations). If g has the fixed point
p, then the fixed point algorithm generates the sequence {xn } defined as
x0 : arbitrary
xn = g(xn 1 ), n = 1, 2, . . .
to approximate p.
2.3 Newton’s method (or Newton-Raphson method)
Turning back to the root finding problem f (x) = 0, suppose that f is a differentiable function on a neigh-
borhood of its root. We start with an initial guess x0 . In Newton’s method once the nth iterate is computed,
xn+1 is determined by requiring that the tangent line to the graph of f at (xn , f (xn )) passes through the
point (xn+1 , 0). Thus xn+1 must satisfy the equation
0 f (xn )
= f 0 (xn )
xn+1 xn
or equivalently
f (xn )
xn+1 = xn , n = 0, 1, . . . (2.3)
f 0 (xn )
For a given point x0 , the sequence {xn } defined by the recursion (2.3) is called the Newton iteration or
Newton-Raphson iteration for the approximation of the root of f .
CHAPTER 2. SOLUTION OF EQUATIONS IN ONE VARIABLE 9
Theorem 2.4 (local convergence of Newton’s method). Let I be an open interval and p 2 I. Suppose that
f 2 C 2 (I), f (p) = and f 0 (p) 6= 0. Then:
9C, > 0 : 8x0 2 [p , p + ] the sequence {xn } given by
f (xn )
xn+1 = xn , n = 0, 1, 2, . . .
f 0 (xn )
is well defined and converges to p; moreover,
2
|xn+1 p| C |xn p| , n = 0, 1, 2, . . . .
Proof. Consider the “function”
f (x)
g(x) = x .
f 0 (x)
Since f 0 (p) 6= 0, there exists 1 > 0 such that g is well-defined on [p 1, p + 1 ]. Note that
f (p)
g(p) = p =p
f 0 (p)
and
2
[f 0 (x)] f (x)f 00 (x) f (x)f 00 (x)
g 0 (x) = 1 2 = 2 , x 2 [p 1, p + 1]
[f 0 (x)] [f 0 (x)]
so that
g 0 (p) = 0.
This, in turn, implies that
9 2 (0, 1) & 0 < 1 : x 2 [p , p + ] =) |g 0 (x)| .
For x 2 [p , p + ], we have by the mean value theorem
g(x) = g(p) + g 0 (⇠x ) (x p) = p + g 0 (⇠x ) (x p)
for some ⇠x between x and p. Thus,
|g(x) p| = |g 0 (⇠x )| |x p| |x p| .
The consequence of this is twofold:
(i) g maps [p , p + ] into itself.
(ii) g is a contraction (with contraction constant ) on this interval.
Thus, by the Banach fixed point theorem (since g(p) = p as well), p is the unique fixed point of g on this
interval and for any x0 2 [p , p + ], the sequence {xn } given by
f (xn )
xn+1 = g(xn ) = xn
f 0 (xn )
(which is nothing but the Newton-Raphson iteration) converges to p.
Now, by Taylor’s theorem,
f 00 (⇠x ) f 00 (⇠x )
f (x) = f (p) + f 0 (p) (x p) + (x p)2 = f 0 (p) (x p) + (x p)2
2 2
for some ⇠x between x & p, and
f 0 (x) = f 0 (p) + f 00 ( x ) (x p)
CHAPTER 2. SOLUTION OF EQUATIONS IN ONE VARIABLE 10
for some x between x & p. It follows that
f (xn ) f 0 (xn ) (xn p) f (xn ) 2 f 00 ( xn ) f 00 (⇠xn ) 2
xn+1 p = xn p 0
= = (xn p)
f (xn ) f 0 (xn ) 0
2 f (xn )
and thus
2
|xn+1 p| C |xn p| .
⇤
2.4 Accelerating convergence
Theorem 2.5 (Aitken’s 2
method). Suppose that
(i) {xn } is a sequence with xn 6= p for all n.
(ii) There exists a constant c and a sequence { n } such that
• |c| =
6 1
• limn!1 n =0
• xn+1 p = (c + n ) (xn p) for all n.
Then:
(i) {xn } converges to p iff |c| < 1.
(ii) If |c| < 1, then
xn+2 xn x2n+1
yn =
xn+2 2xn+1 + xn
is well-defined for all sufficiently large n. Moreover, {yn } converges to p faster than {xn } in the sense that
yn
lim = 0.
n!1 xn
Proof. Let en = xn p. Then we compute
2
(en+2 + p) (en + p) (en+1 + p) en+2 en e2n+1
yn p= p= .
(en+2 + p) 2 (en+1 + p) + (en + p) en+2 2en+1 + en
Since en+1 = xn+1 p = (c + n ) (xn p) = (c + n ) en for all n 0, we get that
2
(c + n+1 ) (c + n ) (c + n ) (c + n ) ( n+1 n)
yn p= (xn p) = 2 (xn p) .
(c + n+1 ) (c + n ) 2 (c + n ) + 1 (c 1) + c ( n+1 + n) + n ( n+1 2)