Ra JPG
Ra JPG
– in a nutshell –
                    This book was created and used for the lecture Math-
                    ematical Analysis at Hamburg University of Tech-
                    nology in the summer term 2019 for General En-
                    gineering Science and Computer Science students.
                                                  Julian P. Großmann
                                                Hamburg University of
                                                           Technology
                                           julian.grossmann@tuhh.de
ii
2 Infinite Series                                                                                                                35
  2.1 Basic Definitions, Convergence Criteria and Examples . . . . . . . . . . . .                                               36
  2.2 Absolute Convergence and Criteria . . . . . . . . . . . . . . . . . . . . . .                                              41
  2.3 The Cauchy Product of Series . . . . . . . . . . . . . . . . . . . . . . . . .                                             49
3 Continuous Functions                                                                   51
  3.1 Bounded Functions, Pointwise and Uniform Convergence . . . . . . . . . . 51
  3.2 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4 Elementary Functions                                                                                                           63
  4.1 Exponential Function . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   63
  4.2 Logarithm . . . . . . . . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   66
  4.3 Hyperbolic and Trigonometric Functions         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   69
      4.3.1 Hyperbolic Functions . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   69
      4.3.2 Area Functions . . . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   73
      4.3.3 Trigonometric Functions . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   75
  4.4 Arcus functions . . . . . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   79
  4.5 Polynomials and Rational Functions . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   80
      4.5.1 Polynomials . . . . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   80
      4.5.2 Rational Functions . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   86
  4.6 Power Series . . . . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   92
5 Differentiation of Functions                                                                                                    97
  5.1 Differentiability and Derivatives . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    97
  5.2 Mean Value Theorems and Consequences           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   105
  5.3 Higher Derivatives, Curve Sketching . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   111
  5.4 Taylor’s Formula . . . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   116
                                                                                                                                  iii
iv                                                                                Contents
Index                                                                                  173
                                                                                    Some words
This text should help you to understand the course Real Analysis. It goes hand in hand
with a video course you find on YouTube. Therefore, you will always find links and QR
codes that bring you to the corresponding videos.
To expand your knowledge even more, you can look into the following books:
      •   Jonathan Lewin: An Interactive Introduction to Mathematical Analysis,
      •   A. N. Kolmogorov: Introductory Real Analysis,
      •   Claudio Canute, Anita Tabacco: Mathematical Analysis I.
      •   Vladimir A. Zorich: Mathematical Analysis I.
Real Analysis is also known as Calculus with real numbers. It is needed for a lot of other
topics in mathematics and the foundation of every new career in mathematics or in fields
that need mathematics as a tool. We1 discuss simple examples later. Some important
bullet points are limits, continuity, derivatives and integrals. In order to describe these
things, we need a good understanding of the real numbers. They form the foundation of
a real analysis course.
For this reason, the first step in a Real Analysis is to define the real number line. After
this, we will be able to work with these numbers and understand the field as a whole. Of
course, this is not an easy task and it will be a hiking tour that we will do together. The
 1
     In mathematical texts, usually, the first-person plural is used even if there is only one author. Most of
      the time it simply means “we” = “I (the author) and the reader”.
                                                                                                            1
2                                                                                  Contents
summit and goal is to understand why working with real numbers is indeed a meaningful
mathematical theory.
We start in the valley of mathematics and will shortly scale the first hills. Always stay
in shape, practise and don’t hesitate to ask about the ways up. It is not an easy trip but
you can do it. Maybe the following tips can guide you:
    • You will need a lot of time for this course if you really want to understand everything
      you learn. Hence, make sure that you have enough time each week to do mathematics
      and keep these time slots clear of everything else.
    • Work in groups, solve problems together and discuss your solutions. Learning math-
      ematics is not a competition.
    • Explain the content of the lectures to your fellow students. Only the things you can
      illustrate and explain to others are really understood by you.
    • Learn the Greek letters that we use in mathematics:
       α alpha       β beta          γ gamma       Γ Gamma
       δ delta        epsilon       ε epsilon     ζ zeta
       η eta         θ theta         Θ Theta       ϑ theta
       ι iota        κ kappa         λ lambda      Λ Lambda
       µ mu          ν nu            ξ xi          Ξ Xi
       π pi          Π Pi            ρ rho         σ sigma
       Σ Sigma       τ tau           υ upsilon     Υ Upsilon
       φ phi         Φ Phi           ϕ phi         χ chi
       ψ psi         Ψ Psi           ω omega       Ω Omega
    • Choosing a book is a matter of taste. Look into different ones and choose the book
      that really convinces you.
Contents                                                                           3
   • Keep interested, fascinated and eager to learn. However, do not expect to under-
     stand everything at once.
DON’T PANIC                                                                   J.P.G.
                                                      Sequences and Limits
                                                                                           1
                          I’m a gym member. I try to go four times a week, but I’ve missed the
                          last twelve hundred times.
                                                                                  Chandler Bing
Before we start with the Real Analysis course, we need to lie down some foundations. You
only need some knowledge about working with sets and maps to get started. From this,
we will introduce all the number sets we will need in this course in a quick way. Hence,
we quickly have the real numbers R we work with throughout this course.
However, if you interested in a more detailed discussion, I can recommend you my video
series about the foundations of mathematics:
   Video: Start Learning Mathematics
In order to construct the real number line, we need to generalise the equality sign. We
get a more abstract notion that we can use for sets to put similar elements into the same
box. In the end, we want to calculate with these boxes.
It turns out that we just need three properties from the equality sign to get the general
concept of an equivalence relation.
                                                                                                 5
6                                                                   1 Sequences and Limits
By having this, we now can put equivalent elements in the corresponding boxes. These
boxes ared called equivalent classes.
    Proposition & Definition 1.2. Equivalent classes
    An equivalence relation ∼ on X gives a partition of X into disjoint subsets. For all
    a ∈ X, we define
                                 [a]∼ := {x ∈ X : x ∼ a}
    and call it an equivalent class. We have the disjoint union:
                                            [
                                       X=       [a]∼ .
                                               a∈X
In the same way as we generalised the equality sign, we can also generalise the greater or
equal sign you might have seen often for numbers. It turns out that we just need some
defining properties there to get an abstract notion of such an ordering.
    Definition 1.3. Ordering
    Let X be a set. Let R≤ ⊂ X × X be a relation on X where we write x ≤ y if
    (x, y) ∈ R≤ . A relation R≤ is called a partial order if it satisfies the following:
If, in addition,
    Remark: Notation
    If one has an ordering relation ≤, one usually also defines the following symbols:
x ≥ y : ⇐⇒ y ≤ x
                                               x ≤ y and x 6= y .
                                                               
                               x < y : ⇐⇒
1.1 Just Numbers                                                                              7
(0, 5) (1, 6) (101, 106) (56, 61) (77, 82) (91, 96)
(a, b) ∼ (c, d) : ⇐⇒ a + d = b + c
   It is well-defined, associative and commutative and [(0, 0)] defines the neutral ele-
   ment. The set of all equivalence classes with this new addition is called the integers
   and denoted by Z.
    (A) Addition
       (A1) associative: x + (y + z) = (x + y) + z
       (A2) neutral element: There is a (unique) element 0 with x + 0 = x for all x.
       (A3) inverse element: For all x there is a (unique) y with x + y = 0. We write
            for this element simply −x.
       (A4) commutative: x + y = y + x
    (M) Multiplication
      (M1) associative: x · (y · z) = (x · y) · z
      (M2) neutral element: There is a (unique) element 1 with 1 · x = x for all x.
      (M3) commutative: x · y = y · x
(D) Distributivity: x · (y + z) = x · y + x · z.
(1, 5) (2, 10) (101, 505) (50, 250) (11, 55) (500, 2500)
(a, b) ∼ (c, d) : ⇐⇒ a · d = b · c
   (A) Addition
      (A1) associative: x + (y + z) = (x + y) + z
      (A2) neutral element: There is a (unique) element 0 with x + 0 = x for all x.
      (A3) inverse element: For all x there is a (unique) y with x + y = 0. We write
           for this element simply −x.
      (A4) commutative: x + y = y + x
   (M) Multiplication
      (M1) associative: x · (y · z) = (x · y) · z
      (M2) neutral element: There is a (unique) element 1 6= 0 with x·1 = x for all x.
      (M3) inverse element: For all x 6= 0 there is a (unique) y with x · y = 1. We
           write for this element simply x−1 .
      (M4) commutative: x · y = y · x
(D) Distributivity: x · (y + z) = x · y + x · z.
   (O) Ordering
      (O1) x ≤ x is true for all x.
      (O2) If x ≤ y and y ≤ x, then x = y.
      (O3) transitive: x ≤ y and y ≤ z imply x ≤ z.
      (O4) For all x, y ∈ X, we have either x ≤ y or y ≤ x.
      (O5) x ≤ y implies x + z ≤ y + z for all z.
      (O6) x ≤ y implies x · z ≤ y · z for all z ≥ 0.
      (O7) x > 0 and ε > 0 implies x < ε + · · · + ε for sufficiently many summands.
   (C) Let X, Y ⊂ R be two non-empty subsets with the property x ≤ y for all x ∈ X
       and y ∈ Y . Then there is a c ∈ R with x ≤ c ≤ y for all x ∈ X and y ∈ Y .
  Remark:
  We will later reformulate the completeness axiom with the help of sequences. Then
  it sounds like:
  Completeness: Every sequence (an )n∈N with the property [For all ε > 0 there is an
  N ∈ N with |an − am | < ε for all n, m > N ] has a limit.
  Exercise 1.9.
  Use the axioms to show:
   (1) : 0 · x = 0
10                                                                     1 Sequences and Limits
(2) : −x = (−1) · x
(4) : 1 > 0
    Definition 1.12.
    Let M be a set. A sequence in M is a map a : N → M or a : N0 → M .
    Remark:
    M is usually a real subset (M ⊂ R), but M can also be a complex subset (M ⊂ C)
    or a subset of some normed space (or M = R, M = C, M a normed space itself ).
Example 1.13. (a) an = (−1)n , then (an )n∈N = ((−1)n )n∈N = (−1, 1, −1, 1, −1, 1, . . .);
(b) an = n1 , then (an )n∈N = ( n1 )n∈N = (1, 21 , 13 , 41 , 51 , 16 , . . .);
(c) an = in (i is the imaginary unit), then (an )n∈N = (in )n∈N = (i, −1, −i, 1, i, −1, . . .);
(d) an =      1
             2n
                ,   then (an )n∈N = ( 21n )n∈N = (1, 21 , 41 , 18 , 16
                                                                    1 1
                                                                       , 32 , . . .);
(e) Approximation of π:
    Consider a circle with radius r = 1.
     The area is given by A = πr2 = π.
     Now: Approximation of the circle by a regular polygon (all edges have equal length):
      Area of a “piece of cake”: Acn = sin( πn ) cos( πn ) = 21 sin( 2π
                                                                      n
                                                                        ) (the latter equation holds
     true due to the general equality sin(2x) = 2 sin(x) cos(x) (will be treated later)).
     The area of the polygon is therefore given by
                                                                                    
                                                                n               2π
                                             An = n · Acn      = · sin                   .
                                                                2                n
     Now consider the sequence (An )n∈N . Some values for An are listed in the following
     table:
12                                                                     1 Sequences and Limits
                   p1
                        p2
                             n            An                 π − An
                             3            1.299038           1.84255
                             6            2.598076           0.54351
                             12           3.000000           0.14159
                             3072         3.14159046         0.00000219
                             50331648     3.141592653589     8.16 · 10−16
    The sequence (xn )n∈N is a sequence of approximate solutions. The method is “good”
    if (xn )n∈N “converges”.
Since most (but not all) results stay the same in the real and complex case, we make the
following definition:
   Definition 1.14.
   The symbol F stands for either the real or complex numbers, i.e., F ∈ {R, C}.
      – (an )n∈N is convergent to a ∈ F if for all ε > 0 there exists some N = N (ε) ∈ N
        such that for all n ≥ N holds |an − a| < ε. In this case, we write
                                               lim an = a.
                                              n→∞
      – (an )n∈N is divergent if it is not convergent, i.e., for all a ∈ F holds: There
        exists some ε > 0 such that for all N there exists some n > N with |an −a| ≥ ε.
Convergence for real sequences means that if you give any small distance ε, one finds that
all sequence members an lie in the interval (a − ε, a + ε) with the exception of only finitely
many.
                 a1 a2      aN                   a
                          a−ε                                  a+ε
Example 1.16.         • Show: (an )n∈N with an = (1/n) is convergent with limit 0.
                                   √
   • Show: (bn )n∈N   with bn = (1/ n) is convergent with limit 0.
                                                1   1
                                    |bn − 0| = √ ≤ √ < ε
                                                 n  N
      This means bn is arbitrarily close to 0, eventually.
   Remark:
   (a) It can be shown that for a complex sequence (an )n∈N , convergence to a ∈ C holds
       true if, and only if, (Re(an ))n∈N converges to Re(a) and (Im(an ))n∈N converges
       to Im(a)
   (b) In fact, convergence can also be defined for sequences in some arbitrary normed
14                                                                              1 Sequences and Limits
         vector space V (for the definition of a normed space, e.g. consult the linear
         algebra script). Then one has to replace the absolute value by the norm (e.g.,
         “kan − ak < ε”.)
     (c) Due to the fact that for any N ∈ N, we can find some x ∈ R with x > N , we
         can equivalently reformulate the convergence definition as follows: “(an )n∈N is
         convergent to a ∈ R if for all ε > 0 there exists some N = N (ε) ∈ R such that
         for all n ≥ N holds |an − a| < ε”. In the following, we just write “there exists
         some N ”.
                   a1 a2     aN                    a   aN +1
                            a−ε                                           a+ε
Outside any ε-neighbourhood of a only finitely many elements of the sequence exist.
                                              |(−1)n − a| <   1
                                                              10
                                                                 .
      This is a contradiction.                                                                               2
(c) For q ∈ C\{0} with |q| < 1 the complex sequence (q )n∈N converges to 0.
                                                                      n
  Remark:
  The choice of the N often seems “to appear from nowhere”. However, there is
  a systematic way to formulate the proof. For instance in a), we need to end up with
  the equation | n1 − 0| < ε or, equivalently, n1 < ε. Inverting this expression leads to
  n > 1ε . Therefore, if N is chosen as N = 1ε + 1 = 1+εε
                                                          , the desired statement follows.
  If one has to formulate such a proof (for instance, in some exercise), then first
  these above calculations have to be done “on some extra sheet” and then formulate
  the convergence proof in the style as in a) or c).
– bounded if there exists some c ∈ R such that for all n ∈ N holds |an | ≤ c;
   Theorem 1.19.
  Let (an )n∈N be a convergent sequence in F. Then (an )n∈N is bounded.
Proof. Suppose that limn→∞ an = a. Take ε = 1. Then there exists some N such that for
all n ≥ N holds |an − a| < 1. Thus, for all n ≥ N holds
Now choose
                             c = max{|a1 |, |a2 |, . . . , |aN −1 |, |a| + 1}
and consider some arbitrary sequence element ak .
If k < N , then |ak | ≤ max{|a1 |, |a2 |, . . . , |aN −1 |} ≤ c.
In the case k ≥ N , the above calculations lead to |ak | < |a| + 1 ≤ c.
Altogether, this implies that |ak | ≤ c for all k ∈ N, so (an )n∈N is bounded by c.
     Remark:
     For a convergent sequence (an )n∈N that is bounded by c, we can also deduce from
     the above argumentation that for the limit a holds |a| ≤ c:
     Suppose that limn→∞ an = a ∈ F, then for an arbitrary ε > 0 there exists an N ∈ N
     such that |a−an | < ε for all n ≥ N . Hence |a| = |a−an +an | ≤ |a−an |+|an | ≤ ε+c.
     Since ε > 0 can be chosen arbitrarily small this implies |a| ≤ c.
     (iii) If limn→∞ bn 6= 0 and bn 6= 0 for all n ∈ N, then the sequence ( abnn )n∈N is
           convergent with
                                            an    lim an
                                                 n→∞
                                        lim    =         .
                                       n→∞ bn     lim bn
                                                          n→∞
1.2 Convergence of Sequences                                                                  17
     Remark:
     (a) Since the constant sequence (a)n∈N = (a, a, a, . . .) is, of course, convergent to a,
         statement (ii) also implies the formula
                                        lim (a · bn ) = a · lim bn .
                                      n→∞                       n→∞
     (b) For k ∈ N, a k-times application of statement (ii) yields that for some conver-
         gent sequence (an )n∈N , also the sequence (akn )n∈N is convergent with
                                                                     k
                                         lim   akn   =       lim an        .
                                        n→∞                  n→∞
                                  lim an = a,                lim bn = b.
                                  n→∞                        n→∞
Further, assume that for all n ∈ N holds an ≤ bn . Then the following holds true:
(i) a ≤ b;
      (ii) If a = b and (cn )n∈N is another sequence with an ≤ cn ≤ bn for all n ∈ N, then
           (cn )n∈N is convergent with
                                            lim cn = a = b.
                                               n→∞
(Sandwich-Theorem)
Proof. (i) Consider the sequence of differences between bn and an , i.e., (bn − an )n∈N . By
     Theorem 1.21, it suffices to show that
                                      b − a = lim (bn − an ) ≥ 0.
                                                 n→∞
       Assume the converse statement, i.e., b − a < 0. Then, we have that both numbers
       a − b and bn − an are positive and thus
     This implies that (cn − an )n∈N is convergent with limn→∞ (cn − an ) = 0. Hence
     a = 0 + a = limn→∞ (cn − an ) + limn→∞ an = limn→∞ cn .
   Remark:
   Since the modification of finitely many sequence elements does not change the limits
   (take a closer look at Definition 1.15), the statements of Theorem 1.22 can be slightly
   generalised by only claiming that there exists some n0 such that for all n ≥ n0 holds
   an ≤ bn (resp. for all n ≥ n0 holds an ≤ cn ≤ bn in (ii)). In the proof of (i), one
   has to replace the words “there exists no n ∈ N such that” by “there exists no n ≥ n0
   such that” and in the proof of (ii) the number N has to be replaced by max{N, n0 }.
   Attention!
   From the fact that we have the strict inequality an < bn , we cannot con-
   clude that the limits satisfy a < b. To see this, consider the sequences
   (an )n∈N = (0, 0, 0, . . .) and (bn )n∈N = ( n1 )n∈N . In this case, we have a = b = 0
   though the strict inequality an = 0 < n1 = bn holds true for all n ∈ N.
Example 1.23. a) Consider ( n1k )n∈N for some k ∈ N. We state two alternative ways to
  show that this sequence tends to zero. The first possibility is, of course, an argument-
  ation as in statement (b) in Remark on page 18. The second way to treat this problem
  is making use of the inequality
                                        1     1
                                          ≥ k > 0.
                                        n    n
  Since we know from Example 1.17 a) that the sequence ( n1 )n∈N tends to zero, statement
  (ii) of Theorem 1.22 directly leads to the fact that ( n1k )n∈N also tends to zero.
b) Consider (an )n∈N with
                                             2n2 + 5n − 1
                                     an =                 .
                                             −5n2 + n + 1
   Rewriting
                                             2 + n5 − n12
                                      an =                 ,
                                             −5 + n1 + n12
   and using that both ( n1 )n∈N and ( n12 )n∈N tend to zero, we can apply Theorem 1.21 to
   obtain that
                                 2n2 + 5n − 1        2 + n5 − n12    2
                   lim an = lim      2
                                              =  lim       1     1 =− .
                  n→∞        n→∞ −5n + n + 1    n→∞ −5 +     + n2    5
                                                           n
                                √
c) Consider (an )n∈N with an = n2 + 1 − n. At first glance, none of the so far presented
   results seem to help to analyse convergence of this sequence. However, we can compute
                                             √             √
                           √               (  n 2 + 1 − n)( n2 + 1 + n)
                      an = n 2 + 1 − n =            √
                                                      n2 + 1 + n
                            n2 + 1 − n2           1          1
                         =√              =√               < .
                             n2 + 1 + n       n2 + 1 + n    n
   By Theorem 1.22, we now get that limn→∞ an = 0.
20                                                                  1 Sequences and Limits
(e) bounded from above if there exists some c ∈ R with an ≤ c for all n ∈ N.
     (g) divergent to ∞ if for all c ∈ R there exists some N with an ≥ c for all n ≥ N .
         In this case, we write
                                           lim an = ∞.
                                           n→∞
     (h) divergent to −∞ if for all c ∈ R there exists some N with an ≤ c for all n ≥ N .
         In this case, we write
                                          lim an = −∞.
                                          n→∞
     Remark:
     It can be readily seen from the definition that a sequence is bounded if and only if
     it is both bounded from above and bounded from below.
Example 1.25. (a) For k ∈ N, the sequence ( n1k )n∈N is strongly monotonically decreasing
   due to n1k > (n+1)
                  1
                     k and, moreover, both bounded from above and bounded from below.
(b) The sequence (n3 )n∈N is bounded from below and divergent to ∞.
    Proof: The fact that this sequence is bounded from below directly follows from n3 ≥ 0
    for all n ∈ N. To show that this sequence is divergent to ∞, let c ∈ R be arbitrary
    and choose                         (√
                                          3
                                            c + 1 : if c ≥ 0,
                                  N=
                                         0        : else.
      Then for n ≥ N , we have that n3 > c and thus, the sequence (an )n∈N tends to ∞.
     Reminder:
     Definition of intervals:
                                [a, b) := {x ∈ R | a ≤ x < b}
                                (a, b] := {x ∈ R | a < x ≤ b}
1.2 Convergence of Sequences                                                            21
                               [a, b] := {x ∈ R | a ≤ x ≤ b}
                               (−∞, b) := {x ∈ R | x < b}
                               (−∞, b] := {x ∈ R | x ≤ b}
                                (a, ∞) := {x ∈ R | a < x}
                                [a, ∞) := {x ∈ R | a ≤ x}
      Remark: Difference between sup and max (resp. inf and min)
      In contrast to the maximum, the supremum does not need to belong to the respective
      set. For instance, we have 1 = sup(0, 1), but max(0, 1) does not exist. The ana-
      logous statement holds true for inf and min. However, we can make the following
      statement: If max M (min M ) exists, then max M = sup M (min M = inf M ).
The next result concerns the special property of the real numbers that supremum and
infimum are defined for all subsets of the real numbers. This theorem goes back to
Julius Wilhelm Richard Dedekind (1831–1916). It follows from the completeness
axiom (C):
      Theorem 1.31. Dedekind’s Theorem
      Every non-empty bounded set M ⊂ R has a supremum and an infimum with
      sup M, inf M ∈ R.
Proof: Let us first assume that (an )n∈N is monotonically increasing and bounded from
above. Define the set M = {an : n ∈ N}. Since M is bounded, Dedekind’s theorem
implies that there exists some K ∈ R such that
K = sup M.
                                       K − ε < aN ≤ an ≤ K
1.2 Convergence of Sequences                                                              23
Example 1.33. a) Consider the sequence (an )n∈N which is recursively defined via a1 = 1
  and
                                        an + a2n
                               an+1 =            for n ≥ 1.
                                           2
  We now prove that this sequence is convergent by showing that it is bounded from
  below and for all n ≥ 2 holds an+1 ≤ an .
                                                                   √
  Proof: To show boundedness from below, we use the inequality xy ≤ x+y       2
                                                                                 for all
  nonnegative x, y ∈ R. This inequality is a consequence of
                                 √      √
                                ( x − y)2        x+y √
                            0≤                 =       − xy.
                                      2            2
   The first inequality is a consequence of the fact that squares of real numbers cannot
   be negative.
   Using this inequality, we obtain for n ≥ 1
                                     an + a2n               √
                                                r
                                                       2
                              an+1 =          ≥ an ·      = 2.
                                        2              an
   Thus, (an ) is bounded from below. For showing monotonicity, we consider
                                               2
                                       an +   an             1
                         an+1 − an =               − an =       (2 − a2n ).
                                          2                 2an
   In particular, if n ≥ 2, we have that an > 0 and 2 − a2n ≤ 0. Thus, an+1 − an ≤ 0
   for n ≥ 2. An application of Theorem 1.31 (resp. the slight generalisation in Remark
   from above) now leads to the existence of some a ∈ R with a = limn→∞ an .
   To compute the limit, we make use of the relation limn→∞ an = limn→∞ an+1 (follows
   directly from Definition 1.15) and the formulae for limits in Theorem 1.21. This yields
                                                                   2
                                                          an +    an     a + a2
                     a = lim an = lim an+1 = lim                       =        .
                          n→∞       n→∞             n→∞       2            2
                                                                           √           √
   This relation leads to the equation 2 − a2 = 0, i.e., we either have a = 2 or a = − 2.
   However, the latter solution cannot be a limit since all sequence elements are positive.
   Therefore, we have                              √
                                         lim an = 2.
                                       n→∞
24                                                                              1 Sequences and Limits
                                                  √
b) Let
     √ x ∈ R with x > 1. Consider  the sequence ( n
                                                    x)n∈N . It can be directly seen that
   ( x)n∈N is monotonically decreasing and bounded from below by one. Therefore, the
     n
   limit                                      √
                                      a = lim n x
                                                     n→∞
     exists with a ≥ 1. To show that a = 1, we assume that a > 1 and lead this to
     a contradiction.
     The √equation a > 1 leads to the existence of some n ∈√  N with an > x, and thus
     a > n x. On the other hand, the monotone decrease of ( n x)n∈N implies that
                                     √         √
                            a = lim n x = inf{ n x : n ∈ N} < a,
                                  n→∞
     which is a contradiction.
                                                               √
c) Let x ∈ R with 0 < x < 1. Consider the sequence (an )n∈N = ( n x)n∈N . Then we have
   by Example b) and Theorem 1.21 that
                                       √
                                       n
                                                       1               1
                                 lim       x=               q =          = 1.
                                 n→∞
                                                limn→∞      n    1     1
                                                                 x
d) Let (an ) be a nonnegative sequence with an → a and k ∈ N. Then for all ε > 0 there
   exists N > 0 such that |an − a| < εk . From this it follows that
                                √       √      p
                               | k an − k a| ≤ k |an − a| < ε.
           √                                √
     Thus ( k an ) is convergent with limit k a.
                                                           1 n
e) The sequence (an )n∈N defined as an := 1 +                    is convergent.
                                                             
                                                           n
     Remark:
     The limit of the sequence
                                                               
                                                        1 n
                                 (an )n∈N =          1+                    ,
                                                        n            n∈N
                             n
     i.e. e := limn→∞ 1 + n1 is well known as Euler’s number. Later on we will define
     the exponential function exp. It holds that e = exp(1) ≈ 2.7182818... . Indeed, we
                                                  n
     will show later on that ez = limn→∞ 1 + nz = exp(z).
a1 a2 a3 a4 a5 a6 a7 a8 a9 a10
Example 1.35. Consider the sequence (an )n∈N = ( n1 )n∈N . Then some subsequences are
given by
   • (ank )k∈N = (a2k )k∈N = ( 12 , 14 , 61 , 18 , . . .);
   • (ank )k∈N = (ak2 )k∈N = (1, 14 , 19 , 16
                                           1 1
                                              , 25 , . . .);
   • (ank )k∈N = (a2k )k∈N = ( 12 , 14 , 18 , 16
                                              1 1
                                                 , 32 , . . .);
   • (ank )k∈N = (ak! )k∈N = (1, 12 , 16 , 24
                                           1     1
                                              , 120    1
                                                    , 720 , . . .).
                                                    lim ank = a.
                                                   k→∞
Proof: Since 1 ≤ n1 < n2 < n3 < . . . and nk ∈ N for all k ∈ N, we have that nk ≥ k
for all k ∈ N. Let ε > 0. By the convergence of (an )n∈N , there exists some N such that
|ak − a| < ε for all k ≥ N . Due to nk ≥ k, we thus also have that |ank − a| < ε for all
k ≥ N.                                                                                2
   Attention!
   The existence of a convergent subsequence (ank )k∈N does in general not imply the
   convergence of (an )n∈N . For instance, consider (an )n∈N = ((−1)n )n∈N . Both sub-
   sequences
                     (a2k )k∈N = ((−1)2k )k∈N = (1, 1, 1, 1, . . .)
                        (a2k+1 )k∈N = ((−1)2k+1 )k∈N = (−1, −1, −1, −1, . . .)
   are convergent though (an )n∈N = ((−1)n )n∈N is divergent (see Example 1.17 b)).
However, we can “rescue” this statement by additionally claiming that (an )n∈N is mono-
tonic.
   Theorem 1.37. Subsequences of monotonic sequences
   Let (an )n∈N be a sequence in R. If (an )n∈N is monotonic and there exists a convergent
   subsequence (ank )k∈N , then (an )n∈N is convergent with
Proof: Denote a = limk→∞ ank . We just consider the case where (an )n∈N is monotonically
increasing (the remaining part can be done analogously to the argumentations at the end
of the proof of Theorem 1.31). Since (ank )k∈N is also monotonically increasing, we have
that a = sup{ank : k ∈ N}.
Let ε > 0. Due to the convergence and monotonicity of (ank )k∈N , there exists some K ∈ N
26                                                                    1 Sequences and Limits
|a − an | = a − an < ε.
                                                                                             2
Next we present the famous Theorem of Bolzano-Weierstraß.
     Theorem 1.38. Theorem of Bolzano-Weierstraß
     Let (an )n∈N be a bounded sequence in F. Then there exists some convergent sub-
     sequence (ank )k∈N .
Proof: First we consider the case F = R. Since (an )n∈N is bounded, there exist some
A, B ∈ R such that for all n ∈ N holds A ≤ an ≤ B. We will now successively construct
subintervals [An , Bn ] ⊂ [A, B] which still include infinitely many sequence elements of
(an )n∈N .
Inductively define A0 = A, B0 = B and for k ≥ 1,
     a) Ak = Ak−1 , Bk = Ak−1 +B
                               2
                                 k−1
                                     , if the interval [Ak−1 , Ak−1 +B
                                                                    2
                                                                      k−1
                                                                          ] contains infinitely
        many sequence elements of (an )n∈N , and
               Ak−1 +Bk−1
     b) Ak =        2
                          ,   Bk = Bk−1 , else.
By the construction of Ak and Bk , we have that each interval [Ak , Bk ] has infinitely many
sequence elements of (an )n∈N . We furthermore have B1 − A1 = 21 (B − A), B2 − A2 =
1
4
  (B − A), . . ., Bk − Ak = 21k (B − A). Moreover, the sequence (An )n∈N is monotonically
increasing and bounded from above by B, i.e., it is convergent by Theorem 1.32. The
relation Bk − Ak = 21k (B − A) moreover implies that (Bn )n∈N is also convergent and has
the same limit as (An )n∈N . Denote
                                        a = lim An = lim Bn .
                                             n→∞         n→∞
Define a subsequence (ank )k∈N by n1 = 1 and nk with nk > nk−1 and ank ∈ [Ak , Bk ]
(which is possible since [Ak , Bk ] contains infinitely many elements of (an )n∈N ). Then
Ak ≤ ank ≤ Bk . Theorem 1.22 then implies that
                                              a = lim ank .
                                                   k→∞
Finally we consider the case F = C. Write an = bn + icn where i is the imaginary unit,
bn := Re (an )p denotes the real part and cn := Im (an ) denotes the imaginary part of an .
Since |an | = b2n + c2n ≥ max{|bn |, |cn |} ≥ 0, the boundedness of the complex sequence
(an )n∈N implies the boundedness of both real sequences (bn )n∈N and (cn )n∈N . Then, by
the previous, we now that (bn )n∈N has a convergent subsequence (bnk )k∈N . Since the
subsequence (cnk )k∈N of the bounded sequence (cn )n∈N is also bounded, it also has a con-
vergent subsequence (cnkm )m∈N . The subsequence (bnkm )m∈N of the convergent sequence
(bnk )k∈N also converges. Hence (ankm )m∈N = (bnkm +icnkm )m∈N is a convergent subsequence
of (an )n∈N with limm→∞ ankm = limm→∞ bnkm + i · limm→∞ cnkm .                          2
1.3 Subsequences and accumulation values                                                      27
                                        a = lim ank .
                                            k→∞
  Attention! Names
  Accumulation values are often called by other names, like accumulation points,
  limits points or cluster points.
   Proposition 1.40.
  a ∈ F is an accumulation value if and only if in every ε-neighbourhood of a, there
  are infinitely many elements of the sequence (an )n∈N .
      • limit superior of (an )n∈N if a is the largest accumulation value of (an )n∈N . In
        this case, we write
                                          a = lim sup an .
                                               n→∞
      • limit inferior of (an )n∈N if a is the smallest accumulation value of (an )n∈N . In
        this case, we write
                                            a = lim inf an .
                                               n→∞
  Remark:
  Almost needless to say, we define the ordering between infinity and real numbers by
  −∞ < a < ∞ for all a ∈ R. It can be shown that (in contrast to the limit) the limit
  superior and limit inferior always exist for any real sequence. This will follow from
  the subsequent results.
   Lemma 1.43.
  Let (an )n∈R be a real sequence. Then the following statements hold
     c) A sequence is convergent if and only if lim inf n→∞ an = lim supn→∞ an 6∈ {±∞}.
        In this case holds limn→∞ an = lim inf n→∞ an = lim supn→∞ an .
Proof:
a) If (an ) is not bounded from below, then, by Definition 1.41, −∞ is an accumula-
   tion value of (an ) which necessarily must be the smallest one. By Definition 1.42
   lim inf an = −∞. On the other hand, the unboundedness from below of (an ) implies
   sn := inf{ak | k ≥ n} = −∞ for all n ∈ N and therefore also limn→∞ sn = −∞. Note
   that formally we only defined limits for sequences with values in R and not with values
   in R ∪ {−∞, ∞}. Here we implicitely used the obvious extension, namely we said that
   the limit of the sequence (sn ) which is constantly −∞ has the limit −∞.
     Next we consider the case where (an ) is divergent to +∞. In particular, (an ) is not
     bounded from above and therefore +∞ is an accumulation value by Definition 1.41.
     This is also the only accumulation value, since each subsequence of (an ) also diverges
     to +∞. Hence, by Definition 1.42, lim inf an = +∞. On the other hand for each c > 0
     there is an N ∈ N such that an ≥ c for all n ≥ N . Therfore sn = inf{ak | k ≥ n} ≥ c
     for all n ≥ N which shows that also (sn ) diverges to +∞, i.e. limn→∞ sn = +∞.
     Finally we consider the remaining case where (an ) is bounded from below and not
     divergent to +∞. Then there exist constants c1 , c2 ∈ R such that c1 ≤ an for all n ∈ N
     and an ≤ c2 for infinitely many n ∈ N. This implies
                                  c1 ≤ sn = inf{ak | k ≥ n} ≤ c2
     for all n ∈ N, i.e. (sn ) is bounded. Since (sn ) is also monotonically increasing as
       sn+1 = inf{ak | k ≥ n + 1} ≥ min{inf{ak | k ≥ n + 1}, an } = inf{ak | k ≥ n} = sn ,
     it must be convergent. Set s := limn→∞ sn . We can recursively define a subsequence
     (ank ) of (an ) with n1 = 1 and nk+1 > nk such that
                                                                            1
                     s(nk +1) = inf{am | m ≥ nk + 1} ≤ ank+1 ≤ s(nk +1) +     .
                                                                            k
     Since the right- and left-hand sides of this inequality converge to s for k → ∞, we
     also have limk→∞ ank = s which shows that s is an accumulation value of (an ). On the
     other hand, if x is any other accumulation value of (an ) and if (ajk ) is a corresponding
     subsequence such that limk→∞ ajk = x, then
                                  sjk = inf{am | m ≥ jk } ≤ ajk
     shows that s = limk→∞ sjk ≤ limk→∞ ajk = x which means that s is indeed the smallest
     accumulation value of (an ), that is lim inf an = s.
1.4 Cauchy Sequences                                                                         29
b) Analogous to a).
c) “⇒”: Since the sequence (an ) is convergent every subsequence is convergent with the
   same limit. By Definition 1.39 there exists only one accumulation value and thus
   lim inf an = lim sup an .
   “⇐”: Let s := lim inf an = lim sup an . Then for all ε > 0 there exists an N ∈ N such
   that for all n ≥ N we have s − ε < an < s + ε. This implies convergence of (an )n∈N to
   s.
d) Let sn := inf{ak : k ≥ n}.
   “⇒”: We have for any c > 0 an N ∈ N such that an > c + 1 for all n ≥ N . Thus sn > c
   for all n ≥ N .
   “⇐”: By definition of sn we have an ≥ sn . Thus an → ∞ since sn → ∞.
e) Analogous to d).
                                                                                              2
Example 1.44. (a) (an )n∈N = (n)n∈N . Then ∞ is the only accumulation value and
   consequently lim supn→∞ an = lim inf n→∞ an = ∞.
(b) (an )n∈N = ((−1)n n)n∈N = (−1, 2, −3, 4, −5, 6, . . .). Then ∞ and −∞ are the only
    accumulation values and consequently lim sup an = ∞ and lim inf an = −∞.
(c) (an )n∈N = ((−1)n )n∈N . Then 1 and −1 are the only accumulation values and con-
    sequently lim sup an = 1 and lim inf an = −1.
(d) (an )n∈N with                 (
                                   (−1)n      : if n is divisible by 3,
                             an =
                                   n          : else.
    Then we have (an )n∈N = (1, 2, −1, 4, 5, 1, 7, 8, −1, 9, 10, . . .) and the set of accumula-
    tion values is given by {−1, 1, ∞}. Thus, we have lim sup an = ∞ and lim inf an = −1.
|an − am | < ε.
   Remark:
   By the expression “n, m ≥ N ”, we mean that both n and m are greater or equal than
   N , i.e., n ≥ N and m ≥ N .
Proof: Let a = limn→∞ an and ε > 0. Then there exists some N such that for all k ≥ N
holds |a − ak | < 2ε . Hence, for all m, n ≥ N holds
                                                                                  ε ε
           |an − am | = |(an − a) + (a − am )| ≤ |an − a| + |a − am | <            + = ε.
                                                                                  2 2
                                                                                                   2
The following theorem is closely related to Theorem 1.19.
     Theorem 1.47. Cauchy sequences are bounded
     Let (an )n∈N be a Cauchy sequence. Then (an )n∈N is bounded.
Proof: Take ε = 1. Then there exists some N such that for all n, m ≥ N holds |an − am | <
1. Thus, for all n ≥ N holds
Now choose
                          c = max{|a1 |, |a2 |, . . . , |aN −1 |, |aN | + 1}
and consider some arbitrary sequence element ak .
If k < N , we have that |ak | ≤ max{|a1 |, |a2 |, . . . , |aN −1 |} ≤ c.
If k ≥ N , we have, by the above calculations, that |ak | < |aN | + 1 ≤ c.
Altogether, this implies that |ak | ≤ c for all k ∈ N, so (an )n∈N is bounded by c.                2
Now we show that Cauchy sequences in F are even convergent:
     Theorem 1.48.
     Every Cauchy sequence (an )n∈N in F converges.
|an − a| ≤ |an − ann + ann − a| ≤ |an − ann | + |ann − a| < ε/2 + ε/2 = ε . 2
Theorem 1.48 is not true for arbitrary normed F-vector spaces. Those normed F-vector
spaces (V, || · ||) for which every Cauchy sequence has a limit in V are called complete or
Banach spaces (in honour of the Polish mathematician Stefan Banach). Without proof
we state that all finite dimensional normed F-vector spaces are Banach spaces.
Bε (x) ⊂ M.
Bε (x) = (x − ε, x + ε).
(i) bounded if there exists some c ∈ R such that for all x ∈ M holds: |x| ≤ c.
   (iii) closed if for all convergent sequences (an )n∈N with an ∈ M for all n ∈ N holds:
         limn→∞ an = a ∈ M .
   (iv) compact if for all sequences (an )n∈N with an ∈ M for all n ∈ N holds: There
        exists some convergent subsequence (ank )k∈N with limk→∞ ank = a ∈ M .
(i) C is open;
Proof:
“(i)⇒(ii)”: Let C be open. Consider a convergent sequence (an )n∈N with an ∈ F\C. We
have to show that for a = limn→∞ an holds a ∈ F\C. Assume the converse, i.e., a ∈ C.
Since C is open, we have that Bε (a) ⊂ C for some ε > 0. By the definition of convergence,
there exists some N such that for all n ≥ N holds |a − an | < ε, i.e.,
                                       an ∈ Bε (a) ⊂ C.
However, this is a contradiction to an ∈ F\C.
“(ii)⇒(i)”: Let F\C be closed. We have to show that C is open. Assume the converse,
i.e., C is not open. In particular, this means that there exists some a ∈ C such that for
all n ∈ N holds B 1 (a) 6⊂ C. This means that for all n ∈ N, we can find some an ∈ F\C
                    n
with an ∈ B 1 (a), i.e., |a − an | < n1 . As a consequence, for the sequence (an )n∈N holds that
              n
                                       lim an = a ∈ C,
                                       n→∞
(i) C is compact;
Proof:
“(i)⇒(ii)”: Let C be compact.
Let (an )n∈N be a convergent sequence in F with an ∈ C and a := limn→∞ an ∈ F. Since
C is compact, there is a subsequence (ank )k∈N such that b := limk→∞ ank ∈ C. By
Theorem 1.36 we have a = b ∈ C.
1.5 Bounded, Open, Closed and Compact Sets                                                33
Now assume that C is unbounded. Then for all n ∈ N, there exists some an ∈ C with
|an | ≥ n. Consider an arbitrary subsequence (ank )k∈N . Due to |ank | ≥ nk ≥ k, we have
that (ank )k∈N is unbounded, i.e., it cannot be convergent. This is also a contradiction to
compactness.
“(ii)⇒(i)”: Let C be closed and bounded. Let (an )n∈N be a sequence in C. The bounded-
ness of C then implies the boundedness of (an )n∈N . By the Theorem of Bolzano-Weierstraß,
there exists a convergent subsequence (ank )k∈N , i.e.,
                                       lim ank = a
                                       k→∞
for some a ∈ F. For compactness, we now have to show that a ∈ C. However, this is
guaranteed by the closedness of C.                                             2
   Remark:
   Taking a closer look to the proof “(i)⇒(ii)’, we did not explicitly use that we are
   dealing with one of the spaces R or C. Indeed, the implication that compact sets are
   bounded and closed holds true for all normed spaces. However, “(ii)⇒(i)” does not
   hold true in arbitrary normed spaces. Indeed, there are examples of normed spaces
   that have bounded and closed subsets which are not compact.
   Remark:
   The relation C̊ ⊂ C ⊂ C holds true for arbitrary subsets C ⊂ F. The first inclusion
   holds true by definition of C̊. To verify C ⊂ C we take an arbitrary x ∈ C and
   consider the constant sequence (x)n∈N . Since this sequence is completely contained
   in C and converges to x ∈ C, we must have that x ∈ C.
   It can be shown that for all sets C, C̊ is always open and C, ∂C are always closed
   sets. In particular, if C is open (closed), then we have C̊ = C (resp. C = C)
34                                                              1 Sequences and Limits
                                           ∞
                                           X
                                                 ak
                                           k=1
for some sequence (an )n∈N . Before we present a mathematically precise definition, we
present “a little paradoxon” that aims to show that one really has to be careful with
series.
Consider the case where (an )n∈N = ((−1)n )n∈N . On the one hand we can compute
        ∞
        X
          (−1)k = − 1 + 1 − 1 + 1 − 1 + 1 − 1 + 1 − 1 + 1 − . . .
        k=1       = (−1 + 1) + (−1 + 1) + (−1 + 1) + (−1 + 1) + (−1 + 1) + . . .
                  =0 + 0 + 0 + 0 + 0 + ... = 0
        ∞
        X
          (−1)k = − 1 + 1 − 1 + 1 − 1 + 1 − 1 + 1 − 1 + 1 − . . .
        k=1       = − 1 + (1 − 1) + (1 − 1) + (1 − 1) + (1 − 1) + (1 − 1) + . . .
                  = − 1 + 0 + 0 + 0 + 0 + 0 + . . . = −1.
                                                                                              35
36                                                                                          2 Infinite Series
     is called infinite series (or just “series”). The sequence element sn is called n-th
     partial sum of (an )n∈N . The series is called convergent if (sn )n∈N is convergent. In
     this case, we write
                                        X∞
                                             ak := lim sn .
                                                             n→∞
                                                k=1
     Remark:
     In the literature, the symbol ∞k=1 ak is also called series. So this symbol has a two-
                                  P
     fold meaning, namely the limit of the series (if existent) and the series itself. At
     the accordant places of this manuscript, the concrete meaning will be clear from the
     context.
     The above definition formally does not include infinite sums of kind
                     ∞
                     X                  ∞
                                        X                     ∞
                                                              X
                           ak ,               ak ,    or              ak for some n0 ∈ N.
                     k=0                k=2                  k=n0
     However, their meaning is straightforward to define and we will call these expres-
     sions infinite series, too.
Before we give some criteria for the convergence of series, we first present the probably
most important series and analyze their convergence.
      is convergent if and only if |q| < 1. Proof: We can show that the n-th partial sum is
      given by
                                        n
                                               ( n+1
                                       X
                                            k
                                                1−q
                                                  1−q
                                                        : if q 6= 1,
                                sn =       q =
                                       k=0
                                                n + 1 : if q = 1.
      Hence, (sn )n∈N is convergent if and only if |q| < 1. In this case we have
                                  ∞
                                  X                             1 − q n+1    1
                                        q k = lim sn = lim                =     .
                                              n→∞            n→∞ 1 − q      1−q
                                  k=0
2.1 Basic Definitions, Convergence Criteria and Examples                                                      37
   Now we take a closer look to the number s2j − s2j−1 : By definition of sn , we have
                                           2    j                   j
                                                                    2
                                           X          1             X   1     j−1 1   1
                     s2j − s2j−1 =                      >                  = 2      =   .
                                                      k                 2j       2j   2
                                         k=2j−1 +1          k=2j−1 +1
   The inequality in the above formula holds true since every summand is replaced by
   the smallest summand 21j . The second last equality sign then comes from the fact
   that the number 21j is summed up 2j−1 -times. Now using this inequality together with
   the above sum representation for s2l , we obtain
                                         l                              l
                                         X                              X 1         l
                        s2l = s1 +             (s2j − s2j−1 ) > 1 +              =1+ .
                                         j=1                            j=1
                                                                               2    2
   Hence, the sequence (s2l )l∈N is bounded. This implies the desired result.                                 2
38                                                                                      2 Infinite Series
     Remark:
     Except for the first example, we have not computed the limits of the other stated
     convergent series. We only proved existence or non-existence of limits. Indeed, the
     computation of limits of series is, in general, a very difficult issue and is not possible
     in many cases.
     The function
                                                   ∞
                                                 X     1
                                          ζ(α) =
                                                  k=1
                                                      kα
     is very popular in analytic number theory under the name Riemann Zeta Function.
     In b) and c), we have implicitly proven that ζ(·) is defined on the interval (1, ∞)
     and has a pole at 1. This function is subject of the Riemann hypothesis which is
     one of the most important unsolved problems in modern mathematics. Some known
     values of the Zeta function are (without proof )
              ∞                          ∞                                 ∞
             X   1            π2        X   1            π4               X   1            π6
                   2
                     = ζ(2) =    ,            4
                                                = ζ(4) =    ,                   6
                                                                                  = ζ(6) =     .
             k=1
                 k            6         k=1
                                            k            90               k=1
                                                                              k            945
Next we consider sums of convergent series and multiplication of series by some scalar
variable. The proof just consists of a straightforward application of Theorem 1.21 and is
therefore skipped.
     Theorem 2.3. Formulae for convergent series
     Let λ ∈ F and
                                            ∞
                                            X              ∞
                                                           X
                                                  ak ,           bk
                                            k=1            k=1
      (i)
                                      ∞
                                      X                    ∞
                                                           X            ∞
                                                                        X
                                            (ak + bk ) =         ak +         bk ;
                                      k=1                  k=1          k=1
      (ii)
                                             ∞
                                             X            ∞
                                                          X
                                               (λak ) = λ   ak .
                                              k=1                k=1
Proof: By Theorem 1.46 and Theorem 1.48, a series converges if and only if the sequence
2.1 Basic Definitions, Convergence Criteria and Examples                                     39
Therefore, the Cauchy criterion is really equivalent to the fact that (sn )n∈N is a Cauchy
sequence in F.                                                                          2
   Remark:
   Reconsidering the example at the very beginning of this chapter, the divergence of
   this sequence can be directly verified be employing the Cauchy criterion.
                                        lim an = 0.
                                        n→∞
Proof: Since the series converges, the Cauchy criterion implies that for all ε > 0, there
exists some N such that for all n ≥ m ≥ N holds
                                            
                                      Xn    
                                          ak  < ε.
                                            
                                      
                                            
                                        k=m
Now considering the special case n = m, we have that for all n ≥ N holds
|an | < ε.
                                        lim an = 0.
                                        n→∞
40                                                                              2 Infinite Series
Proof: Since (an )n∈N is convergent to zero and monotonically decreasing, we have an ≥ 0
for all n ∈ N. Let (sn )n∈N be the corresponding sequence of partial sums. Then we have
for all l ∈ N that
i.e., the subsequence (s2l )l∈N is monotonically decreasing and (s2l+1 )l∈N is monotonically
increasing. Furthermore, due to s2l+1 − s2l = −a2l+1 ≤ 0 holds
s2l+1 ≤ s2l .
Altogether, the (s2l )l∈N is monotonically decreasing and bounded from below, and (s2l+1 )l∈N
is monotonically increasing and bounded from above. By Theorem 1.32, both sub-
sequences are convergent. Due to
an application of Theorem 1.21 yields that both subsequence have the same limit, i.e.,
for some s ∈ R. Let ε > 0. Then there exists some N1 such that for all l ≥ N1
holds |s2l − s| < ε. Furthermore, there exists some N2 such that for all l ≥ N2 holds
|s2l+1 − s| < ε. Now choosing N = max{2N1 , 2N2 + 1}, we can say the following for some
m ≥ N:
In the case where m is even, we have some l ∈ N with m = 2l. By the choice of N , we
also have l ≥ N1 and thus
                                |sm − s| = |s2l − s| < ε.
In the case where m is odd, we have some l ∈ N with m = 2l + 1. By the choice of N , we
also have l ≥ N2 and thus
                               |sm − s| = |s2l+1 − s| < ε.
                                                                                               2
We will later treat the topic of Taylor series. Thereafter we will be able to determine
some further limits of sequences.
   Remark:
   We will see that absolute convergence
                                      P is really a stronger requirement than con-
   vergence. However, for real series ∞  k=1 ak with ak ≥ 0 for all k ∈ N, absolute
   convergence and convergence are equivalent. This is a direct consequence of the fact
   that ak ≥ 0 implies |ak | = ak .
Proof: Let ε > 0. By the absolute convergence of the series, the necessity of Cauchy’s
convergence criterion implies that there exists some N such that for all n ≥ m ≥ N holds
                                        n
                                        X
                                              |ak | < ε.
                                        k=m
                                                                                    P∞
Then the sufficiency of Cauchy’s convergence criterion implies the convergence of     k=1    ak .
The following criterion can be seen as a “series version” of the comparison criterion for
sequences presented in Theorem 1.22.
42                                                                           2 Infinite Series
Proof: Let ε > 0. By the Cauchy criterion applied to ∞  k=1 bk there is an N ≥ n0 such
                                                    P
that for all n ≥ m ≥ N holds
                                                      
                             n
                             X         Xn       Xn    
                         0≤    |ak | ≤     bk =    bk  < ε.
                                                      
                                                      
                                     k=m          k=m          k=m
                                                P∞
The Cauchy criterion now implies that              k=1   |ak | converges.                     2
     Remark:
                P∞
     The series   k=1 bk with the properties as stated in Theorem 2.11 is called a major-
     ant of ∞ k=1 k .
           P
                 a
Now we present a kind of reversed majorant criterion that gives us a sufficient criterion
for divergence.
     Theorem 2.12. Minorant criterion
     Let ∞
                                                             P∞
           k=1 ak be a real series. Moreover, let nP
                                                   0 ∈ N and  k=1 bk be a divergent series
         P
                                                     ∞
     such that ak ≥ bk ≥ 0 for all k ≥ n0 . Then k=1 ak diverges.
Proof: We prove the result by contradiction: Let ∞       k=1 bk be divergent. Assume that
                                                      P
  ∞
         converges. Then, due to ak ≥ bk ≥ 0, the majorant criterion implies the conver-
P
  k=1 ak P
gence of ∞ k=1 bk , too. This is a contradiction to our assumption.                    2
     Remark:
                P∞
     The series   k=1 bk with the properties as stated in Theorem 2.12 is called a minorant
     of ∞       .
       P
             a
          k=1 k
– ak 6= 0 for all k ≥ n0 ;
                                                   |ak+1 |
                                                           ≤ q.
                                                    |ak |
            P∞
     Then    k=1   ak converges absolutely.
                        P∞                                            P∞
Therefore, the series      k=1   q k−n0 |an0 | is a majorant of         k=1   ak . However, the majorant is
convergent due to
                 ∞
                 X                            |an0 |
                        q k−n0 |an0 | =                         (see Example 2.2 a)).
                  k=1
                                          (1 − q)q n0 −1
   Observing that                                          2
                                            |ak+1 |  k 
                                                   =          ,
                                             |ak |     k + 1
   the fact that this expression converges to 1 implies that there does not exist some
   q < 1 for which the quotient criterion is fulfilled. However, this series is convergent
   as we have proven in Example 2.2 c).
   We could also formulate “an alternative quotient criterion”P  that gives us a sufficient
   criterion for divergence. Namely, consider a real series ∞      k=1 ak with positive ak
                      ak+1
   and assume that ak ≥ 1 for all k ≥ n0 for some fixed n0 ∈ N. This gives us
   0 < ak ≤ ak+1 , i.e., the sequence (an )n∈N is positivePand monotonically increasing.
   Such a sequence cannot converge to zero and thus, ∞      k=1 ak is divergent.
holds
                          |ak+1 |               1−c        1+c
                                  <c+ε=c+              =       .
                           |ak |                  2         2
Thus, the quotient criterion holds true for q :=              1+c
                                                               2
                                                                    which satisfies q < 1 due to c < 1.
44                                                                                  2 Infinite Series
                                                                                                    2
     Remark:
                                           |ak+1 |
     Since, in case of convergence of        |ak |
                                                   ,   the limit and limes superior coincide, the
     criterion
                                             |ak+1 |
                                           lim       <1
                                        k→∞ |ak |
     is also sufficient for absolute convergence of ∞  k=1 ak . However, this criterion re-
                                                     P
     quires the convergence of the quotient sequence and is therefore weaker than the
     above one.
     Theorem 2.16.
     Let x ∈ F. Then
                                                                 ∞
                                                 x  n X xk
                                     lim       1+      =        .
                                    n→∞           n      k=0
                                                             k!
Note that       n!
             (n−k)!nk
                        is well defined for n ≥ k. For such an n ∈ N we have
                   k−1
                   Y   n−j       n!       (n − k + 1)k        k−1 k
              1≥           =          k
                                        ≥       k
                                                       = (1 −    ) .                           (2.2)
                   j=0
                        n    (n − k)!n         n               n
For n ≥ N we estimate
                                                
                 ∞    k      n   k       ∞     k
       x n X     x      X      n x     X    x
     1+     −         =               −
                                                
         n          k!             k n k        k! 
                                                   
                         
                k=0          k=0           k=0
                           K−1
                           X n xk xk  X          n         ∞
                                                        n |x|k X |x|k
                                             
                         ≤        k nk − k!  +
                                                            +        .
                           k=0                     k=K
                                                        k nk    k=K
                                                                    k!
                                                                | {z }
                                                                            < ε/3
                            K−1                  k       n                    ∞
                                                                       |x|k X |x|k
                                
                            X        n!         |x|   X       n!
                          =      (n − k)!nk − 1 k! +                      +
                                               
                            k=0
                                                            (n − k)!nk k!
                                                        k=K |                 k=K
                                                                                    k!
                            |            {z           }         {z   }        | {z }
                                                                ≤1
                                       < ε/3            |       {z        }     < ε/3
                                                                    < ε/3
< ε.
                                                                                             2
The following criterion of Raabe refines the quotient criterion.
   Theorem 2.17. Raabe criterion
   Let (ak )k∈N be a sequence in F.
                                          |ak+1 |       β
                                                  ≤1−
                                           |ak |        k
                                   P∞
s. From (2.3) we conclude that β−1
                                1
                                          bk is a convergent majorant of ∞
                                                                         k=k0 |ak |. By
                                                                        P
                       P∞            k=k0
the majorant criterion, k=1 ak is absolutely convergent.
Proof: Taking the k-th power of the inequality k |ak | < q, we obtain that for all k ≥ n0
                                                 p
holds
                                       |ak | < q k
Therefore, the convergent geometric series ∞
                                                                      P∞
                                              k=1 q is a majorant of    k=1 ak and thus,
                                                   k
                                            P
we have absolute convergence.                                                          2
     Theorem 2.19. Root criterion (limit form)
     Let ∞k=1 ak be a series in F and assume that
        P
                                            p
                                    lim sup k |ak | < 1.
                                         k→∞
            P∞
     Then    k=1   ak converges absolutely.
Then we have
                                                 s              √
                                                                  k   5
                                                    5
                                                 k k              k
                                          lim sup  k  = lim sup
                                            k→∞     3       k→∞     3
                                                      √
Since we know (from the tutorial) that                k
                                                          k converges to 1, the whole expression converges
to 13 < 1. Hence, the series converges.
                                                                                           P∞
We will now state two convergence criteria for series of the form                           k=1   ak bk . They are
easily deduced from the following lemma.
   Lemma 2.21. Abel’s partial sums
   For n ∈ N and a1 , ..., an , b1 , ..., bn+1 ∈ F holds
                                  n
                                  X                           n
                                                              X
                                         ak bk = An bn+1 +          Ak (bk − bk+1 ) ,
                                  k=1                         k=1
   n
   X                 n
                     X                  n
                                        X         n
                                                  X           n
                                                              X         n−1
                                                                        X
          ak b k   =   (Ak − Ak−1 )bk =   Ak bk −   Ak−1 bk =   Ak bk −     Ak bk+1
    k=1                 k=1                           k=1           k=1              k=1           k=1
                        n
                        X                n
                                         X                             n
                                                                       X
                   =           Ak bk −         Ak bk+1 + An bn+1 =           Ak (bk − bk+1 ) + An bn+1 .
                        k=1              k=1                           k=1
Proof: Set Ak := ki=1 ai . By assumption both sequences (Ak )k∈N and (bk )k∈N converge
                    P
so that (Ak bk+1 )k∈N converges also. Since (bk )k∈N is monotonic, the telescoping series
  ∞
  k=1 (bk − bk+1 ) converges absolutely as
P
              n                          n
                                                                               n→∞
              X                          X
                     |bk − bk+1 | = |          (bk − bk+1 )| = |b1 − bn+1 | −−−→ |b1 − lim bk | .
                                                                                           k→∞
               k=1                       k=1
                                ∞
                                X                               ∞
                                                                X
                                      ak bk = lim An bn+1 +            Ak (bk − bk+1 ) .
                                                n→∞
                                k=1                              k=1
                                                                                                                2
48                                                                                       2 Infinite Series
Proof: Set Ak :=       i=1 ai . By assumption (Ak )k∈N is bounded. Hence (Ak bk+1 )k∈N
                     Pk
converges
P∞          to zero. By the same argument as in the proof of Theorem 2.22, the series
P∞k=1 k k − bk+1 ) is absolutely convergent. Lemma 2.21 again implies that the series
      A   (b
  k=1 ak bk converges to the limit
                         ∞
                         X                         X∞
                             ak bk = lim An bn+1 +     Ak (bk − bk+1 ) .
                                     n→∞
                         k=1         |   {z    }   k=1
                                                    =0
                                                                                                        2
Note that the Leibniz criterion 2.6 follows from the Dirichlet criterion by taking ak :=
(−1)k .
     Definition 2.24. Reordering
     Let ∞  k=1 ak be a series in F and let τ : N → N be a bijective mapping. Then the
         P
     series                                 ∞
                                            X
                                                aτ (k)
                                                         k=1
                                 P∞
     is called a reordering of      k=1    ak .
Proof: Let a = ∞ k=1 ak and let τ : N → N be bijective. Let ε > 0. Since we have absolute
               P
convergence, there exists some N1 such that
                                       ∞
                                       X          ε
                                          |ak | <
                                      k=N
                                                  2
                                                         1
and thus                                  ∞        
                                N1 −1                      ∞
                               X         X             X         ε
                           a −       ak  =      ak  ≤     |ak | < .
                                                   
                           
                                k=1
                                          
                                               k=N
                                                      
                                                          k=N
                                                                     2
                                                               1                 1
   Remark:
   Note that we considered series with Plower summation   index 0. The Cauchy product
   can be also defined for sequences ∞
                                                  P∞
                                             a
                                         k=n0 k ,  k=n1 k with arbitrary n0 , n1 ∈ N (or
                                                       b
   even n0 , n1 ∈ Z). In this case, the Cauchy product is given by
                                  ∞
                                  X                            k−n
                                                               X1
                                         ck     with ck =             al bk−l .
                              k=n0 +n1                         l=n0
   In order to “keep the set of indices manageable”, this is not further treated here.
   Note that the following result about convergence properties of the Cauchy product
   still hold true in this above mentioned more general case.
The following theorem justifies the name “product” in the above definition.
   Theorem 2.27. Convergence of the Cauchy product
   Let ∞
                  P∞                                    P∞
             ak ,   k=0 bk be series in F. Assume that           is absolutely conver-
       P
         k=0 P                                            k=0 ak P
   gent and ∞      b
                k=0 k is  convergent.  Then  the Cauchy product    ∞
                                                                   k=0 ck is absolutely
   convergent. Moreover, the limit satisfies
                               ∞          ∞
                                               !    ∞
                                                         !
                              X          X          X
                                  ck =       ak ·      bk .
                                  k=0          k=0              k=0
                       P∞
Let ε > 0. Since           k=0     ak converges absolutely, there exists some N0 such that for all
n ≥ N0 holds
                                                   ε
                                           |Bn − b| <         .  P∞
                                            4( k=0 |ak | + 1)
Since (an )n∈N   converges to zero (see Theorem 2.5), there exists some N1 such that for all
n ≥ N1 holds
                                                ε
                                   |an | <                      .
                               4N0 (sup{|b − Bl | : l ∈ N} + 1)
Also there exists some N2 such that for all n ≥ N2 holds
                                                   ε
                                 |An − a| <              .
                                              2(|b| + 1)
Therefore, with N = max{N0 + N1 , N2 }, we have that for all n ≥ N holds
                                          
              Xn                           X n
 |Cn − ab| =     an−k (Bk − b) + b(An − a) ≤    |an−k ||Bk − b| + |b||An − a|
                                          
                                          
                  k=0                                                     k=0
                 N
                 X0 −1                                                     X n
            =                        |an−k |               |Bk − b| +               |an−k |     |Bk − b|              +|b| |An − a|
                                     | {z }                                                     | {z }                     | {z }
                 k=0                     ε                                k=N0                         ε                         ε
                         < 4N                                                                 < 4(P∞                       < 2(|b|+1)
                                0 (sup{|b−Bl | : l∈N}+1)                                            k=0
                                                                                                           |ak |+1)
                 N0 −1                               n
                 X          ε|Bk − b|               X     ε|a |             ε|b|
            <                                     +      P∞ n−k         +
              k=0
                  4N0 (sup{|b − Bl | | l ∈ N} + 1) k=N 4( k=0 |ak | + 1) 2(|b| + 1)
                                                       0
             ε ε ε
            < + + =ε
             4 4 2
Example 2.28. Let x, y ∈ F and consider the series
                                 ∞           ∞
                                X   xk      X    yk
                                       ,
                                k=0
                                    k!      k=0
                                                 k!
which are absolutely convergent
                       P∞       as shown in Example 2.15. Then the Cauchy product of
both series is given by k=0 ck with
                             k                    k  
                            X   xl y k−l      1 X k l k−l
                       ck =                 =   ·       xy .
                            l=0
                                l! (k − l)!   k! l=0 l
By the Binomial Theorem (see tutorial), we obtain
                              k  
                            X     k l k−l
                                      xy    = (x + y)k .
                                  l
                                             l=0
Hence, by Theorem 2.27, we have
                                       ∞
                                                  !        ∞
                                                                      !       ∞
                                       X xk                X yk               X (x + y)k
                                                      ·                   =                     .
                                       k=0
                                             k!            k=0
                                                                 k!           k=0
                                                                                        k!
Altogether, this means that the function
                                                                    ∞
                                                                    X xk
                                                          f (x) =
                                                                    k=0
                                                                          k!
fulfills f (x + y) = f (x) · f (y) for all x, y ∈ R (and even x, y ∈ C). This property is for
instance fulfilled by the exponential function. Indeed, we have that f as defined above
fulfills f (x) = ex .
                                                        Continuous Functions
                                                                                                3
                            Now do you think you can beat the champ?
                                                                           I can take him blindfolded.
                            What if he’s not blindfolded?
                                                                                         Police Squad!
The central notion of this chapter is continuity that is a property of functions. Very very
roughly speaking, it means that the function has no jumps. This is again a very abstract
concept. However, the remaining parts of the lecture are getting easier!
Pointwise convergence means nothing else but that for all x ∈ I, the sequence (fn (x))n∈N
                                                                                                     51
52                                                                         3 Continuous Functions
in F converges to f (x).
We now present an alternative convergence concept for functions:
     Definition 3.3. Uniform convergence
     A sequence (fn )n∈N of functions fn : I → F is called uniformly convergent to
     f : I → F if for all ε > 0 there exists some N such that for all n ≥ N and
     x ∈ I holds
                                    |f (x) − fn (x)| < ε.
     Using logical quantifiers this reads (in contrast to (3.1)):
As a rule of thumb, you can think of pushing one quantifier to the right but, of course,
this will change a lot. The interpretation is that we can measure the distance between
two functions f und g as the largest distance between the two graphs, that means the
distance you can measure at a given point:
We have the uniform convergence if this measured distance between fn and f is convergent
to zero. (See below.)
                          1                                           f1
                                                                      f2
                                                                      f3
−1
One sees that the functions fn pointwisely converges to a limit function. We also see that
we can build a jump by increasing n for the limit function. The distance between the
limit function to each member of the sequence is indeed
 since there is always an x ∈ R which can be chosen close enough to 0 to get an ap-
proximation of this distance. This means that the sequence of functions (fn )n∈N does not
converges uniformly in spite of being pointwisely convergent. The uniform convergence is
in fact a much stronger notion.
We will now see that uniform convergence is a stronger property than pointwise conver-
gence.
3.1 Bounded Functions, Pointwise and Uniform Convergence                                53
Proof: Let ε > 0. Then there exists some N such that for all n ≥ N and x ∈ I holds
   Theorem 3.5.
   A sequence (fn )n∈N of functions fn : I → F converges uniformly to f : I → F, if,
   and only if,
   This means that uniform convergence is nothing but convergence with respect to the
   infinity norm || · ||∞ .
f+
f−
fn
Proof. Assume that (fn )n∈N converges uniformly to f . Let ε > 0. Then there exists some
N such that for all n ≥ N and x ∈ I holds
                                                       ε
                                    |f (x) − fn (x)| < .
                                                      2
Therefore, for all n ≥ N , we have
                                                          ε
                           sup{|f (x) − fn (x)| : x ∈ I} ≤ < ε,
                                                          2
and thus, (3.3) holds true.
Conversely, assuming that (3.3) holds true, we obtain that for ε > 0, there exists some N
with the property that for all n ≥ N holds
                                 sup{|f (x) − fn (x)| : x ∈ I} < ε.
This means that for all n ≥ N and x ∈ I, there holds that
                                          |f (x) − fn (x)| < ε.
However, this statement is nothing but uniform convergence of (fn )n∈N towards f .
Example 3.6. a) Let I = [0, 1] and consider the sequence fn (x) = xn . Then we have
  the pointwise limit
                                           (
                                             0 , if x ∈ [0, 1),
                      f (x) = lim fn (x) =
                               n→∞           1 , if x = 1.
b) We now consider the same sequence on the smaller interval [0, 12 ]. The pointwise limit
   is now f = 0. For n ∈ N, we have
                                                                                1
                  sup |f (x) − fn (x)| : x ∈ [0, 21 ] = sup xn : x ∈ [0, 12 ] = n .
                                                    	                      	
                                                                               2
     Therefore
                               lim sup |f (x) − fn (x)| : x ∈ [0, 21 ] = 0
                                                                     	
                              n→∞
     and hence, we have uniform convergence.
c) Define the function fn : [0, 1] → R by
                                    (
                                      n2 x(1 − nx) , if x ∈ [0, n1 [,
                           fn (x) =
                                      0            , if x ∈ [ n1 , 1].
     since fn (0) = 0 and fn (x) = 0 if x > n1 . The sequence (fn )n∈N is however not uniformly
     convergent to f = 0, since
                                                                    n
                                                                  1 
                          sup{|fn (x) − 0| : x ∈ [0, 1]} ≥ fn 2n     = .
                                                                          4
3.2 Continuity                                                                              55
   Theorem 3.7.
   Let (fn )n∈N be a sequence of bounded functions fn : I → F. Assume that (fn )n∈N
   converges uniformly to f : I → F. Then f is bounded.
Proof: For ε = 1, there exists some N such that for all n ≥ N and x ∈ I holds
In particular, we have |fN (x) − f (x)| < 1 for all x ∈ I. This consequences that for all
x ∈ I, there holds
                                   |f (x)| < |fN (x)| + 1.
The boundedness of fN then implies the boundedness of f .                                   2
   Remark:
   Note that the assumption of uniform convergence is essential for the boundedness of
   f . For instance, consider the sequence (fn )n∈N of bounded functions fn : [0, ∞) → R
   with                                    (
                                            x : if x < n
                                  fn (x) =
                                            0 : else.
   First we argument that (fn )n∈N converges pointwisely to f : [0, ∞) → R with f (x) =
   x: Let x ∈ [0, ∞). Then there exists some N ∈ N with x < N . Hence, for all n ≥ N ,
   we have fn (x) = x. This implies convergence to f : [0, ∞) → R with f (x) = x.
   Second we state that each fn is bounded: This is a consequence of the fact that, by
   the definition of fn , there holds fn (x) < n for all x ∈ [0, ∞).
   Altogether, we have found a sequence of bounded functions pointwisely converging
   to some unbounded function. Hence, Theorem 3.7 is no longer valid, if we replace
   the phrase “uniformly convergent” by “pointwisely convergent”.
3.2 Continuity
Now we begin to introduce the concept of continuity.
   Definition 3.8.
   Let I ⊂ F, let f : I → F be a function, and let x0 ∈ I. Then we define
    (i) the limit of f as x tends to x0 by c ∈ F if for all sequences (xn )n∈N in I\{x0 }
        with limn→∞ xn = x0 holds limn→∞ f (xn ) = c. In this case, we write
                                          lim f (x) = c
                                         x→x0
                                         lim f (x) = c.
                                         x%x0
56                                                                    3 Continuous Functions
                                              lim f (x) = c.
                                            x&x0
     In all three cases we assume that at least one sequence (xn )n∈N with the stated
     property exists.
     Remark:
     From the above definition, we can also conclude that limx→x0 f (x) exists in the case
     I ⊂ R if and only if limx%x0 f (x) and limx&x0 f (x) exist and are equal. In this case,
     there holds
                              lim f (x) = lim f (x) = lim f (x).
                              x%x0         x&x0           x→x0
     Though not explicitly introduced in the above definition, it should be intuitively clear
     what is meant by the following expressions
     Then we have limx%0 H(x) = 0, since for all xn ∈ R with xn < 0 holds H(xn ) = 0.
     Further, limx&0 H(x) = 1, since for all xn ∈ R with xn > 0 holds H(xn ) = 1. The
                                                                         n
     limit limx→0 H(x) does not exist. E.g., take the sequence xn = (−1)
                                                                      n
                                                                           . Then
                                           (
                                              1 : if n is even,
                                 H(xn ) =
                                              0 : if n is odd.
     Then for all sequences (xn )n∈N in R\{0} holds that (f (xn ))n∈N is a constant zero
     sequence. Hence, limx→0 f (x) = 0.
c) Consider a polynomial p : R → R with p(x) = an xn + . . . + a1 x + a0 for some given
   a0 , . . . , an ∈ R. Let x0 ∈ R. By Theorem 1.21, we have that for all real sequences
   (xn )n∈N converging to x0 holds that p(xn ) converges to p(x0 ), i.e.,
   Remark:
   Sometimes we will just say f : I → F is continuous whereby we mean it is continuous
   on I.
   is continuous on R.
g) The function f : R → R with
                                           (
                                            1 , if x ∈ Q,
                                   f (x) =
                                            0 , if x ∈
                                                     /Q
   is everywhere discontinuous.
   Proof: Let x0 ∈ R:
   First case: x0 ∈ Q. Then√take a sequence (xn )n∈N with limn→∞ xn = x0 and xn ∈ R\Q
   (for instance, xn = x0 + n2 ). Then f (xn ) = 0 for all n ∈ N and thus limn→∞ xn = 0 6=
   f (x0 ) = 1.
   Second case: x0 ∈ R\Q. Then take a sequence (xn )n∈N with limn→∞ xn = x0 and
   xn ∈ Q (this exists since every real number can be approximated by a rational number
   in arbitrary good precision). Then f (xn ) = 1 for all n ∈ N and thus limn→∞ xn = 1 6=
   f (x0 ) = 0.
58                                                                        3 Continuous Functions
(i) f is continuous in x0 ;
     (ii) For all ε > 0 there exists some δ > 0 such that for all x ∈ I with |x − x0 | < δ
          holds
                                        |f (x) − f (x0 )| < ε.
f+
f−
x− x+
Proof:
“(i)⇒(ii)”: Assume that (ii) is not fulfilled, i.e., there exists some ε > 0, such that for
all δ > 0, there exists some x ∈ I \ {x0 } with |x − x0 | < δ and |f (x) − f (x0 )| > ε. As
a consequence, for all n ∈ N, there exists some xn ∈ I \ {x0 } with
                                         1
                          |x0 − xn | <     and |f (xn ) − f (x0 )| > ε.
                                         n
Therefore, the sequence (xn )n∈N converges to x0 , but |f (xn ) − f (x0 )| > ε, i.e., f (xn ) is
not converging to f (x0 ).
“(ii)⇒(i)”: Let (xn )n∈N be a sequence in I\{x0 } that converges to x0 . Let ε > 0. Then
there exists some δ > 0 such that for all x ∈ I with |x − x0 | < δ holds |f (x) − f (x0 )| <
ε. Since (xn )n∈N converges to x0 , there exists some N such that for all n ≥ N holds
|xn − x0 | < δ. By the ε-δ-criterion, we have then for all n ≥ N that
g(I2 ) = {g(x) : x ∈ I2 } ⊂ I1 .
Proof: Let (xn )n∈N be a sequence in I2 with limn→∞ xn = x0 . By the continuity of g holds
limn→∞ g(xn ) = g(x0 ). Then, by the continuity of f holds
                                                                                           2
In the following, we collect very important properties of continuous functions. The first
result is that continuous functions are bounded as far as they are defined on some compact
set. Thereafter, we present the famous Intermediate Value Theorem which basically states
that continuous functions attain every value between f (x0 ) and f (x1 ) for some arbitrary
x0 , x1 ∈ I. This result leads us to think about continuous functions as “those functions
whose graph can be drawn without putting down the pencil”.
   Theorem 3.15. Continuous functions defined on a compact set
   Let I ⊂ F be compact and let f : I → F be continuous. Then f (I) is compact. In
   particular, by the Theorem of Heine-Borel, f (I) is bounded and closed. If further
   f (I) ⊂ R, so that there exist x+ , x− ∈ I such that
Proof. Let (yn )n∈N be a sequence in f (I). Then for each n ∈ N there is an xn ∈ I such
that yn = f (xn ). Since I is compact, there exists a subsequence (xnk )k∈N that converges
to some x ∈ I. Now, since f is continuous, we have
Hence we found a subsequence (ynk )k∈N of (yn )n∈N that converges in f (I). Therefore f (I)
is compact.
60                                                                          3 Continuous Functions
Now we show that a uniformly convergent sequence of continuous functions has to converge
to a continuous function.
     Theorem 3.16.
     Let I ⊂ F and let (fn )n∈N be a sequence of continuous functions fn : I → F that
     uniformly converges to some f : I → F. Then f is continuous.
Proof: Let ε > 0 and let x0 ∈ I. Since (fn )n∈N converges uniformly to f : I → F, there
exists some N such that for all n ≥ N and x ∈ I holds
                                                           ε
                                         |f (x) − fn (x)| < .
                                                           3
Since fn is continuous on I, there exists some δ > 0 such that for all x ∈ I with |x−x0 | < δ
holds
                                                         ε
                                    |fn (x0 ) − fn (x)| < .
                                                         3
Altogether, we then have
Proof. Without loss of generality, we assume that x0 < x1 . Moreover, we can assume
without loss of generality that y = 0 (otherwise, consider the function f (x) − y instead of
f ). Furthermore, we can assume without loss of generality that f (x0 ) ≤ 0 and f (x1 ) ≥ 0
(otherwise, consider −f instead of f ).
We will construct x̂ by nested intervals (compare the proof of Theorem 1.38).
Inductively define A0 = x0 , B0 = x1 and for k ≥ 1,
3.2 Continuity                                                                                              61
f+
f−
x− xh x+
                                 Ak−1 +Bk−1
  a) Ak = Ak−1 , Bk =                 2
                                            ,   if f ( Ak−1 +B
                                                            2
                                                              k−1
                                                                  ) ≥ 0, and
                Ak−1 +Bk−1
  b) Ak =            2
                           ,    Bk = Bk−1 , if f ( Ak−1 +B
                                                        2
                                                          k−1
                                                              ) ≤ 0.
Then we have
                                            lim An = lim Bn =: x̂.
                                           n→∞           n→∞
By the continuity of f and the fact that f (An ) ≤ 0, f (Bn ) ≥ 0 for all n ∈ N, we have
                                 0 ≥ lim f (An ) = f (x̂) = lim f (Bn ) ≥ 0
                                      n→∞                        n→∞
     Note that a normed vector space (V, || · ||) is a special case of a metric space since we can define
     X := V and d||·|| (v, w) := ||v − w|| for all v, w ∈ V to obtain a metric space (X, d||·|| ) with the
     same so-called topological properties. Note also that a metric space has no algebraic structure,
     that is there is no summation or scalar multiplication defined on X. In particular, this means that
     generalizations of introduced concepts and theorems to metric spaces only make sense in those
     cases where no algebraical properties are needed.
     For example, the (ε-δ)-criterion for continuity generalizes to a function f : X1 → X2 from one
     metric space (X1 , d1 ) to another metric space (X2 , d2 ) in the following way: f is continuous if
     and only if for each ε > 0 there is a δ > 0 such that for all x, y ∈ X1 with d1 (x, y) < δ holds
     d2 (f (x), f (y)) < ε .
     Finally we mention that metric spaces are still not the end of the road for possible gener-
     alizations. Even more general so-called topological spaces (X, τ ) can be considered, where X 6= ∅
     is a set and τ is a set of subsets of X, i.e. each U ∈ τ is a subset of X. Then τ is called a topology
     on X if the following properties are satisfied:
        a) ∅, X ∈ τ
        b) The union of each subset W of τ is again an element of τ , that is
           ∪W := {x ∈ X | ∃W ∈ W : x ∈ W } ∈ τ .
        c) The intersection of each finite subset W of τ is again an element of τ , that is, if |W| ∈ N,
           then ∩W := {x ∈ X | ∀W ∈ W : x ∈ W } ∈ τ .
     The elements of τ are called open sets and the complements of open sets X\U where U ∈ τ are
     called closed sets. A metric space (X, d) is special case of a topological space. If we set
τd := {U ⊂ X | ∀x ∈ U ∃ε > 0 : Bε (x) ⊂ U },
     where Bε (x) := {y ∈ X | d(x, y) < ε} is the open d-ball with radius ε around x, then (X, τd ) is a
     topological space with the same topological properties as (X, d).
     Now, for example, a function f : X1 → X2 from one topological space (X1 , τ1 ) to an-
     other topological space (X2 , τ2 ) is continuous, if for each U2 ∈ τ2 holds f −1 (U2 ) ∈ τ1 , which
     means that each preimage of an open set in X2 is an open set in X1 . Indeed, this generalizes the
     (ε-δ)-criterion for continuity for metric spaces to general topological spaces.
     Good references to metric and topological spaces are the books “Mengentheoretische Topologie”
     of B.v. Querenburg, and “General Topology” of R. Engelking.
                                                         Elementary Functions
                                                                                       4
                           Her lips said no, but her eyes said ’Read my lips’ !
                                                                                  Niles Crane
Here we will introduce some important functions like polynomials, rational functions, exp,
sin, cos, log, sinh, cosh, power series etc. We will also consider the extension of some of
the aforementioned functions to the complex plane.
   Definition 4.1.
   The exponential function exp : F → F is defined as
                                                  ∞
                                                  X xk
                                      exp(x) =               .
                                                  k=0
                                                        k!
   The number                                     ∞
                                                 X   1
                                    e = exp(1) =
                                                 k=0
                                                     k!
   is called Euler’s number.
e ≈ 2.718281828459046.
We already know from Example 2.15 that the above defined series converges for all x ∈ F.
Now we present an estimate for exp(x) if the series is replaced by a finite sum.
                                                                                          63
64                                                                                  4 Elementary Functions
     Theorem 4.2.
     For n ∈ N and x ∈ F with |x| ≤ 1 +           n
                                                  2
                                                        holds
                                 n
                                 X xk                                            |x|n+1
                    exp(x) =                 + rn (x)       with |rn (x)| ≤ 2            .
                                  k=0
                                        k!                                      (n + 1)!
Proof:                                   
                                  ∞          ∞
                               X     xk   X  |x|k
                   |rn (x)| =            ≤
                              
                              
                                k=n+1
                                      k!  k=n+1 k!
                                        ∞
                               |x|n+1 X                  |x|k
                            =         ·
                              (n + 1)! k=0 (n + 2) · (n + 3) · · · (n + 1 + k)
                                        ∞
                               |x|n+1 X |x|k            |x|n+1     1
                            ≤         ·              =         ·
                              (n + 1)! k=0 (n + 2) k   (n + 1)! 1 − |x|
                                                                    n+2
Since |x| ≤   n
              2
                  +1=   n+2
                         2
                            ,   we have
                                               |x|n+1    1                |x|n+1
                                |rn (x)| ≤            ·         1   =2
                                              (n + 1)! 1 −      2
                                                                         (n + 1)!
                                                                                                       2
     Theorem 4.3. Properties of the Exponential Function
      (ii) For all x ∈ C holds exp(x) = exp(x) (y denotes the complex conjugate of
           y ∈ C).
Proof:
 (i) Already shown in Example 2.28.
4.1 Exponential Function                                                                             65
 (iii) By definition of exp, we have exp(0) = 1. From (i), we get exp(x) · exp(−x) =
       exp(x − x) = exp(0) = 1. Then the statement follows.
 (iv) From x ≥ 0, we get
                                            ∞                    ∞
                                            X xk                 X xk
                               exp(x) =                =1+                  ≥ 1.
                                            k=0
                                                  k!             k=1
                                                                       k!
      It is immediately clear that equality only holds for x = 0.
 (v) If x < 0, we get from (iii) that exp(−x) > 1. Then, exp(x) =                      1
                                                                                    exp(−x)
                                                                                              < 1.
 (vi) First we show that exp is continuous at 0. Let (xn )n∈N be a sequence converging to
      0. By Theorem ?? holds for (small enough) xn that | exp(xn ) − 1| = |r0 (xn )| with
                                                       |xn |
                                      |r0 (xn )| ≤ 2         = 2|xn |.
                                                        1!
      Therefore,
                lim | exp(xn ) − exp(0)| = lim | exp(xn ) − 1| = lim |r0 (xn )| = 0
                n→∞                              n→∞                          n→∞
      which proves the continuity at 0. In order to show continuity on the whole real axis,
      we assume that (xn )n∈N converges to x0 ∈ R and make use of
                            lim exp(xn ) = exp(x0 ) lim exp(xn − x0 ).
                            n→∞                             n→∞
      Since (xn − x0 )n∈N converges to 0, the above limit on the right hand side converges
      to 1 (due to the continuity in 0). Therefore, we have
                                       lim exp(xn ) = exp(x0 )
                                      n→∞
(viii) The fact limx→∞ exp(x) = ∞ follows, since we have for x > 0 that
                                      ∞                          ∞
                                      X xk                       X xk
                           exp(x) =               =1+x+                     > 1 + x.
                                      k=0
                                            k!                   k=2
                                                                       k!
       If we now show that | exp(ix2 )| = 1, the result is proven: Making use of (ii), we
       obtain
25
20
15
10
                  0
                  −3         −2        −1          0         1          2         3
4.2 Logarithm
We will now define the logarithm as the inverse function of the exponential function.
As we have seen, the exponential function is bijective as a map from R to (0, ∞). This
justifies the following definition.
     Definition 4.4.
     The (natural) logarithm log : (0, ∞) → R is defined as the inverse function of
     exp : R → (0, ∞), i.e., for all x ∈ R holds log(exp(x)) = x and for all y ∈ (0, ∞)
     holds exp(log(y)) = y.
In many books, the above defined function is also denoted by logarithmus naturalis ln.
Before we collect some properties, a general result about continuity of inverse functions
is presented.
4.2 Logarithm                                                                                      67
Proof: First we show strict monotonic increase. Let y1 , y2 ∈ J with y1 < y2 . Then
shows that, since f is strictly monotonically increasing, f −1 (y1 ) ≥ f −1 (y2 ) cannot hold,
so that f −1 (y1 ) < f −1 (y2 ). But this means that f −1 is strictly monotonically increasing.
Now we show continuity. Let ε > 0 and y0 ∈ J = f (I). Set x0 := f −1 (y0 ) ∈ I. Since I is
an open interval, there is an ε0 > 0 with ε0 < ε such that [x0 − ε0 , x0 + ε0 ] ⊂ I. Since f is
strictly monotonically increasing,
Then for y ∈ J with |y − y0 | < δ holds f (x0 − ε0 ) < y < f (x0 + ε0 ) and the intermediate
value theorem yields
i.e. |f −1 (y) − f −1 (y0 )| < ε. By the ε-δ criterion this means that f −1 is continuous in y0 . 2
We want to remark that Theorem 4.5 also holds for intervals I and J which are not
open. In this case the proof of continuity of f −1 in y0 ∈ J has to be slightly adapted for
boundary points x0 := f −1 (y0 ). This was dropped for simplicity reasons.
   Theorem 4.6. Properties of the Logarithm
Proof:
  (i) Define x1 = log(x) and y1 = log(y). Then, using x = exp(x1 ) and y = exp(y1 ), we
      obtain
−1
−2
−3
−4
                  −5
                  −0.5   0   0.5   1   1.5    2   2.5       3       3.5   4     4.5   5
     However, formally this is a quite delicate issue and not treated in this lecture in
     much detail. For further information on complex logarithms we refer to books on
     Complex Analysis (German: Funktionentheorie).
By means of the exponential function, the general power ax (for a > 0, x ∈ C) can be
defined as follows:
                                ax := exp(log(a) · x).
This definition indeed makes sense as a0 = exp(log(a) · 0) = 1, a1 = exp(log(a) · 1) = a
and (for n ∈ N)
                     an = exp(log(a) + . . . + log(a)) = exp(log(a))n .
                              |        {z           }
                                       n−times
                                                        1       √
It can further be seen that this definition implies a n =       n
                                                                    a. The definition of the general
power also justifies the notion exp(x) = ex .
A remaining question is how to solve the equation
ax = y
for given a > 0, y ∈ R and unknown x ∈ R. Using the definition of the general power,
this equation becomes
                                exp(log(a) · x) = y.
Performing the logarithm on both sides of this equation, we obtain log(a) · x = log(y) and
thus
                                           log(y)
                                      x=          .
                                           log(a)
4.3 Hyperbolic and Trigonometric Functions                                                 69
In some literature, this expression is known as the logarithm of y to the basis a and
abbreviated by
                                                  log(y)
                                    loga (y) :=          .
                                                  log(a)
                                                   log(x)
                                    log10 (x) =           .
                                                  log(10)
                     1                                        1
         sinh(x) =     (exp(x) − exp(−x)) ,       cosh(x) =     (exp(x) + exp(−x)) .
                     2                                        2
70                                                                           4 Elementary Functions
80
60
40
20
−20
                                                                               sinh
                −40                                                            cosh
−60
                −80
                  −5    −4    −3     −2    −1      0     1     2         3     4       5
                                                      ∞               ∞
                                                                                 !
                             1              1         X xk            X (−x)k
                    sinh(x) = (ex − e−x ) =                       −
                             2              2          k=0
                                                             k!       k=0
                                                                            k!
                                     ∞             ∞
                                                                  !
                               1     X xk          X        xk
                             =                −       (−1)k
                               2     k=0
                                          k! k=0            k!
                                                      !    ∞
                                 1       X xk             X      x2k+1
                             =       2                  =                .
                                 2     k=1,3,5,...
                                                   k!     k=0
                                                               (2k + 1)!
sinh(x + a) = 12 (ex+a − e−(x+a) ) = 12 (ea ex − e−a e−x ) > 12 (ex − e−x ) = sinh(x).
                              1             1
                    cosh(−x) = (e−x + ex ) = (ex + e−x ) = cosh(x).
                              2             2
Now let x ≥ 0. From (iii) we have cosh2 (x) = 1 + sinh2 (x) and since cosh(x) ≥ 1 and
sinh(x) ≥ 0 for x ≥ 0 it follows directly from (v) that cosh is strictly increasing on [0, ∞[.
Now let x ≤ 0. Since cosh(−x) = cosh(x) it follows from the first part that cosh is strictly
decreasing on [−∞, 0).
   Remark:
   The definition of the hyperbolic functions imply that sinh(−x) = − sinh(x) (resp.
   cosh(−x) = cosh(x)). A function with this property is called odd (resp. even).
   The monotonicity property of cosh together with the fact that cosh(0) = 1 imply
   that cosh does not have any real zeros. The hyperbolic sine function has only one
   zero at the origin.
   Remark:
   Why are these functions called hyperbolic functions?
72                                                                  4 Elementary Functions
−1
−2
−3
                    −4
                     −4    −3     −2       −1     0     1      2     3     4
     describes a hyperbola see Figure 4.4. (In analogy the curve {(cos(t), sin(t)) | t ∈ R}
     describes the unit circle.)
−1
−2
−3
−5 −4 −3 −2 −1 0 1 2 3 4 5
   Remark:
   Since sinh and cosh map real numbers to real numbers and, moreover, cosh has no
   zero in R, the hyperbolic tangent is defined on the whole real axis. Furthermore, it
   can be seen that tanh is continuous, strictly monotonically increasing and
    (ii) The area hyperbolic cosine or the area cosinus hyperbolicus is denoted by
         arcosh : [1, ∞) → [0, ∞) is defined as the inverse function of cosh.
   (iii) The area hyperbolic tangent or the area tangens hyperbolicus is denoted by
         artanh : (−1, 1) → R is defined as the inverse function of tanh.
74                                                            4 Elementary Functions
      5                                                            arsinh
                                                                   arcosh
      4
−1
−2
−3
−4
−5
−6 −4 −2 0 2 4 6 8
2.5
1.5
0.5
−0.5
−1
−1.5
−2
      −2.5
                −3       −2     −1      0       1       2          3
   Remark:
   By the above definition, we have that for all x ∈ C holds
     In particular, the equation cos2 (x) + sin2 (x) = 1 implies for x ∈ R that | sin(x)| ≤ 1
     and | cos(x)| ≤ 1.
                   6
                                                                            sin
                                                                            cos
−2
−4
                  −6
                   −8     −6        −4   −2      0       2        4     6         8
The following result gives formulas for sine and cosine applied to sums of (complex)
numbers. These results can be readily verified by making use of Definition 4.11 and the
equation exp(x1 + x2 ) = exp(x1 ) exp(x2 ).
     Theorem 4.13. Trigonometric identities
     For arbitrary x, y ∈ C the trigonometric functions fulfill
and
             cos(x) cos(y) − sin(x) sin(y),
             1                         1
         =     (exp(ix) + exp(−ix)) · (exp(iy) + exp(−iy))
             2                         2
                   1                          1
                − (exp(ix) − exp(−ix)) · (exp(iy) − exp(−iy))
                   2i                        2i
             1                         1
         =     (exp(ix) + exp(−ix)) · (exp(iy) + exp(−iy))
             2                         2
                   1                        1
                + (exp(ix) − exp(−ix)) · (exp(iy) − exp(−iy))
                   2                        2
             1
         =     (exp(i(x + y)) + exp(i(y − x)) + exp(i(x − y)) + exp(−i(x + y))
             4
                + exp(i(x + y)) − exp(i(y − x)) − exp(i(x − y)) + exp(−i(x + y)))
             1
         =     (2 exp(i(x + y)) + 2 exp(−i(x + y)))
             4
             1
         =     (exp(i(x + y)) − exp(−i(x + y)))
             2
         =   cos(x + y).
We now define the famous number π by the double of the first positive zero of the cosine
function. The following result shows that this definition indeed makes sense.
   Theorem 4.14. First positive zero of cos, definition of π
   The function cos : R → R has exactly one zero in the interval [0, 2]. This zero is
   called π2 .
The proof for this is not presented here. It basically consists of three steps: The first step
consists of showing that cos(2) < 0. This can be shown by using the series representation
in Theorem 4.12 (v). In the second step we have to show that cos is strictly monotonically
decreasing in the interval [0, 2]. This can be achieved by showing that sin is positive on
78                                                                        4 Elementary Functions
Proof. (i) follows from the trigonometric identity together with sin( π2 ) = 1 and cos( π2 ) = 0,
namely                   π                π               π 
                 sin x +       = sin(x) cos      + cos(x) sin      = cos(x).
                           2            | {z 2 }          | {z2 }
                                              =0                  =1
Item (ii) can be shown analogously.
(iii) is a consequence of
                                                       π
                              sin(x + π) = cos(x +       ) = − sin(x),
                                                       2
(iv) is analogous.
(v) and (vi) follow by a double application of (iii) (resp. (iv)).
                                                    sin(x)
                                         tan(x) =          .
                                                    cos(x)
One can show that the set of zeros of the cosine function is { 2n+1
                                                                 2
                                                                    π : n ∈ Z}. Therefore,
the tangent is defined on
                                 C\ 2n+1
                                                    	
                                        2
                                           π : n∈Z .
4.4 Arcus functions                                                                       79
−2
−4
                      −6
                           −4        −2     0      2           4   6
Since sin, cos and tan are furthermore continuous, we can apply Theorem 4.5 to see that
the following definition makes sense:
   Definition 4.17.
     (i) The inverse sine or arcus sinus arcsin : [−1, 1] → R is defined as the inverse
         function of sin : [− π2 , π2 ] → [−1, 1].
    (ii) The inverse cosine or arcus cosinus arccos : [−1, 1] → R is defined as the
         inverse function of cos : [0, π] → [−1, 1].
80                                                                                  4 Elementary Functions
     (iii) The inverse tangent or arcus tangens arctan : R → R is defined as the inverse
           function of tan : (− π2 , π2 ) → R.
                   3                                                                     arcsin
                                                                                         arccos
2.5
1.5
0.5
−0.5
−1
                 −1.5
                        −2.5    −2    −1.5   −1   −0.5    0       0.5     1   1.5   2   2.5
4.5.1 Polynomials
−1
−2
−3
−5 −4 −3 −2 −1 0 1 2 3 4 5
   Remark:
   Since sums and scalar multiples of polynomials are again polynomials, they form
   a vector space.
where ar := 0 =: bs for r 6∈ {0, ..., n} and s 6∈ {0, ..., m}. For the proof of the formula
deg(p + q), we assume without loss of generality that n ≥ m. Then
                                     m
                                     X               n
                                                     X
                                                  k
                       p(x) + q(x) =   (ak + bk )x +   ak x k .
                                          k=0                            k=m+1
     Remark:
     As the example p(x) = x and q(x) = −x + 1 shows, it may indeed happen that
     deg(p + q) < max{deg(p), deg(q)}.
     Since deg(0 · p) = deg(0) = −∞ = −∞ + deg(p) and deg(0 + p) = deg(p) =
     max{−∞, deg(p)}, the choice of deg p = −∞ makes indeed sense for preserving the
     above formulas. However, this belongs to the “not so important facts” of mathem-
     atical analysis.
for some constants bn−1 , . . . , b0 , c. It can be directly seen that p(x0 ) = c. Collecting powers
of x yields
                                                  bn−1 = an ,
                                       bn−2 − bn−1 x0 = an−1 ,
                                                           ..
                                                            .
                                            b 0 − b 1 x 0 = a1 ,
                                             c − b 0 x 0 = a0 .
                                       bn−1 = an ,
                                       bn−2 = an−1 + bn−1 x0 ,
                                              ..
                                               .
                                         b 0 = a1 + b 1 x 0 ,
                                          c = a0 + b 0 x 0 .
   Example 4.20.
   Consider the polynomial p(x) = x3 − 6x2 + 7x and determine p(2).
                                         1       −6          7          0
                                                  2         −8         −2
                      (mult. with 2)         %    ↓     %    ↓   %      ↓
                                         1       −4         −1         −2
For integers, “division with remainder” is well known. The same procedure can now be
applied for polynomials and is called polynomial division. First we present an existence
result. Afterwards, we will present a method to compute the polynomials in question.
   Theorem 4.21. Polynomial Division
   Let p, q ∈ F[x] be given. Furthermore, assume that q is not the zero polynomial.
   Then there exist polynomials g and r with deg r < deg q, such that
p = q · g + r. (4.1)
To compute the polynomials g and r as in Theorem 4.21, we use the method of polynomial
division. This method is not displayable by a mathematical theorem but only explainable
by means of concrete examples. Polynomial division is of great importance, in particular
for some special representations of rational functions.
                     −x2 +4x +5
                   −(−x2     −1)
                         +4x +6                                                           .
Note that a polynomial p is divisible by x−x0 if and only if x0 is a zero of p, i.e., p(x0 ) = 0.
This follows from the fact that the division with remainder theorem implies that there
exists some polynomial q ∈ F[x] and a constant c ∈ F (i.e., a polynomial whose degree is
less than 1), such that
                                 p(x) = (x − x0 )q(x) + c.
For sake of completeness, we say that the “order of the zero x0 is zero, if x0 is not a zero
of p”.
As the following picture shows, a zero of even order touches the x-axis, while a zero
of odd order crosses the x-axis. Finally, we present a result which is classical in poly-
nomial algebra. It states that any nonconstant complex polynomial can be represented
as a product of linear factors. Algebraists call this property the algebraic closedness of C.
   tion                                            n
                                                   Y
                                   p(x) = an           (x − ck )
                                                   k=1
for some c1 , . . . , cn ∈ C.
This question is by far not simple for arbitrary polynomials, since there is no “univer-
sal strategy” for this. However, for polynomials of degree at most two, we can give simple
explicit formulas:
   • Polynomials of degree 1: For p(x) = a1 x + a0 with a1 6= 0, the only zero is
     obviously given by x1 = − aa01 .
   • Polynomials of degree 2: For p(x) = a2 x + a1 x + a0 with a2 6= 0, the zeros
     x1 , x2 ∈ C are given by
                                        p
                                   −a1 + a21 − 4a0 a2
                              x1 =                    ,
                                         2a2
                                        p
                                   −a1 − a21 − 4a0 a2
                              x2 =                    .
                                         2a2
      Note that for p(x) = x2 + px + q, the expressions for the zeros read
                                                         r
                                               p             p2
                                      x1/2   =− ±               − q.
                                               2             4
86                                                                    4 Elementary Functions
       In high school mathematics, this has the pictorial name of “pq Formula”.
       Note that in the case where a21 − 4a0 a2 is a negative real number, the square root
       has to be understood as the complex number(s) whose squaring gives a21 − 4a0 a2 .
       For instance, the zeros of the polynomial p(x) = x2 + 1 are given by
                                                √
                                       x1/2 = ± −1 = ±i.
x2 + 1 = (x + i)(x − i).
Altogether, we now have p(x) = (x2 + 2x − 4) · (x − 1). The zeros of p are therefore given
by x1 = 1 and the set of zeros of the polynomial q(x) = x2 + 2x − 4. The latter ones can
now be obtained by the pq formula, i.e.,
                                          √              √
                            x2/3 = −1 ± 1 + 4 = −1 ± 5.
The above strategy can be used for arbitrary polynomials as long as one has enough
successful guesses for zeros. In exercises (or examinations), there is oftentimes a hint
given for such guesses.
   Definition 4.25.
   Let p, q ∈ F[x], where q is not the constant zero polynomial. Then the function
   f : D(f ) = {x ∈ F : q(x) 6= 0} → F with
                                                              p(x)
                                                 f (x) =
                                                              q(x)
−1
−2
−3
−4
                            −5
                             −5       −4   −3   −2   −1   0     1    2   3   4   5
b) f (x) =   1
             x2
                            −1
                             −5       −4   −3   −2   −1   0     1    2   3   4   5
We will now take a closer look at the places where f is not defined, i.e., the zeros of the
denominator polynomial.
Let f = pq be given, let x0 be a zero of q and let r ∈ N0 , s ∈ N be the multiplicities of the
zero x0 of p and q, respectively. This means that we have factorizations
for some p1 , q1 ∈ F[x] with p1 (x0 ) 6= 0 and q1 (x0 ) =6 0. This means that for x ∈ D(f ) we
have
                             p(x)      (x − x0 )r p1 (x)                 p1 (x)
                     f (x) =      =             s
                                                          = (x − x0 )r−s        .
                             q(x)      (x − x0 ) q1 (x)                  q1 (x)
hole
2. Case: r < s:
f has a pole of order s − r at x0 .
0 0
                                                                     x2
Example 4.27. a) The rational function f (x) =                       x
                                                                        ,   defined on R\{0}, has a hole at
  x0 = 0.
b) The rational function f (x) =        x
                                        x2
                                           ,   defined on R\{0}, has a pole of first order at x0 = 0.
4.5 Polynomials and Rational Functions                                                        89
   Definition 4.28.
   A rational function f =       p
                                 q
                                     is called
Proof: Let f = pq . Applying polynomial division, we obtain that there exist g, r ∈ F[x]
with deg r < deg q and p = qg + r. Division by q gives
                                                 p(x)          r(x)
                                     f (x) =          = g(x) +      .
                                                 q(x)          q(x)
Since the latter addend is a strictly proper rational function, the result is proven.
                                                                                              2
Next we present that any strictly proper rational function has a representation as sum
of partial fractions. Note that for rational functions which are not strictly proper, we
first have to perform an additive decomposition into a polynomial and a strictly proper
rational function according to Theorem 4.29.
   Theorem 4.31.
   Let polynomials p(x) = an xn + . . . + a1 x + a0 , q(x) = bm xm + . . . + b1 x + b0 with
   deg(p) = n < m = deg(q) be given. Assume that q has a representation
                                   k
                                   Y                            k
                                                                X
                          q(x) = bm (x − xj )kj ,                     kj = m
                                          j=0                   j=0
                                               kj
                                             k X
                                             X             Ajl
                                   f (x) =                         .
                                             j=0 l=1
                                                        (x − xj )l
Note that a combination of the above result with Theorem 4.29 yields that any rational
function can be represented as the sum of a polynomial and some partial fractions.
The above theorem looks more complicated then it actually is. The polynomial q has
oftentimes only simple multiplicities, i.e., kj = 1 for all j. In that case, the double sum
becomes a single sum of the form
                                               k
                                              X    Aj1
                                      f (x) =            .
                                              j=0
                                                  x − xj
We will give some examples. After that, we discuss how to compute the coefficients Alj .
(b)
                                                   1
                                       f (x) =            .
                                                   x3
                                                 − 3x − 2
       By the determination of the zeros of the denominator polynomial, we obtain a fac-
       torization x3 − 3x − 2 = (x + 1)2 (x − 2). Therefore, there exists a partial fraction
       decomposition of the form
                                            A11    A12     A21
                                f (x) =         +        +      .
                                           x + 1 (x + 1)2 x − 2
For the computation of the coefficients, we make use of the fact that two polynomials
coincide if and only if all their coefficients coincide. This technique is called comparison
of coefficients.
                                           A11 + A21 = 1,
                                         −iA11 + iA21 = 1,
i.e.                                              
                                         1 1      A11   1
                                                      =   .
                                         −i i     A21   1
This leads to the solution A11 = 1+i
                                   2
                                     , A21 = 1−i
                                               2
                                                 . Therefore, we have a partial fraction
decomposition
                                           1+i     1−i
                                x+1         2       2
                                       =        +      .
                                x2 + 1    x+i x−i
For the second example, we get
and thus
In the following, we present a nice trick to compute some of the coefficients without solving
a linear system of equations.
       Theorem 4.33.
       Let the assumptions of Theorem 4.31 be valid. Then the coefficients Ai,ki are given
       by
                                                p(xi )
                                  Ai,ki =     k
                                             Y
                                          bm      (xi − xj )kj
                                                j=0,j6=i
Again, this formula looks more complicated than it really is. The determination of Ai,ki
can be done as follows:
       • Keep the factor (x − xi )ki shut.
       • Plug xi into the remaining part.
92                                                                     4 Elementary Functions
      This is the same result that we obtained by solving the associated linear system.
(b)
                                         1           A11    A12        A21
                      f (x) =            2
                                                  =      +        2
                                                                    +      .
                                  (x + 1) (x − 2)   x + 1 (x + 1)     x−2
      We now compute the coefficients A12 and A21 by the formula given in Theorem 4.33.
                                                            
                                                  1                   1
                                  A12   =       2
                                                                   =− ,
                                          (x + 1)   (x − 2)            3
                                                          
                                            
                                                            x=−1
                                          
                                                  1                 1
                                  A21   =                          = .
                                          (x + 1)2 (x−2) x=2
                                                         
                                                                     9
For the coefficient A11 now we only need to solve the reduced system
                            1         1
                         1 + (x − 2) − (x + 1)2 = A11 (x + 1)(x − 2).
                            3         9
   Definition 4.35.
   Let a sequence (ak )k∈N in F be given and let x0 ∈ F. Then the function f : D(f ) → F
   defined by the series
                                           ∞
                                           X
                                   f (x) =     ak (x − x0 )k
                                                  k=0
   is called
          P∞domain of convergence. The domain of convergence at least includes x0
   since k=0 ak (x0 − x0 ) = a0 .
                          k
Example 4.36. a) The exponential function is defined via the power series
                                                                 ∞
                                                                 X xk
                                              exp(x) =                      ,
                                                                 k=0
                                                                       k!
   i.e., (ak )k∈N = (0, 1!1 , 0, − 3!1 , 0, 5!1 , 0, − 7!1 , . . .)k∈N and x0 = 0. Again D(f ) = C.
c) cos, cosh, sinh are defined via the power series...
d) The function
                                                        ∞
                                                        X (x − 1)k
                                            f (x) =
                                                        k=1
                                                                       k
   is a power series.
   be given. Let
                                                             1
                                          r :=                   p       ,
                                                 lim sup         k
                                                                   |ak |
                                                   k→∞
   where we formally define 1/∞ := 0 and 1/0 := ∞. Then for all x ∈ F with
   |x − x0 | < r holds x ∈ D(f ). Furthermore, for all x ∈ F with |x − x0 | > r holds
94                                                                      4 Elementary Functions
     x∈/ D(f ).
     The number r as defined above is called the radius of convergence.
im
divergence
convergence
re
 (ii) For all x ∈ F with |x − x0 | · lim supk→∞ k |ak | > 1, the power series is divergent.
                                                p
Statement (i) just follows from the limit form of the root criterion (Theorem 2.19), namely
                           p                                       p
                  lim sup k |(x − x0 )k ak | = |x − x0 | · lim sup k |ak | < 1.
                     k→∞                                    k→∞
This implies that the sequence (x−x0 )k ak does not converge to 0 and therefore, the power
series cannot converge.                                                                 2
Geometrically, the above result implies that for all x inside a circle with midpoint x0 and
radius r, the series is convergent and outside this circle, we have divergence.
The Cauchy-Hadamard Theorem characterizes convergence/divergence of the power series
in dependence of x whether it is inside or outside the circle around x0 with radius r. In
the case |x − x0 | = r, this result does not tell us anything. Indeed, we may have points
on the circle with x ∈ D(f ) and also points on the circle with x ∈ / D(f ).
4.6 Power Series                                                                         95
So we have convergence for all x ∈ (0, 2) and divergence for all x ∈ (−∞, 0) ∪ (2, ∞). The
remaining real points which are not characterized by the Theorem of Cauchy-Hadamard
are x = 0 and x = 2. In the case x = 0, we obtain the series
                                         ∞
                                         X (−1)k
                                          k=0
                                                     k
which is convergent by the Leibniz criterion. Plugging in x = 2, the power series becomes
a harmonic series                          ∞
                                           X  1
                                             k=0
                                                     k
that is well-known to be divergent.
Example 4.38. a) For the power series defined by the exponential function, we have
                                      
                                       1
                          (ak )k∈N =          ,     x0 = 0.
                                       k! k∈N
   The radius of convergence is then given by
                                             1                         1
                               r=                    q             =     = ∞.
                                    lim supk→∞       k
                                                         | k!1 |       0
   As a consequence, the series converges for every x ∈ C. The same holds true for the
   series of sin, cos, sinh, cosh.
b) As we have already seen above, the radius of convergence of the power series
                                                 ∞
                                                 X (x − 1)k
                                      f (x) =
                                                 k=0
                                                               k
   is r = 1.
c) Consider the power series
                                                     ∞
                                                     X
                                        f (x) =            k!xk .
                                                     k=0
   The radius of convergence is given by
                                             1       1
                                 r=          p     =   = 0.
                                      lim sup |k!|
                                             k       ∞
                                       k→∞
better suited, which follows from the quotient criterion and is stated without proof.
96                                                                  4 Elementary Functions
     Theorem 4.39.
     Suppose that ∞   n=1 an (x − x0 ) is a power series with coefficients an ∈ F such that
                                      n
                    P
                            2
                                                                      f
1.8 g
1.6
1.4
1.2
                            1
                                               origin
                           0.8
0.6
0.4
0.2
−1 −0.5 0 0.5 1
                                                                                              97
98                                                                         5 Differentiation of Functions
f g
A straight line y(t) going through the points (x0 , f (x0 )) and (x, f (x)) is called secant 1 of
f through these points. It is given by
                                                    f (x0 ) − f (x)
                                y(t) = f (x0 ) +                    (t − x0 ).
                                                        x0 − x
In particular, the slope of y is the difference quotient
fx0 f
f−f
            fx
                                                    x−x
x x0
definition.
      Definition 5.1.
      Let I ⊂ R be an interval with more than one point or an open set. Let f : I → R
      be a function. Then f is called differentiable at x0 ∈ I if there exists a function
 1
     from the Latin word secare = “to cut”
5.1 Differentiability and Derivatives                                                     99
By solving the above equation for ∆f,x0 (x), we get for x 6= x0 that
                                                          f (x) − f (x0 )
                                          ∆f,x0 (x) =                     ,
                                                              x − x0
 d
dx
            ∂
   f (x0 ), ∂x          df
               f (x0 ), dx |x=x0 , ∂f |
                                   ∂x x=x0
                                           ,   ∂x f (x0 ).
The next result states that differentiability is a stronger property than continuity.
   Theorem 5.2.
   Let f : I → R be differentiable at x0 ∈ I. Then f is continuous in x0 .
Proof: By writing
                                   f (x) = f (x0 ) + (x − x0 ) · ∆f,x0 (x),
the continuity of ∆f,x0 at x0 implies the continuity of f at x0 .                         2
As the following example shows, the opposite implication cannot be made, i.e., not every
continuous function is differentiable.
Example 5.3. Consider the absolute value function | · | : R → R. We already know that
it is continuous. For the analysis of differentiability, we distinguish between three cases:
1st Case: x0 > 0.
Then we have that |x0 | = x0 and, moreover, for x in some neighbourhood of x0 holds
|x| = x. Therefore, we have
                                          |x| − |x0 |        x − x0
                                    lim               = lim         = 1.
                                   x→x0    x − x0       x→x0 x − x0
                                       |x| − |x0 |       −x + x0
                                 lim               = lim         = −1.
                                x→x0    x − x0       x→x0 x − x0
100                                                                5 Differentiation of Functions
3rd Case: x0 = 0.
Then the two sequences (xn )n∈N = ( n1 )n∈N , (yn )n∈N = (− n1 )n∈N both tend to x0 = 0.
However, we have
                             |xn | − |x0 |         | 1 | − |0|
                         lim               = lim n1            =1
                        n→∞ xn − x0          n→∞         −0
                                                      n
and
                            |yn | − |x0 |       | − n1 | − |0|
                          lim             = lim                = −1.
                         n→∞ yn − x0        n→∞   − n1 − 0
Therefore, the limit
                                              |x| − |0|
                                            lim
                                          x→0 x − 0
The derivative a := f 0 (x0 ) of f in x0 can be interpreted in the following way: The linear
mapping ϕ : R → R, x 7→ ax fulfills
                                                                               
              |f (x0 + h) − (f (x0 ) + ϕ(h))|         f (x0 + h) − f (x0 )     
          lim                                  = lim 
                                                                           − a = 0.
          h→0                 |h|                h→0            h
This means that the affine linear mapping t(h) := f (x0 ) + ϕ(h) = f (x0 ) + ah, which
actually is the tangent of f at x0 , approximates f (x) linearly in a neighbourhood of x0 in
a best possible way.
This “local linearisation” can be generalised to functions between arbitrary normed R-
vector spaces.
   Definition 5.4. Total Derivative
   Let (E, || · ||E ) and (F, || · ||F ) be two normed R-vector spaces and let U be an open
   subset of E. Then a function f : U → F is said to be differentiable in a point
   x0 ∈ U if there is a continuous linear function ϕ : E → F such that
   or equivalently if
                               ||f (x0 + h) − f (x0 ) − ϕ(h)||F
                           lim                                  = 0.
                           h→0              ||h||E
   In this case ϕ is called the (total) derivative of f in x0 which is denoted by f 0 (x0 ).
   If f is differentiable in all points of U then f is called differentiable in U or just
   differentiable and f 0 : U → L(E, F ), x 7→ f 0 (x) is called the derivative of f .
Note carefully that the total derivative f 0 (x0 ) is a linear function. If E and F are finite
dimensional R-vector spaces, say E = Rm , F = Rn for some m, n ∈ N, then f 0 (x0 ) can
be identified with its matrix representation M ∈ Rn,m with respect to the standard bases
5.1 Differentiability and Derivatives                                                   101
Many of the subsequent results carried out for the cases E = R = F or E = R and
F = C can be generalised to arbitrary E and F in a straight forward manner and the
proofs are analogous and sometimes become even clearer in terms of total derivatives.
This will be part of the exercises.
c) For determining the derivatives of exp, sinh, cosh, sin and cos, we first determine the
   following limit for λ ∈ C:
                                          exp(λh) − 1
                                      lim             .
                                     h→0       h
   By Theorem ??, we know that for h ∈ R with |λh| < 2
exp(λh) = 1 + λh + r2 (λh)
   This has manifold consequences for the derivatives of exponential, hyperbolic and
   trigonometric functions:
                                              exp(x0 + h) − exp(x0 )
                            exp0 (x0 ) = lim                         = exp(x0 ),
                                          h→0           h
                           sinh(x0 + h) − sinh(x0 )
           sinh0 (x0 ) = lim
                       h→0
                                     h                                            
                       1       exp(x0 + h) − exp(x0 ) − exp(−(x0 + h)) + exp(−x0 )
                      = lim                          +
                       2 h→0             h                         h
                       1
                      = (exp(x0 ) + exp(−x0 )) = cosh(x0 ).
                       2
Analogously, we can show that cosh0 = sinh. Now consider the trigonometric functions:
                       sin(x0 + h) − sin(x0 )
        sin0 (x0 ) = lim
                   h→0
                                h                                                    
                   1        exp(i(x0 + h)) − exp(ix0 ) − exp(−i(x0 + h)) + exp(−ix0 )
                  = lim                                +
                   2i h→0               h                            h
                   1
                  = (i exp(ix0 ) + i exp(−ix0 )) = cos(x0 )
                   2i
and
                          cos(x0 + h) − cos(x0 )
         cos0 (x0 ) = lim
                      h→0
                                   h                                                
                     1       exp(i(x0 + h)) − exp(ix0 ) exp(−i(x0 + h)) − exp(−ix0 )
                    = lim                              +
                     2 h→0                h                          h
                     1
                    = (i exp(ix0 ) − i exp(−ix0 ))
                     2
                         1
                    = − (exp(ix0 ) − exp(−ix0 )) = − sin(x0 ).
                        2i
Now we consider rules for the derivatives of sums, products and quotients of functions.
   Theorem 5.6. Summation Rule, Product Rule, Quotient Rule
   Let f, g : I → R be differentiable in x0 ∈ I.
      (ii) Then f · g is differentiable in x0 with (f · g)0 (x0 ) = f 0 (x0 ) · g(x0 ) + f (x0 ) · g 0 (x0 ).
5.1 Differentiability and Derivatives                                                                          103
Proof: Let f (x) = f (x0 ) + (x − x0 ) · ∆f,x0 (x), g(x) = g(x0 ) + (x − x0 ) · ∆g,x0 (x). Then
  (i)
              (f + g)(x) = f (x) + g(x) = (f + g)(x0 ) + (x − x0 ) · (∆f,x0 (x) + ∆g,x0 (x)).
 (ii)
                 (f · g)(x) =(f (x0 ) + (x − x0 ) · ∆f,x0 (x0 ))(g(x0 ) + (x − x0 ) · ∆g,x0 (x))
                            =f (x0 )g(x0 ) + (x − x0 )(∆f,x0 (x0 )g(x0 ) + ∆g,x0 (x)f (x0 ))
                                 + (x − x0 )2 ∆f,x0 (x0 )∆g,x0 (x).
        Thus,
                 f (x)g(x) − f (x0 )g(x0 )
                lim
            x→x0          x − x0
                 (x − x0 )(∆f,x0 (x)g(x0 ) + ∆g,x0 (x)f (x0 )) + (x − x0 )2 ∆f,x0 (x)∆g,x0 (x)
           = lim
            x→x0                                    x − x0
           = lim ((∆f,x0 (x0 )g(x0 ) + ∆g,x0 (x)f (x0 ) + (x − x0 )∆f,x0 (x)∆g,x0 (x))
                x→x0
(iii) For convenience, we assume that f ≡ 1 (the general result follows by an application
      of the product rule). Then
                               1     1      g(x0 ) − g(x)    (x − x0 )∆g,x0 (x)
                                  −       =               =−
                              g(x) g(x0 )    g(x0 )g(x)         g(x0 )g(x)
        and thus
                                                                                   g 0 (x0 )
                                                                           
                                         1                  1     1
                                   lim                         −                 =− 2        .
                                  x→x0 x − x0              g(x) g(x0 )             g (x0 )
                                                                                                                2
c) f (x) = x−n =        1
                       xn
                          ,   n ∈ N. Then for x ∈ R\{0}
                                                −nxn−1       1
                                  f 0 (x) =        2n
                                                       = −n n+1 = −nx−n−1
                                                 x         x
   is true.
104                                                                              5 Differentiation of Functions
Now we introduce differentiation rules for composition of functions and inverse functions.
              f −1 (yn ) − f −1 (y0 )           xn − x0           1         1
           lim                        = lim                   = 0      = 0 −1       .
          n→∞        yn − y0            n→∞ f (xn ) − f (x0 )  f (x0 )  f (f (y0 ))
                                                                                                            2
                                  √         1
Example 5.9. a) g(x) =            n
                                      x = x n is the inverse function of f : R+ → R+ , f (x) = xn .
  Then for x > 0 holds
                                   1          1          1       1 − n−1   1 1 −1
                 g 0 (x) =              =            =  √  n−1 =   x  n  =   xn .
                             f 0 (g(x))   n(g(x))n−1   nnx       n         n
b) log : (0, ∞) → R is the inverse function of exp. Then for x > 0 holds
                                                      1                1       1
                               log0 (x) =                     =               = .
                                            exp0 (log(x))         exp(log(x))  x
                                                       1                1
                              arcsin0 (x) =       0             =                .
                                                sin (arcsin(x))   cos(arcsin(x))
   Therefore
                                                          1                             1
                             arcsin0 (x) = p              2
                                                                                 =√          .
                                                 1 − sin (arcsin(x))                  1 − x2
5.2 Mean Value Theorems and Consequences                                                         105
Proof: By assumption there are functions ∆g,x0 (x) : I → R and ∆f,g(x0 ) (y) : J → R which
are continuous in x0 and g(x0 ) respectively such that
fulfills
   Definition 5.12.
   Let I be an interval and f : I → R be a function. Then x0 ∈ I is called local
   maximum (local minimum) if there exists some neighbourhood U of x0 such that
max
min
   Theorem 5.13.
   Let I be an interval and f : I → R be a function that is differentiable in x0 ∈ I.
   Assume that x0 is an interior point of I and that x0 is a local extremum. Then
   f 0 (x0 ) = 0.
Proof: We assume that x0 is a local maximum (the case of minimum is shown analogously).
Let U be a neighbourhood of x0 with U ⊂ I and f (x0 ) = max{f (x) : x ∈ U }. Let
Assume that f 0 (x0 ) = ∆f,x0 (x0 ) > 0. Since ∆f,x0 is continuous in x0 , then there exists
a neighbourhood V ⊂ U of x0 such that ∆f,x0 (x) > 0 for all x ∈ V . Then for all x1 ∈ V
with x1 > x0 holds f (x1 ) = f (x0 ) + (x1 − x0 ) · ∆f,x0 (x1 ) > f (x0 ). This is a contradiction.
On the other hand, assume that f 0 (x0 ) = ∆f,x0 (x0 ) < 0. Since ∆f,x0 is continuous in x0 ,
then there exists a neighbourhood V ⊂ U such that ∆f,x0 (x) < 0 for all x ∈ V . Then for
all x1 ∈ V with x1 < x0 holds f (x1 ) = f (x0 ) + (x1 − x0 ) · ∆f,x0 (x1 ) > f (x0 ). This is also
a contradiction.                                                                                2
As a consequence, we will formulate the following result stating that derivatives of func-
tions with equal boundary conditions have at least one zero.
5.2 Mean Value Theorems and Consequences                                                   107
Proof: If f is constant, the statement is clear (since then f 0 (x) = 0 for all x ∈ (a, b)). If
f is not constant, consider the maximum and the minimum of f on [a, b] (we know by
Theorem 3.15 that they exist). So, let x− , x+ ∈ [a, b] such that
f (x+ ) = max{f (x) : x ∈ [a, b]}, f (x− ) = min{f (x) : x ∈ [a, b]}.
Then we have that x+ ∈ (a, b) or x− ∈ (a, b) since, otherwise, f (x+ ) = f (x− ) (constant).
Then f 0 (x− ) = 0 or f 0 (x+ ) = 0.                                                      2
As a corollary, we have that for a function f differentiable in some interval I, the following
holds: Between two zeros of f , there always exists some point x0 with f 0 (x0 ) = 0.
Now we present the famous mean value theorem.
   Theorem 5.15.
   Let f : [a, b] → R be differentiable. Then there exists some x̂ ∈ (a, b) such that
Before the proof is presented, we give some graphical interpretation: A division of the
above equation by b − a gives
                                     f (b) − f (a)
                                                   = f 0 (x̂).
                                         b−a
The quantity on the left hand side is equal to the slope of the secant of f through a and
b, whereas f 0 (x̂) corresponds to the slope of tangent of f at x̂. Therefore, the secant of f
through a and b is parallel to a tangent of f .
Proof: Consider the function F : [a, b] → R with
                                                    f (b) − f (a)
                        F (x) := f (x) − f (a) −                  · (x − a).
                                                        b−a
Then we have F (a) = F (b) = 0. By Rolle’s Theorem, we get that there exists some
x̂ ∈ (a, b) with
                                                   f (b) − f (a)
                         0 = F 0 (x̂) = f 0 (x̂) −
                                                       b−a
and thus
                                                 f (b) − f (a)
                                    f 0 (x̂) =                 .
                                                     b−a
                                                                                             2
The mean value theorem leads us to determine monotonicity properties of a function by
means of its derivative.
108                                                                  5 Differentiation of Functions
tangent
secant
a x b
   Theorem 5.16.
   Let f : [a, b] → R be a differentiable function. Then the following holds true.
(i) If f 0 (x) > 0 for all x ∈ (a, b), then f is strictly monotonically increasing.
(ii) If f 0 (x) < 0 for all x ∈ (a, b), then f is strictly monotonically decreasing.
Proof: (i) By the mean value theorem we have that for x1 , x2 ∈ (a, b) with x1 < x2 , there
exists some x̂ ∈ (x1 , x2 ) with
                                                 f (b) − f (a)
                      F (x) = f (x) − f (a) −                  · (g(x) − g(a)),
                                                 g(b) − g(a)
5.2 Mean Value Theorems and Consequences                                                                109
we get F (a) = F (b) = 0. Now using the Theorem of Rolle, the result follows immediately.
2
     Theorem 5.18. Theorem of l’Hospital
     Let I be an interval and let f, g : I → R be differentiable. Let x0 ∈ I and assume
     that f (x0 ) = g(x0 ) = 0 and there exists some neighbourhood U ⊂ I of x0 such that
                                                             0 (x)
     g 0 (x) 6= 0 for all x ∈ U \{x0 }. Then, if limx→x0 fg0 (x)    exists, then also limx→x0 fg(x)
                                                                                                (x)
     exists and
                                           f (x)        f 0 (x)
                                       lim       = lim 0          .
                                      x→x0 g(x)    x→x0 g (x)
Proof: Let (xn )n∈N be a sequence with limn→∞ xn = x0 and xn 6= x0 for all n ∈ N. Then,
by the generalised mean value theorem, there exists a sequence (x̂n )n∈N with x̂n between
x0 and xn such that
                            f (xn )   f (xn ) − f (x0 )  f 0 (x̂n )
                                    =                   = 0         .
                            g(xn )    g(xn ) − g(x0 )    g (x̂n )
In particular, since (x̂n )n∈N converges to x0 , we have
                                                 f (x)      f 0 (x)
                                        lim            = lim 0      .
                                        x→x0     g(x) x→x0 g (x)
                                                                                                         2
                                      log(1 + ax)              a
                                 lim              = lim            = a.
                             x→0,ax>0      x       x→0,ax>0 1 + ax
b)
                                                                                             
                      1                          log(1 + ax)                      log(1 + ax)
       lim (1+ax) =   x      lim       exp                         = exp   lim                      = exp(a).
     x→0,ax>0              x→0,ax>0                   x                  x→0,ax>0      x
c)
                               1 − cos(x)       sin(x)       cos(x)  1
                           lim       2
                                          = lim        = lim        = .
                           x→0     x        x→0   2x     x→0   2     2
        with
                                                 f (x)      f 0 (x)
                                           lim         = lim 0      .
                                          x→x0   g(x) x→x0 g (x)
                                              f (x)       f 0 (x)
                                           lim      = lim 0       .
                                          x→∞ g(x)    x→∞ g (x)
      • Expressions of type ∞ ∞
                                , Limit as x → ∞
        Let f, g : [t0 , ∞) be differentiable functions with limx→∞ f (x) = ∞,
                                               0 (x)
        limx→∞ g(x) = ∞. Then, if limx→∞ fg0 (x)     exists, then also limx→∞ fg(x)
                                                                                (x)
                                                                                    ex-
        ists with
                                       f (x)         f 0 (x)
                                   lim       = lim 0         .
                                  x→∞ g(x)     x→∞ g (x)
The second and third statement follow by a substitution y = x1 and the consideration of
                                                         
                             1            1 0 1            0 1
              f (x)       f  y
                                       − y 2 f  y
                                                        f    y         f 0 (x)
          lim       = lim   = lim              = lim   = lim 0           .
         x→∞ g(x)     y→0          y→0               y→0 0 1      x→∞ g (x)
                          g 1      y
                                       − 1 g0 1     y2
                                                         g    y                     y
                                                                                                                2
Note that we can also treat expressions of type “∞−∞” by l’Hospitals’s Theorem. Namely,
for f, g with limx→x0 f (x) = 0, limx→x0 g(x) = 0, we get that
                   1    1            g(x) − f (x)               g 0 (x) − f 0 (x)
         lim (        −     ) = lim               = lim 0                               .
        x→x0     f (x) g(x)     x→x0 f (x) · g(x)   x→x0 f (x) · g(x) + f (x) · g 0 (x)
Also, expressions of type “0 · ∞” can be treated by a special trick. Namely, for f, g with
limx→x0 f (x) = 0, limx→x0 g(x) = ∞, we get that
                                         f (x)                f 0 (x)                       f 0 (x)(g(x))2
           lim f (x) · g(x) = lim          1     = lim            0       = − lim                          .
           x→x0                  x→x0
                                          g(x)
                                                   x→x0         g (x)
                                                             − (g(x)) 2
                                                                                 x→x0             g 0 (x)
You do not have to keep the above two formulas in mind. These can always be derived
in concrete examples.
b)
                             
                     1    1             x − sin(x)
        lim             −         = lim
        x→0       sin(x) x          x→0 x sin(x)
                                           1 − cos(x)                     sin(x)
                                  = lim                   = lim                            = 0.
                                    x→0 sin(x) + x cos(x)   x→0 cos(x) − x sin(x) + cos(x)
     Theorem 5.23.
     Let I := [a, b] and f : I → R be differentiable. Furthermore let x0 ∈ I such that
     f 0 (x0 ) = 0 and f 0 is differentiable in x0 . Then
Proof: We only show the case f 00 (x0 ) > 0 (the opposite case is analogous). By definition,
we have
                                             f 0 (x) − f 0 (x0 )
                          f 00 (x0 ) = lim                       > 0.
                                        x→x0       x − x0
112                                                                 5 Differentiation of Functions
Since f 0 is continuous in x0 , we have that there exists some ε > 0 such that for all
x ∈ I\{x0 } with |x − x0 | < ε holds
                                         f 0 (x) − f 0 (x0 )
                                                             > 0.
                                               x − x0
Since f 0 (x0 ) = 0, we have that
                                 f 0 (x) < 0 for all x ∈ (x0 − ε, x0 ),
                                 f 0 (x) > 0 for all x ∈ (x0 , x0 + ε).
Therefore, f is monotonically decreasing in (x0 − ε, x0 ) and monotonically increasing in
(x0 , x0 + ε). Therefore, f has a local minimum in x0 .                                2
   Remark:
   Note that in the case f 00 (x0 ) = 0, we cannot make a decision whether f has a local
   extremum there. For instance, consider the three functions f1 (x) = x3 , f2 (x) = x4
   and f3 (x) = −x4 . We have f10 (0) = f20 (0) = f30 (0) = 0 and, furthermore, f100 (0) =
   f200 (0) = f300 (0) = 0. However, f1 has no local extremum in 0, f2 has a local minimum
   in 0 and f3 has a local maximum in 0.
     For xn = 2n+12
                    π, we have f 00 (xn ) = − sin( 2n+1
                                                      2
                                                        π) = (−1)n+1 . As a consequence, sin
     has a local maximum in xn = 2n+1   2
                                           π if n is even and a local minimum in xn = 2n+1
                                                                                        2
                                                                                           π
     if n is odd.
                                                   f (x2 ) − f (x1 )
                                    f (x) ≥                          · (x − x1 ) + f (x1 ) .                                                 (5.2)
                                                       x2 − x1
   If the inequalities in (5.1) or (5.2) are strict then f is called strictly convex/concave.
   Geometrically this means that the graph of a convex (concave) function f : [a, b] → R
   restricted to any subinterval [x1 , x2 ] of [a, b] lies below (above) the secant
                                                       f (x2 ) − f (x1 )
                                    s(x) :=                              · (x − x1 ) + f (x1 ) .
                                                           x2 − x1
10 10
                8                                                                 8
                                                                             y
           y
6 6
4 4
2 2
                0                                                                 0
                    0   x1= 1   2        3         4      x2= 5          6            0   x1= 1   2        3          4   x2= 5          6
                                         x                                                                 x
   Theorem 5.26.
   Let f : [a, b] → R, a, b ∈ R, a < b, be 2-times differentiable.
Proof: We only prove a). The other results follow analogously. Let x1 < x < x2 in [a, b].
Since f 00 ≥ 0 we know that f 0 is monotonically increasing. By the intermediate value
114                                                                5 Differentiation of Functions
This implies
   Theorem 5.28.
   Let f : [a, b] → R be 3-times continuously differentiable and x0 ∈ (a, b).
                                                  f (x)
                                         a =       lim  ,
                                             x→±∞ x
                                         b = lim (f (x) − ax) .
                                                  x→±∞
  5. Zeros
  6. Extrema, monotonicity behaviour
  7. Inflection points, convexity/concavity behaviour
  8. Function graph
Example 5.29. We want to give a complete curve discussion for the rational function
                                                  2x2 + 3x − 4
                                        f (x) =                .
                                                       x2
  1. Domain of definition: D = R\{0}
  2. Symmetries: f is neither an even nor an odd function.
  3. Poles: x0 = 0 is a pole of order 2, limx%0 f (x) = −∞ = limx&0 f (x)
  4. Behaviour for x −→ ±∞, asymptotes: limx→±∞ f (x)     x
                                                             = 0, limx→±∞ f (x) = 2. Thus
     the horizontal line at y = 2 is an asymptote of f for x −→ ∞ and also for x −→ −∞.
                                                                      √
  5. Zeros: f (x) = 0 ⇔ 2x2 + 3x − 4 = 0 ⇔ x = x1,2 = 14 (−3 ± 41)
     x1 ≈ −2.35, x2 ≈ 0.85
  6. Extrema, monotonicity behaviour:
     f 0 (x) =   −3x+8
                   x3
                           = 0 ⇔ x = x3 = 38 , y3 := f (x3 ) ≈ 2.56
     f 00 (x)    6x−24
                = x4
     f 00 (x3 ) < 0 ⇒ f has a local maximum at x3 .
                                                              , strictly convex
                                           
                                            > 0 ,4 < x < ∞
                              f 00 (x)       < 0 ,0 < x < 4   , strictly concave
                                             < 0 , −∞ < x < 0 , strictly concave
                                           
8. Function graph
                     0
               y
−1
−2
                   −3
                   −10        −8      −6     −4      −2      0      2       4      6       8   10
                                                             x
                                                       g(x0 + h) − g(x0 )
                        F (x0 + h) − F (x0 ) =                            · F 0 (x0 + θh).               (5.3)
                                                          g 0 (x0 + θh)
Then we have
          n                                      n
 0
          X f (k+1) (x)                         X
                                                k    f (k) (x)                    f (n+1) (x)
F (x) =                          (x0 + h − x) −                 (x0 + h − x)k−1 =             (x0 + h − x)n .
           k=0
                       k!                       k=1
                                                    (k  −   1)!                       n!
and
                                           g 0 (x) = −(n + 1)(x0 + h − x)n .
Moreover, we have
                 n
                 X f (k) (x0 )
     F (x0 ) =                      hk ,    F (x0 + h) = f (x0 + h),    g(x0 ) = hn+1 ,      g(x0 + h) = 0.
                 k=0
                            k!
The application of Taylor’s formula is twofold: First, it gives a polynomial that approx-
imates a given function quite fine in some neighbourhood. The second application is the
computation of values of “complicated functions”. We will present examples for both kinds
of application.
Example 5.32. Consider the function f (x) = sin(x). We want to determine the Taylor
polynomial of degree 6 with expansion point x0 = π2 . Since we have
                 3
                                                                                   sin(x)
                                                                                   T6(x)
−1
−2
                −3
                 −3      −2       −1        0       1     2      3      4      5            6
In particular, we have
                                                  1         1         137
                           T3 (1.2) = 0.2 −         · 0.22 + · 0.23 =     .
                                                  2         3         750
Now we estimate | log(1.2) −      137
                                  750
                                      |:   The remainder term is given by
                                                3! (x − x0 )4     (x − x0 )4
                           R3 (x, x0 ) = −                    = −
                                                x̂4    4!            4x̂4
for some x̂ between x and x0 . For x = 1.2, x0 = 1 we have 1 < x̂ < 1.2 and therefore
                                            (0.2)4            1
                         |R3 (1.2, 1)| =         4
                                                   = 4 · 10−4 4 ≤ 4 · 10−4 .
                                             4x̂             x̂
This leads to
                         | log(1.2) − 0.1826| = |R3 (1.2, 1)| ≤ 4 · 10−4 ,
so we have determined log(1.2) up to three digits.
   Theorem 5.34.
   Let I ⊂ R be an open interval, n ∈ N and f : I → R an n-times continuously
   differentiable function. Suppose that for a ∈ I holds
   If n is odd, then a is not a local extremum. If n is even and f (n) (a) > 0, then a is
   a local minimum. If n is even and f (n) (a) < 0, then a is a local maximum.
120                                                               5 Differentiation of Functions
                                                f n (zl )
                              f (xl ) = f (a) +           (xl − a)n ,
                                                   n!
                                                f n (zr )
                              f (xr ) = f (a) +           (xr − a)n ,
                                                   n!
and 0 6= f n (a), f n (zl ), f n (zr ) have the same sign. If n is odd, then (xl −a)n < 0 < (xr −a)n
and therefore either f (xl ) < f (a) < f (xr ) or f (xl ) > f (a) > f (xr ) so that a is not
a local extremum. If n is even and f (n) (a) > 0, then (xl − a)n , (xr − a)n > 0 and
f (xl ), f (xr ) > f (a) so that a is a local minimum. Finally, if n is even and f (n) (a) < 0,
then (xl − a)n , (xr − a)n > 0 and f (xl ), f (xr ) < f (a) so that a is a local maximum.         2
or to
                                     f˜(x) := f (x) − y = 0 .                                 (5.7)
Thus solving Equation (5.6) means finding a fixed-point of f˜ and (5.7) means finding a
zero of f˜. These are in some kind the most common normalized formulations for “solving”
an equation. First we will present a simple numerical method for finding a fixed point of a
given function based on Banach’s fixed-point theorem. Afterwards we will state Newton’s
method for finding a zero of a given differentiable function.
5.5 Simple methods for the numerical solution of equations                                                           121
d(xn , yn ) ≤ an · d(x0 , y0 )
   Then f possesses exactly one fixed point z ∈ X and for an arbitrary starting point
   z0 ∈ X the recursively defined sequence zn+1 := f (zn ), n ∈ N0 , converges to z.
   Moreover, the following error estimates hold:
                                         ∞
                                         X
                          d(z, zk ) ≤            an · d(z0 , z1 )       (a priori estimate)                   (5.8)
                                           n=k
                                 ∞
                                 X
                   d(z, zk ) ≤         an · d(zk−1 , zk )           (a posteriori estimate)                   (5.9)
                                 n=1
for all k ∈ N.
Proof: First of all we show that f is continuous. Let x ∈ X, ε > 0 and set δ :=                                ε
                                                                                                             a1 +1
                                                                                                                     > 0.
Then for y ∈ X with d(x, y) < δ holds by assumption
                                                                               ε
                           d(f (x), f (y)) ≤ a1 · d(x, y) ≤ a1 ·                   <ε.
                                                                            a1 + 1
From the ε-δ-criterion it follows that f is continuous in x.
for
P∞all n ∈ N. We will show now that (xn )n∈N is a Cauchy sequence. Let ε > 0. Since
   n=0 an converges, there is an N ∈ N such that for all k, m > N with k ≥ m holds
                                            m
                                            X                    ε
                                                   an <                   .
                                            n=k
                                                          d(x0 , x1 ) + 1
Thus (xn )n∈N is a Cauchy sequence. Since (X, d) is a complete, (xn )n∈N converges to some
limit x and since f is continuous we conclude
i.e. x is a fixed point of f . For any other starting point z0 ∈ X and recursively defined
zn+1 := f (zn ), n ∈ N0 , we obtain by the previous that z := limn→∞ zn is also a fixed
point of f . But the assumption applied to u0 := x, v0 := z, un+1 := f (un ) = x,
vn+1 := f (vn ) = z for n ∈ N0 yields
                         0 ≤ d(x, z) = d(un , vn ) ≤ an · d(u0 , v0 ) = an · d(x, z).
Since (an )n∈N0 converges to zero, the right-hand side converges to zero which implies
d(x, z) = 0, i.e. x = z. Thus f has a unique fixed point z ∈ X.
Next we derive (5.8). Let k ∈ N. For arbitrary m ≥ k holds
                                                                                 m−1
                                                                                 X
               d(z, zk ) ≤ d(z, zm ) + d(zm , zk ) ≤ d(z, zm ) +                       d(zn , zn+1 )
                                                                                 n=k
                                               m−1                                      ∞
                                                                             m→∞
                                               X                                        X
                           ≤ d(z, zm ) +               an        · d(z0 , z1 ) −−−→           an · d(z0 , z1 ) .
                             | {z }
                                   −→ 0
                                 m→∞           |n=k
                                                  {z }                                  n=k
                                                 P    ∞
                                              −→      n=k   an
                                              m→∞
Finally, if we define z̃0 := zk−1 and z̃j+1 := f (z̃j ) = zk+j , j ∈ N0 , (5.9) follows from (5.8)
applied to the sequence (z̃j )j∈N0 for j := 1, namely:
                                              ∞
                                              X                            ∞
                                                                           X
                   d(z, zk ) = d(z, z̃1 ) ≤         an · d(z̃0 , z̃1 ) =         an · d(zk−1 , zk ) .
                                              n=1                          n=1
The so-called Banach fixed-point theorem is a special case of Weissinger’s fixed point
theorem.
   Theorem 5.36. Banach fixed-point theorem
   Let (X, d) be a complete metric space and f : X → X be a function on X such that
   d(f (x), f (y)) ≤ q · d(x, y) for all x, y ∈ X, where q ∈ [0, 1) is a fixed nonnegative
   constant less then one a . Then f has exactly one fixed point z ∈ X and for an
   arbitrary z0 ∈ X the recursively defined sequence zn+1 := f (zn ), n ∈ N0 , converges
   to z. Moreover the following error estimates hold for k ∈ N:
                                            qk
                             d(z, zk ) ≤       · d(z0 , z1 )            (a priori estimate)                        (5.10)
                                           1−q
                                    q
                     d(z, zk ) ≤       · d(zk−1 , zk )              (a posteriori estimate)                        (5.11)
                                   1−q
      a
          Functions with this property are called contractions and q is called a contraction constant for f .
Proof: With an := q n for n ∈ N0 the asumptions of Weissinger’s fixed point theorem are
fulfilled. The estimates (5.10) and (5.11) follow directly from (5.8) and (5.9) respectively.
2
We want to reformulate Banach’s fixed-point theorem for the special case where X is
a closed subset of R and d is the Euclidean metric on X, i.e. d(x, y) := |x − y| for
x, y ∈ X. Recall that in this case (X, d) is complete.
5.5 Simple methods for the numerical solution of equations                                       123
   Theorem 5.37.
   Let X ⊂ R be closed and f : X → X be a function such that |f (x)−f (y)| ≤ q·|x−y|
   for all x, y ∈ X, where q ∈ [0, 1) is a fixed nonnegative constant less then one, i.e. f
   is a contraction on X with contraction constant q. Then f has exactly one fixed point
   z ∈ X and for an arbitrary z0 ∈ X the recursively defined sequence zn+1 := f (zn ),
   n ∈ N0 , converges to z. Moreover the following error estimates hold for k ∈ N:
                                       qk
                        |z − zk | ≤       |z1 − z0 |         (a priori estimate)            (5.12)
                                      1−q
                                q
                 |z − zk | ≤       |zk−1 − zk |        (a posteriori estimate)              (5.13)
                               1−q
In practical applications the function f is given but in order to apply Theorem 5.37 an
appropriate domain X with f (X) ⊂ X and a contraction constant q ∈ [0, 1) must be
determined. In practise a closed area X is guessed where a fixed-point of a given function
f might be located. Then f (X) ⊂ X must be verified and a contraction constant must
be found. If this is not possible, the guessed domain X must be changed. The following
Theorem states a standard procedure for finding an contraction constant q on a given
closed interval domain X by using an upper bound of the first derivative of f , which
requires that f is continuously differentiable on X.
   Theorem 5.38.
   Let X ⊂ R be a closed interval and f : X → X be continuously differentiable on
   X. If q := ||f 0 ||∞ = sup{|f 0 (x)| | x ∈ X} < 1, then f is a contraction on X
   with contraction constant q. In particular, by Banach’s fixed-point theorem, f has
   exactly one fixed point in X and for an arbitrary z0 ∈ X the recursively defined
   sequence zn+1 := f (zn ), n ∈ N0 , converges to z and the error estimates (5.12) and
   (5.13) hold.
Proof: Let x, y ∈ X with x < y. By the mean value theorem there is a ξ ∈ (x, y) such
that
             |f (x) − f (y)| = |f 0 (ξ) · (y − x)| = |f 0 (ξ)| · |(y − x)| ≤ q · |x − y|.
This implies that f is a contraction on X with contraction constant q and the conclusion
follows from Theorem 5.37 .                                                            2
Now we will state Newton’s method for finding zeros of differentiable functions f . New-
ton’s method is based on the following simple iteration principle: If x0 is an approximate
solution of f (x) = 0, then f is replaced in a vicinity of x0 by the tangent of f in x0 ,
namely by
                                 t(x) := f 0 (x0 )(x − x0 ) + f (x0 ) .
                                                       f (x0 )
                                        x1 := x0 −               ,
                                                       f 0 (x0 )
15
                                                      f(x)
                                                      t(x)
10
                  5
             y
                  0
                                             x1                 x0
                 −5
                  −1    0      1      2      3         4        5       6    7      8
                                                  x
Proof: a) By the intermediate value Theorem f has a root ξ ∈ (a, b). In order to prove
uniqueness, we assume that f has two distinct roots ξ, η ∈ (a, b) with ξ < η. Since f is
convex, for x1 := a < x := ξ < x2 := η holds
Since f is convex, f 0 is monotonically increasing so that for all x ∈ [ξ, b] holds f 0 (x) ≥
f 0 (η) > 0, i.e. f 0 is positive on [ξ, b]. In particular, f is strictly monotonically increasing
on [ξ, b] so that f (x) > f (ξ) = 0 for all x ∈ (ξ, b].
                                                      f (x0 )
                                         x1 := x0 −
                                                      f 0 (x0 )
Since t(x0 ) = f (x0 ) ≥ 0 and t0 (x0 ) = f 0 (x0 ) > 0 we immediately see that xn+1 ≤ x0 .
Moreover, (f − t)(x0 ) = 0 and (f − t)0 (x) = f 0 (x) − f 0 (x0 ) ≤ 0 for x ≤ x0 imply
f (x) ≥ t(x) for x ≤ x0 . In particular t(ξ) ≤ f (ξ) = 0 ≤ f (x0 ) = t(x0 ) and therefore
ξ ≤ x1 ≤ x0 as t(x) is a straight line. Replacing x0 by x1 we conclude inductively that
the sequence
                                               f (xn )
                             xn+1 := xn − 0            , n ∈ N0 ,
                                               f (xn )
is well-defined, monotonically decreasing and bounded from below by ξ. Thus
                                            η := lim xn
                                                   n→∞
exists and fulfils ξ ≤ η. Since f 0 is bounded from below on [ξ, b] by some positive constant,
we also have
                                                  f (η)
                                          η=η− 0
                                                  f (η)
which implies f (η) = 0. By a) this proves η = ξ.
126                                                               5 Differentiation of Functions
c) Since f 0 is monotonically increasing we have f 0 (x) ≥ f 0 (ξ) ≥ C > 0 for all x ∈ [ξ, b].
Therefore, f (x) ≥ C(x − ξ) for all x ∈ [ξ, b]. In particular, this implies
                                                 f (xn )
                           |xn − ξ| = xn − ξ ≤             for all n ∈ N.
                                                    C
In order to estimate f (xn ) the following function g(x) is considered:
                                                                      K
              g(x) := f (x) − f (xn−1 ) − f 0 (xn−1 )(x − xn−1 ) −      (x − xn−1 )2
                                                                      2
             g 0 (x) = f 0 (x) − f 0 (xn−1 ) − K(x − xn−1 )
             g 00 (x) = f 00 (x) − K ≤ 0 for x ∈ (ξ, b).
Thus g 0 is monotonically decreasing on [ξ, b]. Since g 0 (xn−1 ) = 0, this implies g 0 (x) ≥ 0
for x ∈ [ξ, xn−1 ]. Since g(xn−1 ) = 0, this implies g(x) ≤ 0 for x ∈ [ξ, xn−1 ]. In particular,
                                                                            K
          0 ≥ g(xn ) = f (xn ) − f (xn−1 ) − f 0 (xn−1 )(xn − xn−1 ) −        (xn − xn−1 )2
                                                                            2
                                                   K
             = f (xn ) − f (xn−1 ) + f (xn−1 ) −     (xn − xn−1 )2
                                                   2
                           K
             = f (xn ) −     (xn − xn−1 )2 ,
                           2
that is
                                               K
                                   f (xn ) ≤     (xn − xn−1 )2
                                               2
and we conclude
                                        f (xn )   K
                              |xn − ξ| ≤        ≤    |xn − xn−1 |2 .
                                           C      2C
Finally, by b) (xn )n∈N0   decreases monotonically with limit ξ which directly yields |xn+1 −
xn | ≤ |ξ − xn |.
                                                                                              2
   Remark:
      a) The error estimate given in Theorem 5.39 c) says that Newton’s method (loc-
         ally) converges quadratically. If f 0 is bounded from below by some c1 > 0 and
         if |f 00 | is bounded from above by some c2 , then we may choose C := c1 and
         K := c2 .
Definition 6.1.
Let [a, b] ⊂ R. A set {x0 , x1 , . . . , xn } is called a decomposition or partition of [a, b]
if
                      a = x0 < x1 < x2 < . . . < xn−1 < xn = b.
                                                                                              127
128                                                                               6 The Riemann Integral
a b
It can be readily verified that for two step functions f1 , f2 ∈ T ([a, b]) holds f1 + f2 ∈
T ([a, b]). As well, we have λf1 ∈ T ([a, b]). Hence, T ([a, b]) is a vector space. Furthermore,
since step functions only attain finitely many values, they are bounded.
   Definition 6.3. Integral of step functions
   Let φ ∈ T ([a, b]) and a decomposition {x0 , . . . , xn } of [a, b] be given such that for
   all i = 1, . . . , n
                             φ(x) = ci for all x ∈ (xi−1 , xi ).
   Then the integral of φ is defined as
                               Z    b                n
                                                     X
                                        φ(x) dx :=         cj (xj − xj−1 ).
                                a                    j=1
However, to be sure that the integral is well-defined we need additionally that it is inde-
pendent of the special choice of a decomposition.
   Lemma 6.4.
   Rb                                                                 Rb
    a
      φ(x) dx is independent of the choice of the decomposition, i.e., a φ(x) dx is well-
   defined for all φ ∈ T ([a, b]).
Proof: Let
                      Z1 : a = x0 < x1 < x2 < . . . < xn−1 < xn = b,
                      Z2 : a = y0 < y1 < y2 < . . . < ym−1 < ym = b
be two decompositions of [a, b] such that φ is constant on (xi−1 , xi ) and φ is constant
on (yj−1 , yj ) for all i = 1, . . . , n, j = 1, . . . , m. We distinguish between two cases: 1st
Case: Z2 is a refinement of Z1 . This means that for all i ∈ {1, . . . , n} there exists some
j(i) ∈ {1, . . . , m} with xi = yj(i) . Then for i holds
                m                                 n             j(i)
                X                                 X             X
                            dj (yj − yj−1 ) =                               ci (yk − yk−1 )
                j=1                               i=1 k=j(i−1)+1
                                                  n                  j(i)
                                                  X                  X
                                              =            ci                (yk − yk−1 )
                                                  i=1           k=j(i−1)+1
                                                  n
                                                  X                                        n
                                                                                           X
                                              =            ci (yj(i) − yj(i−1) ) =               ci (xi − xi−1 ).
                                                  i=1                                      i=1
                                                                                                                        2
The integral can be seen as a mapping from the space T ([a, b]) to R. In the literature,
mappings from vector spaces to a field (in this case R) are called functionals. The following
result shows that the integral is a linear and monotone functional.
   Theorem 6.5.
   Rb
    a
       : T ([a, b]) → R is linear and monotonic, that is, for all φ, ψ ∈ T ([a, b]) and all
   λ, µ ∈ R holds
     (i)
                             Z b                       Z b             Z b
                                    λφ(x) + µψ(x) dx = λ     φ(x) dx + µ     ψ(x) dx
                               a                                              a                      a
a b
a b
It can be seen from Fig. 6.1 and Fig. 6.2 that the integral can be seen as the “signed area”
between the function graph and the x-axis. “Signed” means that the area of the negative
parts of the function has to be counted negative.
By the above definition, we can directly deduce that for φ ∈ T ([a, b]) holds
                          Z       b                  Z     b                   Z    b
                                      φ(x) dx =                φ(x) dx =                φ(x) dx.
                              a                        a                        a
Note that in general the upper integral does not coincide with the lower integral. For
instance, consider the function f : [0, 1] → R with
                                           (
                                            1 : x∈Q
                                 f (x) =
                                            0 : x ∈ R\Q
                                                                                                                                                                       131
Then we have
                                                   Z        1                                                Z    1
                                                                f (x) dx = 1 > 0 =                                    f (x) dx.
                                                        0                                                     0
   Theorem 6.7.
   Let f, g : [a, b] → R be bounded. Then
    (i)
                                           Z       b                                         Z       b                        Z       b
                                                       f (x) + g(x) dx ≤                                 f (x) dx +                       g(x) dx.
                                               a                                                 a                                a
    (ii)
                                           Z       b                                         Z       b                        Z       b
                                                       f (x) + g(x) dx ≥                                 f (x) dx +                       g(x) dx.
                                               a                                                 a                                a
Proof:
(i) Let ε > 0. Then there exist φ, ψ ∈ T ([a, b]) with φ ≥ f , ψ ≥ g and
               Z       b                                    Z        b                            Z      b                                  Z       b
                                     ε                                                                                ε
                           f (x) dx + ≥                                  φ(x) dx,                            g(x) dx + ≥                                ψ(x) dx.
                   a                 2                           a                                   a                2                       a
Then f + g ≤ φ + ψ and
           Z   b                                                     Z        b                                                                                   
                   f (x) + g(x) dx = inf        ζ(x) dx : ζ ∈ T ([a, b]) with ζ ≥ f + g
           a                                 a
                                     Z b
                                   ≤ φ(x) + ψ(x) dx
                                      a
                                        Z b               !     Z b             !
                                                       ε                      ε
                                   ≤        f (x) dx +      +       g(x) dx +
                                         a             2         a            2
                                     Z b             Z b
                                   = f (x) dx +          g(x) dx + ε.
                                                                a                           a
Then λφ ≥ λf and
             Z    b                    Z       b               Z b         Z b
                      λf (x) dx ≤                  λφ(x) dx = λ φ(x) dx ≤ λ f (x) dx + λε.
              a                            a                                    a                              a
The opposite inequality follows from the previous one applied to g(x) := λf (x) and
µ := λ1 > 0:
                  Z       b                Z       b                            Z       b                      Z       b
                                                                                                      1
                              f (x) dx =               µg(x) dx ≤ µ                         g(x) dx =                      λf (x) dx.
                      a                        a                                    a                 λ            a
(iv): This result can be shown by using (iii) and the fact
                                           Z       b                     Z b
                                                           − f (x) dx = − f (x) dx.
                                               a                                            a
                                                                                                                                               2
   Definition 6.8.
   A bounded function f : [a, b] → R is called Riemann-integrable if
                                                   Z           b                Z       b
                                                                   f (x) dx =               f (x) dx.
                                                           a                        a
We obviously have that T ([a, b]) ⊂ R([a, b]). In the following we state that monotonic
functions as well as continuous functions are Riemann-integrable. The proof is not presen-
ted here.
                                                                                         133
   Remark:
   As it holds true for summation, the integration variable can be renamed without
   changing the integral. That is, for f ∈ R([a, b]) holds
                                     Z    b                 Z     b
                                              f (x) dx =              f (t) dt.
                                      a                       a
   Theorem 6.9.
   Let f : [a, b] → R be continuous or monotonic. Then f is Riemann-integrable.
So far we did not compute any integrals. We will now give an example of an integral that
will be computed according to the definition. This will turn out to be really exhausting
even for this quite simple example.
RExample
   1            R 1 Consider the function f : [0, 1] → R with f (x) = x. Determine
              6.11.
  0
     f (x) dx = 0 x dx. First we consider two sequences of step functions (φn )n∈N , (ψn )n∈N
 with
                                               
                          k−1           k−1 k
                 φn (x) =     for x ∈       ,     ,     k ∈ {1, . . . , n},
                            n             n n
                                          
                          k         k−1 k
                 ψn (x) = for x ∈        ,    ,     k ∈ {1, . . . , n}.
                          n           n n
                              1                n               
                                                  k−1     k k−1
                         Z                    X
                                  φn (x) dx =         ·     −
                          0                   k=1
                                                   n      n   n
                                                     n
                                                    X   k−1 1
                                                  =        ·
                                                    k=1
                                                         n   n
                                                      n
                                                    1 X
                                                  = 2    (k − 1)
                                                   n k=1
                                                      1 n(n − 1)  1  1
                                                  =     2
                                                          ·      = −
                                                      n     2     2 2n
134                                                                                         6 The Riemann Integral
and                                            n
                              1                              
                                                        k k−1
                          Z                   X   k
                                  ψn (x) dx =       ·     −
                          0                   k=1
                                                  n     n   n
                                                      n
                                                      X k         1
                                                  =              ·
                                                          k=1
                                                                n n
                                                      n
                                                    1 X
                                                  = 2    k
                                                   n k=1
                                          1 n(n + 1)        1     1
                                            2
                                              ·   =       = + .
                                          n         2       2 2n
In particular, we have for all n ∈ N that
                          Z 1             Z 1          Z 1
                1   1                                                 1  1
                  −    =      φn (x) dx ≤       x dx ≤     ψn (x) dx = +
                2 2n       0                0           0             2 2n
and thus                                          Z       1
                                                                    1
                                                              x dx = .
                                                      0             2
By the definition of the integral, it is not difficult to obtain that for f ∈ R([a, b]) and
c ∈ (a, b) holds
                         Z b            Z c            Z b
                             f (x) dx =     f (x) dx +     f (x) dx.
                              a                           a                  c
To make this formula also valid for c ≥ b or c ≤ a, we define for a ≥ b that
                              Z b               Z a
                                  f (x) dx := −     f (x) dx.
                                           a                         b
Now we present the mean value theorem of integration and its manifold consequences.
   Theorem 6.12. Mean Value Theorem of Integration
   Let f, g : [a, b] → R be continuous and let g(x) ≥ 0 for all x ∈ [a, b] (i.e., g ≥ 0).
   Then there exists some x̂ ∈ [a, b] such that
                              Z       b                                  Z   b
                                          f (x)g(x) dx = f (x̂) ·                g(x) dx.
                                  a                                      a
Proof: Let
Since g ≥ 0, for all x ∈ [a, b] holds mg(x) ≤ f (x)g(x) ≤ M g(x) and, by the monotonicity
of the integral holds
       Z b            Z b              Z b                Z b               Z b
    m      g(x) dx =      mg(x) dx ≤       f (x)g(x) dx ≤     M g(x) dx ≤ M     g(x) dx.
       a              a                           a                               a                  a
Since
                  min{f (x) : x ∈ [a, b]} ≤ µ ≤ max{f (x) : x ∈ [a, b]},
the intermediate value theorem implies that there exists some x̂ ∈ [a, b] with µ = f (x̂)
and thus                         Z b           Z b
                          f (x̂)     g(x) dx =     f (x)g(x) dx.
                                       a                  a
                                                                                       2
   Corollary 6.13.
   Let f : [a, b] → R be continuous. Then there exists some x̂ ∈ [a, b] such that
                               Z       b
                                           f (x) dx = f (x̂) · (b − a).
                                   a
   Definition 6.14.
   Let I be an interval and f : I → R be continuous. Then a differentiable function
   F : I → R is called an antiderivative of f if F 0 = f .
   Theorem 6.15.
   Let I be an interval, f : I → R be continuous and a ∈ I. For x ∈ I define
                                           Z x
                                   F (x) =     f (ξ)dξ.
                                                      a
Proof: Let x ∈ I and h 6= 0 such that x + h ∈ I. Then, by using the mean value theorem
of integration we obtain
                                           Z x+h           Z x         
                1                        1
                  (F (x + h) − F (x)) =           f (ξ)dξ −     f (ξ)dξ
                h                       h    a               a
                                        1 x+h
                                           Z
                                                          1
                                      =        f (ξ)dξ = · hf (x̂) = f (x̂)
                                        h x               h
for some x̂ between x and x + h. If h tends to 0 then x̂ → x and thus
                                1
                            lim (F (x + h) − F (x)) = f (x).
                            h→0 h
   Theorem 6.16.
   Let I be an interval, f : I → R be given and let F : I → R be an antiderivative of
   f , i.e., F 0 = f . Then G : I → R is an antiderivative of f if and only if F − G is
   constant.
The next result now states that integrals can be determined by inversion of differentiation.
   We write                      Z       b
                                             f (x) dx = F (x)|x=b
                                                              x=a .
                                     a
By Theorem
         R a 6.15, we know that F0 is an antiderivative of f .R bIn particular,
                                                                             R b we have that
F0 (a) = a f (ξ)dξ = 0 and thus F0 (b) − F0 (a) = F0 (b) = a f (ξ)dξ = a f (x) dx. Let
F I :→ R be an antiderivative of f . Theorem 6.16 now implies that there exists some
c ∈ R with F (x) = F0 (x) + c for all x ∈ I. Therefore
                                                                           Z   b
         F (b) − F (a) = (F0 (b) + c) − (F0 (a) + c) = F0 (b) − F0 (a) =           f (x) dx.
                                                                           a
                                                                                               2
The above result gives rise to the following notation for an antiderivative:
                                    Z
                                       f (x) dx := F (x).
                                                            R
                                           f (x)                f (x) dx
                                                              1
                                    xn ,     n∈N                  xn+1
                                                            n+1
                                    x−1 ,     x 6= 0         log(|x|)
                                                              1
                          x−n ,     x 6= 0, n ∈ N, n 6= 1         x1−n
                                                            1−n
                                       exp(x)                 exp(x)
                                       sinh(x)               cosh(x)
                                        cosh(x)              sinh(x)
                                            1
                                       √                    arsinh(x)
                                          1 + x2
                                      1
                                  √         , x>1           arcosh(x)
                                    x2 − 1
                                     1
                                           , |x| < 1        artanh(x)
                                  1 − x2
                                         sin(x)             − cos(x)
                                       cos(x)                   sin(x)
                               1
                               2
                                     = 1 + tan2 (x)             tan(x)
                            cos (x)
                                  1
                             √         , |x| < 1            arcsin(x)
                                1 − x2
                                   1
                            −√          , |x| < 1           arccos(x)
                                 1 − x2
                                       1
                                                            arctan(x)
                                     1 + x2
Therefore,
  Z       b                        Z   b                                                                                Z   φ(b)
                                                                                                           x=φ(b)
                       0
              f (φ(t))φ (t) dt =                          0
                                           (F ◦ φ) (t)dt = (F ◦                 φ)(t)|t=b
                                                                                      t=a        =   F (x)|x=φ(a)   =              f (x) dx.
      a                            a                                                                                    φ(a)
                                                                                                                                               2
As a direct conclusion of this results, we can formulate the following:
   Theorem 6.19. Integration by Substitution II
   Let I be an interval, g : I → R be continuously differentiable and injective with
   inverse function g −1 : g(I) → R. Let f : J → R with J ⊂ g(I). Then
                                           Z    b                    Z   g −1 (b)
                                                    f (x) dx =                      f (g(t))g 0 (t) dt.
                                            a                        g −1 (a)
Example 6.20. We can use the substitution rule to determine the area of an ellipse. The
equation of an ellipse is given by
                                       x2 y 2
                                          + 2 = 1.
                                       a2     b
This leads to                        r
                                           x2
                              y = ±b 1 − 2 ,       x ∈ [−a, a].
                                           a
As a consequence, the area of an ellipse is given by
                              Z a r                 Z ar
                                         x2                     x2
                       A=2         b 1 − 2 dx = 2b        1 − 2 dx
                                −a       a           −a         a
6.2 Integration Rules                                                                                         139
                                          q
Now we set g(t) = a sin(t) and f (x) = 1 − xa2 . According to the substitution rule, we
                                                 2
now have
           Z ar               Z arcsin( a ) r
                      x2                a          a2 sin2 (t)
                 1 − 2 dx =                   1−         2
                                                               (a sin)0 (t) dt
            −a        a        arcsin(− a
                                        a)
                                                       a
                              Z π q                                  Z π
                                 2                                        2
                            =                  2
                                      1 − sin (t)a cos(t) dt = a            cos2 (t) dt.
                                              − π2                                                     − π2
With
              1                       1                            1         1
    cos2 (t) = (exp(it) + exp(−it))2 = (exp(2it) + 2 + exp(−2it)) = cos(2t) + ,
              4                       4                            2         2
we obtain
                      a
                          r              Z π
                           x2
             Z
                                           2
       A =2     b 1 − 2 dx = 2ab             cos2 (t)dt
             −a            a               π
                                          −2
               Z π                           Z π
                 2 1                1           2
         =2ab           cos(2t) + dt = ab          cos(2t) + 1dt
                − π2 2              2         − π2
                                 t= π !                                     
                 1                2                1          1         π  π
         =ab ·       sin(2t) + t        = ab ·      sin(π) − sin (−π) + +     = πab.
                 2                 t=− π            2          2         2  2
                                              2
b) For a ≥ 0, b ≥ 0, determine
                                                          Z       b       √
                                                                      exp( x)dx.
                                                              a
140                                                                                                     6 The Riemann Integral
   Consider the substitution x = t2 . Then dx = 2t and thus dx = 2tdt. For the integration
                                           dt            √        √
   bounds, consider a = tl and b = tu which yields tl = a, tu = b. We now get
                          2           2
                                                 √
              Z     b       √
                                           Z         b
                        exp( x)dx =              √
                                                          exp(t)2tdt
                a                             a
                                             Z √          b
                                       =2         √
                                                              t exp(t)dt
                                                      a
                                                                                 √            √ √        x=b
                                                                               t=√b
                                       = 2 exp(t)(t −                       1)|t= a   = 2 exp( x)( x − 1)x=a .
                                                                                                                   g 0 (x)
By using the substitution rule, we can also integrate expressions of type                                          g(x)
                                                                                                                           .
   Corollary 6.22.
   For a differentiable function g : [a, b] → R with g(x) 6= 0 for all x ∈ [a, b] holds
                                             b
                                                 g 0 (x)
                                       Z
                                                         dx = log(|g(x)|)|x=b
                                                                          x=a .
                                         a       g(x)
                                 b                                    b                      b
                                                                                                 − cos0 (x)
                            Z                                 Z                          Z
                                                                           sin(x)
                                     tan(x)dx =                                   dx =                      dx
                             a                                    a        cos(x)     a           cos(x)
                                                                                     x=b
                                                         =−           log(| cos(x)|)|x=a         .
6.2 Integration Rules                                                                 141
b) For a, b ∈ R holds
                                         b
                                                         1 b 2x
                                    Z                      Z
                                                x
                                                    dx =            dx
                                     a       x2 + 1      2 a x2 + 1
                                                         1 b (x2 + 1)0
                                                           Z
                                                       =               dx
                                                         2 a x2 + 1
                                                         1             x=b
                                                       = log(|x2 + 1|)x=a .
                                                         2
   We use integration by parts with f 0 (x) = f (x) = exp(x) and g(x) = x. Then
                       Z b                               Z b
                                                   x=b
                           exp(x)xdx = x exp(x)|x=a −        exp(x)dx
                            a                                             a
                                                  =   x exp(x)|x=b
                                                               x=a   − exp(x)|x=b
                                                                              x=a
                                                  = (x − 1) exp(x)|x=b
                                                                   x=a .
   It is very important to note that an unlucky choice of f and g may be misleading. For
   instance, if we choose f 0 (x) = x and g(x) = exp(x). Then integration by parts gives
                    Z b                           x=b Z b
                                        1 2                1 2
                        x exp(x)dx = x exp(x)         −     x exp(x)dx.
                     a                  2           x=a   a 2
   This formula is mathematically correct, but it does not lead to the explicit determin-
   ation of the integral.
142                                                                                                       6 The Riemann Integral
   Defining f (x) =    1
                      λ+1
                          xλ+1 ,              g(x) = log(x), integration by parts leads to
                                  Z       b
                                              xλ log(x)dx
                                      a
                                                   x=b Z b
                                   1    λ+1
                                                                 1          1
                             =        x     log(x)     −            xλ+1 dx
                                  λ+1                x=a     a λ+1           x
                                                   x=b Z b
                                   1                             1
                             =        xλ+1 log(x)      −            xλ dx
                                  λ+1                x=a     a λ +  1
                                                   x=b                    x=b
                                   1    λ+1
                                                              1           
                                                                       λ+1 
                             =        x     log(x)     −           x
                                  λ+1                x=a   (λ + 1)2        
                                                                             x=a
                                                              x=b
                                   1                       1     
                             =        xλ+1 log(x) −                  .
                                  λ+1                    λ+1                                  x=a
                                                      5    3   2
Example 6.26. We want to integrate f (x) = x +2x     +4x −3x
                                                  (x2 +1)2
                                                             . First of all we decompose f
into a polynomial part and a strict proper function using polynomial division
                                           x2
                                Z
                                   x dx =     + const.
                                           2
The strict proper part has a partial fraction decomposition
                    4x2 − 4x           A      B       A      B
                                   =      +        +     +         ,
                 (x + i)2 (x − i)2   x + i (x + i)2 x − i (x − i)2
                                                          1−i 1+i
                Z
                              1
                   f (x) dx = x2 + 2 arctan(x) −               −       + const.
                              2                           x+i x−i
                              1                           2x − 2
                            = x2 + 2 arctan(x) −                 + const.
                              2                           x2 + 1
Example 6.28. a) For integrating the function exp(−x) on the interval [0, ∞), we com-
  pute                 Z ∞                    Z b
                            exp(−x)dx = lim       exp(−x)dx
                              0                                      b→∞       0
                                                                = lim − exp(−x)|x=b
                                                                                x=0
                                                                     b→∞
                                                                =1 − lim exp(−b) = 1.
                                                                          b→∞
   we have                                                           (
                                          Z       ∞                       1
                                                       1                 α−1
                                                                                       : α > 1,
                                                         dx =
                                              1       xα                 ∞             : α ≤ 1.
We now define the integral of functions defined on the whole real axis.
   Definition 6.29.
   Let f : R → R be a function with the property that for all a, b ∈ R the restriction
   of f to [a, b] belongs to R([a, b]). If there exists some c ∈ R such that both integrals
                                Z ∞                 Z c
                                      f (x)dx,           f (x)dx
                                          c                                −∞
b) The integral
                                                     Z   ∞
                                                              xdx
                                                         −∞
The majorant criterion for series says that if the absolut values of the addends of a given
series can be bounded from above by the addends of a convergent series, then (absolut)
convergence of the given series can be concluded.
Conversely, the minorant criterion for series says that if the addends of a given series can
be bounded from below by the addends of a series that diverges to +∞, then also the
given series diverges to +∞.
Analogue criteria hold true for integrals. We skip the proofs since they are totally ana-
logous to those of the majorant and minorant criteria.
6.4 Integration on Unbounded Domains and Integration of Unbounded Functions                       147
   Theorem 6.31.
  Let f, g : [a, ∞) → R such that for all b ∈ [a, ∞), the restrictions of f and g to
  [a, b] are Riemann-integrable.
                                                       R∞
    (i) RIf |f (x)| ≤ g(x) for all x ∈ [a, ∞) and a g(x)dx converges, then also
           ∞
          a
             f (x)dx converges and it holds that
                          Z ∞         Z ∞                Z ∞
                                     
                          
                              f (x)dx ≤
                                              |f (x)|dx ≤     g(x)dx.
                          a                     a                      a
                                                            R∞                       R∞
   (ii) If g(x) ≤ f (x) for all x ∈ [a, ∞) and               a
                                                                 g(x)dx = +∞, then   a
                                                                                          f (x)dx =
        +∞.
   and thus
                                              x     1
                                                 ≥    .
                                        x2    +1   2x
   Since                                    Z     ∞
                                                          1
                                                            dx
                                              1          2x
   is divergent,                      Z       ∞
                                                         x
                                                            dx
                                          1         x2   +1
   is divergent, too.
b) Consider                                             √
                                      Z     ∞
                                                         x
                                                            dx.
                                        1           x2   +1
   We have                                                  √
                                            3/2              x
                                  lim x             ·           = 1.
                                  x→∞                   x2   +1
   For large enough x ∈ R, we therefore have
                                       √ 
                                                x2
                               
                                3/2
                               x ·      x 
                                              =      ≤2
                                    x2 + 1  x2 + 1
   and thus                            √
                                         x      2
                                             ≤      .
                                      x2 + 1   x3/2
148                                                                                                     6 The Riemann Integral
   Since                                                   Z       ∞
                                                                            2
                                                                                 dx
                                                               1        x3/2
   is convergent, so
                                                          Z    ∞        √
                                                                            x
                                                                               dx
                                                           1        x2      +1
   is convergent, too.
Now we use integrals on unbounded domains to check whether series are convergent or
not.
   Theorem 6.33. Integral Criterion for Series
   Let f : [0, ∞) → R be monotonically decreasing and non-negative. Then the series
                                                           ∞
                                                           X
                                                                    f (k)
                                                              k=0
Therefore
                      Z     n+2                 n+1
                                                X                       Z       n+1                  n
                                                                                                     X
                                  f (x)dx ≤              f (k) ≤                      f (x)dx ≤            f (k).
                        1                          k=1                      0                        k=0
Using this inequality, we can directly conclude that, if one of the limits (integral or sum)
as n → ∞ exists, then the other limit (sum or integral) also exists.
In case of convergence necessarily (f (k))k∈N0 is a zero sequence. Therefore
                n
                X                 Z    n+1                     n
                                                               X                          n+1
                                                                                          X
           0≤         f (k) −                f (x) dx ≤                 f (k) −                 f (k) = f (0) − f (n + 1)
                k=0                0                           k=0                        k=1
   Remark:
                              R∞
   Since the convergence of a f (x)dx does not depend on a ∈ R, the aboveP   result can
   also be slightly generalised in a way that the Rconvergence of the series ∞ k=a f (k)
                                                    ∞
   for a ∈ N is equivalent to that of the integral a f (x)dx.
Example 6.34. a) Using the results of Example 6.28 b), we see that
                                                         ∞
                                                        X    1
                                                        k=1
                                                            kα
   We have
                                                            ( (log(x))1−α x=n
                         ∞
                                                                       : α 6= 1,
                    Z
                                  1                               1−α      
                                        dx =  lim               x=2
                     2       x(log(x))α      n→∞
                                                      log(log(x))|x=n  :α=1
                                                                  x=2
                                             (            1−α
                                                − (log(2))
                                                      1−α
                                                              : α > 1,
                                           =
                                                ∞             : α ≤ 1.
a b
Example 6.36. a) For integrating the function log(x) from 0 to 1, we first compute
                     Z     1                     Z     1                                                  Z    1
                                                                                                                      1
                               log(x)dx =                  1 · log(x)dx =       x log(x)|x=1
                                                                                         x=ε          −            x · dx
                       ε                           ε                                                       ε          x
                                               = x(log(x) −         1)|x=1
                                                                       x=ε      .
   Therefore                      Z       1                         Z    1
                                              log(x)dx = lim                 log(x)dx
                                   0                          ε&0    ε
                                                            = log(1) − 1 − lim ε(log(ε) − 1)
                                                                                        ε&0
                                                            = − 1 − lim ε log(ε)
                                                                         ε&0
                                                                               log(ε)
                                                            = − 1 − lim                 1
                                                                         ε&0
                                                                                        ε
                                                                                    1
                                                                                    ε
                                                            = − 1 − lim                     = −1 .
                                                                         ε&0   − ε12
Again we can formulate a majorant and a minorant criterion for the convergence of in-
tegrals of unbounded functions.
   Theorem 6.37.
   Let f, g : (a, b] → R such that for all ε > 0, the restrictions of f and g to [a + ε, b]
   are Riemann-integrable.
                                                 Rb                             Rb
     (i) If |f (x)| ≤ g(x) for all x ∈ (a, b] and a g(x)dx converges, then also a f (x)dx
6.4 Integration on Unbounded Domains and Integration of Unbounded Functions                                  151
                                                           Rb                               Rb
    (ii) If g(x) ≤ f (x) for all x ∈ (a, b] and            a
                                                                g(x)dx = +∞, then           a
                                                                                                 f (x)dx = +∞.
                                  x=b
                          sin(x)          sin(b)
                       lim √          = lim √ − sin(1) = − sin(1).
                      b→∞     x x=1 b→∞         b
                          R ∞ cos(x)
   Therefore, the integral 0 √x dx converges.
Example 6.39.
                    Z   1                      Z   ε     Z 1
                             1                 1              1
                            p dx = lim        p dx + lim     p dx
                     −1      |x|   ε%0     −1  |x|   ε&0 ε    |x|
                                             √ x=ε         √ x=1
                                    = lim −2 −x    + lim 2 x
                                                            x=−1
                                                                   =4                 x=ε
                                     ε%0                             ε&0
where the integrand f additionaly depends on some free parameter x ∈ I, where I is some
real interval. Precisely, the function f : I × [a, b] → R has the property that for each fixed
x ∈ I, the function f (x, ·) : [a, b] → R, y 7→ f (x, y) is Riemann integrable.
We will investigate continuity and differentiability of the integral function
                                                     Z   b
                               F : I → R, x 7→               f (x, y) dy .
                                                     a
Note that a uniformly continuous function is continuous. In general the converse is not
true, but if the domain of a continuous function is compact it is already uniformly con-
tinuous.
   Theorem 6.41.
   Let D ⊂ Rn , n ∈ N, be compact. A continuous function f : D → Rm , m ∈ N, is
   uniformly continuous.
Proof: Suppose that f is not uniformly continuous. Then there exists an ε > 0 such that
for each n ∈ N there are xn , yn ∈ D with ||xn − yn || < n1 and ||f (xn ) − f (yn )|| > ε. Since
D is compact, there are convergent subsequences (xnk )k∈N and (ynk )k∈N of (xn )n∈N and
(yn )n∈N which necessarily converge to the same limit. Set
Proof:
Since f is continuous, for each fixed x ∈ I, the function f (x, ·) : [a, b] → R, y 7→ f (x, y) is
continuous too and therefore integrable on [a, b]. Hence F is well-defined. Now let x0 ∈ I
and I0 ⊂ I be a compact interval that contains x0 . If x0 is an inner point of I then we
can also choose I0 such that x0 is an inner point of I0 . Now I0 × [a, b] is compact in R2
and hence f is uniformly continuous on I0 × [a, b] by Theorem 6.41. This means that for
given ε > 0 there is a δ > 0 such that for all x ∈ I0 with |x − x0 | < δ and all y ∈ [a, b]
holds |f (x, y) − f (x0 , y)| < ε.
Thus for x, x0 ∈ I0 with |x − x0 | < δ and y ∈ [a, b] we have
                      Z b                            Z b
                                                    
 |F (x) − F (x0 )| =     f (x, y) − f (x0 , y) dy  ≤   |f (x, y) − f (x0 , y)| dy ≤ ε(b − a) .
                          a                                            a
Therefore F is continuous in x0 . 2
Proof:
Let x0 ∈ I. Then for x ∈ I\{x0 } and each y ∈ [a, b] the mean value theorem applied to
f (·, y) supplies a ξx,y between x and x0 such that
                                        b                                                  b
             F (x) − F (x0 )                f (x, y) − f (x0 , y)
                                  Z                                                  Z
                                                                                               ∂f
                             =                                    dy =                            (ξx,y , y) dy .
                 x − x0             a              x − x0                              a       ∂x
Example 6.44.
                                                   Z     π
                                                             sin(tx)
                                      F (x) :=                       dt
                                                     1          t
154                                                                   6 The Riemann Integral
                                             Z       π
                                    0
                                  F (x) :=               cos(tx) dt
                                                 1
                                          Z      π
                               00
                              F (x) := −             t sin(tx) dt
                                             1
For parameter-dependent improper intergrals analog statements for continuity and differ-
entiability hold true which we will state without proof.
   Definition 6.45.
   Let a ∈ R, I ⊂ R be an intervalR ∞ and f : I × [a, ∞) → R. Suppose that
                                                                        R ∞for each fixed
   x ∈ I the improper integral a f (x, y) dy exists. Then the integral a f (x, y) dy is
   called uniformly convergent if for each ε > 0 there is a constant K > a such that
   for all x ∈ I and all b1 , b2 ≥ K holds
                                     Z b2             
                                                      
                                     
                                          f (x, y) dy <ε
                                                       
                                        b1
   Theorem 6.46.
   Let a ∈ R and I ⊂ R be an interval. If f : I × [a, ∞) → R is continuous and
   continuously differentiable with respect to the first variable x and if the integrals
                         Z ∞                      Z ∞
                                                        ∂f
                              f (x, y) dy and              (x, y) dy
                           a                       a    ∂x
   converge on each compact subset of I uniformly, then
                                            Z ∞
                           F : I → R, x 7→      f (x, y) dy
                                                         a
the graph of f is rotated around the x-axis, then the enclosed volume Vrot can easily be
computed by
                                    Z       b                         Z       b
                                                      2
                           Vrot =               πf (x) dx = π                     f (x)2 dx .
                                        a                                 a
                                                  x2 y 2
                                                     + 2 = 1,
                                                  a2  b
with semi-axes a, b > 0.
                                    q
                                                          x 2
In this case f : [−a, a] → R, x 7→ b 1 −                      .   Hence
                                                            
                                                          a
                                Z   a                     x 2 Z   a
                                                  2                        2
                   Vrot    = π     f (x) dx = π     b 1−           dx
                                −a               −a         a
                                       1            4
                           = πb2 (x − 2 x3 )|x=a        2
                                             x=−a = πab .
                                      3a            3
Now we will derive a formula for computing the lateral surface Mrot of solids of revolution.
First recall that the lateral surface MC of a cone with circular ground face of radius r and
lateral height l is given by
                                         MC = πrl .
Thus the lateral surface MtC of a truncated cone with circular ground face of radius r1 ,
circular top face of radius r2 < r1 and lateral height l is given by
where l1 is the lateral height of the complete cone and l2 that of its truncated top. (Recall
that rl11 = rl22 .)
Now let f : [a, b] → R≥0 be a non-negative function which is continuously differentiable
on (a, b). We want to approximate its lateral surface area by summing up lateral surfaces
of certain truncated cones. Precisely, consider a decomposition Z: a = x0 < x1 < ... <
xn−1 < xn = b of [a, b] and define yi := f (xi ), ∆xi = xi+1 − xi and ∆yi = yi+1 − yi . Then
the sum M (Z) of all lateral surfaces
                                   p of the n truncated cones with circular faces of radii
yi , yi+1 and lateral heights li := (∆xi )2 + (∆yi )2 , i = 0, ..., n − 1, is given by
                                n−1
                                X                        p
                   M (Z) =                  π(yi + yi+1 ) (∆xi )2 + (∆yi )2
                                 i=0
156                                                                              6 The Riemann Integral
                                                             s
                                     n−1                                   2
                                     X   yi + yi+1                    ∆yi
                             = 2π                        ·   1+                  · ∆xi
                                     i=0
                                                2                     ∆xi
Now, roughly speaking, for ∆xi → 0 the right-hand side converges to the integral
                                           Z    b        p
                            Mrot = 2π               f (x) 1 + f 0 (x)2 dx .
                                            a
      c) A C 1 -curve c ∈ C 1 ([a, b], Rn ) is called smooth or regular if for all t ∈ [a, b].
         holds
                                    c0 (t) = (c01 (t), ...., c0n (t)) 6= 0 .
Example 6.51.       a) The curve c : [0, 2π] → R2 , t 7→ (r cos(t), r sin(t))T , r > 0, de-
    scribes a circle in R2 of radius r. It is a closed C 1 -curve with derivative c0 (t) =
    (−r sin(t), r cos(t))T . Since cosine and sine do not have common roots, c0 (t) 6= 0 for
    all t so that c is also smooth.
  b) The curve c : [0, T ] → R2 , t 7→ (rt − a sin(t), r − a cos(t))T , T, a, r > 0 is called
     a cycloid. It is a C 1 -curve with with derivative c0 (t) = (r − a cos(t), a sin(t))T . If
     a = r, then then c0 (2πk) = 0 for all k ∈ N. Hence c is not smooth in this case.
6.6 Solids of revolution, path integrals                                                                    157
c̃ : [α, β] → Rn , τ 7→ c(h(τ ))
has the same shape and the same oriented direction. In this case the function h is called
a reparametrisation. In case of C 1 -curves also only C 1 -reparametrisations h with h0 > 0
are permitted.
In general curves c1 and c2 which distinguish themselves only through a reparametrisation
are considered as “equal”.
Now we want to compute the length of a C 1 -curve c : [a, b] → Rn . This is done by
approximation by polygonal paths. For a given decomposition Z := {a = t0 < t1 < .... <
tm = b}, m ∈ N, of the interval [a, b] the length L(Z) of the polygonal path with corners
c(ti ) is given by
                      m−1                               m−1
                      X                                 X            c(ti+1 ) − c(ti )
             L(Z) =         ||c(ti+1 ) − c(ti )|| =             ||                     || · (ti+1 − ti ).
                      i=0                               i=0
                                                                        ti+1 − ti
If the right-hand side converges for ti+1 − ti → 0 then the limit is the the curve length
L(c) which is given by
                                          Z b
                                  L(c) =      ||c0 (t)|| dt.
                                                    a
   Definition 6.52.
   If the set {L(Z) | Z is a decomposition of [a, b]} is bounded from above, then the
   curve c : [a, b] → R is called rectifiable and
   Theorem 6.53.
   Each C 1 -curve c is rectifiable and
                                                Z       b
                                       L(c) =               ||c0 (t)|| dt.
                                                  a
158                                                                                        6 The Riemann Integral
Proof: Let Z = {a = t0 < t1 < .... < tm = b}, m ∈ N, be a decomposition of the interval
[a, b]. Using the mean value theorem for c, we have
                        v                                v
                    m−1
                    X uX
                        u n                          m−1
                                                     X uX
                                                         u n
         L(Z) =         t (ck (tj+1 ) − ck (tj ))2 =     t (c0 (τk ))2 (tj+1 − tj )
                                                             k    j
                       j=0    k=1                                      j=0     k=1
We will estimate |L(Z) − R(Z)|. Let ε > 0. Since c0k is uniformly continuous on [a, b],
there is a δ > 0 such that for all t, t̃ ∈ [a, b] with |t − t̃| < δ holds |c0k (t̃) − c0k (t)| < ε for all
k = 1, ..., n. Thus if Z fulfils ||Z|| < δ, then
                                         m−1                                         
                                         X                                           
                                                  0             0
                  |L(Z) − R(Z)| =            (||c (τj )|| − ||c (tj )||) (tj+1 − tj )
                                                                                     
                                                                                     
                                                  j=0
                                             m−1
                                             X
                                        ≤           | ||c0 (τj )|| − ||c0 (tj )|| | (tj+1 − tj )
                                              j=0
                                             m−1
                                             X
                                        ≤           ||c0 (τj ) − c0 (tj )|| (tj+1 − tj )
                                              j=0
                                             √
                                        ≤         nε(b − a) −→ 0 ,
                                                                    ε&0
where τj := (τ1j , ..., τnj ) and c0 (τj ) := (c01 (τ1j ), ..., c0n (τnj )).
               Rb
Since R(Z) → a ||c0 (t)|| dt for ||Z|| → 0, this also holds for L(Z).
Example 6.54. The length of a cycloid c(t) = (r(t − sin(t)), r(1 − cos(t)))T , 0 ≤ t ≤ 2π,
can be calculated as follows:
   Definition 6.55.
   Let c : [a, b] → R be a C 1 -curve. The function
                                                  Z t
                            S : [a, b] → R, t 7→      ||c0 (τ )|| dτ
                                                                 a
If c is a smooth C 1 -curve, then the arc length function S : [a, b] → [0, L(c)] is a C 1 -
function. In particular, the inverse function S −1 : [0, L(c)] → [a, b] exists and is a C 1 -
reparametrisation. The parametrisation c̃ := c◦S −1 is called parametrisation with respect
to the arc length.
Its derivative is given by
                                                                          1
                                 c̃0 (s) = c0 (S −1 (s)) ·
                                                                 ||c0 (S −1 (s))||
Obviously, this is a vector in Rn of length one. This means that the “speed” of the curve
is always constant one and that c̃0 (s) is the unit tangent vector at the curve in the point
t = S −1 (s).
Differentiation of 1 = ||c̃0 (s)||2 = hc̃0 (s), c̃0 (s)i yields
                                                                  n
                                                                                       !0
                                                                  X
                             0 = (hc̃0 (s), c̃0 (s)i)0 =                (c˜k 0 (s))2
                                                                  k=1
                                      n
                                      X
                                 =          2c˜k 0 (s)c˜k 00 (s) = 2hc̃00 (s), c̃0 (s)i .
                                      k=1
This means that the acceleration vector c̃00 (s) is perpendicular to the velocity vector c̃0 (s).
The vector
                                                  c̃00 (s)
                                    n(s) := 00
                                                ||c̃ (s)||
is called the unit normal vector and ||c̃00 (s)|| is called the curvature of c(t) in the point
t = S −1 (s).
Finally, the plane spanned by the tangent vector c̃0 (t) and the normal vector c̃00 (t) is called
osculating plane of c(t) in the point t = S −1 (s).
Example 6.56. For a curve c : [a, b] → R, x 7→ (t, y(x))T corresponding to the graph of
a C 2 -function y : [a, b] → R holds
Now, for a given C 1 -curve c : [a, b] → R2 , we want to compute the signed area A(c)
consisting of all points “between” the curve and the origin. These are all points P = λc(t),
t ∈ [a, b], λ ∈ [0, 1]. The area A(c) is called the area enclosed by the curve c.
If Z = {a = t0 , t1 , ..., tm−1 , tm = b}, m ∈ N, is a decomposition of [a, b]. Then A(c)
can be approximated by the sum of the signed areas Ai of all triangles with corners
c(ti ), c(ti+1 ), 0R2 , i = 0, ..., m − 1. These triangle areas can easily be calculated using the
cross product:
         1                                                               1
|Ai | =    ||(c1 (ti ), c2 (ti ), 0)T × (c1 (ti+1 ), c2 (ti+1 ), 0)T || = |c1 (ti )c2 (ti+1 ) − c1 (ti+1 )c2 (ti )|
         2                                                               2
         1
  Ai   =   (c1 (ti )c2 (ti+1 ) − c1 (ti+1 )c2 (ti )) .
         2
Setting ∆ti := ti+1 − ti , ∆cj,i := cj (ti+1 ) − cj (ti ), j = 1, 2, i = 0, ..., m − 1, the sum A(Z)
of all signed triangle areas is
                                 m−1
                               1X
                       A(Z) :=       (c1 (ti )c2 (ti+1 ) − c1 (ti+1 )c2 (ti ))
                               2 i=0
                                         m−1
                                       1 X c1 (ti )c2 (ti+1 ) − c1 (ti+1 )c2 (ti )
                                 =                                                  · ∆ti
                                       2 i=0                  ∆ti
                                         m−1                                  
                                       1X               ∆c2,i            ∆c1,i
                                 =             c1 (ti )       − c2 (ti )          · ∆ti .
                                       2 i=0            ∆ti              ∆ti
Example 6.57. For given a, b > 0, t1 , t2 ∈ [0, 2π], t1 < t2 we want to compute the area
At1 ,t2 of the sector of the ellipse
                       1 t2                             ab t2        ab(t2 − t1 )
                        Z                                 Z
                                   2         2
          At1 ,t2 =          ab(cos (t) + sin (t)) dt =       1 dt =              .
                       2 t1                             2 t1              2
In particular, if a = b =: r, then the sector of the circle has the area                      t2 −t1 2
                                                                                                 2
                                                                                                    r    .
The final topic of this section are so-called curve integrals. Consider the following problem:
Given a curved wire with inhomogeneous density. By integration we want to determine
its total mass. Assume that the position of the wire is parameterised by a C 1 -curve
c : [a, b] → Rn , n := 3. The density ρ(c(t)) of the wire in the point c(t) is defined as
                                                                mass
                                          ρ(c(t)) :=                     .
                                                             length unit
6.6 Solids of revolution, path integrals                                                                        161
Now, in order to compute the total mass of the wire, we consider a decomposition Z =
{a = t0 , t1 , ..., tm−1 , tm = b} of the interval [a, b] and approximate the density in the
interval [ti , ti+1 ] by the constant value ρ(c(ti )) [= density in the left boundary point c(ti )].
Furthermore, also the length of the wire in the interval [ti , ti+1 ] is approximated by the
length of the straight line between c(ti ) and c(ti+1 ) which, by the mean value theorem, is
                                   v                             v
                                   u n                           u n
                                   uX                            uX
        ||c(ti+1 ) − c(ti )|| = t (ck (ti+1 ) − ck (ti ))2 = t         c0k (τk,i )2 (ti+1 − ti )
                                            k=1                                           k=1
for suitable τk,i ∈ [ti , ti+1 ]. Thus the total mass of the wire is approximated by
                                                              v
          m−1
          X                                      m−1
                                                 X
                                                              u n
                                                              uX
               ρ(c(ti ))||c(ti+1 ) − c(ti )|| =      ρ(c(ti ))t    c0k (τk,i )2 (ti+1 − ti ).
            i=0                                                    i=0                 k=1
is used instead.
The path integral of the first kind is invariant with respect to reparametrisations, since
for a C 1 -reparametrisation h : [α, β] → [a, b] with h0 > 0 holds
            Z                           Z     β
                    f (x) ds    =                 f ((c ◦ h)(τ ))||(c ◦ h)0 (τ )|| dτ
              c◦h                         α
                                        Z  β
                                =                 f (c(h(τ )))||c0 (h(τ ))h0 (τ )|| dτ
                                          α
                                        Z  β
                                =                 f (c(h(τ )))||c0 (h(τ ))||h0 (τ ) dτ
                               h0 >0      α
                                        Z  b
                                =                 f (c(t))||c0 (t)|| dt             (substitution t := h(τ ))
                                        Za
                                =             f (x) ds .
                                          c
162                                                                                    6 The Riemann Integral
Example 6.59 (center of gravity). For a system of N mass points with point masses mi
at positions xi ∈ Rn the center of gravity xs ∈ Rn is given by
                                          PN
                                                mi xi
                                    xs = Pi=1 N
                                                      .
                                              i=1 m i
For computing the center of gravity xs of a wire, the total mass of the piece of wire
between two points c(ti ) and c(ti+1 ) is approximated by ρ(c(ti ))||c(ti+1 ) − c(ti )||. Thus
the approximation of xS reads
                                    Pm−1                c(ti+1 )−c(ti )
                                        i=0 ρ(c(ti ))||      ∆ti
                                                                        ||c(ti )∆ti
                             xS ≈       Pm−1               c(ti+1 )−c(ti )
                                                                                       .
                                          i=0 ρ(c(ti ))||       ∆ti
                                                                           ||∆ti
Example 6.60 (moment of inertia). If a mass point with mass m rotates around an axis
with distance r and angular velocity ω, then its kinetic energy is
                                        1      1         1
                                  Ekin = mv 2 = mr2 ω 2 = θω 2 .
                                        2      2         2
The term θ := mr2 is called moment of inertia of the masspoint with respect to the given
axis of rotation. For a system of N mass points with masses mi and distances ri to the
axis of rotation, the single moments of inertia θi = mi ri2 simply add up to a total moment
of inertia
                                           XN
                                       θ=       mi ri2 .
                                                        i=1
Here r(c(t)) denotes the orthogonal distance of c(t) to the axis of rotation. For ||Z|| → 0
the right-hand side converges to
                       Z b                              Z
                                     0
                   θ=      ρ(c(t))||c (t)||r (c(t)) dt = ρ(x)r2 (x) ds .
                                            2
                              a                                            c
6.7 Fourier series                                                                         163
For example if the wire has constant density ρ and is placed along a straight line of length
l > 0 in the (x, z)-plane that encloses an angle α with the x-axis, then
                     r(c(t)) = t sin(α)
                                Z l
                                                            1
                      θx-axis =     ρ · 1 · (t sin(α))2 dt = l3 ρ sin2 (α) .
                                 0                          3
For example sin(t), cos(t) and exp(it) = cos(t) + i sin(t) are 2π-periodic functions. If a
function f is T -periodic, then it is also kT -periodic for each k ∈ Z. Without proof we
mention that if f is a nonconstant continuous periodic function, then there exists always
a smallest positive period T > 0 of f .
   Definition 6.62.
   A series of the form
                                        ∞
                                  a0 X
                          f (t) =   +     ak cos(kωt) + bk sin(kωt)
                                  2   k=1
                                    n
                             a0 X
                fn (t) =       +     ak cos(kωt) + bk sin(kωt)
                             2   k=1
                                    n
                          a0 X ak ikωt             bk
                        =   +       (e + e−ikωt ) + (eikωt − e−ikωt )
                          2   k=1
                                  2                2i
                                    n
                          a0 X ak − ibk ikωt ak + ibk −ikωt
                        =   +          e    +        e
                          2   k=1
                                  2             2
164                                                                                                     6 The Riemann Integral
                                        n
                                        X
                                    =          γk eikωt
                                        k=−n
with
                                                                           a0
                                                               γ0 :=                                                     (6.4)
                                                                           2
                                                                           ak − ibk
                                                               γk       :=                                               (6.5)
                                                                              2
                                                                           ak + ibk
                                                           γ−k          :=                                               (6.6)
                                                                              2
for k ∈ N. Conversely, if coefficients γk ∈ C, k = −n, −n + 1, ..., n, are given, then
corresponding coefficients ak ∈ C, k = 0, ..., n, bk ∈ C, k = 1, ..., n, compute as
                                                       a0 := 2γ0                                                         (6.7)
                                                       ak := γk + γ−k                                                    (6.8)
                                                       bk := i(γk − γ−k ) .                                              (6.9)
If these series converge for each t ∈ R, then clearly the function f (t) is well-defined and
periodic with period T := 2π ω
                               .
Without proof we state that for two complex valued Riemann integrable functions f, g ∈
R([a, b], C), a, b ∈ R, a < b, with integrals
                        Z     b                                Z    b                     Z    b
                                    f (x) dx :=                         Re f (x) dx + i            Im f (x) dx
                          a                                     a                          a
                         Z      b                              Z    b                     Z b
                                    g(x) dx :=                          Re g(x) dx + i             Im g(x) dx
                            a                                   a                          a
   Definition 6.63.
   Let c > 0. The mapping
                          Z     b                        Z    b
             hf, gi = c             f (x)g(x) dx = c     Re f (x) Re g(x) + Im f (x) Im g(x) dx +
                            a                          a
                                                    Z b
                                                 ic     Re f (x) Im g(x) − Im f (x) Re g(x) dx
                                                         a
If f ∈ C([a, b], C)\{0} is a non-zero complex valued continuous function on [a, b], then
there exists a ξ ∈ (a, b) such that |f (ξ)| > 0 and by continuity there is an ε > 0 such that
[ξ − ε, ξ + ε] ⊂ [a, b] and such that |f (x)| ≥ 12 |f (ξ)| for all x ∈ [ξ − ε, ξ + ε]. Thus
                            Z       b                    Z   ξ+ε
                                             2                                  1
              hf, f i = c               |f (x)| dx ≥ c             |f (x)|2 dx ≥ cε|f (ξ)|2 > 0 .
                              a                          ξ−ε                    2
This shows that h·, ··i restricted to C([a, b], C) ⊂ R([a, b], C) is positive definite, i.e. for
f ∈ C([a, b], C) holds hf, f i = 0 if and only if f = 0.
In other words h·, ··i is an inner product on the complex vector space C([a, b], C) which,
equipped with this inner product, therefore becomes a complex inner product space.
Restricted to the R-subspace C([a, b], R) consisting of all real valued continuous functions
                                               Rb
on [a, b] the inner product becomes hf, gi = c a f (x)g(x) dx and equipped with this inner
product C([a, b], R) is a real inner product space.
The constant c is simply a scaling factor which for example can be set to the reciprocal
of the length of the integration interval, that is c := b−a
                                                         1
                                                            .
Similar arguments as given above show that h·, ··i restricted to the C/R-vector space of
piecewice complex-/real-valued continuous functions on [a, b], denoted by Cp ([a, b], C)/Cp ([a, b], R),
makes this space to a unitary/Euclidean one.
   Theorem 6.64.
   Let T > 0 and ω :=       2π
                            T
                               .
                                   1 T −ikωt ikωt      1 T
                                     Z                  Z
                     ikωt ilωt
                   he , e i =          e    e     dt =     1 dt = 1 .
                                   T 0                 T 0
If k 6= l, then
                                         Z   T
                                                                                           t=T
                  ikωt    ilωt       1               i(l−k)ωt             1                
                                                                                  i(l−k)ωt 
             he          ,e      i =             e              dt =            ·e               =0.
                                     T   0                           i(l − k)ωT            
                                                                                             t=0
                                                           Z        T
                                                   2
                       bk   = 2hsin(kωt), f (t)i =                      sin(kωt)f (t) dt     ,   k ∈ N.     (6.12)
                                                   T            0
Proof: For n ∈ N set fn := nk=−n γk eikωt ∈ C([0, T ], C). By assumption, the sequence
                             P
(fn )n∈N converges uniformly to f . By Theorem 3.16 the function f must be continuous.
Furthermore, for fixed k ∈ N0 , e−ikωt fn converges uniformly to e−ikωt f as
Analogously using Theorem 6.64 b) we obtain the stated formulas for ak , k ≥ 0, and bk ,
k > 0.
                                                                                                                     2
Note that for arbitrary f ∈ R([0, T ], C) the coefficients γk , ak , bk defined in (6.10) to (6.12)
fulfil the relations (6.4) to (6.9). We will mainly restrict to piecewise continuous functions.
   Definition 6.66. Fourier series
   For a piecewise continuous function f : [0, T ] → C the Fourier coefficients (γk )k∈Z ,
   (ak )k∈N0 , (bk )k∈N of f are defined by (6.10) to (6.12) and the Fourier sereies of f is
   defined by
                                     ∞                                  ∞
                                     X
                                                ikωt     a0 X
                   F (f )(t) :=            γk e        =   +     ak cos(kωt) + bk sin(kωt) .
                                    k=−∞
                                                         2   k=1
   Theorem 6.67.
   Let f : R → C be a piecewise continuous T -periodic function.
                                          ak      = 0
                                                          Z     T
                                                     4          2
                                          bk      :=                f (t) sin(kωt) dt .
                                                     T      0
168                                                                           6 The Riemann Integral
                   2 0                              2 T
                     Z                                Z
             = −          f (τ ) sin(kωτ ) dτ = −        f (τ ) sin(kωτ ) dτ = −bk
                   T −T                            T 0
In the following we list some calculation rules for Fourier series which are easily deduced
and can be proved as an exercise.
   Lemma 6.68.
   Suppose that f, g : R → C are T -periodic piecewise continuous functions with
                                   ∞                           ∞
                                   X                  a0 X
          F (f ) = F (f )(t) =           γk eikωt =     +     ak cos(kωt) + bk sin(kωt)
                                  k=−∞
                                                      2   k=1
                                   X∞
          F (g) = F (g)(t) =             δk eikωt
                                  k=−∞
      a) linearity:
                                                                   ∞
                                                                   X
                      F (αf + βg) = αF (f ) + βF (g) =                  (αγk + βδk )eikωt
                                                                k=−∞
      b) conjugation:
                                                      ∞
                                                      X
                                         F (f ) =           γ−k eikωt
                                                    k=−∞
      c) time reversal:
                                                        ∞
                                                        X
                                      F (f (−t)) =             γ−k eikωt
                                                        k=−∞
      d) stretch:
                                                    ∞
                                                    X
                                 F (f (ct)) =           γk eik(cω)t ,   c>0
                                                k=−∞
6.7 Fourier series                                                                                        169
     e) shift:
                                                            ∞
                                                            X
                                F (f (t + a)) =                    (γk eikωa )γk eikωt ,   a∈R
                                                            k=−∞
                                                             X∞
                                F (einωt f (t)) =                  γk−n eikωt ,     n∈Z
                                                            k=−∞
                                                 RT
     g) integration: If 0 = γ0 =             1
                                             T   0
                                                        f (t) dt, then
              Z     t                          Z      T                ∞
                                             1                          X   bk            ak
          F              f (τ ) dτ       = −                tf (t) dt −        cos(kωt) −    sin(kωt) .
                 0                           T      0                   k=1
                                                                            kω            kω
   and let
                                                    n
                                   a0 X
                         Sn (t) :=   +     (ak cos(kωt) + bk sin(kωt)).
                                   2   k=1
b)
c) By using the rule for differentiation of Lemma 6.68 it is sufficient to show the assertion
for m = 0. Therefore let f be piecewise continuously differentiable. Choose a decom-
position 0 = t0 < t1 < ... < tm = T such that f |[ti ,ti+1 ] , i = 0, ..., m − 1, is continuously
differentiable. Then integration by parts yields
                        Z T
                γk =         f (t)eikωt dt
                          0
                                m−1                         Z tj+1                !
                            1 X                      t=t
                                       f (t)e−ikωt t=tj −         f 0 (t)eikωt dt
                                                       j+1
                    = −
                           ikω j=0                           tj
and therefore
                                 m−1                             Z T             !
                         1     1X                              1
                 |γk | ≤             |f (t−             +
                                          j+1 )| + |f (tj )| +       |f 0 (t)| dt .
                         k     ω j=0                           ω 0
                             |                         {z                        }
                                                      C:=
                                                                                               2
Finally, without proof, we state the following theorem on uniqueness of Fourier series:
     Theorem 6.71.
     If two T-periodic, piecewise continuous functions f, g : R → C have the same
     Fourier series and if they fulfill
                                          1
                                   f (t) =  (f (t− ) + f (t+ ))
                                          2
                                          1
                                   g(t) =   (g(t− ) + g(t+ ))
                                          2
                                                                               173
174                                                                          INDEX