0% found this document useful (0 votes)
59 views26 pages

Riemann Stieltjes Integration

Uploaded by

rajeshdhaka224
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views26 pages

Riemann Stieltjes Integration

Uploaded by

rajeshdhaka224
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Riemann-Stieltjes Integral

S Kumaresan
kumaresa@gmail.com
27 January 2021

Contents

1 Darboux Approach to Riemann Integration 2

2 Darboux Approach to Riemann-Stieltjes Integral 4

3 Riemann’s Criterion for Integrability 6

4 Examples of Integrable Functions 8

5 Riemann’s Approach to Integration 12

6 Class of Integrable Functions 13

7 Two Important Examples of Riemann-Stieltjes Integrals 19

8 Fundamental Theorems of Calculus 22

Abstract

We treat the case of Riemann integrals first and motivate the results geometrically.
When we prove the results analytically, we observe that the proof can be adopted to prove
the corresponding results in Riemann-Stieltjes integrals where the integrator is an in-
creasing function. The aim of this article is to give an outline of the theory of Riemann-
Stieltjes integral closely following the development of Riemann integral in our book [1].
The approach may develop geometric ideas of Riemann integrals and hone the analytical
skills.
You may also like to watch the videos in [5] on which this article is based.

1
1 Darboux Approach to Riemann Integration

Definition 1. A partition P of an interval [a, b] is a finite set {x 0 , x 1 , . . . , x n } such that a = x 0 <


x 1 < · · · < x n−1 < x n = b. The points x i are called the nodes of P .

Example 2. (i) P = {a = x 0 , x 1 = b} is the trivial partition of [a, b].


(ii) For any n ∈ N, let x i = a + ni (b −a) for 0 ≤ i ≤ n. Then {x 0 , . . . , x n } is a partition, say, P n . Note
that P n divides [a, b] in subintervals of equal length. (One often says that P n divides [a, b] into
equal parts.)

Given two partitions P and Q of [a, b], we say that Q is a refinement of P if P ⊂ Q. In


the example (ii) above, the partition P 2k+1 is a refinement of P 2k . More generally, P mn is a
refinement of P n where m, n ∈ N.
If P = {a = s 0 , . . . , s m = b} and Q = {a = t 0 , . . . , t n = b} are two partitions of [a, b], let us
consider P ∪ Q. It is a finite set of points in [a, b]. So we can arrange them in increasing order
{x 0 , . . . , x N }. (Here x 0 = s 0 = t 0 = a and x N = s m = t n = b.) This is a partition of [a, b] and by an
abuse of notation it is denoted by P ∪ Q. Note that P ∪ Q is a refinement of both P and Q. For
example, if P = {0, 1/2, 3/4, 1} and Q = {0, 1/3, 1/2, 2/3, 1}, then P ∪Q = {0, 1/3, 1/2, 2/3, 3/4, 1}.

Definition 3. Given f : [a, b] → R and a partition P = {x 0 , . . . , x n } of [a, b], we let

n−1
X
L( f , P ) := m i ( f )(x i +1 − x i ), where m i ( f ) := GLB { f (x) : x ∈ [x i −1 , x i ]}
i =0
n−1
X
U ( f , P ) := M i ( f )(x i +1 − x i ), where M i ( f ) := LUB { f (x) : x ∈ [x i −1 , x i ]}
i =0

Observe that, if f ≥ 0, L( f , P ) (respectively, U ( f , P )) is the sum of areas of rectangles in-


scribed inside (respectively, circumscribing) the region bounded by the curves x = a, x = b,
y = 0 and y = f (x). For simplicity, we refer to this region as the region under the graph in the
case of nonnegative functions. The numbers L( f , P ) and U ( f , P ) are respectively called the
lower and the upper Darboux sums of f with respect to the partition P . They approximate the
area under the graph from below and from above.
See Figure 6.3 on page 177 in [1].
We let m = m( f ) := GLB { f (x) : x ∈ [a, b]} and M = M ( f ) := LUB { f (x) : x ∈ [a, b]}.

We now look at some examples of upper and lower sums. They also teach us the impor-
tance of choosing partitions smartly depending on the nature of the functions.

Example 4 (Constant functions). Let f : [a, b] →PR be a constant, say, c. Let P be any partition.
Then we see that m i = M i = c. Hence L( f , P ) = ni=1 c(x i − x i −1 ) = c(b − a). Similarly U ( f , P ) =
c(b − a). This validates our intuition that the “area" under the graph should be c(b − a).

2
Example 5 (Step Functions). Let f : [−1, 1] → R be defined by f (x) = 1 for −1 ≤ x < 0, f (0) = 10
and f (x) = 2 for 0 < x ≤ 1. Draw picture of the graph of f . Can you convince yourself that the
area under graph should be 3? Our intuition says that the point x = 0 needs special attention.
Let us consider the partition P n := {−1, x 1 = − n1 , x 2 = n1 , 1}. We see that m 1 = M 1 = 1, m 2 = 1
and M 2 = 10 and m 3 = M 3 = 2. Hence we find that
µ ¶ µ ¶ µ ¶
1 2 1 1 2
L( f , P n ) = 1 1 − +1× +2 1− = 3 1− +
n n n n n
µ ¶ µ ¶ µ ¶
1 2 1 1 2
U ( f , Pn ) = 1 1 − + 10 × + 2 1 − = 3 1− + 10 × .
n n n n n

Do you observe that L( f , P n ) → 3 and U ( f , P n ) → 3?

Example 6 (Monotone Functions). Let f : [a, b] → R be increasing. Let P be any partition.


Then we have m i = f (x i −1 ) and M i = f (x i ) for each i . Hence we see that
X X
L( f , P ) = f (x i −1 )(x i − x i −1 ) and U f , P ) = f (x i )(x i − x i −1 )
i i

(If f is decreasing what are the lower and upper sums?) If we take finer partitions do the upper
and lower sums come close to each other? That is, we ask: can we make U ( f , P ) − L( f , P )
arbitrarily small? Note that
n
X
U ( f , P ) − L( f , P ) = [ f (x i ) − f (x i −1 )](x i − x i −1 ).
i =1

Note that if all the lengths x i − x i −1 of the subintervals are equal, say, (b − a)/n, then we have

b−a Xn b−a
U ( f , P ) − L( f , P ) = [ f (x i ) − f (x i −1 )] = ( f (b) − f (a)). (1)
n i =1 n

In the last equality we used the fact that the sum


X
[ f (x i ) − f (x i −1 )] = [ f (x 1 ) − f (x 0 )] + [ f (x 2 ) − f (x 1 )] + · · · + [ f (x n ) − f (x n−1 )]
i
= − f (x 0 ) + f (x n ),

a telescopic sum. Is our question above answered?

These examples lead us to believe that as we take finer and finer partitions, we seem to
approximate the desired area well.

Example 7 (Dirichlet’s Function). Let f : [0, 1] → R be the function f (x) = 1 if x ∈ Q ∩ [0, 1] and
f (x) = 0 if x ∈ [0, 1] \ Q. Let P be any partition. Then m i = 0 and M i = 1 thanks to the density
of rational numbers as well as the irrational numbers. Hence we find that
X X
L( f , P ) = 0 × (x i − x i −1 ) = 0 and U ( f , P ) = 1 × (x i − x i −1 ) = 1.
i i

3
This is true for any partition P . What do we infer? If at all there is meaning to the area under
the graph of the function, our approach is not going to give approximations to the ‘area’. So
there are two possible policy decisions we have to take. We love our approach and so we
declare that the area of the region under the graph of f has no meaning. Or, we should devise
a better system. We adopt the first and leave the second to a later course!
Ex. 8. Prove that m(b − a) ≤ L( f , P ) ≤ U ( f , P ) ≤ M (b − a) for any partition P of [a, b].
Ex. 9. Let f : [a, b] → R be bounded. Let c ∈ (a, b). Let P 1 (respectively, P 2 ) be a partition
of [a, c] (respectively, of [c, b]). Show that P := P 1 ∪ P 2 is a partition of [a, b]. Show also that
L( f , P ) = L( f , P 1 ) + L( f , P 2 ) and U ( f , P ) = U ( f , P 1 ) +U ( f , P 2 ).
A very pedantic formulation is as follows. Let f 1 (respectively, f 2 ) be the restriction of f to
[a, c] (respectively, to [c, b]). Prove that L( f , P ) = L( f 1 , P 1 ) + L( f 2 , P 2 ) and so on.
Definition 10. Given a partition P = {x 0 , . . . , x n }, we insert a new node, say, t such that x i < t <
x i +1 for some i and get a new partition Q. Then drawing pictures of a non-negative function,
it is clear that L( f ,Q) ≥ L( f , P ). Similarly, U ( f ,Q) ≤ U ( f , P ). Draw pictures. See Figures 6.4–
6.5 on Page 178 in [1]. (We shall prove this later.) Thus, Q produces a better approximation to
the area bound by the graph. This suggests that to get the “real" area we should look at
L( f ) := LUB {L( f , P ) : P is a partition of [a, b]}
U ( f ) := GLB {U ( f , P ) : P is a partition of [a, b]}.

These numbers exist (why?) and are called respectively the lower and upper integral of f
on [a, b]. The upper integral of f on [a, b] may be understood as the best possible approx-
imation to the area of the region under the graph as approximated from above. How do we
understand the lower integral?
Ex. 11. Show that L( f ) ≤ U ( f ) for any bounded function f : [a, b] → R.
Definition 12. We say that f is Darboux integrable (or simply integrable) on [a, b] if the upper
and lower integrals coincide. (This intuitively says that we require that the area should be
approximable both from below and from above.) If f is integrable, the common value of the
Rb
upper and lower integrals is denoted by the symbol a f (x) d x. This is just a notation; we may
Rb
as well use I ab ( f ) or a f etc.

2 Darboux Approach to Riemann-Stieltjes Integral

Let α : [a, b] → R be an increasing (non-decreasing) function. Let f : [a, b] → R be a bounded


function. Let P := {x 0 , x 1 , . . . , x n } be a partition of [a, b]. Let
P m i ≡ m i ( f ) and M i = M i ( f ) be as
usual. Let ∆αi := α(x i ) − α(x i −1 ). Note that ∆αi ≥ 0. Also ni=1 ∆αi = α(b) − α(a). (Verify this.)
We define upper and lower Riemann-Stieltjes sums of f relative to the partition P in an
obvious way:
n n
U ( f , P, α) := M i ∆αi and L( f , P, α) := m i ∆αi .
X X
i =1 i =1

4
Since m i ≤ M i and ∆αi ≥ 0, we see that
n n
L( f , P, α) := m i ∆αi ≤ M i ∆αi = U ( f , P, α).
X X
i =1 i =1

Let m and M be such that m ≤ f (x) ≤ M for x ∈ [a, b]. Then m ≤ m i and M i ≤ M for all i .
Hence it follows that

m(α(b) − α(a)) ≤ L( f , P, α) ≤ U ( f , P, α) ≤ M (α(b) − α(a)).

In particular, the set {U ( f , P, α)} is bounded below and hence its GLB exists. In a similar way,
{L( f , P, α)} is bounded above and hence its LUB exists. As in the case of Riemann integral, let
U ( f , α) := GLB {U ( f , P, α)} be the upper α-integral of f on [a, b]. How is L( f , α) defined? What
do you call it?
The following theorem is a collection of results which are proved by mimicking the proofs
of the corresponding results in the theory of Riemann integration.

Theorem 13. The following are true:


(i) If Q is a refinement of the partition P , then L( f , P, α) ≤ L( f ,Q, α) and U ( f , P, α) ≥ U ( f ,Q, α).
(ii) For any partitions P and Q of [a, b], we have L( f , P, α) ≤ U ( f ,Q, α).
(iii) We have L( f , α) ≤ U ( f , α).

Proof. Let P := {x 0 , . . . x i −1 , x i , . . . , x n } be a partition. Let t ∈ (x i −1 , x i ) and

Q := P ∪ {t } = {x 0 , x 1 , . . . , x i −1 , t , x i , . . . , x n }.

Note that M i = LUB { f (x) : x ∈ [x i −1 , x i ]} ≥ LUB { f (x) : x ∈ [x i −1 , t ]} and M i = LUB { f (x) : x ∈


[x i −1 , x i ]} ≥ LUB { f (x) : x ∈ [t , x i ]}. Hence we find that

M i (∆αi ) = M i (α(t ) − α(x i −1 )) + M i (α(x i ) − α(t ))


≥ LUB { f (x) : x ∈ [x i −1 , t ]}(α(t ) − α(x i −1 ))
+ LUB { f (x) : x ∈ [t , x i ]}(α(x i ) − α(t )).

Can you state and prove the analogous results for m i ∆αi ? We then have

U ( f , P, α) = M j ∆α j + M i ∆αi
X
j 6=i

M j ∆α j + LUB { f (x) : x ∈ [x i −1 , t ]}(α(t ) − α(x i −1 ))


X

j 6=i

+ LUB { f (x) : x ∈ [t , x i ]}(α(x i ) − α(t ))


= U ( f ,Q, α).

The reader should prove that L( f , P, α) ≤ L( f ,Q, α). Let Q = P ∪ {t 1 , . . . , t r }. Let Q i := P ∪


{t 1 , . . . , t i }, 1 ≤ i ≤ r . Let Q 0 = P . Each Q i is a refinement of Q i −1 . Hence U ( f ,Q r , α) ≤

5
U ( f ,Q r −1 , α) ≤ · · · ≤ U ( f ,Q 1 , α) ≤ U ( f , P, α). The analogous result for lower sum follows in
a similar fashion.
(ii) We know that L( f , P, α) ≤ U ( f , P, α) for any partition P . Let P 0 = P ∪Q. Observe that

L( f , P, α) ≤ L( f , P 0 , α) ≤ U ( f , P 0 , α) ≤ U ( f ,Q, α).

(Can you justify the first and the third inequalities?)


(iii) is an easy exercise in LUB and GLB. Consider the two subsets A and B of real numbers.
Assume that for any pair of points (x, y) ∈ A × B , x ≤ y. Then each y is an upper bound of A.
Hence LUB A ≤ y. Since this holds true for each y ∈ B , we see that LUB A is a lower bound
for B . Hence LUB A ≤ GLB B . Let A := {L( f , P, α) : P a partition of [a, b]} and B := {U ( f ,Q, α) :
Q a partition of [a, b]}. Then A and B satisfy our assumption in view of (ii). The result (iii)
follows.

Definition 14. Keep the notation above. We say that f is α-integrable if L( f , α) = U ( f , α). We
also say that the Riemann-Stieltjes integral of f with respect to the integrator α exists. The
Rb
common value of the upper and lower integrals is denoted by a f d α.
Let us repeat: if α(x) = x then we simply say that f is integrable on [a, b].

Note that when α is the identity function, this means that we can approximate the area
under the graph of a positive function both from inside and outside and get the same result.
Thus, this definition insists on a kind of “symmetry".

3 Riemann’s Criterion for Integrability

Theorem 15 (Riemann’s Criterion). f is α-integrable iff for any ε > 0 there exists a partition P
such that U ( f , P, α) − L( f , P, α) < ε.1

Proof. Let the condition be satisfied. We are required to prove that f is α-integrable. Let I 1
and I 2 be the lower and upper integrals. We need to show I 1 = I 2 . It is enough to show that for
any ε > 0, |I 1 − I 2 | < ε.
Since I 1 ≤ I 2 , it is enough to show that I 2 < I 1 + ε for any ε > 0. Given ε > 0, let P be as in
the hypothesis. Observe that

I 1 ≥ L( f , P, α) and I 2 ≤ U ( f , P, α).

Hence
I 2 − I 1 ≤ U ( f , P, α) − L( f , P, α) < ε.
Thus we have proved that f α-is integrable.
1
See the video in [5]: Riemann Integration - 3, especially 30:57 – 46:25.

6
To prove the converse, let ε > 0 be given. Let I 1 and I 2 be the lower and upper integrals.
Then we have I 1 = I 2 . Since I 1 is the LUB of L( f , P, α)’s, there exists a partition P 1 of [a, b] such
that I 1 − ε/2 < L( f , P 1 , α). That is,

I 1 − L( f , P 1 , α) < ε/2. (2)

Similarly, there exists a partition P 2 of [a, b] such that

U ( f , P 2 , α) < I 2 + ε/2 so that U ( f , P 2 , α) − I 2 < ε/2. (3)

Let P = P 1 ∪ P 2 . Since P is a refinement of P 1 and P 2 , we have

L( f , P 1 , α) ≤ L( f , P, α) ≤ U ( f , P, α) ≤ U ( f , P 2 , α). (4)

Now,

U ( f , P, α) − L( f , P, α)
= U ( f , P, α) − I 2 + I 2 − L( f , P, α), adding and subtracting I 2
= U ( f , P, α) − I 2 + I 1 − L( f , P, α), since I 1 = I 2
≤ U ( f , P 2 , α) − I 2 + I 1 − L( f , P 1 , α), by (4)
< ε/2 + ε/2, by (2) and (3)
= ε.

Thus, the condition is necessary.


Remark 16. Let f : [a, b] → R be continuous. Let α : [a, b] → R be increasing. If we wish to
use the Riemann’s criterion to prove the integrability of f , for any ε > 0 we need to find a
partition P such that U ( f , P, α) − L( f , P, α) = i (M i − m i )∆αi < ε. Look at the sum. If we wish
P
to estimate it, we may wish to estimate each of the summands (M i − m i )∆αi . If the function
is “well-behaved" (for example, continuous) in [x i −1 , x i ], then it is easier to estimate M i − m i .
So a natural way of estimating the sum is to use “Divide and Conquer" trick. 2
Let B (B for bad!) be the set of
Pindices i for which
P we have no good control
P over M i − m i .
We break the sum into two parts i (M i − m i )∆αi = i ∉B (M i − m i )∆αi + i ∈B (M i − m i )∆αi .
For the latter we have a crude estimate i ∈B (M i −m i )∆αi ≤ 2C i ∈B ∆αi , whereP¯ f (x)¯ ≤ C
P P ¯ ¯

for x ∈ [a, b]. Thus to control the second, we need to find a partition P so that i ∈B ∆αi is
“small". Keep these vague ideas in mind while going through the proofs which use Riemann’s
criterion. You will not miss the wood for trees! Look at Examples 18–19, Theorem 22 (i),
Theorem 24, Example 27, and Theorem 32.
Observation 17. As explained in the Remark above, we need to look at M i ( f ) − m i ( f ). Al-
ternate descriptions of M i ( f ) − m i ( f ) are useful in the sequel. Let us start with a general
observation.
Let A − A := {x − y : x, y ∈ A}. We claim that LUB A −GLB A = LUB (A − A). Let C = LUB (A −
A). Let M := LUB A and m = GLB A. Then x − y ≤ M − m. Hence C ≤ M − m. Fix y ∈ A.
2
See the video in [5]: Riemann Integration - 5, especially 8:3–21:23.

7
Since x − y ≤ C , we see that x ≤ C + y for all x ∈ A. Thus C + y is an upper bound for A and so
M ≤ C + y. It follows from this that M − C ≤ y. This is true for any y ∈ A. Therefore, M − C is a
lower bound for A. We conclude that M −C ≤ m or M − m ≤ C . Thus, M − m = LUB (A − A).
Next question is: Is there ¯a better ¯ description of LUB (A − A)? Note that both x − y, y −
x ∈ A ¯− A for
¯ x, y ∈ A. Since x − y = max{x
¯ ¯
¯ − y,¯ y − x}, we suspect that C := LUB (A − A) =
LUB { x − y : x, y ∈ A} =: D. Since x −¯y ≤ ¯x¯ − y ¯ ≤ D, it follows that D is an upper bound of
¯ ¯
A − A and so C ≤ D. Conversely, since ¯x − y ¯ is¯ either¯ x − y or y − x and since both lie in A − A,
we see
©¯ that
¯ x − y ≤ C
ª and y − x ≤ C and hence ¯x − y ¯ ≤ C . That is, C is an upper bound for the
set ¯x − y ¯ : x, y ∈ A . Hence D ≤ C . Thus we have established
© ª ©¯ ¯ ª
LUB A − GLB A = LUB x − y : x, y ∈ A = LUB ¯x − y ¯ : x, y ∈ A . (5)

We shall apply these observations to the set A := { f (x) : x ∈ [x i −1 , x i ]}. We then arrive at the
following:
© ª
M i ( f ) − m i ( f ) = LUB f (s) − f (t ) : s, t ∈ [x i −1 , x i ]
©¯ ¯ ª
= LUB ¯ f (s) − f (t )¯ : s, t ∈ [x i −1 , x i ] . (6)

4 Examples of Integrable Functions

Example 18 (Step Functions). Let f : [a, b] → R be a bounded function. Assume that ¯ f (x)¯ ≤
¯ ¯

M for x ∈ [a, b]. Let c ∈ (a, b). Assume that f (x) = A on [a, c) and f (x) = B on (c, b]. Let
f (c) = C . We claim that f is integrable. Let P be a partition such that c ∈ [x j −1 , x j ]. Then we
have

U ( f , P ) − L( f , P ) = (max{A, B,C } − min{A, B,C }) (x j − x j −1 )) ≤ 2M (x j − x j −1 ),

If ε is given, we need only choose any partition P such that the length of the subinterval con-
taining c in its interior is arbitrarily small.
For example, choose N ∈ N such that 4M N
< ε. Consider the partition P = {a, c − N1 , c + N1 , b}.
We assume that N is so large that a < c − N1 and c + N1 < b. Then U ( f , P )−L( f , P ) ≤ 2M × N2 < ε.
This argument is easily adapted to step-functions. A function f : [a, b] → R is a step func-
tion if there exists a finite set of points {t i : 1 ≤ i ≤ r } ⊂ [a, b] such that on each of the subinter-
vals in [a, b] \ {t i : 1 ≤ i ≤ r }, the function is a constant.
Let us be explicit. Assume without loss of generality a ≤ t 1 < t 2 < · · · < t r ≤ b. We have
subintervals of the form [a, t 1 ), (t i −1 , t i ) and (t r , b]. Then the assumption is that f is a constant
c i on the i -th interval.
Example 19. We now look at very important but easy example. Fix c such that a < c ≤ b.
Define the Heaviside function at c as follows:
(
0 if x < c
Hc (x) :=
1 if x ≥ c.

8
Note that Hc is an increasing function. We let α := Hc . Let f : [a, b] → R be a bounded function
Rb
which is continuous at c. We claim that f is α-integrable and we have a f d α = f (c). Let P be
any partition. Choose the unique k-th subinterval of the partition so that x k−1 < c ≤ x k . Then
we have the following:

For i < k, ∆αi = α(x i ) − α(x i −1 ) = 0 − 0 = 0,


For i = k, ∆αk = α(x k ) − α(x k−1 ) = 1 − 0 = 1,
For i > k, ∆αi = α(x i ) − α(x i −1 ) = 1 − 1 = 0.

It follows that U ( f , P, α) = M k and L( f , P, α) = m k . We wish to apply the Riemann’s criterion.


We have U ( f , P, α) − L( f , P, α) = M k − m k . Let ε > 0 be given. By the continuity of f at c, there
exists δ > 0 such that f (c) − ε < f (t ) < f (c) + ε for t ∈ (c − δ, c + δ). We choose a partition
P such that each of the subintervals is of length at most δ. If k is as above, then we have
f (c) − ε ≤ m k ≤ M k ≤ f (c) + ε. As a consequence, we have U ( f , P, α) − L( f , P, α) ≤ 2ε. This
Rb
proves the α-integrability of f . What is a f d α? Observe that for such partitions we have

f (c) − ε ≤ L( f , P, α) ≤ L( f , α) = U ( f , α) ≤ U ( f , P, α) ≤ f (c) + ε.
¯R ¯
¯ b
Thus, ¯ a f d α − f (c)¯ ≤ 2ε. Since ε > 0 is arbitrary, the result follows.
¯

Ex. 20. Prove that the result of the last example holds true if we relax the assumption of f by
requiring that f is continuous from the left. Hint: We may assume that c is the right endpoint
of the subinterval (x k−1 , x k ].

Ex. 21. This is an exploratory question. (i) What happens if one assumes right continuity.
2. If we replace ≥ by ≤ in the definition of the Heaviside function.

Theorem 22. (i) If f is continuous on [a, b], then f is α-integrable.


(ii) If f is monotone and if α is continuous, then f is α-integrable.

Proof. Pn(i) Please review Remark 16. Given ε > 0, we need to find a partition P of [a, b] such
that i =1 (M i − m i )∆αi < ε. Since f is continuous on any subinterval [x i −1 , x i ], there exists
s i , t i ∈ [x i −1 , x i ] such that f (t i ) = M i and f (s i ) = m i . Hence as explained in the Remark quoted
above, if s i , t i are close to each other we may ensure that f (t i ) − f (s i ) = M i − m i < ε. A little
thought will convince us to invoke the uniform continuity of ¯f on the closed and bounded
interval [a, b]. Let δ be such that if |s − t | < δ, then ¯ f (s) − f (t )¯ < ε. Choose N ∈ N such that
¯

(b − a)/N < δ. Let P be the partition whose i -th node is x i := a + Ni (b − a). Then if s i and t i as
above, it is clear that M i − m i < ε. Hence we have
N
U ( f , P, α) − L( f , P, α) = (M i − m i )[α(x i ) − α(x i −1 )] < ε[α(b) − α(a)].
X
i =1

Hence we conclude that f is α-integrable.


(ii) Revisit Example 6. Note that (1) establishes the integrability of f (where α(x) = x).
What made the proof work? We focused on partitions in which all subintervals have the same

9
length. This facilitated the estimate of U ( f , P ) − L( f , P ). If we wish to extend the result to an
arbitrary increasing integrator α, we need to find partition of [α(a), α(b)] into subintervals of
equal length and this, in turn, gives rise to partition of [a, b]. Let us attend to the details.
Let y k := α(a) + nk (α(b) − α(a)), 0 ≤ k ≤ n. Since α is increasing and continuous, we note
that α([a, b]) = [α(a), α(b)]. Now y k ∈ [α(a), α(b)]. Hence by the intermediate value theorem,
there exists x k ∈ [a, b] such that α(x k ) = y k . Note also that x k ≤ x k+1 since α is increasing. Let
P = {x 0 , . . . , x k , . . . , x n } be the partition of [a, b]. We can adapt the argument in Example 6 to
arrive at U ( f , P, α) − L( f , P, α) = α(b)−α(a) n [ f (b) − f (a)].

Remark 23. Go through the proof of (ii) of Theorem 22 again. You may be tempted to say that
is suffices to assume that α : [a, b] → [α(a), α(b)] is onto. If you remember your real analysis
well, especially the study of monotone functions, you may recall that a monotone function on
an interval is continuous iff its range is an interval. (This follows from the intermediate value
theorem. See pages 88-89 of [1], especially Proposition 3.5.5 and its corollary. It is always
worth reviewing as it makes you gain mastery over the subject.) So, nothing gained!

The next theorem improves (i) of Theorem 22

Theorem 24. Let f : [a, b] → R be a bounded function. Assume that f is continuous on [a, b]
except at a finite number of points. Let α : [a, b] → R be increasing. Assume that α is continuous
at the points of discontinuity of f . Then f is α-integrable.

Proof. We shall walk through a proof assuming that f is discontinuous at only one point c ∈
(a, b). The general case can be proved along similar lines.
As per the strategy outlined in Remark 16,Pc is a “bad" point. Let P be a partition such
that c ∈ [x j −1 , x j ]. Then U ( f , P, α)−L( f , P, α) = i 6= j (M i −m i )∆αi +(M j −m j )(α(x j )−α(x j −1 )).
Thanks to the continuity of f on [a, x j −1 ] and [x j , b] we are confident of ¯ensuring ¯ that the
first summand on the right is as small as we please. If C > 0 is such that f (x) ≤ C for x ∈ ¯ ¯
[a, b], then the term (M j − m j )[α(x j ) − α(x j −1 )] ≤ 2C [α(x j ) − α(x j −1 )]. Thus we should aim to
“control" α(x j ) − α(x j −1 ). This is where we invoke the continuity of α at c. Choose s < c < t so
small that α(t ) − α(s) is as small as we please. Let us look at [a, b] = [a, s] ∪ [s, t ] ∪ [t , b]. The
textbook proof is ready.
Let ε > 0 be given. Let C > 0 be such that ¯ f (x)¯ ≤ C for x ∈ [a, b]. Since α is continuous at c,
¯ ¯
ε
there exists δ > 0 such that |x − c| < δ =⇒ |α(x) − α(c)| < 12C . Let s := c − δ2 and t := c + δ2 . Then
ε
|α(t ) − α(s)| = α(t )−α(s) < 6C . Consider the intervals [a, s] and [t , b]. Since f is continuous on
each of them there exist partitions P 1 := {a = s 0 < · · · < s m = s} and P 2 := {t = t 0 < · · · < t n = b}
such that U ( f , P i , α) − L( f , P i , α) < 3ε for i = 1, 2. (If you are very pedantic, then the " f " in
U ( f , P 1 , α) is the restriction of f to [a, s] and so on!) Consider P = P 1 ∪P 2 . Let M 0 = LUB { f (x) :
x ∈ [s, t ]} and m 0 = GLB { f (x) : x ∈ [s, t ]}. We then have

U ( f , P, α) − L( f , P, α)
= [U ( f , P 1 , α) − L( f , P 1 , α)] + (M 0 − m 0 )[α(t ) − α(s)] + [U ( f , P 2 , α) − L( f , P 2 , α)]
ε ε ε
< + 2C + = ε.
3 6C 3

10
This proves that f is α-integrable.
Can you now work out the proof of the general case? If c 1 , . . . , c r are the points of disconti-
nuity of f , choose δ > 0 in such that way that {(c i − δ, c i + δ) : 1 ≤ i ≤ r } are pairwise disjoint.
Let s i := c i − δ2 etc. What do we want about α(t i ) − α(s i )? What kind of estimate you want on
U ( f , P i , α) − L( f , P i , α) where P i is a partition of [t i −1 , s i ]? Draw pictures for the final partition
P . On how many subintervals the ‘control’ of U ( f , P i , α) − L( f , P i , α) is easy? What kind of
estimate you would like to have on each of them?

Remark 25. Revisit Example 18. Let f : [a, b] → R be a step function. Then f is continuous
except at finitely many points. If we let α(x) = x, then f is integrable on [a, b] by the last
theorem. This was proved already in Example 18.

Remark 26. There is a result due to Lebesgue which characterizes the integrability of f in
terms of the ‘size’ (measure or length) of the set of discontinuities. (See a forthcoming book ) Give Ref!

A particular case: if the set of discontinuities of f is countable, then the function is integrable.
For example, Thomae’s function is integrable. See Example 27.

Example 27 (Integrability of Thomae’s Function). Let f : [0, 1] → R be defined by


(
0, if x is irrational
f (x) =
1/q, if x ∈ Q, x = p/q with p, q ∈ N and gcd(p, q) = 1.

Given ε > 0, we must find a partition P such that U ( f , P ) − L( f , P ) < ε. By the density of
irrationals, in any subinterval [x i , x i +1 ] of a partition P of [0, 1], irrationals exist and hence m i ’s
are zero and hence L( f , P ) = 0. Hence we need only show U ( f , P ) < ε. Let n ∈ N be such that
1/n < ε. The set A n := {r ∈ Q ∩ [0, 1] : f (r ) > 1/n} is finite. If r ∈ A n and r ∈ [x i −1 , x i ], M i ( f ) ≥ n1 .
We call such interval ‘bad’. We employ the divide and conquer method to show that the sum
of the lengths of such bad intervals is ‘small’.
Let ε > 0 be given. Choose k ∈ N such that k1 < ε/2. There exists a finite number, say, N of
rational numbers p/q with q ≤ k. Denote them by {r j : 1 ≤ j ≤ N }. Let δ < ε/(4N ). Choose a
partition P = {x 0 , . . . , x n } of [0, 1] such that

max{|x i +1 − x i | : 0 ≤ i ≤ n − 1} < δ.

Let A := {i : r j ∈ [x i , x i +1 ], for some j }, and B := {0, . . . , n}\ A. Note that the number of elements
in A will be at most 2N . (Why 2N ? Some r j could be the left and the right endpoint of adjacent

11
subintervals!) For i ∈ A, we have M i ≤ 1. For j ∈ B , we have M j < 1/k. Hence
n−1
X
U(f ,P) = M i (x i +1 − x i )
i =0
X X
= M i (x i +1 − x i ) + M j (x j +1 − x j )
i ∈A j ∈B
X 1
≤ (2N )δ + (x j +1 − x j )
j ∈B k
1
≤ (2N )δ +
k
ε ε
+ = ε.
<
2 2
P
In the above, we used the fact that j ∈B (x j +1 − x j ) is the sum of the lengths of the disjoint
subintervals that lie in B , and hence it is at most 1.
Thus, for any ε > 0, we have found a partition P ε such that U ( f , P ε ) < ε. It follows that
R1
GLB {U ( f , P ) : P is a partition of [0, 1]} = 0. Hence 0 f = 0. Of course, we could use a simpler
argument. Since f is integrable, and since each lower sum is zero, it follows that the lower
R1
integral is 0. Hence 0 f = 0. (Always explore different ways of looking at the same thing!)

5 Riemann’s Approach to Integration

We now introduce the Riemann sum approach to α-integrability.


Definition 28. Let f , α : [a, b] → R be bounded functions. (At present we do not insist on the
monotonicity of α.) Let P = {x 0 , . . . , x n } be a partition of [a, b]. Let t i ∈ [x i −1 , x i ] for 1 ≤ i ≤ n be
arbitrary. We let t := (t 1 , . . . , t n ) and call it a set of tags. The Riemann-Stieltjes sum is defined
as follows:
n
S( f , P, t, α) =
X
f (t i )∆αi .
i =1
We say that f is α-integrable on [a, b] if there exists A ∈ R such that for any given ε > 0, there
exists a partition ¯ P¯ such that for any¯ refinement Q of P and for any set of tags in Q, we have
¯S( f , P, t, α) − A ¯ = ¯Pn f (t i )∆αi − A ¯ < ε. It is easy to see that such an A is unique. We call A
¯
i =1
the Riemann-Stieltjes integral of f with respect to the integrator α.
We refer to this as the Riemann sum approach to integration. The earlier one will be re-
ferred to as Darboux sum (upper/lower sum) approach to integration.

We now show that the present definition of α-integral is equivalent to the earlier definition
if the integrator α is increasing.
Theorem 29. Let f : [a, b] → R be bounded. Let α : [a, b] → R be increasing. Then f is α-
integrable according to the first definition iff it is α-integrable according to Definition 28. In
such a case, both the integrals are the same.

12
Proof. Let f be integrable according to Definition 28 with the integral A. We need to show
that f is α-integrable according to the first definition. We plan to use the Riemann’s criterion
to achieve this. Let ε > 0 be given. Then by definition there exists a partition ¯ Pn P such that ¯ for
any partition Q, which refines P and for any set of tags t, we have i =1 f (t i )∆αi − A ¯ < ε.
¯
We now choose a sequence tk := (t k1 , . . . , t kn )of tags as follows: f (t ki ) > M i − k1 for k ∈ N.
1
i f (t ki¯)∆αi = U ( f , P, α). Since
P
Note
¯ Pn that we have M i − k
< f (t ki ) ≤ M i . It follows that lim k

¯ i − A < ε, it¯ leads us to conclude that U ( f , P,¯ α) − A ¯ ≤ ε. In a similar way,


¯ ¯
i =1 f (t ki )∆α
¯ ¯ ¯
we arrive at ¯L( f , P, α) − A ¯ ≤ ε. Hence ¯U ( f , P, α) − L( f , P, α)¯ ≤ 2ε. This establishes the α-
¯

integrability of f according to the first definition. Also it shows that A = U ( f , α) = L( f , α).


We now prove the converse. Let A := U ( f , α) = L( f , α). We shall show that f is α-integrable
as per Definition 28. Let ε > 0 be given. Then there exist partitions P 1 and P 2 such that
A − ε < L( f , P 1 , α) and U ( f , P 2 , α) < A + ε. (7)
Note that (7) remains true for any partition P which refines both P 1 and P 2 . Let t be a set of
tags in such a partition P . We then have
n
−ε < L( f , P, α) − A ≤ f (t i )∆αi − A ≤ U ( f , P, α) − A < ε.
X
i =1

We conclude that f is α-integrable as per Definition 28 with the integral A.


Remark 30. Definition 28 has certain advantages. (i) We do not have to impose any condition
on the integrator except that it is bounded. (ii) The proofs of the analogues of (i)–(iii) of The-
orem 31 are easier to prove using the second definition, that too, in a more general setting of
any integrator α! But a look at (iv) and (v) of the same theorem will show that we cannot hope
to prove analogous results in the general setting.

6 Class of Integrable Functions

You may like to watch the videos 7 and 8 in [5] either as you read this section or you may read
this section after watching them.
Theorem 31. Let α : [a, b] → R be a bounded function. Let R(α) denote the set of α-integrable
functions on [a, b]. Then the following are true.
Rb
i) R(α) is a vector space over R and the map f 7→ a f d α is linear.
ii) Let β : [a, b] → R be bounded. Let σ := λα + µβ where λ, µ ∈ R. If f is α as well as β-
integrable, then f is σ integrable and we have
Z b Z b Z b
f dσ = λ f dα + µ f d β. (8)
a a a

iii) Let a < c < b. Assume that any two of the integrals in (9) exist. Then the third integral exists
and we have Z bZ Z c b
f dα = f dα + f d α. (9)
a a c

13
Rb
iv) Assume that α is increasing. Let f ∈ R(α) and f ≥ 0 on [a, b]. Then a f d α ≥ 0. More
generally, let f , g ∈ R(α). Assume that f (x) ≤ g (x) for x ∈ [a, b]. Then we have
Z b Z b
f dα ≤ g d α. (10)
a a

that α is increasing. Let f ∈ R(α). Assume that ¯ f (x)¯ ≤ M for x ∈ [a, b]. Then
¯ ¯
v) ¯Assume
¯ f ¯ ∈ R(α) and we have
¯

¯Z b ¯ Z b
α ¯ f ¯ d α ≤ M (α(b) − α(a)).
¯ ¯ ¯ ¯
¯
¯ f d ¯≤
¯ (11)
a a

vi) Assume that α is increasing. Let f ∈ R(α) Then f 2 ∈ R(α).


vii) Assume that α is increasing. Let f , g ∈ R(α) Then f g ∈ R(α).

Proof. Just to make sure that you are alert while reading the statements, (i) – (iii) are true for
any integrator α. We do not required α to be increasing for the validity of (i)–(iii).
The proofs will be instructive in the sense that we shall employ one of the equivalent defi-
nitions (for the case on hand) which makes our lives easy!
(i) Let λ, µ ∈ R. Observe that
n
S(λ f + µg , P, t, α) = (λ f (t i ) + µg (t i ))∆αi = λS( f , P, t, α) + µS(g , P, t, α).
X
i =1

Hence we obtain
¯ µ Z b Z b ¶¯
¯S(λ f + µg , P, t, α) − λ f dα + µ g d α ¯¯
¯ ¯
¯
a a
¯ Z b ¯ ¯ Z b
¯
≤ |λ| ¯S( f , P, t, α) − f d α¯ + µ ¯S(g , P, t, α) − g d α¯¯ .
¯ ¯ ¯ ¯¯ ¯
¯ ¯ ¯ ¯ ¯ (12)
a a

Do you see how to write a textbook proof now?


Given ε > 0, choose a partition P 1 and P 2 such that

ε ε
¯ Z b ¯ ¯ Z b ¯
¯S( f , P 1 , t, α) − f d α¯ < & ¯S(g , P 2 , t, α) − g d α¯¯ <
¯ ¯ ¯ ¯
¯ ¯
2(1 + ¯µ¯)
¯ ¯
¯
a 2(1 + |λ|) a

Let P := P 1 ∪ P 2 . Then, for any partition Q finer than P , it follows from (12) that
¯ µ Z b Z b ¶¯
¯S(λ f + µg , P, t, α) − λ α µ α ¯ < ε.
¯ ¯
¯ f d + g d ¯
a a

This completes the proof of (i).


Note that ii) has no analogue in the theory of Riemann integrals.
Proof of (ii) is very similar to that of (i). Start with the question: What is the relation be-
tween S( f , P, λα + µβ) in terms of λ, µ, S( f , P, α) and S( f , P, β)?

14
Remark: This may be omitted on first reading.
Just for fun: If you wish to prove (ii) using Darboux sum approach, you need to
assume λ, µ are positive. Let us go through a proof when α is increasing and λ =
µ = 1.
To prove ii), we start with an observation: ∆σi = ∆αi + ∆βi . An immediate conse-
quence is: L( f , P, σ) = L( f , P, α) + L( f , P, β) for any partition P . Similar result holds
for the upper sums.
We first show that f is σ-integrable, using Riemann’s criterion. Let ε > 0 be given.
Then we can find partitions P 1 and P 2 so that

U ( f , P 1 , α) − L( f , P 1 , α) < ε/2 and U ( f , P 2 , β) − L( f , P 2 , β) < ε/2.

For any common refinement P of P 1 and P 2 , we have

U ( f , P, σ) − L( f , P, σ) = U ( f , P, α) − L( f , P, α) +U ( f , P, β) − L( f , P, β) < ε.

This implies that f is σ-integrable.


We now prove (8). For any partition P , we have

L( f , P, σ) = L( f , P, α) + L( f , P, β)
Z b Z b
≤ f dα + f dβ
a a
≤ U ( f , P, α) +U ( f , P, β)
= U ( f , P, σ).

Can you see how (8) follows from this?

Let A, B ⊂ R be nonempty. Assume that for each x ∈ A and y ∈ B , we


have x ≤ y. Let α := LUB A and β := GLB B . (Why do they exist?) Let us
assume that α = β. Let c ∈ R be such that x ≤ c ≤ y for x ∈ A and y ∈ B .
Then we claim that α = c = β.
For, c is an upper bound for A and a lower bound for B . Hence α ≤ c
and c ≤ β. Since α = β, the claim follows.

Thus we have proved ii).

Rc Rb
We shall now prove one of the three cases of (iii). Assume that a f d α and c f d α exist.
Then we are required to show that f is α-integrable on [a, b] and that (9) holds.
Let P be a partition of [a, b] such that c is a node, that is,c ∈ P . Let P 1 = P ∩ [a, c] and
P 2 = P ∩ [c, b]. Then P 1 and P 2 provide partitions of [a, c] and [c, b] respectively.
What is the relation between S( f , P 1 , t1 , α), S( f , P 2 , t2 , α) and S( f , P, t, α)? Of course, its eas-
ily seen that
S( f , P, t, α) = S( f , P 1 , t1 α) + S( f , P 2 , t2 , α).

15
Let ε > 0 be given. Choose partitions P 1 and P 2 of [a, c] and [c, b] respectively with the prop-
erty that for any set of tags we have
¯ Z c ¯ ¯ Z b ¯
¯S( f , P 1 , t1 , α) − f d α¯ < ε/2 and ¯S( f , P 2 , t2 , α) − f d α¯¯ < ε/2.
¯ ¯ ¯ ¯
¯ ¯
¯
a c

Let P = P 1 ∪P 2 . Let Q be any partition finer than P . Note that Q i = Q ∩P i will be finer than P i ,
i = 1, 2. Hence for any set of tags in Q we have
¯ Z c Z b ¯
¯S( f ,Q, t, α) − f dα − f d α¯¯ < ε.
¯ ¯
¯
a c

(Can you justify the inequality quoting the observation made above?) This proves (iii).
Let f ∈ R(α) and f ≥ 0. To prove (iv), we need only ask the question: Which of the defini-
tions is likely to yield a simple proof? Yes, the Darboux sum approach. Again, the next ques-
Rb
tion is since a f d α is the LUB of lower sums and the GLB of upper sums, which sum should
Rb
we use? An obvious choice is lower sums. Observe that L( f , P, α) ≥ 0. Since a f d α ≥ L( f , P, α),
the result follows.

An aside. You could work with upper sums as well as Riemann sums. In each of
these cases, assume that the integral is negative, say A < 0. Choose ε = − A2 to ar-
rive at a contradiction. Recall how you prove: If (x n ) is a sequence of nonnegative
terms converging to x, then x ≥ 0.

The general case can be deduced easily from this. Consider g − f . By (i), g − f ∈ R(α) and
Rb Rb
also g − f ≥ 0. Hence a (g − f ) d α ≥ 0. Again by the linearity of the integral, a (g − f ) d α =
Rb Rb
a g d α − a f d α ≥ 0. Thus we have proved (iv).

(v) I am sure all of us will think of Darboux sum approach to ¯ ¯prove this.
¯ ¯ The key idea of
the proof is to find the relation between M i ( f ) − m i ( f ) and M i ( f ) − m i (¯ f ¯). If you think for
¯ ¯
a while, we can formulate a more general question. Let A ⊂ R be a nonempty set which is
both bounded above and bounded below. Let B := {|x| : x ∈ A}. Recall the relation between
LUB A − GLB A and LUB B − GLB B made in Observation 17.
We are now ready to prove (v). We let A := { f (x) : x ∈ [x i −1 , x i ]} and apply the observations
made in the last two paragraphs above. Recall that for any s, t ∈ R, we have ||s| − |t || ≤ |s − t |.
Observe that for s, t ∈ [x i −1 , x i ] we have
¯¯ ¯ ¯ ¯¯ ¯ ¯
¯¯ f (s)¯ − ¯ f (t )¯¯ ≤ ¯ f (s) − f (t )¯ .
¯ ¯ ¯ ¯
The LUB of the left side is M i (¯ f ¯)−m i (¯ f ¯). Similarly the LUB of the right side is M i ( f )−m i ( f ).
It follows that ¯ ¯ ¯ ¯
M i (¯ f ¯) − m i (¯ f ¯) ≤ M i ( f ) − m i ( f ).
Since ∆αi ≥ 0, we see that M i (¯ f ¯) − m i (¯ f ¯) ∆αi ≤ [M i ( f ) − m i ( f )]∆αi . Hence
£ ¯ ¯ ¯ ¯¤

U ¯ f ¯ , P, α − L ¯ f ¯ , P, α ≤ U ( f , P, α) − L( f , P, α).
¡¯ ¯ ¢ ¡¯ ¯ ¢

16
Do you see how to prove ¯ f ¯ ∈ R(α)? If yes, which results are needed to establish (11)?
¯ ¯

We now prove (vi). The key observation is x 2 = |x|2 for x ∈ R. We have


¯ ¯ ¢2
M i ( f 2 ) − m i ( f 2 ) = (M i ¯ f ¯))2 − (m i (¯ f ¯)
¡¯ ¯
£ ¯ ¯ ¯ ¯¤ £ ¯ ¯ ¯ ¯¤
= M i (¯ f ¯) + m i (¯ f ¯) · M i (¯ f ¯) − m i (¯ f ¯)
£ ¡¯ ¯¢ ¡¯ ¯¢¤
≤ 2C M i ¯ f ¯ − m i ¯ f ¯ .
©¯ ¯ ª
Here C := LUB ¯ f (x)¯ : x ∈ a[, b] . Can you complete the proof now?
Proof (vii) depends on the algebraic identity: ab = 21 [(a+b)2 −a 2 −b 2 ]. We have 2 f (x)g (x) =
[ f (x)+g (x)]2 − f (x)2 −g (x)2 . Since f , g ∈ R(α), f +g ∈ R(α) and hence ( f +g )2 , f 2 , g 2 ∈ R(α).
Hence the right side lies in R(α) and so the left side 2 f g ∈ R(α).
You may compare the proofs of i), iii)–v) with those in Section 6.2 of [1].

The next theorem is a very useful result.

Theorem 32. Let α : [a, b] → R be increasing. Let f ∈ R(α). Assume that m ≤ f (x) ≤ M for
x ∈ [a, b]. Let g : [m, M ] → R be continuous. Then g ◦ f ∈ R(α).

Proof. Let C > 0 be such that ¯g (y)¯ ≤ C for y ∈ [m, M ]. Let ε > 0 be given. We need to find a
¯ ¯

partition P such that

U (g ◦ f , P, α) − L(g ◦ f , P, α) = (M i (g ◦ f ) − m i (g ◦ f ))∆αi < ε.


X
i

We shall use the “Divide and Conquer" trick to estimate the sum. By the uniform continuity of
g on [m, M ], for the given θ, there corresponds a δ > 0. Hence if we let G := {i : M i ( f )−m i ( f ) <
δ}, then we have a control over M i (g ◦ f ) − m i (g ◦ f ). Let B := {i : M i ( f ) − m i ( f ) ≥ δ}. The sum
over B is
∆αi .
X X
(M i (g ◦ f ) − m i (g ◦ f ))∆αi ≤ 2C
i ∈B i ∈B

the partition P so that i ∈B ∆αi


P
As per our strategy outlined in Remark 16, we need to choose
is “small". Note that i ∈G (M i (g ◦ f ) − m i (g ◦ f ))∆αi ≤ i ∈G θ(∆αi ) ≤ θ(α(b) − α(a)). Let P be
P P
a partition of [a, b] such that U ( f , P, α) − L( f , P, α) < η. Presumably η will be something to do
with ε and will be decided later. We then arrive at the following estimate

U (g ◦ f , P, α) − L(g ◦ f , P, α) =
X X
(M i (g ◦ f ) − m i (g ◦ f ))∆αi + (M i (g ◦ f ) − m i (g ◦ f ))∆αi
i ∈G i ∈B
θ∆αi + 2C ∆αi
X X
<
i ∈G i ∈B
< θ(α(b) − α(a)) + 2C ∆αi .
X
i ∈B

We want the first term on the last right expression to be less than ε/2. So, you choose 0 < θ <
ε
2[1+α(b)−α(a)]
.

17
∆αi so that it is less than ε/2. Observe that
P
We need to estimate i ∈B

η > U ( f , P, α) − L( f , P, α) ≥ (M i − m i )∆αi ≥ δ ∆αi .


X X
i ∈B i ∈B

We find that i ∈B ∆αi < η/δ. So this suggests that we choose a partition P so that U ( f , P, α) −
P
εδ
L( f , P, α) < 4C .
Do you think you can write a textbook proof now?
Let ε > 0 be given. Let C > 0 be such that ¯g (t )¯ ≤ C for t ∈ [m, M ]. Since g is uniformly
¯ ¯

continuous on [m, M ], there exists δ > 0 such that


ε
|s − t | < δ =⇒ ¯g (s) − g (t )¯ <
¯ ¯
, s, t ∈ [m, M ].
2[1 + α(b) − α(a)]

Since f is integrable, there exists a partition P of [a, b] such that

εδ
U ( f , P, α) − L( f , P, α) < .
4C
Let G := {i : M i ( f ) − m i ( f ) < δ} and B := {i : M i ( f ) − m i ( f ) ≥ δ}. Then we have

εδ ε
δ ∆αi ≤ ∆αi <
X X X
(M i − m i )∆αi < =⇒ . (13)
i ∈B i ∈B 4C i ∈B 4C

For x, y ∈ [x i −1 , x i ], i ∈ G, we see that ¯ f (x) − f (y)¯ ≤ M i − m i < δ so that g ( f (x)) − g ( f (y)) <
¯ ¯
ε
2[1+α(b)−α(a)] . Observation 17 implies that

ε
M i (g ◦ f ) − m i (g ◦ f ) ≤ for i ∈ G. (14)
2[1 + α(b) − α(a)]

Let us estimate U (g ◦ f , P, α) − L(g ◦ f , P, α).

U (g ◦ f , P, α) − L(g ◦ f , P, α)
X X
= [M i (g ◦ f ) − m i (g ◦ f )]∆αi + [M i (g ◦ f ) − m i (g ◦ f )]∆αi
i ∈G i ∈B
ε
∆αi + 2C ∆αi
X X
< by (14)
2[1 + α(b) − α(a)] i ∈G i ∈B
ε n ε
∆αi + 2C
X
< by (13)
2[1 + α(b) − α(a)] i =1 4C
ε ε
< + .
2 2
We hope that you enjoyed the proof. This is an example of how one thinks through a proof
and then writes a textbook proof.

Remark 33. Do you realize that (v) and (vi) of Theorem 31 are easy consequences of the last
Theorem 32?

18
7 Two Important Examples of Riemann-Stieltjes Integrals

You may like to watch the video 11 in [5] either as you read this section or you may read this
section after watching it.
We now give two important examples of Riemann-Stieltjes integrals which are very useful.
Also, the first example shows that an infinite series is a special case of α-integrals.
Example 34. Let {t k : 1 ≤ k ≤ N } be a finite set of points in (a, b]. (Did you notice the brackets?
see why they are different?) Let {c k : 1 ≤ k ≤ N } be a set of nonnegative numbers. Let
Do you P
N
α(x) := k=1 c k H tk (x). It is clear that α is increasing. Let f : [a, b] → R be continuous. We then
have Z b
f d α = c k f (t k ).
X
(15)
a k
(What results did we use to assert (15)?)
Ex. 35. Let α : [0, N ] → R be the greatest integer function α(x) = [x]. Write it as “a sum of
RN PN
Heaviside functions”. Let f : [0, N ] → R be continuous. Prove that 0 f d α = k=0 f (k).

Let now {t k : k ∈ N} be a countable subset


P of (a, b]. Let (c k ) be a sequence of nonnegative
numbers such that the associated series k c k is convergent. We define

α(x) := c k H tk (x),
X
x ∈ [a, b]. (16)
k

Then α(x) makes sense (why?) and is an increasing function. We now claim that for any
Rb
continuous function f on [a, b] we have a f d α = k c k f (t k ). This is an easy exercise with the
P
PN
following hint: Given ε > 0, choose N ∈ N such that k≥N +1 c k < ε/2. Write α = k=1
P
c k H tk +
β σ
P
c H
k≥N +1 k t k = n + n , say.

Pn Rb
We need to show that c f (t k ) → a f d α. Let ε¯> 0 be given. Then we need to
k=1 ¯k
¯R b
≥ N , ¯ a f d α − nk=1 c k f (t k )¯ < ε. Keep the notation of the
P
find N such that for n
¯

hint. For n ≥ N , let us write α = βn + σn , in an obvious (?) notation. Then we have


Rb Rb Rb Rb Pn
a f d α = a f d β n + a f d σ n . We know that a f d βn = k=1 c k f (t k ). Hence we
have, for n ≥ N ,
¯Z ¯ ¯Z
¯ b n ¯ ¯ b Z b ¯
f dα − f dα − f d βn ¯¯
X ¯
c k f (t k )¯ = ¯ (Why?)
¯ ¯ ¯
¯
¯ a k=1
¯ a a
¯Z b ¯
f d σn ¯¯
¯ ¯
=¯ ¯ (Why?)
a
Z b¯
¯ f ¯ d σn
¯
≤ (Why?)
a
≤ M (σn (b) − σn (a)) (What is M ? Why?)
c k < M ε.
X
=M
k≥n+1

19
Hence we have proved the following theorem.

Theorem 36. Let {t k : k ∈ N} be a countable


P subset of (a, b]. Let (c k ) be a sequence of nonnegative
numbers such that the associated series k c k is convergent. Define

α(x) := c k H tk (x),
X
x ∈ [a, b]. (17)
k

Then for any continuous function f : [a, b] → R, we have


Z b
f dα =
X
c k f (t k ). (18)
a k

The second example is given by the following theorem.

Theorem 37. Let f , α : [a, b] → R be bounded. Assume that α is continuously differentiable and
Rb Rb
that f is α-integrable. Then f α0 is integrable on [a, b] and we have a f d α = a f α0 d x.

Proof. It behooves us to consider the Riemann sums. Let P be any partition with tags t. Then
we look at
n n
f (t i )α0 (t i )∆x i .
X X
f (t i )∆αi − (19)
i =1 i =1

Again it is clear that we wish to apply the mean value theorem to ∆αi : ∆αi = α0 (s i )∆x i . Using
this in (19) we arrive at
n n n
f (t i )α0 (t i )∆x i = f (t i )(α0 (s i ) − α0 (t i ))∆x i .
X X X
f (t i )∆αi − (20)
i =1 i =1 i =1

Our aim is to show that these two Riemann sums are close to each other. So the next obvious
¯ α . Give ε > ¯ 0, by the uniform continuity of α there
0 0
step is to use the uniform continuity of
exists a δ > 0 such that |s − t | < δ =⇒ α (s) − α (t ) < ε. Then for any a partition Q, which is
¯ 0 0 ¯
a refinement of P , with the property that the maximum length of the subintervals is less than
δ, it follows from (20)
¯ ¯
¯X n n ¯
f (t i )α0 (t i )∆x i ¯ ≤ M ε(b − a).
X
¯ f (t i )∆αi − (21)
¯ ¯
¯i =1 i =1
¯

(M
¯R has thePusual meaning!) ¯ We choose any partition Q 0 which is refinement of Q so that
¯ b
¯ a f d α − ni=1 f (t i )∆αi ¯ < ε for any set of tags. We then see that for any such Q 0 and any set
¯

of tags ¯Z ¯
¯ b n ¯
f dα − f (t i )α0 (t i )¯ ∆x i < (M (b − a) + 1)ε.
¯ X ¯
¯
¯ a i =1
¯
This completes the proof.

Remark 38. The result above can be proved without assuming that α0 is continuous but under
the hypothesis that α0 is integrable. See Theorem 39.

20
Theorem 39. Let f ∈ R[a, b]. Let α : [a, b] → R be differentiable. Assume further that α0 ∈
R[a, b]. Then f ∈ R(α) and we have
Z b Z b
f dα = f α0 d x. (22)
a a

Proof. Note that f , α0 ∈ R[a, b] implies f α0 ∈ R[a, b]. Hence the right side of (22) makes sense.
Rb Rb
Let A := a f α0 . We wish to show that a f d α = A. Let ¯ f (x)¯ ≤ C for x ∈ [a, b].
¯ ¯

Let ε > 0 be given. Since f α0 is integrable and α0 is integrable, there exists a (common)
partition P such that for any partition Q ≥ P , and for any set of tags in Q, we have
¯ ¯
¯S( f α ,Q, t) − A ¯ = ¯ f (t i )α (t i )(x i − x i −1 ) − A ¯¯ < ε , and U (α0 ,Q) − L(α0 ,Q) < ε . (23)
¯ 0
¯ ¯¯X 0
¯
¯i ¯ 2 2C

We claim that ¯S( f ,Q, t, α) − A ¯ < ε. Applying the mean value theorem to α(x i ) − α(x i −1 ), we
¯ ¯

find s i ∈ [x i −1 , x i ] such that

S( f ,Q, t, α) = f (t i )α0 (s i )(x i − x i −1 ).


X X
f (t i )∆αi =
i i

We have
¯ ¯
¯X ¯
¯S( f α0 ,Q, t) − S( f ,Q, t, α)¯ = ¯ f (t i )(α0 (t i ) − α0 (s i ))(x i − x i −1 )¯¯
¯ ¯ ¯
¯i ¯
≤ ¯ f (t i )¯ ¯α (t i ) − α0 (s i )¯ (x i − x i −1 )
X¯ ¯¯ 0 ¯
i
≤ ¯ f (t i )¯ (M i (α0 ) − m i (α0 ))(x i − x i −1 )
X¯ ¯
(Why?)
i
≤ C U (α0 ,Q) − L(α0 ,Q) < ε/2.
¡ ¢
(Why?) (24)

It follows from (23) - (24) that

¯S( f ,Q, t, α) − A ¯ ≤ ¯S( f ,Q, t, α) − S( f α0 ,Q, t) + S( f α0 ,Q, t) − A ¯


¯ ¯ ¯ ¯

≤ ¯S( f ,Q, t, α) − S( f α0 ,Q, t)¯ + ¯S( f α0 ,Q, t) − A ¯


¯ ¯ ¯ ¯

< ε.
Rb
This shows that a f d α = A.

Remark 40. Did you observe something interesting about the proof above? We did not hes-
itate to use both the definitions of the integral! To show that f is α-integrable, we used the
Riemann sum approach and to estimate a sum, we used the Riemann’s criterion for the Dar-
boux sum approach. I thought of the proof after I sent you the file! I have¯ given a textbook
proof above. Can you ‘discover’ the proof starting from a wish to estimate ¯S( f ,Q, t, α) − A ¯?
¯

21
8 Fundamental Theorems of Calculus

You may like to watch the videos 9 and 10 in [5] either as you read this section or you may read
this section after watching them.
We now look at one of the most important results in the theory of integration, namely, the
fundamental theorems of calculus. These theorems establish the validity of the computation
of integrals via Newtonian calculus, as learned in high school. In some sense, they justify the
high-school way of defining integration as finding an anti-derivative.

Theorem 41 (First Fundamental Theorem of Calculus). Let f : [a, b] → R be differentiable.


Assume that f 0 is integrable on [a, b]. Then
Z b
f 0 (x) d x = f (b) − f (a).
a

Rb
Proof. WeP wish to use the Riemann sum approach. A Riemann sum for the integral a f 0 (t ) d t
looks like ni=1 f 0 (t i )(x i − x i −1 ). The summands remind us of the mean value theorem of dif-
ferential calculus. Let ε > 0 be given. Then there exists a partition P such that for any partition
Q ≥ P and set of tags t, we have
b
¯ Z ¯
¯S( f 0 ,Q, t) − 0
f (t ) d t ¯¯ < ε.
¯ ¯
¯ (25)
a

Let Q ≥ P be partition. Then observe that


n n
f 0 (t i )(x i − x i −1 ),
X X
f (b) − f (a) = [ f (x i ) − f (x i −1 )] = (26)
i =1 i =1

where we used the mean value theorem to f (x i ) − f (x i −1 ), 1 ≤ i ≤ n. But the term on the
extreme right is S( f 0 ,Q, t). Hence it follows from (25)-(26) that
b
¯ Z ¯
0
f (t ) d t ¯¯ < ε.
¯ ¯
¯ f (b) − f (a) −
¯
a

Since ε > 0 is arbitrary, the result follows.

Remark 42. Note that this justifies what you learned in school about the integral being anti-
Rb
derivative. That is, to find a f (x) d x, we find a function g such that g 0 = f and then in this
Rb
case we have a f (x) d x = g (b) − g (a).

Let f : [a, b] → R be integrable. Then R xfor any x ∈ [a, b], we know that f is integrable on
[a, x]. Hence we have a function F : x 7→ a f (t ) d t , x ∈ [a, b]. The new function F is called the
indefinite integral of f . This is the area under the curve y = f (x) between the x-axis, x = a,
and x = b. Draw a picture

22
Theorem 43 (Second Fundamental Theorem of Calculus). Let f : [a, b] → R be integrable. The
indefinite integral F of f is continuous (in fact, Lipschitz) on [a, b] and is differentiable at x if
f is continuous at x ∈ [a, b]. In fact, F 0 (x) = f (x), if f is continuous at x.
R x+h
Why is this result plausible? Draw a picture. It seems that x f (t ) d t is approximately the
1 x+h
R
area of the rectangle whose base is h and height is f (x). Hence h x f (t ) d t ≈ f (x). Observe
that µZ x+h Z x
1 x+h

1
Z
f (t ) d t − f (t ) d t = f (t ) d t ≈ f (x).
h a a h x
¯ ¯
Proof. Since f is bounded, there exists M such that ¯ f (x)¯ ≤ M for x ∈ [a, b]. Then we have
¯ ¯ x
¯Z Z y ¯
¯ ¯
¯F (x) − F (y)¯ = ¯ f (t ) d t − f (t ) d t ¯¯
¯
¯Z a y ¯ a
¯ ¯
= ¯¯ f (t ) d t ¯¯
x
¯Z y ¯
¯ ¯
≤M¯ ¯ 1 d t ¯¯
¯ x ¯
= M ¯x − y ¯ .

Thus F is Lipschitz and hence continuous on [a, b].


Let f be continuous at c ∈ [a, b]. We shall show that F is differentiable at c and F 0 (c) = f (c).
Observe that, for x > c,
Z x Z x
F (x) − F (c) 1 1
= f (t ) d t and f (c) = f (c) d t .
x −c x −c c x −c c
Hence, we obtain
¯ ¯ ¯ Z x Z x ¯
¯ F (x) − F (c) ¯ ¯ 1 1 ¯
¯ − f (c)¯ = ¯
¯ ¯ f (t ) d t − f (c) d t ¯¯
¯ x −c x −c c x −c c
¯ Z x ¯
¯ 1 ¯
=¯¯ [ f (t ) − f (c)] d t ¯¯
x −c c
Z x
1 ¯ ¯
¯ f (t ) − f (c)¯ d t .
≤ (27)
x −c c

Given ε > 0, by the continuity of f at c, we can find a δ > 0 such that ¯ f (t ) − f (c)¯ < ε for
¯ ¯

|t − c| < δ. Hence for x ∈ [a, b] such that |x − c| < δ, we see that the RHS of (27) is estimated
above by ε. Similar argument applies when x < c.
This shows that F is differentiable at c and F 0 (c) = f (c).

Remark 44. Look at the inequality (27). One of the terms in LHS is an integral while the other
is a numberR f (c). We re-wrote this as a sum of two integrals by observing that f (c) is the
1 x
average x−c c f (c) d t and then applied the linearity, continuity, and the standard estimate
for the integral. Learn this well as this trick is often used.

23
Remark 45. We can deduce a weaker version of the first fundamental Theorem 41 from the
second fundamental theorem of calculus.
Let f : [a, b] → R be differentiable with f 0 continuous on [a, b]. Then
Z b
f 0 (x) d x = f (b) − f (a).
a

Rx
Proof. Since f 0 is continuous, it is integrable and its indefinite integral, say, G(x) = a f 0 (t ) d t ,
exists. By the last item, G is differentiable with derivative G 0 = f 0 . Hence the derivative of f −G
is zero on [a, b] and hence the function f −G is a constant on [a, b]. In particular, f (a)−G(a) =
Rb
f (b) −G(b), that is, f (a) = f (b) − a f 0 (x) d x.

Theorem 46 (Integration by Parts). Let u, v : [a, b] → R be differentiable. Assume that u 0 , v 0 are


integrable on [a, b]. Then
Z b Z b
0
u(x)v (x) d x = u(x)v(x) |ba − u 0 (x)v(x) d x. (28)
a a

Strategy: Let g := uv. Then g is integrable, and g 0 = u 0 v + uv 0 is integrable. (Why? If we assume


that u and v are continuously differentiable, then the integrability of g 0 etc. are clear.) Apply the
Rb
first fundamental theorem of calculus to a g 0 (x) d x to arrive at the result.

Proof. We assume that u and v are continuously differentiable functions. Then g = uv is


continuous and hence integrable. Also g 0 = u 0 v + uv 0 . Furthermore, g 0 is continuous and
hence integrable. Applying the first fundamental theorem of calculus, we obtain

g (b) − g (a) = u(b)v(b) − u(a)v(a)


Z b Z b
0
= u(x)v (x) d x + u 0 (x)v(x) d x
a a

The term u(b)v(b) − u(a)v(a), we write as u(x)v(x) |ba . Hence we have


Z b Z b
u(x)v 0 (x) d x = u(x)v(x) |ba − u 0 (x)v(x) d x.
a a

One of the most basic tools for computing integration in high school is integration by
substitution or the change of variables. The following result justifies this process.

Theorem 47 (Change of Variables). Let I , J be closed and bounded intervals. Let u : J → R be


continuously differentiable. Let u(J ) ⊂ I and f : I → R be continuous. Then f ◦ u is continuous
on J and we have Z b Z u(b)
f (u(x))u 0 (x) d x = f (y) d y, a, b ∈ J . (29)
a u(a)

24
Ry
Proof. Fix c ∈ I . Let F (y) := c f (t ) d t . Then by the second fundamental theorem of calculus
(Theorem 43), F is differentiable and F 0 (y) = f (y). Let g (x) := (F ◦ u)(x). Then g is differen-
tiable, and by the chain rule we have

g 0 (x) = F 0 (u(x))u 0 (x) = f (u(x))u 0 (x).

We apply the first fundamental theorem of calculus to g 0 :


Z b Z b
0
f (u(x))u (x) d x = g 0 (x) d x
a a
= g (b) − g (a)
= F (u(b)) − F (u(a))
Z u(b) Z u(a)
= f (t ) d t − f (t ) d t
c c
Z u(b)
= f (t ) d t .
u(a)

This completes the proof.

We now prove a general version of integration by parts formula for Riemann-Stieltjes in-
tegrals.

Theorem 48 (Integration by Parts Formula). Let f , g : [a, b] → R be bounded functions. Assume


Rb
that f is g -integrable on [a, b], that is, assume that a f d g exists. Then g is f -integrable and
we have Z b Z b
g d f = f (b)g (b) − f (a)g (a) − f dg, (30)
a a

Rb
Strategy: Let A := a f d g . Let ε > 0 be given. Choose P = {x 0 , . . . ,¯x n } such that for
¯
any partition Q finer than P and for any set of tags t in Q, we have ¯ A − S( f , P, t, g )¯ <
ε.
We wish to show that ¯−A + f (b)g (b) − f (a)g (a) − S(g , P, t, f )¯ < ε. Note that we
¯ ¯

need to move from a Riemann sum of the form


X X
S( f , P, t, g ) = f (t i )(g (x i ) − g (x i −1 )) to S(g , P, t, f ) = g (t i )( f (x i ) − f (x i −1 )).
i i

This suggests that we consider a partition Q in which the nodes are the nodes of
P along with the tags in P . Also, the tags of Q should include the tags t and the
nodes of P . Let Q = P ∪ t. Let
¯ s = {s 1 = a, s 2¯ = x 1 , s 3 = x 1 , . . . , x n−1 , x n−1 , s 2n = b} be
the set of tags in Q. Then ¯ A − S( f ,Q, s, g )¯ < ε. Keeping in mind what we want,
rearrange the terms in S( f ,Q, s, g ) to arrive at (31).
Rb
Proof. Let A := a f d g . Let ε > 0 be given. Choose P¯ = {x 0 , . . . , x n } such that for any partition
Q finer than P and for any set of tags t in Q, we have ¯ A − S( f , P, t, g )¯ < ε.
¯

25
We wish to show that ¯−A + f (b)g (b) − f (a)g (a) − S(g , P, t, f )¯ < ε.
¯ ¯

Let¯ Q = P ∪ t. Let ¯s = {s 1 = a, s 2 = x 1 , s 3 = x 1 , . . . , x n−1 , x n−1 , s 2n = b} be the set of tags in Q.


Then ¯ A − S( f ,Q, s, g )¯ < ε.
We claim that
S( f ,Q, s, g ) = f (b)g (b) − f (a)g (a) − S(g , P, t, f ). (31)
The result follows from this.
In order to make the ideas clear, we look at a specific case where P = {a = x 0 , x 1 , x 2 , b}.
Let Q = {a, t 1 , x 1 , t 2 , x 2 , t 3 , b}. Note that there are six subintervals. We assume that the set s =
{a, x 1 , x 1 , x 2 , x 2 , b} of tags are chosen. (Can you say to which subinterval the third x 1 belong?
Draw pictures. The point x i is the tag of the consecutive subintervals [t i , x i ] and [x i , t i +1 ]. )
S( f ,Q, s, g )
= f (a)(g (t 1 ) − g (a)) + f (x 1 )(g (x 1 ) − g (t 1 )) + f (x 1 )(g (t 2 ) − g (x 1 ))
+ f (x 2 )(g (x 2 ) − g (t 2 )) + f (x 2 )(g (t 3 ) − g (x 2 )) + f (b)(g (b) − g (t 3 ))
= f (b)g (b) − f (a)g (a) − g (t 1 )[ f (x 1 ) − f (x 0 )] − g (t 2 )[ f (x 2 ) − f (x 1 )] − g (t 3 )[ f (b) − f (x 2 )]
= f (b)g (b) − f (a)g (a) − S(g , P, t, f ).
Thus (31) is proved. From it, we obtain
¯ ¯ ¯ ¯
¯S( f ,Q, s, g ) − A ¯ = ¯ f (b)g (b) − f (a)g (a) − S(g , P, t, f ) − A ¯ .

Since Q ≥ P , the left side is less than ε. So the right side is also less than ε. We deduce that
Z b Z b
− gdf = f d g − f (b)g (b) + f (a)g (a).
a a
Thus (30) follows.

Acknowledgment. I thank Vikram Aithal,Tamoghna Kar, Manmohan Sahu, Shivam Bajpai,


V Balakumar and Ruby Pinto for a very meticulous proof-reading of a (deliberately minimally
proofread) draft. If this article of mine has less number of irritating typos, you should thank
them. I also thank Tamoghna for suggesting specific references to our book on real analysis
and the YouTube videos.

References
(1) Ajit Kumar and S Kumaresan, A Basic Course in Real Analysis, CRC Press.
(2) Tom Apostol, Analysis, Indian Reprint.
(3) Walter Rudin, Principles of Mathematical Analysis, International Student Edition.
(4) S Kumaresan and G Santhanam, Analysis on Rn , Forthcoming.
(5) This article is based on the series of my videos in the YouTube playlist on Riemann Inte-
gration:
https://www.youtube.com/playlist?list=PLDzvuf9Uf4FNveNqPpjBtZqtfWEIj3DKr
You may either watch them first and then read this article or the other way around.

26

You might also like