DIFFERENTIATION
Instructor: Mr. Kigahe Orestas
SUBTOPICS
• 1.1 Mean-value theorems
• 1.2 Taylor’s theorem
• 1.3 Partial differentiation
• 1.4 Jacobian transformation
The Mean Value Theorem
• The Mean Value Theorem is one of the
most important theorems in calculus. We
look at some of its implications at the
end of this section. First, let’s start with a
special case of the Mean Value Theorem,
called Rolle’s theorem.
Rolle’s Theorem
• Informally, Rolle’s theorem states that
if the outputs of a differentiable function
f are equal at the endpoints of an
interval, then there must be an interior
point c where f′(c) = 0 . (Figure)
illustrates this theorem.
Contd.
Figure 1. If a differentiable function f satisfies f(a) = f(b), then its
derivative must be zero at some point(s) between a and b.
Contd.
Rolle’s Theorem
• Let 𝑓 be a continuous function over the
closed interval [𝑎, 𝑏] and differentiable
over the open interval (𝑎, 𝑏) such that
𝑓(𝑎) = 𝑓(𝑏). There then exists at least
one such that c ∈ (a, b) such that
𝑓′(𝑐) = 0.
Contd.
Proof
• Let 𝑘 = 𝑓(𝑎) = 𝑓(𝑏). We consider three
cases:
1. 𝑓(𝑥) = 𝑘 for all 𝑥 ∈ (a, b)
2. There exists 𝑥 ∈ (a, b) such that
𝑓(𝑥) > 𝑘.
3. There exists 𝑥 ∈ (a, b) such that
𝑓(𝑥) < 𝑘.
Contd.
• Case 1: If 𝑓(𝑥) = 0 for all 𝑥 ∈ (a,
b) then 𝑓′(𝑥) = 0 for all 𝑥 ∈ (a, b)
• Case 2: Since 𝑓 is a continuous
function over the closed, bounded
interval [𝑎, 𝑏], by the extreme value
theorem, it has an absolute
maximum. Also, since there is a point
𝑥 ∈ (a, b) such that 𝑓(𝑥) > 𝑘 , the
absolute maximum is greater than k.
Contd.
• Therefore, the absolute maximum does not
occur at either endpoint. As a result, the
absolute maximum must occur at an
interior point 𝑐 ∈ (a, b). Because 𝑓 has a
maximum at an interior point c, and 𝑓 is
differentiable at c, by Fermat’s theorem,
𝑓′(𝑐) = 0 .
• Case 3: The case when there exists a
point 𝑥 ∈ (a, b) such that 𝑓(𝑥) < 𝑘 is
analogous to case 2, with maximum
replaced by minimum.
Contd.
• An important point about Rolle’s theorem is
that the differentiability of the function 𝑓 is
critical. If f is not differentiable, even at a
single point, the result may not hold. For
example, the function 𝑓(𝑥) = |𝑥| − 1 is
continuous over [-1,1] and 𝑓(−1) = 0 =
𝑓(1), but 𝑓′(𝑐) ≠ 0 for any 𝑐 ∈ (−1, 1) as
shown in the following figure.
Contd.
Figure 2. Since 𝑓(𝑥) = |𝑥| − 1 is not differentiable at x=0, the conditions of
Rolle’s theorem are not satisfied. In fact, the conclusion does not hold here;
there is no c ∈ (−1, 1) such that 𝑓 ′ 𝑐 = 0.
Contd.
• Let’s now consider functions that satisfy
the conditions of Rolle’s theorem and
calculate explicitly the points c where
𝑓 ′ 𝑐 = 0.
• For each of the following functions, verify
that the function satisfies the criteria
stated in Rolle’s theorem and find all
values c in the given interval where
𝑓 ′ 𝑐 = 0.
a. 𝑓 𝑥 = 𝑥 2 + 2𝑥 over [-2,0]
b. 𝑓 𝑥 = 𝑥 3 − 4𝑥 over [-2,2]
Contd.
Solution a
• Since f is a polynomial, it is continuous
and differentiable everywhere. In
addition, 𝑓(−2) = 0 = 𝑓(0). Therefore, f
satisfies the criteria of Rolle’s theorem.
We conclude that there exists at least
one value 𝑐 ∈ (−2,0) such that 𝑓′ (𝑐) =
0 . Since 𝑓’ (𝑥) = 2𝑥 + 2 = 2(𝑥 + 1), we
see that 𝑓′(𝑐) = 2(𝑐 + 1) = 0 implies 𝑐 =
− 1 as shown in the following graph.
Contd.
Figure 3. This function is continuous and differentiable over [-2,0],
𝑓′ (𝑐) = 0 when c=-1.
Contd.
• As in part a. 𝑓 is a polynomial and
therefore is continuous and differentiable
everywhere. Also, 𝑓(−2) = 0 = 𝑓(2). That
said, 𝑓 satisfies the criteria of Rolle’s
theorem. Differentiating, we find that
𝑓′ (𝑥) = 3𝑥 2 − 4 . Therefore, 𝑓′ (𝑐) =
2
0 when 𝑥 = Both points are in the
3
interval [-2,2], and, therefore, both points
satisfy the conclusion of Rolle’s theorem as
shown in the following graph.
Contd.
2
Figure 4. For this polynomial over [-2,2], 𝑓′ (𝑐) = 0 at 𝑥 = 3
The Mean Value Theorem and
Its Meaning
• Rolle’s theorem is a special case of the Mean
Value Theorem. In Rolle’s theorem, we
consider differentiable functions 𝑓 that are zero
at the endpoints. The Mean Value Theorem
generalizes Rolle’s theorem by considering
functions that are not necessarily zero at the
endpoints. Consequently, we can view the
Mean Value Theorem as a slanted version of
Rolle’s theorem ((Figure)).
Contd.
• The Mean Value Theorem states that if f is
continuous over the closed interval [a,b] and
differentiable over the open interval (a,b), then
there exists a point c ∈ (a, b) such that the
tangent line to the graph of f at c is parallel to
the secant line connecting (𝑎, 𝑓(𝑎)) and
(𝑏, 𝑓(𝑏)).
Figure 5. The Mean Value Theorem says that for a function
that meets its conditions, at some point the tangent line has
the same slope as the secant line between the ends. For this
function, there are two values c1 and c2 such that the tangent
line to f at c1 and c2 has the same slope as the secant line
Contd.
• Let 𝑓 be continuous over the closed
interval [𝑎, 𝑏] and differentiable over the
open interval (𝑎, 𝑏). Then, there exists at
least one point 𝑐 ∈ (a, b) such that
𝑓 𝑏 −𝑓(𝑎)
𝑓′(𝑐) =
𝑏−𝑎
Contd.
Proof
• The proof follows from Rolle’s theorem
by introducing an appropriate function
that satisfies the criteria of Rolle’s
theorem. Consider the line connecting
(𝑎, 𝑓(𝑎)) and (𝑏, 𝑓(𝑏)). Since the slope of
𝑓 𝑏 −𝑓(𝑎)
that line is 𝑓′(𝑐) =
𝑏−𝑎
Contd.
and the line passes through the point (a,f(a)),
the equation of that line can be written as
𝑓 𝑏 − 𝑓(𝑎)
𝑦= (𝑥 − 𝑎) + 𝑓(𝑎).
𝑏−𝑎
Contd.
• Let 𝑔(𝑥) denote the vertical difference
between the point (𝑥, 𝑓(𝑥)) and the point
(𝑥, 𝑦) on that line. Therefore,
𝑓 𝑏 − 𝑓(𝑎)
𝑔 𝑥 =𝑓 𝑥 − (𝑥 − 𝑎) + 𝑓(𝑎)
𝑏−𝑎
Contd.
Figure 6.The value 𝑔(𝑥) is the vertical difference
between the point (x,f(x)) and the point (𝑥, 𝑦) on
the secant line connecting (𝑎, 𝑓(𝑎)) and (𝑏, 𝑓(𝑏)).
Contd.
• Since the graph of f intersects the secant
line when 𝑥 = 𝑎 and 𝑥 = 𝑏, we see that
𝑔(𝑎) = 0 = 𝑔(𝑏). Since 𝑓 is a
differentiable function over (𝑎, 𝑏) , g is
also a differentiable function over (a,b).
Contd.
• Furthermore, since f is continuous over
[𝑎, 𝑏], 𝑔 is also continuous over
[𝑎, 𝑏]. Therefore, 𝑔 satisfies the criteria of
Rolle’s theorem. Consequently, there exists
a point 𝑐 ∈ (a, b) such that 𝑔′ 𝑐 = 0 .
Contd.
Since
𝑓 𝑏 − 𝑓(𝑎)
𝑔′ 𝑥 = 𝑓′ 𝑥 −
𝑏−𝑎
we see that
𝑓 𝑏 − 𝑓(𝑎)
𝑔′ 𝑐 = 𝑓′ 𝑐 −
𝑏−𝑎
Since 𝑔′ 𝑐 = 0, we conclude that
𝑓 𝑏 − 𝑓(𝑎)
𝑓′ 𝑐 =
𝑏−𝑎
Contd.
• In the next example, we show how the Mean
Value Theorem can be applied to the function
𝑓 𝑥 = 𝑥 over the interval [0,9]. The method
is the same for other functions, although
sometimes with more interesting
consequences.
Verifying that the Mean Value
Theorem Applies
• For 𝑓 𝑥 = 𝑥 over the interval [0,9], show
that f satisfies the hypothesis of the Mean
Value Theorem, and therefore there exists at
least one value 𝑐 ∈ (0,9) such that 𝑓′ 𝑐 is
equal to the slope of the line connecting
(0, 𝑓(0)) and (9, 𝑓(9)) . Find these values c
guaranteed by the Mean Value Theorem
Contd.
• We know that 𝑓(𝑥) = 𝑥 is continuous over
[0,9] and differentiable over (0,9). Therefore, f
satisfies the hypotheses of the Mean Value
Theorem, and there must exist at least one
value 𝑐 ∈ (0,9) such that 𝑓′ 𝑐 is equal to the
slope of the line connecting (0, 𝑓(0)) and
(9, 𝑓(9)) ((Figure)). To determine which
value(s) of c are guaranteed, first calculate the
derivative of f..
Contd.
• The derivative 𝑓′ 𝑥 =
2
1
𝑥
. The slope of the line
connecting (0,f(0)) and (9,f(9)) is given by
𝑓 9 − 𝑓(0) 9− 0 3 1
= = =
9−0 9−0 9 3
1
• We want to find c such that 𝑓′ 𝑐 = . That is, we want
3
to find c such that
1 1
=
2 𝑐 3
9
• Solving this equation for c, we obtain 𝑐 = this At
4
point, the slope of the tangent line equals the slope of
the line joining the endpoints
Contd.
Figure 7. The slope of the tangent line at c=9/4 is the same as the slope
of the line segment connecting (0,0) and (9,3).
Rolle’s and Mean-value
theorems
Theorem 1 (Rolle’s). Let f be a function
such that:
(i) f is continuous on [a, b];
(ii) f is differentiable on (a,b);
(iii) f(a) = f(b).
Then there exists some c ∈ (a, b) such
that f′(c) = 0.
Cont…
Theorem 2 (Mean Value). Let f be a function
such that:
(i) f is continuous on [a, b];
(ii) f is differentiable on (a, b).
Then there exists some c ∈ (a, b) such that
𝑓 𝑏 −𝑓(𝑎)
𝑓′(𝑐) =
𝑏−𝑎
Taylor’s theorem
• We now look at a result which allows us
to compute the values of elementary
functions like sin, exp and log. This
theorem can be used to approximate
these functions by polynomials (which
are easy to compute) and provides an
estimate of the error involved in the
approximation.
Contd.
Taylor’s Theorem. Let 𝑓 be an (𝑛 + 1) times
differentiable function on an open interval
containing the points 𝑎 and 𝑥. Then
𝑓(𝑥)
𝑓 ′′ 𝑎
= 𝑓 (𝑎) + 𝑓 ′(𝑎)(𝑥 − 𝑎) + 𝑥 − 𝑎 2+⋯
𝑛
2!
𝑓 𝑎
+ 𝑥 − 𝑎 𝑛 + 𝑅𝑛 𝑥
𝑛!
Contd.
𝑓 𝑛+1 𝑐
Where 𝑅𝑛 𝑥 = 𝑥 − 𝑎 𝑛+1
𝑛+1 !
for some number 𝑐 between 𝑎 and 𝑥.
The function 𝑇𝑛 defined by
𝑇𝑛 𝑥
= 𝑎0 + 𝑎1 𝑥 − 𝑎
+ 𝑎2 𝑥 − 𝑎 2 + . . . + 𝑎𝑛 𝑥 − 𝑎 𝑛
𝑓 𝑟 (𝑎)
𝑤ℎ𝑒𝑟𝑒 𝑎𝑟 = ,
𝑟!
Contd.
is called the Taylor polynomial of degree
𝑛 of 𝑓 at 𝑎. This can be thought of as 𝑎
polynomial which approximates the function
𝑓 in some interval containing 𝑎. The error in
the approximation is given by the remainder
term 𝑅𝑛(𝑥). If we can show 𝑅𝑛(𝑥) → 0 as
𝑛 → ∞ then we get a sequence of better
and better approximations to 𝑓 leading to a
power series expansion
∞
𝑓 𝑛 (𝑎)
𝑓(𝑥) = 𝑥−𝑎 𝑛,
𝑛!
𝑛=0
Contd.
which is known as the Taylor series for 𝑓.
In general this series will converge only
for certain values of 𝑥 determined by the
radius of convergence of the power
series. When the Taylor polynomials
converge rapidly enough, they can be
used to compute approximate values of
the function.
Contd.
Connection with Mean Value
Theorem.
• When n = 0, Taylor’s theorem reduces to
the Mean Value Theorem which is itself a
consequence of Rolle’s theorem. A similar
approach can be used to prove Taylor’s
theorem.
Proof of Taylor’s Theorem.
• The remainder term is given by
• Fix 𝑥 and 𝑎. For 𝑡 between 𝑥 and 𝑎 set
Contd.
so that 𝐹(𝑎) = 𝑅𝑛(𝑥). Then
Contd.
• Now defining
• we have 𝐺(𝑎) = 0 and 𝐺(𝑥) = 𝐹(𝑥) =
0 . Applying Rolle’s theorem to the
function G shows that there is a 𝑐
between 𝑎 and 𝑥 with 𝐺′(𝑐) = 0.
Contd.
• Now
• But 𝐹(𝑎) = 𝑅𝑛(𝑥) and rearranging the last
equation gives
Maclaurin Series
• Taking a = 0 in Taylor’s theorem gives us
the expansion
• Where
Contd.
• for some number c between 0 and x. For
those values of 𝑥 for which
lim𝑛→∞ 𝑅𝑛(𝑥) = 0, we then obtain the
following power series expansion for
𝑓 which is known as the Maclaurin series
of 𝑓:
• Here 𝑓 (0) (0) is defined to be 𝑓(0).
Contd.
• Example. To find the Maclaurin series of
the sine function we need to find its
derivative of order n.
Contd.
• It follows that 𝑓 (𝑛) (0) = 0 if n is even, and
alternates as 1, −1, 1, −1, . . . For n = 1, 3, 5,
7, . . . . Hence the Maclaurin series expansion
is
Contd.
• To find values of x for which this is valid, we
need to consider the remainder term which is
given by
• Now for each n, 𝑓 𝑛+1 (𝑐) is given by ± sin c or
± cos c. The values of the sine and cosine
functions always lie between −1 and 1,so
Differentiation of Functions of
Several Variables
We shall mainly be concerned with
differentiation and integration of functions
of more than one variable. We describe
• how each process can be done;
• why it is interesting, in terms of
applications; and how to interpret the
process geometrically.
Functions of Several Variables
• You did a significant amount of work
studying functions, typically written as y
= f(x), which represented the variation
that occurred in some (dependent)
variable y ,as another (independent)
variable, x, changed. For example you
might have been interested in the height
y after a given time x, or the area y,
enclosed by a rectangle with sides x and
10 - x.
Contd.
Once the function was known, the usual
rules of calculus could be applied, and
results such as the time when the particle
hits the ground, or the maximum possible
area of the rectangle, could be calculated.
Contd.
• We are going to do the same thing now
for functions of several variables. For
example the height y of a particle may
depend on the position x and the time t,
so we have y=f(x, t); the volume V of a
cylinder depends on the radius r of the
base and its height h, and indeed, as you
know, 𝑉 = 𝜋𝑟 2 ℎ;
Contd.
or the pressure of a gas may depend on
its volume V and temperature T, so
P=P(V,T). Note the trick I have just used;
it is often convenient to use P both for the
(defendant) variable, and for the function
itself: we don't always need separate
symbols as in the y = f(x) example.
Contd.
• When studying the real world, it is
unusual to have functions which depend
solely on a single variable. Of course the
single variable situation is a little simpler
to study.
Contd.
Figure 2.1: Graph of a simple function of one variable
Contd.
• we shall usually have a “standard”
function name; instead of y = f(x), we
often work with z = f(x, y), since most of
the extra complications occur when we
have two, rather than one (independent)
variable, and we don't need to consider
more general cases like w = f(x,y,z), or
even y = f (x1,x2,... ,xn).
Graphing functions of Several
Variables
• One way we tried to understand the
function y = f(x) was by drawing its
graph, as shown in Fig 2.1. We then
used such a graph to pick out points
such as the local minimum at x=3/2, and
to see how we could get the same result
using calculus.
Contd.
• Working with two or more independent
variables is more complicated, but the
ideas are familiar. To plot z = f(x, y) we
think of z as the height of the function f
at the point (x, y ), and then try to
sketch the resulting surface in three
dimensions. So we represent a function
as a surface rather than a curve.
Contd.
Example. Sketch the surface given by
𝑧 = 2 − 𝑥/2 − 2𝑦/3
Solution. We know the surface will be a
plane, because z is a linear function of x
and y. Thus it is enough to plot three
points that the plane passes through. This
gives Fig. 8.2.
Contd.
Partial Differentiation
• The usual rules for differentiation apply
when dealing with several variables, but
we now require to treat the variables one
at a time, keeping the others constant. It
is for this reason that a new symbol for
differentiation is introduced. Consider the
function
2𝑦
𝑓 𝑥, 𝑦 =
𝑦 + 𝑐𝑜𝑠𝑥
Contd.
We can consider y fixed, and so treat it as
a constant, to get a partial derivative
𝜕𝑓 2𝑦 𝑠𝑖𝑛𝑥
=
𝜕𝑥 2𝑦 + 𝑐𝑜𝑠𝑥 2
where we have differentiated with respect
to x as usual. Or we can treat x as a
constant, and differentiate with respect to
y, to get
Contd.
𝜕𝑓 2𝑦 + 𝑐𝑜𝑠 𝑥 . 2 − 2𝑦. 2
=
𝜕𝑦 2𝑦 + 𝑐𝑜𝑠𝑥 2
2𝑦 𝑐𝑜𝑠𝑥
=
2𝑦 + 𝑐𝑜𝑠𝑥 2
Although a partial derivative is itself a
function of several variables, we often
want to evaluate it at some fixed
point, such as (x0,y0).
Contd.
We thus often write the partial derivative
as
𝜕𝑓
𝑥0 , 𝑦0
𝜕𝑥
There are a number of different notations
in use to try to help understanding in
different situations.
Contd.
All of the following mean the
same thing:-
𝜕𝑓
𝑥0 , 𝑦0 , 𝑓1 𝑥0, 𝑦0 , 𝑓𝑥 𝑥0, 𝑦0
𝜕𝑥
𝑎𝑛𝑑 𝐷1𝐹(𝑥0, 𝑦0).
Contd.
• Note also that there is a simple definition
of the derivative in terms of a Newton
quotient:-
𝜕𝑓 𝑓 𝑥0 + 𝛿𝑥, 𝑦0 − 𝑓 𝑥0 , 𝑦0
= lim
𝜕𝑥 𝛿𝑥→0 𝛿𝑥
Contd.
• Some other common notations for partial
derivatives are
Contd.
Example. Let z = sin(x/y). Compute
Solution
• Treating first y and then x as constants, we
have
• Thus
Contd.
• Theorem. Assume that f and all its
partial derivatives fx and fy are
continuous, and that x = x(t) and y =
y(t) are themselves differentiable
functions of t. Let F (t) = f(x(t), y(t))
• Then F is differentiable and
Contd.
Example. Let f(x, y) = xy, and let x = cos
t, y = sin t. Compute
Solution
From the chain rule,
Contd.
• Proposition. Let x = x(u, v), y = y(u, v)
and z = z(u, v), and let f be a function
defined on a subset U ∈ 𝑅3 , and suppose
that all the partial derivatives of f are
continuous.
Write
Contd.
• Then
Contd.
Example. Assume that f(u, v, w) has
continuous partial derivatives, and that u
= x − y; v = y − z w = z − x
Let
F(x, y, z) = f(u(x, y, z), v(x, y, z), w(x, y,
z))
Show that
solution
• We apply the chain rule, noting first that
from the change of variable formulae,
• we have
Contd.
• Then
• Adding them gives the result claimed.
Higher Derivatives
• Note that a partial derivative is itself a
function of two variables, and so further
partial derivatives can be calculated.
• We write
𝜕 𝜕𝑓 𝜕2 𝑓 𝜕 𝜕𝑓 𝜕2 𝑓
= 2, = ,
𝜕𝑥 𝜕𝑥 𝜕𝑥 𝜕𝑥 𝜕𝑦 𝜕𝑥𝜕𝑦
𝜕 𝜕𝑓 𝜕2 𝑓 𝜕 𝜕𝑓 𝜕2 𝑓
= , = 2
𝜕𝑦 𝜕𝑥 𝜕𝑦𝜕𝑥 𝜕𝑦 𝜕𝑦 𝜕𝑦
Contd.
• This notation generalizes to more than
two variables, and to more than two
derivatives in the way you would expect.
There is a complication that does not
occur when dealing with functions of a
single variable; there are four derivatives
of second order, as follows:
𝜕2𝑓 𝜕2𝑓 𝜕2𝑓 𝜕2𝑓
2
, , 𝑎𝑛𝑑
𝜕𝑥 𝜕𝑥𝜕𝑦 𝜕𝑦𝜕𝑥 𝜕𝑦 2
Contd.
.
Contd.
• Proposition. Assume that all second
order derivatives of f exist and are
continuous. Then the mixed second order
partial derivatives of f are equal. i.e.
𝜕2𝑓 𝜕2𝑓
=
𝜕𝑥𝜕𝑦 𝜕𝑦𝜕𝑥
Contd.
• The third-order and higher partial
derivatives are defined similarly
Contd.
• Example. Suppose that f(x, y) is written
in terms of u and v where x = u + log v
and y = u − log v. Show that, with the
usual convention
Solution
• Using the chain rule, we have
• Thus using both these and their operator form,
we have
Contd.
• while differentiating with respect to v, we have
Maxima and Minima
• As in one variable calculations, one use
for derivatives in several variables is in
calculating maxima and minima. Again as
for one variable, we shall rely on the
theorem that if 𝑓 is continuous on a
closed bounded subset of 𝑅 2 , then it has
a global maximum and a global
minimum.
Contd.
• Definition. Say that f(x, y) has a critical
point at (a, b) if and only if
• It is clear by comparison with the single
variable result, that a necessary
condition that if have a local extremum
at (a, b) is that it have a critical point
there, although that is not a sufficient
condition. We refer to this as the first
derivative test
Contd.
• We can get more information by looking
at the second derivative
• Theorem (Second Derivative Test). Assume
that (a; b) is a critical point for f.
Then
Contd.
Contd.
• Example. Show that f(x, y) = 𝑥 2 + 𝑦 2
has a minimum at (0, 0).
Solution
• We have fx = 2x; fy = 2y, so fx = fy precisely
when x = y = 0, and this is the only critical
point.
• We have fxx = fyy = 2; fxy = 0, so ∆ = fxxfyy −
𝑓 2 xy = 4 > 0 and there is a local minimum at
(0; 0).
Jacobian transformation
• Definition
Contd.
• The Jacobian is defined as a determinant
of a 2x2 matrix. Here is how to compute
the determinant.
Contd.