Lecture 9 (17.4.
2019)
(translated and adapted from lecture notes by Martin Klazar)
Theorem 35 (∂ ⇒ differential). Let U ⊂ Rm is a neighborhood of a point
a ∈ Rm . If a function f : U → R has all partial derivatives on U and they
are continuous at a, then f is differentiable at a.
Proof. We consider only the case of two variables x and y (m = 2). For more
variables, the proof is similar (but more technical). We might assume that the
point a = o and U is a ball B(o, γ) for some γ > 0. Let h = (h1 , h2 ) ∈ U (so,
kbhk < γ) and h0 = (h1 , 0). Difference f (h) − f (o) can be expressed as a sum
of differences along both coordinate axes:
f (h) − f (o) = (f (h) − f (h0 )) + (f (h0 ) − f (o)) .
Segments h0 h and oh0 lie inside U , so f is defined on them, morever, f
depends only on variable y on the former and only on variable x on the latter
segment. Thus, Lagrange mean value Theorem (for single variable) yields:
∂f ∂f
f (h) − f (o) = (ζ2 ) · h2 + (ζ1 ) · h1 ,
∂y ∂x
where ζ1 and ζ2 are internal points of segments oh0 and h0 h, respectively.
In particular, the points ζ1 and ζ2 lie inside B(o, khk), so by continuity of both
partial derivatives at o, we have
∂f ∂f ∂f ∂f
(ζ2 ) = (o) + α(ζ2 ) and (ζ1 ) = (o) + β(ζ1 ) ,
∂y ∂y ∂x ∂x
where α(h) i β(h) are o(1) as h → o (i.e., for every ε > 0 there is δ > 0, such
that khk < δ ⇒ |α(h)| < ε · 1 = ε and the same holds for β(h)). Thus
∂f ∂f
f (h) − f (o) = (o) · h2 + (o) · h1 + α(ζ2 )h2 + β(ζ1 )h1 .
∂y ∂x
By triangle inequality, and inequalities 0 < kζ1 k, kζ2 k < khk and |h1 |, |h2 | ≤
khk it follows that if khk < δ, then
|α(ζ2 )h2 + β(ζ1 )h1 | ≤ |α(ζ2 )| · khk + |β(ζ1 )| · khk ≤ 2εkhk .
Thus, α(ζ2 )h2 + β(ζ1 )h1 = o(khk) for h → o. So by definition of the total
differential, f is differentiable at o.
Lagrange Mean Value Theorem can be generalized for functions of several
variables as follows.
41
Theorem 36 (Lagrange Mean Value Theorem for several variables). Let U ⊂
Rm be an open set containing a segment u = ab with endpoints a and b and
let f : U → R be a function which is continuous at every point of u and
differentiable at every internal point of u. Then there exists an internal point
ζ of u satisfying
f (b) − f (a) = Df (ζ)(b − a) .
In other words, difference of functional values at endpoints of the segment
equals value of differential at some internal point of the segment for the vector
of the segment.
Proof. Idea: Apply Lagrange Mean Value Theorem of single variable for an
auxiliary function F (t) = f (a + t(b − a)) and t ∈ [0, 1].
We say that an open set D ⊂ Rm is connected, if every two of its points can
be connected by a broken line contained in D. Examples of connected open
sets: an open ball in Rm , whole Rm and R3 \L, where L is the union of finitely
many lines. On the other hand, B\R, where B is an open ball R3 and R a
plane intersecting B, is an open set which is not connected.
Corollary 37 (∂ = 0 ⇒ f ≡ const.). If a function f of m variables has zero
differential at every point of an open connected set U , then f is constant on
U . The same conclusion holds if f has all partial derivatives on U zero.
Proof. Idea: Consider two points of U and a broken line connecting them.
Apply Lagrange Mean Value Theorem for several variables for each segment
of the broken line.
Calculating partial derivatives and differentials. For two functions f, g :
U → R, defined on a neighborhood U ⊂ Rm of a point a ∈ U that have a
partial derivative with repect to xi at a point a, formulae for partial derivative
their sum, product and quotient are analogous to those for single variable:
∂i (αf + βg)(a) = α∂i f (a) + β∂i g(a)
∂i (f g)(a) = g(a)∂i f (a) + f (a)∂i g(a)
g(a)∂i f (a) − f (a)∂i g(a)
∂i (f /g)(a) = (if g(a) 6= 0) .
g(a)2
Similarly, for differentials, we have:
Theorem 38 (Arithmetic of differentials). Let U ⊂ Rm is a neighborhood of
a and f, g : U → R are functions differentiable at a.
(i) αf + βg is differentiable at a and
D(αf + βg)(a) = αDf (a) + βDg(a) .
for any α, β ∈ R,
42
(ii) f g is differentiable at a and
D(f g)(a) = g(a)Df (a) + f (a)Dg(a) .
(iii) If g(a) 6= 0, f /g is differentiable at a and
1
D(f /g)(a) = g(a)Df (a) − f (a)Dg(a) .
g(a)2
Proof. Follows from Theorem 32 and formulae for partial derivatives.
The formula for linear combination can be easily generalized for vector
valued functions f, g : U → Rn .
Next, we generalize a formula for derivative of a composed function to a
composition of multivariable mappings. We use ◦ for denoting composition,
where (g ◦ f )(x) = g(f (x)).
Theorem 39 (Differential of a composed mapping). Let
f : U → V, g : V → Rk
are two mappings where U ⊂ Rm is a neighborhood of a and V ⊂ Rn is a
neighborhood of b = f (a). If the mapping f is differentiable at a and g is
differentialble at b, the composed mapping
g ◦ f = g(f ) : U → Rk
is differentiable at a and the total differential is a composition of differentials
of f and g:
D(g ◦ f )(a) = Dg(b) ◦ Df (a) .
Since composition of linear mappings corresponds to multiplication of ma-
trices, total differential of a composed mapping corresponds to a product of
the Jacobi matrices.
Partial derivatives of higher orders. If the f : U → R function defined
on a neighborhood U ⊂ Rm of a point a has a partial derivative F = ∂f xi
in each point U and this function F : U → R has at a the partial derivative
∂F xj (a), we say that f has a partial derivative at the point a of the second
order with respect to the variables xi and xj and we denote it
∂ 2f
(a)
∂xj ∂xi
or shortly by ∂i ∂j f (a).
Similarly, we define higher order partial derivatives: if f = f (x1 , x2 , ldots, xm )
has partial derivative (i1 , i2 , . . . , ik−1 , j ∈ {1, 2, . . . , m})
∂ k−1 f
F = (x)
∂xik−1 ∂xik−2 . . . ∂xi1
43
at every point x inU and we say that f has partial derivative of order k with
respect to the variables xi1 , . . . , xik−1 , xj in point a and we denote its value by
∂kf
(a) .
∂xj ∂xik−1 . . . ∂xi1
In general, order of variables in higher order derivatives matters. You can
verify yourself that f : R2 → R,
(
xy(x2 −y 2 )
x2 +y 2
pro x2 + y 2 6= 0
f (x, y) =
0 pro x2 + y 2 = 0 ,
has different mixed (i.e., with respect to two different variables) second order
partial derivatives in the origin.
∂ 2f ∂ 2f
(0, 0) = 1 a (0, 0) = −1 .
∂x∂y ∂y∂x
However, the order does not matter if the partial derivatives are continuous.
Theorem 40 (Usually ∂x ∂y f = ∂y ∂x f ). Let f : U → R be a function with
second order partial derivatives ∂j ∂i f a ∂i ∂j f , i 6= j on a neighborhood U ⊂ Rm
of a point a which are continuous in a. Then
∂j ∂i f (a) = ∂i ∂j f (a) .
Proof. We prove the statement for m = 2, for m > 2, the proof would be
analogous but more tedious. Without loss of generality, we may assume that
a = o = (0, 0). By continuity of the partial derivatives in the origin, it
is enough to find for arbitrarily small h > 0 two points σ, τ in the square
[0, h]2 satisfying ∂x ∂y f (σ) = ∂y ∂x f (τ ). Then, for h → 0+ , σ, τ → o and
from a limit argument and continuity of the partial derivatives we get that
∂x ∂y f (o) = ∂y ∂x f (o).
Given h, we find σ and τ as follows. We denote the corners of the square
a = (0, 0), b = (0, h), c = (h, 0), d = (h, h) and we consider a value f (d) −
f (b) − f (c) + f (a). It can be expressed in two different ways:
f (d) − f (b) − f (c) + f (a) = (f (d) − f (b)) − (f (c) − f (a)) = ψ(h) − ψ(0)
= (f (d) − f (c)) − (f (b) − f (a)) = φ(h) − φ(0) ,
where
ψ(t) = f (h, t) − f (0, t) and φ(t) = f (t, h) − f (t, 0) .
We have that ψ 0 (t) = ∂y f (h, t) − ∂y f (0, t) and φ0 (t) = ∂x f (t, h) − ∂x f (t, 0).
Lagrange mean value theorem gives two expresions
f (d) − f (b) − f (c) + f (a) = ψ 0 (t0 )h = (∂y f (h, t0 ) − ∂y f (0, t0 ))h
= φ0 (s0 )h = (∂x f (s0 , h) − ∂x f (s0 , 0))h ,
44
where 0 < s0 , t0 < h are intermediate points. Applying the theorem once more
on differences of partial derivatives of f , we obtain the following
f (d) − f (b) − f (c) + f (a) = ∂x ∂y f (s1 , t0 )h2 = ∂y ∂x f (s0 , t1 )h2 , s1 , t1 ∈ (0, h) .
Points σ = (s1 , t0 ) and τ = (s0 , t1 ) belong to [0, h]2 and we have ∂x ∂y f (σ) =
∂y ∂x f (τ ) (since both sides equal to (f (d) − f (b) − f (c) + f (a))/h2 ).
45