0% found this document useful (0 votes)

120 views22 pages

CS 726: Nonlinear Optimization 1 Lecture 3: Di Erentiability

This document summarizes key concepts from a lecture on non-linear optimization and differentiability. 1) It defines local and global minimizers of functions, and discusses different types of solutions to optimization problems. 2) It covers Taylor's theorem and how it relates to differentiability, stating the first and second order Taylor approximations. 3) It discusses properties like convexity, strict convexity, and strong convexity, and how they relate to the uniqueness and existence of solutions.

Uploaded by

Harris

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

120 views22 pages

CS 726: Nonlinear Optimization 1 Lecture 3: Di Erentiability

Uploaded by

Harris

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

CS 726: Nonlinear Optimization 1

Lecture 3 : Di↵erentiability

Michael C. Ferris

Computer Sciences Department

University of Wisconsin-Madison

January 29 2021

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 1 / 21

Background Material

Since we spent some part of the lecture in Background Quiz, I would

like you to review the whole of [Wright and Recht(2020), Chapter 2].
Definition of local and global, see [Wright and Recht(2020), Section
2.1].
Reading through this additional information and that of Bertsekas
mentioned last time may be helpful.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 2 / 21

Taxonomy 2.1
of Asolutions
Taxonomy of Solutions to Optimization Problems
Before we can begin designing algorithms, we must determine what it means
to solve an optimization problem. Suppose that f is a function mapping some
domain D ⇢ Rn to the real line R. We have the following definitions.
• x⇤ 2 D is a local minimizer of f if there is a neighborhood N of x⇤ such that
f (x) f (x⇤ ) for all x 2 N \ D.
• x⇤ 2 D is a global minimizer of f if f (x) f (x⇤ ) for all x 2 D.
• x⇤ 2 D is a strict local minimizer if it is a local minimizer for some neigh-
borhood N of x⇤ , and in addition f (x) > f (x⇤ ) for all x 2 N with x , x⇤ .
• x⇤ is an isolated local minimizer if there is a neighborhood N of x⇤ such
that f (x) f (x⇤ ) for all x 2 N \ D and in addition, N contains no local
minimizers other than x⇤ .
• x⇤ is the unique minimizer if it is the only global minimizer.
For the constrained optimization problem
min f (x), (2.1)
x2⌦

where ⌦ ⇢ D ⇢ Rn is a closed set, we modify the terminology slightly to use

the word “solution” rather than “minimizer.” That is, we have the following
definitions.

15
Note local solution and global solution

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 3 / 21

Taylor’s Theorem and Di↵erentiability

Convention
f : Rn ! R.
I Df (x) is a 1 ⇥ n row vector.
I rf (x) = [Df (x)]T (column vector).
g : R n ! Rm .
I Dg (x) 2 Rm⇥n
I [Dg (x)]T = rg (x) 2 Rn⇥m .

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 4 / 21

Theorem (First order Taylor)
([Wright and Recht(2020), Theorem 2.1])
f 2 C 1 . Then
1 f (x + p) = f (x) + rf (x + p)T p for some 2 (0, 1].
2 f (x + p) = f (x) + rf (x)T p + o(kpk), where

o(t)
lim = 0.
t#0 t
R1
3 f (x + p) = f (x) + 0 rf (x + p)T pd .

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 5 / 21

Theorem (Second order Taylor)
([Wright and Recht(2020), Theorem 2.1])
f 2 C 2 . Then
1 f (x + p) = f (x) + rf (x)T p + 12 p T r2 f (x + p)p for some 2 (0, 1].
1 T 2 2
2 f (x + p) = f (x) + rf (x)T p + 2 p r f (x)p + o(kpk ).
R1
3 rf (x + p) = rf (x) + 0 r2 f (x + p)pd .

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 6 / 21

Theorem
First Order Sufficiency Condition Let f : Rn ! R̄ (f is an extended
real-valued function). Suppose f is convex and x̄ is a local minimum of
f (x): f (x̄) = minx2Rn f (x). Then x̄ is a global minimizer of f (x) over Rn .

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 7 / 21

Proof.
Suppose f is convex and x̄ is a local minimizer of the function f (x) and
suppose 9 x̂ which is better (i.e. f (x̂) < f (x̄)). Let us construct an ✏-ball
around the point x̄ such that f (x) f (x̄) 8x 2 B✏ (x̄).

The line connecting x̂ and x̄ with x inside the ✏-ball

Now, define x = x̄ + (x̂ x̄). Note this is equivalent to x̂ + (1 )x̄.

Thus x is on the segment joining x̄ and x̂. If we take > 0 sufficiently
small, we can ensure x 2 B✏ (x̄). Therefore, f (x ) f (x̄).

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 8 / 21

Proof.
But we already know f (x ):

f (x̄)  f (x ) = f ( x̂ + (1 )x̂)

by our definition of x . f is convex, so

f ( x̂ + (1 )x̂)  f (x̂) + (1 )f (x̄).

By assumption, f (x̂) < f (x̄), so

f (x )  f (x̂) + (1 )f (x̄) < f (x̄) + (1 )f (x̄) = f (x̄).

This implies we can find a point in the neighborhood we defined strictly

better than the local minimizer. Contradiction, so x̄ must be the global
minimizer.
It’s worth noting this result holds regardless of di↵erentiability; it only
depends on the convexity of f .

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 9 / 21

Uniqueness of Global Minimizers
Theorem
Uniqueness Result Let f : Rn ! R̄ be strictly convex. Then, if f has a
global minimizer, that global minimizer is unique.

Proof.
Let x 1 ,x 2 be distinct global minimizers. For 0 < < 1, because f is
strictly convex,

f ( x 1 + (1 )x 2 ) < f (x 1 ) + (1 )f (x 2 ).

In order for both x 1 and x 2 to be global minimizers, f (x 1 ) = f (x 2 ), so let’s

replace f (x 2 ) with f (x 1 ) on the right-hand side of the above inequality:

f ( x 1 + (1 )x 2 ) < f (x 1 ) + (1 )f (x 1 ) = f (x 1 ).

This says 9 a point x somewhere between x 1 and x 2 such that

f (x) < f (x 1 ). )( f must have a unique global minimizer.
Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 10 / 21
for all 2 (0, 1).
As we will see throughout this text, a crucial quantity in optimization is the
Lipschitz constant L for the gradient of f , which is defined to satisfy

kr f (x) r f (y)k  Lkx yk, for all x, y 2 dom ( f ). (2.7)

We say that a continuously di↵erentiable function f with this property is L-

smooth or has L-Lipschitz gradients. We say that f is L0 -Lipschitz if

| f (x) f (y)|  L0 kx yk, for all x, y 2 dom ( f ). (2.8)

From (2.2), we have

Z 1
f (y) f (x) r f (x)T (y x) = [r f (x + (y x)) r f (x)]T (y x) d .
0

By using (2.7), we have

[r f (x+ (y x)) r f (x)]T (y x)  kr f (x+ (y x)) r f (x)kky xk  L ky xk2 .

By substituting this bound into the previous integral, we obtain the following
result.

Lemma 2.2 Given an L-smooth function f , we have for any x, y 2 dom ( f )

that
L
f (y)  f (x) + r f (x)T (y x) + ky xk2 . (2.9)
2
Lemma 2.2 asserts that f can be upper bounded by a quadratic function
whose value at x is equal to f (x).
When f is twice continuously di↵erentiable, we can characterize the con-
stant L in terms of the eigenvalues of the Hessian r2 f (x). Specifically, we
Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 11 / 21
Strict and Strong Convexity

Definition
From now on, x1 , xn , etc, will be used to refer to components of vectors
and x 1 , x n , etc, will be used to refer to distinct points.

Definition
Strictly Convex: A function f : R ! R̄ is strictly convex if 8x,y such
that x 6= y and 8↵ 2 [0, 1]

f ((1 ↵)x + ↵y ) < (1 ↵)f (x) + ↵f (y ).

Note this definition is identical to that of convex functions, but the

inequality is now strict.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 12 / 21

Strong Convexity

Let f : ⌦ ! R, h : ⌦ ! R where ⌦ is an open convex set.

Definition
f is strongly convex (⇢) on ⌦ if 9⇢ > 0 such that 8x, y 2 ⌦, 2 [0, 1]
⇢
f ((1 )x + y )  (1 )f (x) + f (y ) (1 ) kx y k2
2

Definition
h is strongly monotone (⇢) on ⌦ if 9⇢ > 0 such that 8x, y 2 ⌦

hh(x) h(y ), x yi ⇢ kx y k2

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 13 / 21

Theorem (Strong convexity)
If f is continuously di↵erentiable on ⌦ then the following are equivalent:
(a) f is strongly convex (⇢) on ⌦
(b) For all x, y 2 ⌦,
f (y ) f (x) + hrf (x), y xi + (⇢/2) kx y k2
(c) rf is strongly monotone (⇢) on ⌦
If f is twice continuously di↵erentiable on ⌦, then
⌦ ↵
(d) For all x,y ,z 2 ⌦, x y , r2 f (z)(x y ) ⇢ kx y k2
is equivalent to the above.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 14 / 21

Proof.
We show (a) () (b) () (c).
(a) =) (b) The hypothesis gives

f (x + (y x)) f (x) ⇢
 f (y ) f (x) (1 ) kx y k2
2
so taking the limit as !0
⇢
hrf (x), y xi  f (y ) f (x) kx y k2
2

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 15 / 21

Proof.
(b) =) (c) Applying (b) twice gives

⇢ 2
f (y ) f (x) + hrf (x), y xi + kx yk
2
⇢ 2
f (x) f (y ) + hrf (y ), x yi + kx yk
2

Adding these inequalities gives

2
f (y ) + f (x) f (x) + f (y ) + hrf (x) rf (y ), y xi + ⇢ kx yk

from where the result follows.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 16 / 21

Proof.
(c) =) (b) The hypothesis gives
Z 1 Z 1
2
hrf (x + t(y x)) rf (x), y xi dt ⇢t kx y k dt
0 0

which implies the result.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 17 / 21

Proof.
(b) =) (a) Letting y = u and x = (1 )u + v in (b) gives

⇢ 2
f (u) f ((1 )u + v ) + hrf ((1 )u + v ), (u v )i + k (u v )k (1)
2

Also letting y = v and x = (1 )u + v in (b) implies

⇢ 2
f (v ) f ((1 )u + v ) + hrf ((1 )u + v ), (1 )(v u)i + k(1 )(v u)k (2)
2

Adding (1 ) times (1) to times (2) gives the required result.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 18 / 21

Proof.
To complete the proof, we assume that f is twice continuously
di↵erentiable on ⌦.
(d) =) (c) This follows from the hypothesis since

⌧Z 1
2
hrf (x) rf (y ), x yi = r f (y + t(x y ))(x y )dt, x y
0
2
⇢ kx yk

(c) =) (d) Let x, y , z 2 ⌦. Then z + (x y ) 2 ⌦ for sufficiently small , so

D E hx y , rf (z + (x y )) rf (z)i
2
x y , r f (z + (x y ))(x y) = + o(1)

2
⇢ kx y k + o(1)

The result follows in the limit as ! 0.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 19 / 21

Then give definition of strong convexity modulus m:
1
f ((1 ↵)x + ↵y ) + m↵(1 ↵) kx y k22  (1 ↵)f (x) + ↵f (y ).
2
Strong implies strict clearly.
Lemma
If a function is strictly convex

f (y + (y x)) < f (y ) + (f (y ) f (x))

then
f (y ) > f (x) + rf (x)T (y x)

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 20 / 21

Proof.
Suppose f (y ) = f (x) + hrf (x), (y x)i for some x 6= y . Let
(t) = f (x + t(y x))f (x)t hrf (x), (y x)i. The above can be written
as (1) = (0) and we note that 0 (0) = 0, so we have (t) = (0) for
all t 2 [0, 1], which contradicts being strictly convex.
Hence f (y ) > f (x) + hrf (x), (y x)i for all x 6= y .

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 21 / 21

S. J. Wright and B. Recht.
Optimization for Data Analysis.
in proof, 2020.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 21 / 21

Convex Analysis Fundamentals
No ratings yet
Convex Analysis Fundamentals
12 pages
Lecture Notes PDF
No ratings yet
Lecture Notes PDF
143 pages
CS 726: Nonlinear Optimization 1 Lecture 04: Convexity and Continuity
No ratings yet
CS 726: Nonlinear Optimization 1 Lecture 04: Convexity and Continuity
16 pages
Convex Optimization
No ratings yet
Convex Optimization
108 pages
Optimality Conditions: Unconstrained Optimization: 1.1 Differentiable Problems
No ratings yet
Optimality Conditions: Unconstrained Optimization: 1.1 Differentiable Problems
10 pages
Concave and Convex Functions: 1 Basic Definitions
No ratings yet
Concave and Convex Functions: 1 Basic Definitions
12 pages
Lecture 6
No ratings yet
Lecture 6
9 pages
Meanvalhhhhue
No ratings yet
Meanvalhhhhue
4 pages
Convexity and Differentiable Functions: R R R R R R R R R R R R R R R R
No ratings yet
Convexity and Differentiable Functions: R R R R R R R R R R R R R R R R
5 pages
Nonlinear Programming Solutions
100% (1)
Nonlinear Programming Solutions
27 pages
A Detailed Analysis of The Brachistochrone Problem
No ratings yet
A Detailed Analysis of The Brachistochrone Problem
15 pages
Jan Van Tiel - Convex Analysis - An Introductory Text-Wiley (1984) PDF
No ratings yet
Jan Van Tiel - Convex Analysis - An Introductory Text-Wiley (1984) PDF
135 pages
Some Special Class of Functions in Optimization: Convex, Lipschitz, Strongly Convex
No ratings yet
Some Special Class of Functions in Optimization: Convex, Lipschitz, Strongly Convex
17 pages
Derivatives
No ratings yet
Derivatives
20 pages
9 Calculus: Differentiation: 9.1 First Optimization Result: Weierstraß Theorem
No ratings yet
9 Calculus: Differentiation: 9.1 First Optimization Result: Weierstraß Theorem
8 pages
Anal21.Dvi-calculus On Banach Space
No ratings yet
Anal21.Dvi-calculus On Banach Space
21 pages
Convex Functions: Renu M. R
No ratings yet
Convex Functions: Renu M. R
43 pages
Yale Univ. Mathematics Camp - 07
No ratings yet
Yale Univ. Mathematics Camp - 07
16 pages
Review Question 3
No ratings yet
Review Question 3
4 pages
Optimization Lecture 2
No ratings yet
Optimization Lecture 2
7 pages
03 Convex Functions Notes Cvxopt f22
No ratings yet
03 Convex Functions Notes Cvxopt f22
21 pages
Ma1102R Calculus Lesson 11: Wang Fei
No ratings yet
Ma1102R Calculus Lesson 11: Wang Fei
12 pages
Record Lab
No ratings yet
Record Lab
2 pages
Lecture 3 Si416 2025
No ratings yet
Lecture 3 Si416 2025
23 pages
Section05 Solutions
No ratings yet
Section05 Solutions
5 pages
hmw9 (MA504)
100% (1)
hmw9 (MA504)
5 pages
Lecture 2 Si416 2025
No ratings yet
Lecture 2 Si416 2025
17 pages
Convexity 1
No ratings yet
Convexity 1
3 pages
Unconstrained Optimization Conditions
No ratings yet
Unconstrained Optimization Conditions
13 pages
Convexity, Lipschitzness, Smoothness
No ratings yet
Convexity, Lipschitzness, Smoothness
5 pages
Optimization Algorithms Guide
No ratings yet
Optimization Algorithms Guide
71 pages
Convex Duality Cond Enced
No ratings yet
Convex Duality Cond Enced
57 pages
SequnceSeries of Functions
No ratings yet
SequnceSeries of Functions
50 pages
Optimization: 1 Motivation
No ratings yet
Optimization: 1 Motivation
20 pages
(9783110426045 - An Introduction To Nonlinear Optimization Theory) 4 Convex Nonsmooth Optimization
No ratings yet
(9783110426045 - An Introduction To Nonlinear Optimization Theory) 4 Convex Nonsmooth Optimization
14 pages
Convex Optimization Insights
No ratings yet
Convex Optimization Insights
3 pages
Lec3 Convex Function Exercise
No ratings yet
Lec3 Convex Function Exercise
4 pages
Mean Value Theorems For Vector Valued Functions
No ratings yet
Mean Value Theorems For Vector Valued Functions
13 pages
BasicsOfConvexOptimization PDF
No ratings yet
BasicsOfConvexOptimization PDF
142 pages
Convex Functions in Optimization
No ratings yet
Convex Functions in Optimization
14 pages
Optimization for Advanced Students
No ratings yet
Optimization for Advanced Students
4 pages
Convex Optimization L2 18
No ratings yet
Convex Optimization L2 18
11 pages
Lecture 3 Taxonomy Taylor
No ratings yet
Lecture 3 Taxonomy Taylor
4 pages
On Differentiability of Functions of Two Variables: 1 Equidifferentiability
No ratings yet
On Differentiability of Functions of Two Variables: 1 Equidifferentiability
10 pages
Adv CV
No ratings yet
Adv CV
82 pages
Math I Lecture Notes
No ratings yet
Math I Lecture Notes
8 pages
Co 463
No ratings yet
Co 463
116 pages
Lecture Notes 2
No ratings yet
Lecture Notes 2
181 pages
Convex Functions and Optimization
No ratings yet
Convex Functions and Optimization
20 pages
Multivariatecalculus
No ratings yet
Multivariatecalculus
16 pages
Nowhere Differentiable Functions-Kesavan
No ratings yet
Nowhere Differentiable Functions-Kesavan
10 pages
New Zealand Mathematical Olympiad Committee Convex Functions
No ratings yet
New Zealand Mathematical Olympiad Committee Convex Functions
7 pages
Sobolov Spaces
No ratings yet
Sobolov Spaces
49 pages
Curs Tehnici de Optimizare
No ratings yet
Curs Tehnici de Optimizare
141 pages
Lecture 09
No ratings yet
Lecture 09
4 pages
Lecture Notes On Differentiability
No ratings yet
Lecture Notes On Differentiability
14 pages
Newton Methods for Optimization
No ratings yet
Newton Methods for Optimization
5 pages
Surface Finish Study - ES - ITP Aero
No ratings yet
Surface Finish Study - ES - ITP Aero
128 pages
How Does Distance Impact Light Intensity.
No ratings yet
How Does Distance Impact Light Intensity.
31 pages
5A Weekly Lesson Plan Week 24
No ratings yet
5A Weekly Lesson Plan Week 24
1 page
Contemporary Chemical Approaches For Green and Sustainable Drugs 1st Edition Török M. (Ed.) Latest PDF 2025
100% (5)
Contemporary Chemical Approaches For Green and Sustainable Drugs 1st Edition Török M. (Ed.) Latest PDF 2025
97 pages
Inventorwizard: Miniature Model Hot Air Engine Horizontal Stirling Engine Flywheel
No ratings yet
Inventorwizard: Miniature Model Hot Air Engine Horizontal Stirling Engine Flywheel
1 page
Experiment No 5 Hvac
No ratings yet
Experiment No 5 Hvac
3 pages
01 +Deep+Breathing+1-6
No ratings yet
01 +Deep+Breathing+1-6
6 pages
Self Introduction and Asking One's Personal Data
No ratings yet
Self Introduction and Asking One's Personal Data
21 pages
Contoh Artikel Tugas 8 (Survey)
No ratings yet
Contoh Artikel Tugas 8 (Survey)
8 pages
MIDTERM EXAMINATION in EARTH SCIENCE
No ratings yet
MIDTERM EXAMINATION in EARTH SCIENCE
3 pages
Condenser, Reboiler and Evaporator: Dr. Rakesh Kumar
No ratings yet
Condenser, Reboiler and Evaporator: Dr. Rakesh Kumar
70 pages
Environmental Migration: Challenges & Solutions
No ratings yet
Environmental Migration: Challenges & Solutions
15 pages
Guide For The Mechanistic-Empirical Design of New and Rehabilitated Pavement Structures Materials Characterization Is Your Agency Ready?
No ratings yet
Guide For The Mechanistic-Empirical Design of New and Rehabilitated Pavement Structures Materials Characterization Is Your Agency Ready?
13 pages
U7.1 E11 Practice 1-2-3
No ratings yet
U7.1 E11 Practice 1-2-3
1 page
Powerpoint Anatomy
No ratings yet
Powerpoint Anatomy
9 pages
Advanced Reading Part 5
No ratings yet
Advanced Reading Part 5
6 pages
01unit-1 Design Thinking Principles - Note-1
No ratings yet
01unit-1 Design Thinking Principles - Note-1
31 pages
Quiz 2 Chapter 3 The Mole Concept, Chemical Formulae and Equations
No ratings yet
Quiz 2 Chapter 3 The Mole Concept, Chemical Formulae and Equations
3 pages
Academic Intergrity and Ethics
No ratings yet
Academic Intergrity and Ethics
10 pages
2nd Quarter RESEARCH II Module 2 Lesson 2
100% (1)
2nd Quarter RESEARCH II Module 2 Lesson 2
6 pages
Manuscript - Docx WWWW (2) .PDF Carbonation Depth
No ratings yet
Manuscript - Docx WWWW (2) .PDF Carbonation Depth
1 page
Onomatopoeia
No ratings yet
Onomatopoeia
7 pages
Trigonometry Project for FY-IT Students
No ratings yet
Trigonometry Project for FY-IT Students
27 pages
Answer Key Listening Course Book B1
No ratings yet
Answer Key Listening Course Book B1
28 pages
Introduction to Engineering Vibration
No ratings yet
Introduction to Engineering Vibration
29 pages
Hayelom Proposal New End
No ratings yet
Hayelom Proposal New End
14 pages
Biofeedback in Sports
No ratings yet
Biofeedback in Sports
18 pages
An Applied Guide To Process and Plant Design 2nd Edition Sean Moran - Ebook PDF PDF Download
100% (2)
An Applied Guide To Process and Plant Design 2nd Edition Sean Moran - Ebook PDF PDF Download
126 pages
Warhammer 40,000 in Which It Is Set. Warhammer 40,000 (Informally Known As
No ratings yet
Warhammer 40,000 in Which It Is Set. Warhammer 40,000 (Informally Known As
33 pages
PK3 2nded ReferenceManual v1.1 Virtual
No ratings yet
PK3 2nded ReferenceManual v1.1 Virtual
70 pages

CS 726: Nonlinear Optimization 1 Lecture 3: Di Erentiability

Uploaded by

CS 726: Nonlinear Optimization 1 Lecture 3: Di Erentiability

Uploaded by

CS 726: Nonlinear Optimization 1

Computer Sciences Department

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 1 / 21

Since we spent some part of the lecture in Background Quiz, I would

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 2 / 21

where ⌦ ⇢ D ⇢ Rn is a closed set, we modify the terminology slightly to use

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 3 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 4 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 5 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 6 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 7 / 21

The line connecting x̂ and x̄ with x inside the ✏-ball

Now, define x = x̄ + (x̂ x̄). Note this is equivalent to x̂ + (1 )x̄.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 8 / 21

by our definition of x . f is convex, so

f ( x̂ + (1 )x̂)  f (x̂) + (1 )f (x̄).

By assumption, f (x̂) < f (x̄), so

f (x )  f (x̂) + (1 )f (x̄) < f (x̄) + (1 )f (x̄) = f (x̄).

This implies we can find a point in the neighborhood we defined strictly

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 9 / 21

In order for both x 1 and x 2 to be global minimizers, f (x 1 ) = f (x 2 ), so let’s

This says 9 a point x somewhere between x 1 and x 2 such that

kr f (x) r f (y)k  Lkx yk, for all x, y 2 dom ( f ). (2.7)

We say that a continuously di↵erentiable function f with this property is L-

| f (x) f (y)|  L0 kx yk, for all x, y 2 dom ( f ). (2.8)

From (2.2), we have

By using (2.7), we have

[r f (x+ (y x)) r f (x)]T (y x)  kr f (x+ (y x)) r f (x)kky xk  L ky xk2 .

Lemma 2.2 Given an L-smooth function f , we have for any x, y 2 dom ( f )

f ((1 ↵)x + ↵y ) < (1 ↵)f (x) + ↵f (y ).

Note this definition is identical to that of convex functions, but the

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 12 / 21

Let f : ⌦ ! R, h : ⌦ ! R where ⌦ is an open convex set.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 13 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 14 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 15 / 21

Adding these inequalities gives

from where the result follows.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 16 / 21

which implies the result.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 17 / 21

Also letting y = v and x = (1 )u + v in (b) implies

Adding (1 ) times (1) to times (2) gives the required result.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 18 / 21

(c) =) (d) Let x, y , z 2 ⌦. Then z + (x y ) 2 ⌦ for sufficiently small , so

The result follows in the limit as ! 0.

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 19 / 21

f (y + (y x)) < f (y ) + (f (y ) f (x))

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 20 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 21 / 21

Michael C. Ferris (UW-Madison) CS726:Lecture 3 Di↵erentiability 21 / 21

You might also like