BV21
BV21
Key words. Riemannian least-squares problem, sensitivity, condition number, local minimizers,
Weingarten map, second fundamental form
In [15] we called this type of problem a parameter identification problem (PIP). Ap-
plication areas where PIPs originate are data completion with low-rank matrix and
tensor decompositions [16, 23, 38, 44, 60, 63], geometric modeling [18, 48, 61], computer
vision [32, 37, 50], and phase retrieval problems in signal processing [9, 12].
The manifold O is called the output manifold because its elements are the outputs
of the optimization problem. The outputs are the quantities of interest; for instance,
they could be the unknown system parameters that one wants to identify based on a
given observation a and the forward description of the system F . In order to distin-
guish between the inputs x ∈ I of the IP and the inputs a ∈ Rn of the optimization
formulation (PIP), we refer to the latter as ambient inputs. In this paper, we study
the sensitivity of the output of the optimization problem (PIP) (and an implicit gen-
eralization thereof) when the ambient input a ∈ Rn is perturbed by considering its
condition number κ at global and local minimizers and critical points.
1.1. The condition number. Condition numbers are one of the cornerstones in
numerical analysis [7,17,26,36,40,68,69]. Its definition was introduced for Riemannian
manifolds already in the early days of numerical analysis. In 1966, Rice [58] presented
the definition in the case where input and output space are general nonlinear metric
spaces; the special case of Riemannian manifolds is [58, Theorem 3]. Let I and O be
where distI (·, ·) denotes the distance given by the Riemannian metric on I and likewise
for O. It follows immediately that the condition number yields an asymptotically
sharp upper bound on the perturbation of the output of a computational problem
when subjected to a perturbation of the input x. Indeed, we have
distO (f (x), f (x0 )) ≤ κ[f ](x) · distI (x, x0 ) + o (distI (x, x0 )) .
The number κ[f ](x) is an absolute condition number. On the other hand, if I is a
submanifold of a vector space equipped with the norm k · kI and analogously for O
and k · kO , then the relative condition number is κrel [f ](x) := κ[f ](x) kfkxk I
(x)kO . We
focus on computing the absolute condition number, as the relative condition number
can be inferred from the absolute one.
1.2. Condition number of approximation problems. Our starting point
is Rice’s classic definition (1.1), which is often easy to compute for differentiable
maps, rather than Demmel and Renegar’s inverse distance to ill-posedness, which is
generally more difficult to compute. Nevertheless, Rice’s definition cannot be applied
unreservedly to measure how sensitive a solution of (PIP) is to perturbations of the
ambient input a ∈ Rn . At first glance, it seems that we could apply (1.1) to
φ : Rn → O, a 7→ argmin kF (y) − ak2 .
y∈O
The tangent bundle is a smooth manifold of dimension 2m. The normal bundle is
constructed similarly. The normal space of M in Rn at p is the orthogonal complement
5
in Rn of the tangent space Tp M, namely Np M := (Tp Rn )⊥ . The normal bundle is a
smooth embedded submanifold of Rn × Rn of dimension n. Formally, we write
a
NM := Np M = {(p, η) | p ∈ M and η ∈ Np M}.
p∈M
where the norm is the one induced from the Riemannian metric; in our setting it is the
standard norm on Rn . The map E : NM → Rn , (p, η) 7→ p + η is smooth and there
exists a δ such that the restriction E|Vδ becomes a diffeomorphism onto its image
[47, Theorem 6.24]. Consequently, T = E(Vδ ) is an n-dimensional, open, smooth,
embedded submanifold of Rn that forms a neighborhood of M. The submanifold T
is called a tubular neighborhood of M and δ is called its height (function).
2.3. Vector fields. A smooth vector field is a smooth map X : M → TM
from the manifold to its tangent bundle taking a point p to a tangent vector vp . By
the notation X|p we mean X(p). A smooth vector field X on a properly embedded
submanifold M ⊂ N can be extended to a smooth vector field X b on N such that it
agrees on M: X| b M = X.
A smooth frame of Mm is a tuple of m smooth vector fields (X1 , . . . , Xm ) that
are linearly independent: (X1 |p , . . . , Xm |p ) is linearly independent for all p ∈ M. If
these tangent vectors are orthonormal for all p, then the frame is called orthonormal.
Let X be a smooth vector field on M. An integral curve of X is a smooth curve
γ ⊂ M such that Dt γ = X|γ(t) for all t ∈ (−1, 1). For every p ∈ M the vector field X
generates an integral curve γ with starting point p = γ(0). There is always a smooth
curve realizing a tangent vector vp ∈ Tp M, i.e., a curve γ with p = γ(0) and v = D0 γ.
2.4. Riemannian manifolds. A Riemannian manifold is a smooth manifold
M equipped with a Riemannian metric g: a positive definite symmetric bilinear form
gp : Tp M × Tp M → R that p varies smoothly with p. The metric gp induces a norm on
Tp M denoted ktp kM := gp (tp , tp ). The only Riemannian metric we will explicitly
use in computations is the standard Riemannian metric of Rn , i.e., the Euclidean
inner product gp (x, y) = hx, yi = xT y.
If M and N are Riemannian manifolds with induced norms k · kM and k · kN ,
and if F : M → N is a differentiable map, then the spectral norm of Dp F is
k(Dp F )(tp )kN
(2.2) kDp F kM→N := sup .
tp ∈Tp M\{0} ktp kM
If the manifolds are clear from the context we can drop the subscript and write kDp F k.
2.5. Riemannian gradient and Hessian. Let M be a Riemannian manifold
with metric g, and let f : M → R be a smooth function. The Riemannian gradient
of f at p ∈ M is the tangent vector ∇f |p ∈ Tp M that is dual to the derivative
Dp f : Tp M → R under g, i.e., gp (∇f |p , t) = (Dp f )(t) for all t ∈ Tp M. It is known
that ∇f is a vector field, hence explaining the notation ∇f |p for its value at p ∈ M.
When M ⊂ Rn is a Riemannian embedded submanifold with the inherited Euclid-
ean metric, the Riemannian Hessian [2, Definition 5.5.1] can be defined without ex-
plicit reference to connections, as in [13, Chapter 5]:
where ∇f now has to be viewed as a vector field on Rn and PTp M projects or-
thogonally onto the tangent space Tp M. In other words, the Hessian takes ηp to
d
PTp M dt |t=0 ∇f |γ(t) where γ(t) ⊂ M is a curve realizing the tangent vector ηp .
6
The Hessian Hessp (f ) is a symmetric bilinear form on the tangent space Tp M
whose (real) eigenvalues contain essential information about the objective function f
at p. If the Hessian is positive definite, negative definite, or indefinite at a critical
point p of f (i.e., ∇f |p = 0 or, equivalently, Dp f = 0), then p is respectively a local
minimum, local maximum, or saddle point of f .
3. Measuring curvature with the Weingarten map. A central role in this
paper is played by the Riemannian Hessian of the distance function
1
(3.1) d : I → R, x 7→ kx − ak2 ,
2
which measures the Euclidean distance between a fixed point a ∈ Rn and points on the
embedded submanifold I ⊂ Rn . It is the inverse of this Riemannian Hessian matrix
that will appear as an additional factor to be multiplied with D(x,y) πO (D(x,y) πI )−1
from (1.2) in the condition numbers of the problems studied in Section 4 to 6.
The Riemannian Hessian of d contains classic differential-geometric information
about the way I curves inside of Rn [54]. Indeed, from [3, Equation (10)] we conclude
that it can be expressed at x ∈ I as
Hessx (d) = 1Tx I + II x (PNx M (x − a)) ,
where PNx M is the projection onto the normal space, and II x : Nx M → S 2 (Tx M) is
a classical differential invariant called the second fundamental form [27, 46, 51, 52, 54],
which takes a normal vector in Nx M and sends it to a symmetric bilinear form on
the tangent space Tx M. If we let ηx := a − x, then at a critical point of d we have
ηx ∈ Nx I, so that the Hessian of interest satisfies
(H) Hη := Hessx (d) = 1 − Sη , where Sη := II x (ηx ).
Note that we dropped the base point x ∈ I from the notation in Hη and Sη . In the
above, Sη is interpreted as a symmetric endomorphism on Tx I and it is classically
called the shape operator or Weingarten map [27, 46, 51, 52, 54].
The Weingarten map measures how much the embedding into the ambient space
Rn bends the manifold I m in the following concrete way [46, Chapter 8]. For η 6= 0,
η 1
let w := kηk and let c1 , . . . , cm be the eigenvalues of Sw = kηk Sη . As Sw is self-adjoint,
all ci ’s are all real. The eigenvalues of Sη are then c1 kηk, . . . , cm kηk. The ci ’s are
called the principal curvatures of I at x in direction of η. They have the following
classic geometric meaning [59, Chapter 1]: If ui ∈ Tx I is a unit length eigenvector
of the eigenvalue ci , then locally around x and in the direction of ui the manifold I
contains an infinitesimal arc of a circle with center x + c−1 i w = x + (ci kηk)
−1
η through
−1
x. This circle is called an osculating circle. The radii ri := |ci | of those circles are
called the critical radii of I at x in direction η. See also Figure 4.1.
For several submanifolds I ⊂ Rn relevant in optimization applications, the Wein-
garten map Sη was computed. Absil, Mahoney, and Trumpf [3] provide expressions
for the sphere, Stiefel manifold, orthogonal group, Grassmann manifold, and low-rank
matrix manifold. Feppon and Lermusiaux [33] further extended these results by com-
puting the Weingarten maps and explicit expressions for the principal curvatures of
the isospectral manifold and “bi–Grassmann” manifold, among others. Heidel and
Schultz [38] computed the Weingarten map of the manifold of higher-order tensors of
fixed multilinear rank [42].
4. The condition number of critical point problems. Prior to treating
(IPIP) in general, we focus on an important and simpler special case. This case arises
when the solution manifold S ⊂ I × O is the graph of the identity map 1 : I → I, so
O = I. Given an ambient input a ∈ Rn , the problem (IPIP) now comprises finding
an output x ∈ I so that the distance function d from (3.1) is minimized. This is can
be formulated as the standard Riemannian least-squares problem
1
(4.1) min kx − ak2 .
x∈I 2
7
Computing global minimizers of this optimization problem is usually too ambitious
a task when I is a complicated nonlinear manifold. Most Riemannian optimization
methods will converge to local minimizers by seeking to satisfy the first-order necessary
optimality conditions of (4.1), namely a−x ∈ Nx I. We say that such methods attempt
to solve the critical point problem (CPP) whose implicit formulation is
We call SCPP the critical point locus. It is an exercise to show that SCPP is
linearly diffeomorphic to the normal bundle NI so that we have the following result.
Lemma 4.1. SCPP is an embedded submanifold of Rn × I of dimension n.
Consequently, the CPP falls into the realm of the theory of condition from (1.2).
We denote the coordinate projections of SCPP by ΠRn : SCPP → Rn and ΠI : SCPP →
I for distinguishing them from the projections associated to S. The corresponding
submanifold of well-posed tuples is
The fact that it is a manifold is the following result, proved in Appendix A.1.
Lemma 4.2. WCPP is an open dense embedded submanifold of SCPP .
We show in Appendix A.1 that the condition number of this problem can be
characterized succinctly as follows.2
Theorem 4.3 (Critical points). Let (a, x) ∈ SCPP and η := a − x. Then,
where Hη is given by (H). Moreover, if (a, x) ∈ SCPP \WCPP , then Hη is not invertible.
Corollary 4.4. κ[SCPP ] : SCPP → R ∪ {+∞} is continuous and κ[SCPP ](x, y)
is finite if and only if (x, y) ∈ WCPP .
Curvature has entered the picture in Theorem 4.3 in the form of the Riemannian
Hessian Hη from (H). It is an extrinsic property of I, however, so that the condition
number depends on the Riemannian embedding of I into Rn . Intuitively this is
coherent, as different embeddings give rise to different computational problems each
with their own sensitivity to perturbations in the ambient space.
Let us investigate the meaning of the foregoing theorem a little further. Recall,
from Section 3, the definition of the principal curvatures ci as the eigenvalues of the
η
Weingarten map Sw where w = kηk and η = a − x. Then, Theorem 4.3 implies that
1
(4.2) κ[SCPP ](a, x) = max ,
1≤i≤m 1 − ci kηk
a + ∆a a
x + ∆x x + ∆x x x + ∆x
x x
a + ∆a a
(a) (b) (c)
Figure 4.1. The graphs shows ambient inputs a for the CPP of the parabola. The osculating
circle with critical radius lies above the parabola, and its center (the gray point) is a focal point of
the parabola, i.e., it is a point in ΣCPP = SCPP \ WCPP . In graph (a), the ambient input a also
lies above the parabola, so η = a − x points towards the focal point. This means that the principal
curvature at x in direction of η is c1 > 0. Moreover, c1 kηk < 1, and thus, by (4.2), we have
κ[SCPP ](a, x) > 1. In other words, the curvature of the parabola amplifies the perturbation k∆ak.
In graph (b), the ambient input a lies below the parabola, so η = a − x points away from the focal
point. This means that the principal curvature at x in direction of η is negative, so that by (4.2) we
have κ[SCPP ](a, x) < 1. The curvature of the parabola shrinks the perturbation k∆ak. In graph (c),
the ambient input a lies above the parabola, so that the corresponding critical curvature is positive.
However, now c1 kηk > 1, and so by (4.2) we have κ[SCPP ](a, x) < 1. Again, the curvature of the
parabola shrinks the perturbation k∆ak.
to a. The locus of ill-posed inputs for the CPP is ΣCPP = SCPP \ WCPP . The
projection ΠRn (ΣCPP ) is well studied in the literature. Thom [62] calls it the target
envelope of SCPP , while Porteous [55] calls it the focal set of I, and for curves in the
plane it is called the evolute; see Chapter 10 of [51]. The points in the intersection of
{x} × (x + Rη) with ΣCPP are given by the multiples of the normal vector η whose
length equals one of the critical radii r1 , . . . , rm ; this is also visible in Figure 9.3.
Comparing with (4.2) we find
1 kη0 k
= min , where η = a − x.
κ[SCPP ](a, x) η0 ∈Rη: (x,a+η0 )∈ΣCPP | kη0 k − kηk |
This is an interpretation of Rice’s condition number in the spirit of Demmel [24, 25]
and Renegar [56, 57].
Remark 4.5. If I is a smooth algebraic variety in Rn , then solving SCPP reduces
to solving a system of polynomial equations. For almost all inputs a ∈ Rn this system
has a constant number of solutions over the complex numbers, called the Euclidean
distance degree [29]. Theorem 4.3 thus characterizes the condition number of real
solutions of this special system of polynomial equations.
5. The condition number of approximation problems. Next, we analyse
the condition number of global minimizers of (IPIP). This setting will be crucial for
a finer interpretation of Theorem 4.3 and 6.2 in Section 7.
By suitably choosing the height function δ [41, Chapter 4, Section 5], we can
assume for each ambient input a in a tubular neighborhood T ⊂ Rn of I that there
is a unique point x on I that minimizes the distance from I to a. In this case, the
computational problem that consists of finding global minimizers of (IPIP) can be
modeled implicitly with the graph
κ[S](x, y) κ[S](x, y)
≤ κ[SGCPP ](a, x, y) ≤ ,
max 1 − ci kηk min 1 − ci kηk
1≤i≤m 1≤i≤m
Our main motivation for studying critical points instead of only the global min-
imizer as in (AP) stems from our desire to predict the sensitivity of outputs of Rie-
mannian optimization methods for solving approximation problems like (4.1), (PIP),
11
and (IPIP). When applied to a well-posed optimization problem, these methods in
general only guarantee convergence to critical points [2]. However, in practice, con-
vergence to critical points that are not local minimizers is extremely unlikely [2]. The
reason is that they are unstable outputs of these algorithms: Applying a tiny per-
turbation to such critical points will cause the optimization method to escape their
vicinity, and with high probability, converge to a local minimizer instead.
Remark 7.2. One should be careful to separate condition from numerical stabil-
ity. The former is a property of a problem, the latter the property of an algorithm. We
do not claim that computing critical points of GCPPs other than local minimizers are
ill-conditioned problems. We only state that many Riemannian optimization methods
are unstable algorithms for computing them via (IPIP). This is not a bad property,
since the goal was to find minimizers, not critical points!
For critical points of the GCPP that are local minimizers of the problem in Propo-
sition 7.1 we can characterize their condition number as the condition number of the
unique global minimizer of a localized problem. This result is proved in Appendix A.4.
Theorem 7.3 (Condition of local minimizers). Let the input manifold I, the
output manifold O, and well-posed loci W and WGCPP be as in (GCPP). Assume that
(a? , x? , y ? ) ∈ WGCPP and x? is a local minimizer of da? : I → R, x 7→ 12 ka? − xk2 .
Then, there exist open neighborhoods Aa? ⊂ Rn around a? and N(x? ,y? ) ⊂ W around
(x? , y ? ), so that
1
ρ(a? ,x? ,y? ) : Aa? → O, a 7→ πO ◦ argmin ka − πI (x, y)k2
(x,y)∈N(x? ,y? ) 2
κ[ρ(a? ,x? ,y? ) ](a) = κ[SGCPP ](a, PIx? (a), ρ(a? ,x? ,y? ) (a)),
where Ix? = πI (N(x? ,y? ) ) is the projection onto the first factor, and PIx? (a) =
argminx∈Ix? ka − xk2 is the nonlinear projection onto Ix? .
Taking I = O and S as the graph of the identity map 1, the next result follows.
Corollary 7.4. Let the input manifold I, and well-posed loci W and WCPP
be as in (CPP). Assume that (a? , x? ) ∈ WCPP is well-posed and that x? is a local
minimizer of da? : I → R, x 7→ 21 ka? − xk2 . Then, there exist open neighborhoods
Aa? ⊂ Rn around a? and Ix? ⊂ I around x? , so that
1
ρ(a? ,x? ) : Aa? → I, a 7→ argmin ka − xk2
x∈Ix? 2
is an (AP) and its condition number is κ[ρ(a? ,x? ) ](a) = κ[SCPP ](a, ρ(a? ,x? ) (a)).
An interpretation of these results is as follows. The condition number of a gen-
eralized critical point (a? , x? , y ? ) that happens to correspond to a local minimum of
(IPIP) equals the condition number of the map ρ(a? ,x? ,y? ) . This smooth map mod-
els a Riemannian AP on the localized computational problem N(x? ,y? ) ⊂ W which
has a unique global minimum for all ambient inputs from Aa? . Consequently, every
local minimum of (IPIP) is robust in the sense that it moves smoothly according to
ρ(a? ,x? ,y? ) in neighborhoods of the input a? and (x? , y ? ) ∈ W. In this open neighbor-
hood, the critical point remains a local mimimum.
The foregoing entails that if a local minimum (x? , y ? ) ∈ W is computed for
(IPIP) with ambient input a? using Riemannian optimization, then computing a local
minimum (x, y) for (IPIP) with perturbed input a? + ∆ and starting point (x? , y ? )
results in distO (y, y ? ) ≤ κ[SGCPP ](a? , x? , y ? )k∆k + o(k∆k), where distO is as in (1.1),
provided that k∆k is sufficiently small.
12
8. Computing the condition number. Three main ingredients are required
for evaluating the condition number of (IPIP) numerically:
(i) the inverse of the Riemannian Hessian Hη from (H),
(ii) the derivative of the computational problem (D(x,y) πO )(D(x,y) πI )−1 , and
(iii) the spectral norm (2.2).
The only additional item for evaluating the condition number of (IPIP) relative to
the idealized inverse problem “given x ∈ I, find a y ∈ O such that (x, y) ∈ S” is the
curvature term Hη−1 from item (i). Therefore, we focus on evaluating (i) and only
briefly mention the standard techniques for (ii) and (iii).
The derivative in (ii) is often available analytically as in Remark 5.4. If no
analytical expression can be derived, it can be numerically approximated when O ⊂ Rq
is an embedded submanifold by computing the standard Jacobian of S, viewed as the
graph of a map Rn → Rq , and composing it with the projections onto Tx I and Ty O.
Evaluating the spectral norm in (iii) is a linear algebra problem. Let I ⊂ Rn be
embedded with the standard inherited metric and let the Riemannian metric of O be
given in coordinates by a matrix G. If G = RT R is its Cholesky factorization [36],
then kAkI→O = kRAk2 = σ1 (RA), where A : Tx I → TA(x) O is a linear map and
k · k2 is the usual spectral norm, i.e., the largest singular value σ1 (·). The largest
singular value can be computed numerically in O(min{m2 p, p2 m}) operations, where
m = dim I and p = dim O, using standard algorithms [36]. If the linear map LA can
be applied in fewer than O(mp) operations, a power iteration can be more efficient [6].
The Riemannian Hessian Hη can be computed analytically using standard tech-
niques from differential geometry for computing Weingarten maps [27, 46, 51, 52, 54],
or it can be approximated numerically [11]. Numerical approximation of the Hes-
sian with forward differences is implemented in the Matlab package Manopt [14]. Its
inverse can in both cases be computed with O(m3 ) operations.
We found two main analytical approaches helpful to compute Weingarten maps
of embedded submanifolds I ⊂ Rn . The first relies on the following characterization
of the Weingarten map Sηx in the direction of a normal vector ηx = a−x. It is defined
in [27, Section 6.2] as
(8.1) Sηx : Tx I → Tx I, vx 7→ PTx I −(∇ e v N )|x ,
x
d
(∇
e v Y )|x =
x Y |γ(t) ,
dt
where Y is an arbitrary vector field on I and γ(t) is a smooth curve realizing vx [46,
Lemma 4.9]. In other words, Sηx maps ηx to the tangential part of the usual directional
derivative of N in direction of vx ∈ Tx I ⊂ Rn .
The second approach relies on the second fundamental form of I, which is defined
as the projection of the Euclidean covariant derivative onto the normal space Nx I:
II x (X, Y ) := (II (X, Y ))|x := PNx I (∇
e X| Y )|x ,
x
where now both X and Y are vector fields on I. The second fundamental form is sym-
metric in X and Y , and II x (X, Y ) only depends on the tangent vectors X|x and Y |x .
It can be viewed as a smooth map II x : Tx M×Tx M → Nx M. Contracting the second
fundamental form with a normal vector ηx ∈ Nx I yields an alternative definition of
the Weingarten map: For all v, w ∈ Tx I we have hSηx (v), wi = hII x (v, w), ηx i, where
h·, ·i is the Euclidean inner product. Computing the second fundamental form is often
facilitated by pushing forward vector fields through a diffeomorphism F : M → N .
In this case, there exists a vector field Y on N that is F -related to a vector field X
13
image plane
y
x2 baseline
principal plane
x1 c2
c1
Figure 9.1. Setup of the 2-camera triangulation problem. The world coordinates of y ∈ R3 are
to be reconstructed from the projections x1 , x2 ∈ R2 (in the respective image coordinates) of y onto
the image planes of the cameras with centers at c1 and c2 (in world coordinates) respectively.
on M: Y |p = (F∗ X)|p := (DF −1 (p) F )(X|F −1 (p) ). The integral curves generated by X
and Y = F∗ X are related by Proposition 9.6 of [47]. This approach will be illustrated
in the next section.
As a rule of thumb, we find that computation of the condition number of (IPIP) is
approximately as expensive as one iteration of a Riemannian Newton method for solv-
ing this problem. Typically, the cost is not worse than O(m3 +nm2 +min{m2 p, p2 m})
operations, where the terms correspond one-to-one to items (i)–(iii), and where I m ⊂
Rn and p = dim O.
9. Triangulation in computer vision. A rich source of approximation prob-
lems whose condition can be studied with the proposed framework is multiview ge-
ometry in computer vision; for an introduction to this domain see [32, 37, 50]. The
tasks consist of recovering information from r ≥ 2 camera projections of a scene in the
world R3 . We consider the pinhole camera model. In this model the image formation
process is modeled by the following transformation, as visualised in Figure 9.1:
r
A` y + b`
µr : y 7→ ,
cT` y + d` `=1
Consequently, the triangulation problem can be cast as a GCCP provided that two
conditions hold: (i) the set of consistent point correspondences is an embedded man-
ifold, and (ii) intersecting the back-projected rays is a smooth map from aforemen-
tioned manifold to R3 . We verify both conditions in the following subsection.
9.1. The multiview manifold. The image formation process can be inter-
3 2
preted as a projective transformation
h ifrom the projective space P to P in terms of
A` b`
the 4 × 3 camera matrices P` = cT d` [32,37,50]. It yields homogeneous coordinates
`
14
of x` = (z1` /z3` , z2` /z3` ) ∈ R2 where z ` = P` [ y1 ]. Note that if z3` = 0 then the point y
has no projection3 onto the image plane of the `-th camera, which occurs precisely
when y lies on the principal plane of the camera [37, p. 160]. This is the plane parallel
to the image plane through the camera center c` ; see Figure 9.1. It is also known that
points on the baseline of two cameras, i.e., the line connecting the camera centers
visualised by the dashed line in Figure 9.1, all project to the same two points on the
two cameras, called the epipoles [37, Section 10.1]. Such points cannot be triangulated
from only two images. For simplicity, let B ⊂ R3 be the union of the principal planes
of the first two cameras and their baseline, so B is a 2-dimensional subvariety. The
next result shows that a subset of the consistent point correspondences outside of B
forms a smooth embedded submanifold of R2r .
Lemma 9.1. Let OMV = R3 \ B with B as above. The map µr : OMV → R2r is a
diffeomorphism onto its image IMV = µr (OMV ).
Proof. Clearly µr is a smooth map between manifolds (cT` y + d` = z3` 6= 0). It
only remains to show that it has a smooth inverse. Theorem 4.1 of [39] states that
µ2 ’s projectivization is a birational map, entailing that µ2 is a diffeomorphism onto
its image. Let π1:4 : R2r → R4 denote projection onto the first 4 coordinates. Then,
µ−1 −1 −1
2 ◦ π1:4 is a smooth map such that (µ2 ◦ π1:4 ) ◦ µr = µ2 ◦ µ2 = 1OMV , having
used that the domain OMV is the same for all r. For the right inverse, we see that
any element of IMV can be written as µr (y) for some y ∈ OMV . Hence,
(µr ◦ (µ−1
2 ◦ π1:4 ))(µr (y)) = (µr ◦ 1OMV )(y) = µr (y),
The components of the second fundamental form at x = µr (y) ∈ IMV are then
d
II x (Ei , Ej ) = PNIMV dt Ej |γi (t)
h ir
d A` ej A` (y+tei )+b`
= PNIMV dt T
c` (y+tei )+d`
− c `,j T
(c` (y+tei )+d` )2
`=1
h c c c`,i c`,j
ir
`,i `,j
= PNIMV − α2 (y) A` ej − α2 (y) A` ei + 2 α3 (y) (A` y + b` )
` ` `
,
`=1
1
k(D(x,y) πOMV )(D(x,y) πIMV )−1 Hη−1 kI→O = kR−1 (I − Sη )−1 k2 = ,
σ3 (I − Sη )R
where I is the 3 × 3 identity matrix, and σ3 is the third largest singular value.
The computational complexity of this algorithm grows linearly with the number
of cameras r. Indeed, the Weingarten map Sbη can be constructed in O(r) operations,
the smooth frame E is orthogonalized in O(r) operations, the change of basis from
Sbη to Sη requires a constant number of operations, and the singular values of a 3 × 3
matrix adds another constant.
9.4. Numerical experiments. The computations below were implemented in
Matlab R2017b [49]. The code we used is provided as supplementary files accompany-
ing the arXiv version of this article. It uses functionality from Matlab’s optimization
toolbox. The experiments were performed on a computer running Ubuntu 18.04.3
LTS that consisted of an Intel Core i7-5600U CPU with 8GB main memory.
16
We present a few numerical examples illustrating the behavior of the condition
number of triangulation. The basic setup is described next. We take all 10 cam-
era matrices Pi ∈ R3×4 from the “model house” data set of the Visual Geometry
Group of the University of Oxford [66]. These cameras are all pointing roughly in the
same direction. In our experiments we reconstruct the point y := p + 1.510 v ∈ R3 ,
where p = (−1.85213, −0.532959, −5.65752) ∈ R3 is one point of the house and
v = (−0.29292, −0.08800, −0.95208) is the unit-norm vector that points from the
center of the first camera p. Given a data point a ∈ R2r , it is triangulated as fol-
lows. First a linear triangulation method is applied, finding the right singular vector
corresponding to the least singular value of a matrix whose construction is described
(for two points) in [37, Section 12.2]. We then use it as a starting point for solving
optimization problem (9.1) with Matlab’s nonlinear least squares solver lsqnonlin.
For this solver, the following settings were used: TolFun and TolX set to 10−28 ,
StepTolerance equal to 10−14 , and both MaxFunEvals and MaxIter set to 104 . We
provided the Jacobian to the algorithm, which it can evaluate numerically.
Since the multiview manifold IMV is only a three-dimensional submanifold of R2r
and the computational complexity is linear in r, we do not focus on the computational
efficiency of computing κ[SGCPP ]. In practical cases, with r 1000, the performance
is excellent. For example, the average time to compute the condition number for r =
1000 cameras was less than 0.1 seconds in our setup. This includes the time to compute
the frame E, the Weingarten map expressed with respect to E, the orthogonalization
of the frame, and the computation of the smallest singular value.
Experiment 1. In the first experiment, we provide numerical evidence for the
correctness of our theoretical results. We project y to the image planes of the r = 10
cameras: x = µr (y). A random normal direction η ∈ Nx IMV is sampled by taking a
random vector with i.i.d. standard normal entries, projecting it to the normal space,
and then scaling it to unit norm. We investigate the sensitivity of the points along
the ray a(t) := x − tη.
The theoretical condition number of the triangulation problem at a(t) can be
computed numerically for every t using the algorithm from subsection 9.3. We can
also estimate the condition number experimentally, for example by generating a large
number of small perturbations a(t) + E(t) with kE(t)k ≈ 0, solving (9.1) numerically
using lsqnonlin and then checking the distance to the true critical point y. However,
from the theory we know that the worst direction of infinitesimal perturbation is
given by the left singular vector of (I − S−tη )R corresponding the smallest (i.e. third)
singular value, where R is as in subsection 9.3. Let u(t) ∈ R3 denote this vector,
which contains the coordinates of the perturbation relative to the orthonormal basis
Q = (Q1 , Q2 , Q3 ); see subsection 9.3. Then, it suffices to consider only a small
(relative) perturbation in this direction; we took E(t) := 10−6 ka(t)kQu(t). We solve
(9.1) with input a(t)+E(t) and output yest (t) using lsqnonlin with the exact solution
of the unperturbed problem y as starting point. The experimental estimate of the
condition number is then κest (t) = ky−y est (t)k
kE(t)k .
In Figure 9.2 we show the experimental results for the above setup, where we chose
t = 10i kxk with 100 values of i uniformly spaced between −3 and 2. It is visually
evident that the experimental results support the theory. Excluding the three last
values of t, the arithmetic and geometric means of the condition number divided
by the experimental estimate are approximately 1.006664 and 1.006497 respectively.
This indicates a near perfect match with the theory.
There appears to be one anomaly in Figure 9.2, however: the three crosses to the
right of the singularity at t? = 79.64416kxk. What happens is that past the singularity,
y is no longer a local minimizer of the optimization problem miny∈OMV 12 ka(t) −
µ10 (y)k2 . This is because x(t) is no longer a local minimizer of minx∈IMV 12 ka(t)−xk2 .
Since we are applying an optimization method, the local minimizer happens to move
away from the nearby critical point and instead converges to a different point. Indeed,
one can show that the Riemannian Hessians (see [2, Chapter 5.5]) of these problems
17
10 6
10 4
10 2
10 0
10 -2
10 -3 10 -2 10 -1 10 0 10 1 10 2
Figure 9.2. A comparison of theoretical and experimental data of the condition number of
the triangulation problem with all cameras for several points along the ray x − tη. Herein, η is a
randomly chosen unit-norm normal direction at x ∈ IMV . The numerically computed theoretical
condition numbers are indicated by circles, while the experimentally estimated condition numbers,
as described in the text, are plotted with crosses.
t
Figure 9.3. The GCPP condition number in function of the relative distance kxk from a
specified x ∈ IMV and a randomly chosen unit-length normal direction η. Each of the four lines
corresponds to a different number k of cameras taking the pictures, namely k = 2, 3, 5, 10.
are positive definite for small t, but become indefinite past the singularity t∗ .
Experiment 2. The next experiment illustrates how the condition number varies
with the distance t along a normal direction tη ∈ NIMV . We consider the setup from
the previous paragraph. The projection of the point y onto the image planes of the
first k cameras is xk := µk (y) = (P1 y, . . . , Pk y). A random unit-norm normal vector
ηk in Nxk IMV is chosen as described before. We investigate the sensitivity of the
perturbed inputs ak (t) := xk + tηk for t ∈ R. The condition number is computed for
these inputs for k = 2, 3, 5, 10 and kx±tk k = 10i for 104 values of i uniformly spaced
between −3 and 4. The results are shown in Figure 9.3. We have 3 peaks for a fixed k
because dim IMV = 3. The two-camera system has only two singularities in the range
|t| < 104 kx2 k. These figures illustrate the continuity of κGCPP from Corollary 6.3.
Up to about |t| ≤ 10−2 kxk k we observe that curvature hardly affects the condition
number of the GCPP for the triangulation problem. Beyond this value, curvature
plays the dominant role: both the singularities and the tapering off for high relative
errors are caused by it. Moreover, the condition number decreases as the number of
cameras increases. This indicates that the use of minimal problems [45] in computer
vision could potentially introduce numerical instability.
On the one hand, if (a, x) ∈ WCPP then the map D(a,x) ΠRn is invertible. By
multiplying with its inverse we get PTx I = Hη (D(a,x) ΠI )(D(a,x) ΠRn )−1 . As Hη is an
endomorphism on Tx I, which is the range of the left-hand side, the equality requires
Hη to be an automorphism, i.e., it is invertible.
On the other hand, if (a, x) 6∈ WCPP , there is some (ȧ, ẋ) 6= 0 in the kernel
of D(a,x) ΠRn . Since ȧ = (D(a,x) ΠRn )(ȧ, ẋ) = 0, we must have ẋ 6= 0. Consequently,
applying both sides of (A.1) to (ȧ, ẋ) gives 0 = Hη ẋ, which implies that Hη is singular.
This finishes the proof.
The next result follows almost directly from Proposition A.1 and appears several
times as an independent statement in the literature. The earliest reference we could
locate is Abatzoglou [1, Theorem 4.1], who presented it for C 2 embedded submanifolds
of Rn in a local coordinate chart.
Corollary A.3. Let I ⊂ Rn be a Riemannian embedded submanifold. Then,
there exists a tubular neighborhood T of I such that the projection map PI : T →
I, a 7→ argminx∈I ka − xk2 is a smooth submersion with derivative
Da PI = Hη−1 PTx I ,
WCPP
ΠRn
/ Rn o ΠRn
WGCPP
ΠI ΠO
Io πI
W πO
/O
Figure A.1. A sketch of the geometric construction in the proof of Theorem 7.3.
A.4. Proofs for Section 7.
Proof of Proposition 7.1. Taking the derivative of the objective function at a
point (x, y) ∈ W we see that the critical points satisfy ha − x, D(x,y) πI (ẋ, ẏ)i = 0
for all (ẋ, ẏ) ∈ T(x,y) W. By the definition of W, the derivative D(x,y) πI is surjective,
which implies that (D(x,y) πI )(T(x,y) W) = Tx I. Therefore, the condition of being a
critical point is equivalent to a − x ∈ Nx I.
Proof of Theorem 7.3. By assumption (a? , x? , y ? ) ∈ WGCPP . In particular, this
implies (a? , x? ) ∈ WGCPP , so that D(a? ,x? ) ΠRn : T(a? ,x? ) WCPP → Rn is invertible.
By the inverse function theorem [47, Theorem 4.5], there is an open neighborhood
U(a? ,x? ) ⊂ WCPP of (a? , x? ) such that ΠRn restricts to a diffeomorphism on
Let Π−1
Rn denote the inverse of ΠRn on Aa? . Combining (A.1) and Lemma A.2, the
derivative of ΠI ◦ Π−1
Rn is given by (D(a,x) ΠI )(D(a,x) ΠRn )
−1
= Hη−1 PTx I and thus has
constant rank on Aa? equal to dim I. This implies that
ha − πI (x, y), D(x,y) πI (ẋ, ẏ)i = 0 for all (ẋ, ẏ) ∈ T(x,y) N(x? ,y? ) .
where the first bound is the triangle inequality, and where Pδ (x? ) ⊂ Ix? is the preim-
age of Bδ (x? ) ∩ Tx? Ix? under the projection P restricted to Ix? .
Now let x ∈ ∂Ix? be a point in the boundary of Ix? and, again, x0 = P (x). We
want to show that the distance ka − xk is strictly larger than (A.3). This would imply
that local minimizer for da is indeed contained in the interior of N(x? ,y? ) . We have
ka − xk = ka − a? + a? − x? + x? − x0 + x0 − xk
≥ ka? − x? + x? − x0 + x0 − xk − ka − a? k.
Note that x? = P (a? ). Hence we have the orthogonal decomposition
a? − x? + x? − x0 + x0 − x = (1 − P )(a? − x) + P (a? − x) = a? − x.
Plugging this into the above, we find
ka − xk ≥ k(1 − P )(a? − x) + P (a? − x)k − δ
p
= k(1 − P )(a? − x)k2 + kP (a? − x)k2 − δ
p
≥ k(1 − P )(a? − x)k2 + 2 − δ,
where = minz∈P (∂Ix? ) kx? − zk > 0.
Observe that a? − x? points from x? to a? and likewise x0 − x points from x to
x . Therefore, ha? − x? , x0 − xi ≥ 0 because both point away from the manifold Ix? ,
0
which lies entirely on the side of x by assumption (ii) above; see also Figure A.1.
Consequently, k(1 − P )(a? − x)k = ka? − x? + x0 − xk ≥ ka? − x? k, which implies
p 2
ka − xk ≥ ka? − x? k2 + 2 − δ = ka? − x? k + − δ + O(4 ).
2ka? − x? k
Note that h is a continuous function with limx→x? h(x) = 0. Hence, as δ → 0 the
maximum height tends to zero as well. Moreover, and ka? − x? k are independent of
δ. All this implies that we can choose δ > 0 sufficiently small so that
2
− δ + O(4 ) ≥ 2δ + max ? h(z).
2ka? − x? k z∈Pδ (x )
It follows that ka − xk is lower bounded by (A.3), so that the minimizer for da must
be contained in the interior of N(x? ,y? ) .
The foregoing shows that on Bδ (a? ) ∩ Aa? , the map ΠI ◦ Π−1
Rn equals the nonlinear
projection PIx? onto the manifold Ix? . Putting everything together, we obtain
ρ(a? ,x? ,y? ) (a) = (ΠO ◦ Π−1 −1 −1 −1
Rn )(a) = (πO ◦ πI ◦ ΠI ◦ ΠRn )(a) = (πO ◦ πI ◦ PIx? )(a);
23
the second equality following (A.2). The condition number of this smooth map is
given by (1.2). Comparing with Theorem 5.2 and 6.2 concludes the proof.
REFERENCES
24
applications, With contributions from Théo Papadopoulo.
[33] F. Feppon and P. F. J. Lermusiaux, The extrinsic geometry of dynamical systems tracking
nonlinear matrix projections, SIAM J. Matrix Anal. Appl., 40 (2019), pp. 814–844.
[34] R. M. Freund and F. Ordóñez, On an extension of condition number theory to nonconic
convex optimization, Math. Oper. Res., 30 (2005), pp. 173–194.
[35] R. M. Freund and J. R. Vera, Condition-based complexity of convex opitimization in conic
linear form via the ellipsoid algorithm, SIAM J. Optim., 10 (1999), pp. 155–176.
[36] G. H. Golub and C. Van Loan, Matrix Computations, The John Hopkins University Press,
4 ed., 2013.
[37] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, Cambridge
University Press, Cambridge, second ed., 2003. With a foreword by Olivier Faugeras.
[38] G. Heidel and V. Schulz, A Riemannian trust-region method for low-rank tensor completion,
Numer. Linear Algebra Appl., 25 (2018).
[39] A. Heyden and K. Åström, Algebraic properties of multilinear constraints, Math. Meth. Appl.
Sci., 20 (1997), pp. 1135–1162.
[40] N. Higham, Accuracy and Stability of Numerical Algorithms, SIAM, Philadelphia, PA, 1996.
[41] M. W. Hirsch, Differential Topology, no. 33 in Graduate Text in Mathematics, Springer-Verlag,
1976.
[42] S. Holtz, T. Rohwedder, and R. Schneider, On manifolds of tensors of fixed TT-rank,
Numer. Math., 120 (2012), pp. 701–731.
[43] J. Humpherys, T. J. Jarvis, and E. J. Evans, Foundations of Applied Mathematics, SIAM,
2017.
[44] D. Kressner, M. Steinlechner, and B. Vandereycken, Low-rank tensor completion by
Riemannian optimization, BIT Numer. Math., 54 (2014), pp. 447–468.
[45] Z. Kúkelová, Algebraic Methods in Computer Vision, PhD thesis, Czech Technical University
in Prague, 2013.
[46] J. M. Lee, Riemannian Manifolds: Introduction to Curvature, Springer-Verlag, 1997.
[47] J. M. Lee, Introduction to Smooth Manifolds, Springer, New York, USA, 2 ed., 2013.
[48] Y. Liu and W. Wang, A revisit to least squares orthogonal distance fitting of parametric curves
and surfaces, in Advances in Geometric Modeling and Processing, F. Chen and B. Juttler,
eds., Lecture Notes in Computer Science, 2008.
[49] MATLAB, R2017b. Natick, Massachusetts, 2017.
[50] S. Maybank, Theory of Reconstruction from Image Motion, vol. 28 of Springer Series in In-
formation Sciences, Springer-Verlag, Berlin, 1993.
[51] B. O’Neill, Semi-Riemannian Geometry, Academic Press, 1983.
[52] , Elementary Differential Geometry, Elsevier, revised second edition ed., 2001.
[53] J. Peña and V. Roshchina, A data-independent distance to infeasibility for linear conic sys-
tems, SIAM J. Optim., 30 (2020), pp. 1049–1066.
[54] P. Petersen, Riemannian Geometry, vol. 171 of Graduate Texts in Mathematics, Springer,
New York, second ed., 2006.
[55] I. R. Porteous, The normal singularities of a submanifold, J. Differ. Geom., 5 (1971).
[56] J. Renegar, Incorporating condition measures into the complexity theory of linear program-
ming, SIAM J. Optim., 5 (1995), pp. 506–524.
[57] , Linear programming, complexity theory and elementary functional analysis, Math. Pro-
gram., 70 (1995), pp. 279–351.
[58] J. R. Rice, A theory of condition, SIAM J. Numer. Anal., 3 (1966), pp. 287–310.
[59] M. Spivak, A Comprehensive Introduction to Differential Geometry, vol. 2, Publish or Perish,
Inc., 1999.
[60] M. Steinlechner, Riemannian optimization for high-dimensional tensor completion, SIAM
J. Sci. Comput., 38 (2016), pp. S461–S484.
[61] A. Sung-Joon, Geometric fitting of parametric curves and surfaces, J. Inf. Process. Syst., 4
(2008), pp. 153–158.
[62] R. Thom, Sur la theorie des enveloppes, J. Math. Pures Appl., 41 (1962), pp. 177–192.
[63] B. Vandereycken, Low-rank matrix completion by Riemannian optimization, SIAM J. Optim.,
23 (2013), pp. 1214–1236.
[64] J. R. Vera, Ill-posedness and the complexity of deciding existence of solutions to linear pro-
grams, SIAM J. Optim., 6 (1996), pp. 549–569.
[65] , Geometric measures of convex sets and bounds on problem sensitivity and robustness
for conic linear optimization, Math. Program., 147 (2014), pp. 47–79.
[66] Visual Geometry Group, University of Oxford, Model house data set, Last accessed 28
september 2020. Access online at https://www.robots.ox.ac.uk/∼vgg/data/data-mview.
html.
[67] H. Weyl, On the volume of tubes, Amer. J. Math., 2 (1939), pp. 461–472.
[68] J. H. Wilkinson, Rounding Errors in Algebraic Processes, Prentice-Hall, Englewood Cliffs,
New Jersey, 1963.
[69] , The Algebraic Eigenvalue Problem, Oxford University Press, Amen House, London,
United Kingdom, 1965.
[70] H. Woźniakowski, Numerical stability for solving nonlinear equations, Numer. Math., 27
(1976/77), pp. 373–390.
25