0% found this document useful (0 votes)

62 views33 pages

Location Jul 29 12

The document describes the multi-facility location problem (MFLP), which involves locating K facilities to serve N customers in a way that minimizes the total weighted distance traveled. It can be decomposed into an assignment problem of allocating customers to facilities and a location problem of positioning the facilities. The document proposes a probabilistic decomposition method that approximates MFLP by using probabilistic rather than rigid customer assignments and iteratively updating facility locations and customer probabilities. It aims to overcome limitations of existing heuristic solutions that can get stuck in non-optimal stable solutions.

Uploaded by

Shiiva Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views33 pages

Location Jul 29 12

Uploaded by

Shiiva Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

THE MULTI–FACILITY LOCATION PROBLEM: A PROBABILISTIC

DECOMPOSITION METHOD

CEM IYIGUN AND ADI BEN-ISRAEL

Abstract. A generalized Weiszfeld method is proposed for the multi–facility location problem. The prob-
lem is relaxed using probabilistic assignments, and is decomposed into single facility location problems, that
are coupled by these assignments, and can be solved in parallel. The probabilistic assignments are updated
at each iteration, using the distances to the current centers. The method thus iterates between assignments
and centers updates, with the probabilistic assignments gradually approaching the pure assignments to the
computed centers. The method also provides intrinsic criteria for the quality of the solution, and for the
optimal number of facilities to serve the given customers. A duality theorem allows verifying the optimality
of the cluster centers by solving a dual problem. Numerical experience with several problems from the
literature is presented.

1. Introduction

The Fermat–Weber (also single facility) location problem is to locate a facility that will serve
optimally a set of customers, given by their locations and weights, in the sense of minimizing the
weighted sum of distances traveled by the customers, see [14] and [16]. A well known method for solving
the problem is the Weiszfeld method [44], a gradient method that expresses and updates the sought
center as a convex combination of the data points.
The multi–facility location problem (MFLP) is to locate a (given) number of facilities to serve the
customers as above. Each customer is assigned to a single facility, and the problem (also called location–
allocation) is to determine the optimal locations of the facilities, as well as the optimal assignments of
customers (assignment is not an issue in the single facility case.)
In some MFLP cases the facility locations are constrained to lie in a given subset of the plane, in
particular a given subset of the customers locations. This constrained, or discrete, MFLP model was
first considered by Hakimi [20], and is often solved by mixed integer programming, see [39].
We study here the continuous, or unconstrained, MFLP model, where the facilities can be anywhere
in the plane. We also assume that the facilities can handle all the customers assigned to them. The
capacitated problem, where some facilities may have limited capacities, see, e.g., [45], will be treated
elsewhere using the results of [23].
The continuous MFLP is a clustering problem, where customers and facility locations correspond
to data points and centers, and the clustering criterion is the sum of distances from each data point to its

Date: July 29, 2012.

2000 Mathematics Subject Classification. Primary 90B85, 90C46, 62H30; Secondary 90C59, 92D40, 91B14, 94A17.
Key words and phrases. Facility location; clustering; Fermat–Weber problem; multi–facility location problem; Weiszfeld
method; contour approximation; duality; Luce choice axiom; Kullback–Leibler divergence.
1
2 CEM IYIGUN AND ADI BEN-ISRAEL

cluster center, to be minimized. We use interchangeably center = facility location, and cluster = the
set of customers assigned to a facility.
This paper solves the continuous MFLP using the probabilistic clustering approach of [4], and is a
continuation of [26]. We approximate the MFLP (which is NP hard, [37]) by replacing rigid assignments
with probabilistic assignments, as in [5] and [12]. The resulting iteration is a natural generalization of the
Weiszfeld method [44] to several facilities.
The plan of the paper is as follows. Section 2 describes the multi–facility location problem. The special
case of a single facility is discussed in Section 3.
Section 4 introduces probabilistic assignments, with cluster membership probabilities that depend on
the distances involved. Section 5 introduces the probabilistic location problem (P.K), an approximation
of MFLP. The updates of the centers of (P.K) are explained in Section 6.
A generalized Weiszfeld algorithm for solving (P.K) is presented in Section 9. The cluster membership
probabilities are used in Section 10 to chart the territories of the facilities, a Voronoi diagram of the centers.
The gradient of the objective is modified in Section 11 to guarantee its existence everywhere. The modified
gradient is used to characterize optimality and to prove convergence, Theorem 1.
Section 12 gives a duality theory for (P.K), based on the geometric duality of Kuhn ([29], [30]). A
duality theorem, Theorem 2, allows verifying the optimality of any feasible solution (centers and their
assignments) by solving a dual problem.
Section 13 recalls two useful concepts from probabilistic clustering, the joint distance function (JDF)
and the classification uncertainty function (CUF), and applies them to the problem (P.K).
Section 14 presents numerical results from our method applied to several problems in the literature.
A proof of Theorem 2 is given in Appendix A. Appendix B shows the connection between the CUF
(introduced in Section 13) and the Kullback-Leibler entropic distance. The probabilistic model used here
is related to Luce’s choice axiom [36] in Appendix C.

2. The multi–facility location problem

index set {1, 2, . . . , K} by 1:K. For a vector x = (xj ) ∈ Rn , kxk denotes the
Notation: We denote an q
Pn 2
Euclidean norm, kxk := j=1 |xj | . The distance between the vectors x and y is denoted d(x, y).
The Euclidean distance
d(x, y) = kx − yk (1)
is used throughout.
Consider

• a set X := {xi : i ∈ 1:N } of N points in Rn ,

• a set W := {wi > 0 : i ∈ 1:N } of their respective weights, and
• an integer 1 ≤ K ≤ N .

Typically, the points {xi } are the locations of customers1, and the weights {wi } are their demands.

1Most location problems are planar, i.e. n = 2, but the results below hold for general n, and are stated for Rn .
THE MULTI–FACILITY LOCATION PROBLEM 3

The multi–facility location problem (MFLP)2 defined next, uses the data

data(L.K) := {X, W, K}, (2)

and is to locate K facilities, and assign customers to these facilities, so as to minimize the sum of weighted
distances
XK X
min wi d(xi , ck ) (L.K)
c1 ,c2 ,...,cK
k=1 xi ∈Xk
where {ck : k ∈ 1:K} are the centers (facility locations), and Xk is the cluster of customers assigned to the
k th facility. For K = 1 this reduces to the Fermat–Weber location problem, see § 3, where assignment is
absent. The other extreme case, K = N (every point is a center), is of no interest.
The MFLP (L.K) can be broken into two problems:
(a) the assignment problem: given K centers {ck }, assign each of the N points {xi } to the nearest
center, and
(b) the location problem: given the K clusters Xk formed in (a), find the center of each cluster (using,
say, the Weiszfeld method.)
A natural heuristic ([11], [33]) for solving (L.K) is to iterate between these two problems: the clusters
are updated (assignment), requiring updating the centers (location), then the points are re-assigned, etc.
A solution {(Xk , ck ) : k ∈ 1:K} is stable if it cannot be improved by re–assigning any point xi to
another cluster. The heuristic method described above may terminate in a non–optimal stable solution,
declaring it to be optimal, as illustrated in Example 1.

Example 1. Given N = 4 points, x1 = (0, 0), x2 = (1, 0), x3 = (1, 1), x4 = (0, 1), and all weights wi = 1,
consider the problem (L.2), and three of its solutions:
Solution 1: Each cluster is a pair of adjacent points, see e.g. Fig. 1(a). The centers may lie anywhere
on the segments joining the points, and the value of the objective function is 2.
Solution 2: Unequal clusters with centers in opposite points, for example, c1 = (0, 0), c2 = (1, 1), see,
e.g., Fig. 1(b). The value of the objective is 2.
Solution 3: Unequal clusters, see, e.g., Fig. 1(b) with centers c1 = ( 3+1√3 , 3+1√3 ), c2 = (1, 1) and the
value of the objective is 1.9318.
Solution 3 is optimal, but Solutions 1 and 2 are stable: for example, in Solution 1, the point x2 is not
closer to the center of X1 than to the center of X2 , and should not be re–assigned (to get a better solution)
on the basis of distance information alone.
A fourth solution is where each cluster is a pair of opposite points, the centers are anywhere on the
√
diagonals, and the objective function value is 2 2. This solution is unstable, except in the (uninteresting)
case where the two centers coincide at ( 12 , 12 ).
If a simple example like this has stable solutions that are not optimal, one would expect many such
solutions in bigger problems, say hundreds of customers and several facilities.

2Also called the location–allocation problem.

4 CEM IYIGUN AND ADI BEN-ISRAEL

x4 x3 x4 x3
• • • •

• • • •
x1 x2 x1 x2

(a) X1 = {x1 , x4 }, X2 = {x2 , x3 } (b) X1 = {x1 , x2 , x4 }, X2 = {x3 }

Figure 1. Illustration of Example 1

3. The Fermat–Weber location problem

Given {X, W, 1} as above, the Fermat–Weber location problem is to find a point c ∈ Rn minimizing
the sum of weighted distances,
N
X
minn wi d(xi , c), (L.1)
c∈R
i=1
see [17], [35], [45] and their references.
If the points {xi } are not collinear, as is assumed throughout, the objective function
N
X
f (c) = wi d(xi , c) (3)
i=1

is strictly convex, and (L.1) has a unique optimal solution.

The gradient of (3) is undefined if c ∈ X, i.e. if c coincides with one of the data points {xi }. For c 6∈ X,
N
X xi − c
∇f (c) = − wi , (4)
kxi − ck
i=1

and the optimal center c∗ , if not in X, is characterized by ∇f (c∗ ) = 0, expressing it as a convex combination
of the points xi ,
N
wi kxi − c∗ k

X
∗ ∗ ∗
c = λi xi , with coefficients λi = N that depend on c∗ .

wm kxm − c∗ k
P
i=1
m=1

This circular statement gives rise to the Weiszfeld iteration, [44],

c+ := T (c) (5)

where c+ is the updated center, c is the current center, and

  


N  wi kxi − ck

 
 P 

 N  xi , if c 6∈ X;

T (c) := i=1  P  (6)

 wm kxm − ck


 m=1
c, if c ∈ X.

THE MULTI–FACILITY LOCATION PROBLEM 5

In order to extend ∇f (c) to all c, Kuhn [30] modified its definition as follows: ∇f (c) := −R(c), where

 −∇f (c), if c 6∈ X;
R(c) := R j (7)
 max {0, kRj k − wj } , if c = xj ∈ X,
j
kR k
and X wi
Rj := (xi − xj ) (8)
kxi − xj k
i6=j
is the resultant force of N − 1 forces of magnitude wi and direction xi − xj , i 6= j.

3.1. The results of Kuhn, [30]. The following properties of the mappings R(·), T (·), and the optimal
center c∗ were proved by Kuhn [30]. By “= c∗ ” we mean “is optimal”.
(a) c = c∗ ⇐⇒ R(c) = 0.
(b) c∗ ∈ conv X (the convex hull of X).
(c) If c = c∗ then T (c) = c. Conversely, if c 6∈ X, T (c) = c then c = c∗ .
(d) If T (c) 6= c then f (T (c)) < f (c).
(e) xj = c∗ ⇐⇒ wj ≥ kRj k.
(f) If xj 6= c∗ , the direction of steepest descent of f at xj is Rj /kRj k.
(g) If xj 6= c∗ there exists δ > 0 such that
0 < kc − xj k =⇒ kT s (c) − xj k > δ for some s.
kT (c) − xj k kRj k
(h) lim = .
c→xj kc − xj k wj
(i) For any c0 , if no cr := T r (c0 ) ∈ X, then lim cr = c∗ .
r→∞
These results are generalized in Theorem 1 to the case of several facilities.

Remark 1. Another claim in [30], that

T r (c0 ) → c∗ for all but a denumerable number of initial centers c0 ,

was refuted by Chandrasekaran and Tamir [9]. Convergence can be assured by modifying the algorithm
(5)–(6) at a sticky, non–optimal, center that coincides with a data point xj . Balas and Yu [3] suggested
moving from xj in the direction Rj (8) of steepest descent, assuring a decrease of the objective, and non–
return to xj by (g) above. Vardi and Zhang [43] guaranteed leaving xj by adding to the objective function
a quadratic term in the distances to the other data points. Convergence was also addressed by Brimberg
[7], Drezner [15], Eckhardt [18], Katz [28] and others.

4. Cluster membership probabilities

For 1 < K < N , the problem (L.K) is NP hard, [37]. It can be solved polynomially in N for K = 2, see
[13], and possibly for other given K.
We relax the assignment problem in (L.K) by using probabilistic assignments, see § 5. First we require
the concept of cluster membership probabilities,

pk (x) := Prob {x ∈ Xk }, k ∈ 1:K,

6 CEM IYIGUN AND ADI BEN-ISRAEL

assumed to depend only on the distances {d(x, ck ) : k ∈ 1 : K} of the point x from the K centers. A
reasonable assumption is

membership in a cluster is more likely the closer is its center (A)

and a simple way to model this assumption is3,

w pk (x) d(x, ck ) = D(x), k ∈ 1:K, (9)

where w is the weight of x, and D(·) is a function of x, that does not depend on k. D(x) is called the
joint distance function (JDF) at x.
Equations (9) are optimality conditions for the extremum problem
( K K
)
X X
2
min w pk (x) d(x, ck ) : pk (x) = 1, pk (x) > 0, k ∈ 1:K (E)
k=1 k=1

in the probabilities {p1 (x), · · · , pK (x)}. The squares of probabilities in the objective of (E) serve to smooth
the underlying non–smooth problem (L.K), see the seminal paper by Teboulle [41].
Since probabilities add to one we get from (9),
Q
d(x, cj )
j6=k
pk (x) = K Q
, k ∈ 1:K, (10)
P
d(x, cm )
ℓ=1 m6=ℓ

and the JDF at x,

K
Q
d(x, cj )
j=1
D(x) = w K Q
, (11)
P
d(x, cm )
ℓ=1 m6=ℓ
which is (up to a constant) the harmonic mean of the distances {d(x, ck ) : k ∈ 1:K}. The probabilities
{pk (x) : k ∈ 1:K} are determined by the centers {ck : k ∈ 1:K}, and the JDF also uses the weight w.
For example, in case K = 2,
d(x, c2 ) d(x, c1 )
p1 (x) = , p2 (x) = , (12a)
d(x, c1 ) + d(x, c2 ) d(x, c1 ) + d(x, c2 )
d(x, c1 ) d(x, c2 )
D(x) = w , (12b)
d(x, c1 ) + d(x, c2 )
and for K = 3,
d(x, c2 ) d(x, c3 )
p1 (x) = , etc.,
d(x, c1 ) d(x, c2 ) + d(x, c1 ) d(x, c3 ) + d(x, c2 ) d(x, c3 )
d(x, c1 ) d(x, c2 ) d(x, c3 )
D(x) = w .
d(x, c1 ) d(x, c2 ) + d(x, c1 ) d(x, c3 ) + d(x, c2 ) d(x, c3 )

3There are other ways to model Assumption (A), e.g. [4], but the simple model (9) works well enough for our purposes.
THE MULTI–FACILITY LOCATION PROBLEM 7

5. Probabilistic approximation of (L.K)

There is a difference between the statements “x is a member of Xk ” and “x is assigned to Xk ”, making

it necessary to introduce “probabilistic assignments”, of which the cluster membership probabilities of § 4
are a special case. For a given a point x and clusters {Xk : k ∈ 1:K}, a probabilistic assignment of
x is a vector q(x) = (q1 (x), . . . , qK (x)), where qk (x) is the probability that x is assigned to Xk . A pure
assignment (in short, an assignment) is a unit vector, say q(x) = ek , indicating that x is assigned
(deterministically) to Xk .
A pure assignment of all points {xi } corresponds to a clustering of X,

X = X1 ∪ X2 ∪ · · · ∪ XK , where Xk := {x ∈ X : q(x) = ek }. (13)

Given the probabilistic assignments for each xi ∈ X we denote the set of K N numbers {qk (xi )} by

Q(X) := {qk (xi ) : k ∈ 1:K, i ∈ 1:N }. (14)

Probabilistic assignments are assumed to have the following property: If a data point xj coincides with
a center ck , i.e. if d(xj , ck ) = 0, then q(xj ) is a pure assignment,

d(xj , ck ) = 0 =⇒ qk (xj ) = 1 , qm (xj ) = 0 for all m 6= k . (15)

For the cluster membership probabilities {pk (x)} this is assured by (10).
We use a probabilistic approximation of the problem (L.K) in the form
K X
X N
min wi qk (xi ) d(xi , ck ), (P.K)
{c1 , . . . , cK } k=1 i=1
{{q1 (xi ), . . . , qK (xi )} : i ∈ 1:N }
with two sets of variables, the centers {ck } and probabilistic assignments {qk (xi )}, that are updated
iteratively. The problem (P.K) uses the same data, (2), as in problem (L.K).
For any centers and probabilistic assignments, the objective function of (P.K) is an upper bound on the
optimal value of (L.K),
XK X N
wi qk (xi ) d(xi , ck ) ≥ min (L.K), (16)
k=1 i=1
and therefore so is the optimal value of (P.K),

min (P.K) ≥ min (L.K). (17)

The probabilistic assignments {qk (x)} used below are calculated by raising the cluster membership
probabilities {pk (x)} to a power ν ≥ 1 and normalizing4,
pνk (xi )
qk (xi ) = K
, i ∈ 1:N, k ∈ 1:K, (18a)
ν
P
pj (xi )
j=1

4This is one of many possible constructions.

8 CEM IYIGUN AND ADI BEN-ISRAEL

and can be calculated directly from the distances, since by (10),

d(xi , cj )ν
Q
j6=k
qk (xi ) = K
. (18b)
)ν
P Q
d(xi , cm
ℓ=1 m6=ℓ

To record the dependence on {pk (x)} and the exponent ν, we denote the probabilistic assignments (18a)
(ν)
by {pk (xi )}. In previous work ([4] and [23]–[26]) we used the exponent ν = 2 without renormalization.
For a discussion of the exponent ν see § 8 below.

5.1. Bounds for (L.K) with given centers. Given a set of K centers {c1 , . . . , cK }, not necessarily
optimal, the minimal value of (L.K) is obtained by assigning each xi to its nearest center,
N
X
m(c1 , . . . , cK ) = wi min{d(xi , ck ) : k ∈ 1:K}. (19)
i=1

This value (not necessarily optimal since the centers are arbitrary) can be bounded as follows.

Corollary 1. Given a set of centers {c1 , . . . , cK }, the corresponding cluster membership probabilities
{pk (xi ) : k ∈ 1:K, i ∈ 1:N } of (10), and the values of the JDF (11), {D(xi ) : i ∈ 1:N }, at the points
{xi : i ∈ 1:N },
N N
X X D(xi )
D(xi ) ≤ m(c1 , . . . , cK ) ≤ PK 2
, (20a)
i=1 i=1 j=1 pj (xi )
N
X
≤K D(xi ). (20b)
i=1

Proof. The left inequality in (20a) follows from

D(xi ) ≤ wi min{d(xi , ck ) : k ∈ 1:K}, by (11).

To prove the right inequality, note that for any probabilistic assignments {qk (xi ) : i ∈ 1:N, k ∈ 1:K},
K X
X N
wi qk (xi ) d(xi , ck ) ≥ m(c1 , . . . , cK ) (21)
k=1 i=1
THE MULTI–FACILITY LOCATION PROBLEM 9

(2)
In particular, for qk (x) = pk (x),
K X
N
X p2 (xi )
LHS (21) = wi PK k d(xi , ck ),
2
k=1 i=1 j=1 pj (xi )
N K
X wi X
= PK pk (xi )2 d(xi , ck ),
i=1 j=1 p2j (xi ) k=1
N K
!
X 1 X
= PK pk (xi ) D(xi ), by (9),
i=1 j=1 p2j (xi ) k=1
N
X D(xi )
= PK = RHS (20a).
2
i=1 j=1 pj (xi )
PK 1
Finally, (20b) follows from j=1 p2j (xi ) ≥ K, for all i.

Note that the bounds (20a) are tighter as {pj (xi )} get closer to pure assignments.

6. Updates of centers

Given probabilistic assignments {qk (xi )}, the objective function of (P.K) is a separable function of the
centers,
K
X
f (c1 , . . . , cK ) := fk (ck ), (22a)
k=1
N
X
where fk (c) := wi qk (xi ) kxi − ck, k ∈ 1:K. (22b)
i=1

The centers problem thus separates into K problems of type (L.1), coupled by the probabilistic assign-
ments {qk (xi )}. If the data points are not collinear, as we assume throughout, the centers problem has a
unique optimal solution {c∗k : k ∈ 1:K}.
The gradient of (22b), wherever it exists, is
N
X wi qk (xi )
∇fk (c) = − (xi − c), k ∈ 1:K. (23)
kxi − ck
i=1

Zeroing the gradients (23) we get, in analogy with the Weiszfeld method, the optimal centers {c∗1 , . . . , c∗K }
as convex combinations of the data points,
N
X
c∗k = λ∗k (xi ) xi , (24a)
i=1

with coefficients λ∗k (xi ) given by

wi qk (xi ) kxi − c∗k k

λ∗k (xi ) = N , k ∈ 1:K, i ∈ 1:N. (24b)
P ∗
wj qk (xj ) kxj − ck k
j=1
10 CEM IYIGUN AND ADI BEN-ISRAEL

Equations (24a)–(24b) induce K mappings Tk : c → Tk (c), k ∈ 1:K,

N
X wi qk (xi )/kxi − ck
Tk (c) := λki xi , with λki := N
, (25a)
P
i=1 wj qk (xj )/kxj − ck
j=1

for c 6∈ {xj : j ∈ 1:N }, and by continuity,

Tk (xj ) := xj , for j ∈ 1:N. (25b)

These mappings are used to update the centers {c1 , . . . , cK } as follows,

c+
k := Tk (ck ), for k ∈ 1:K. (26)

7. The modified gradient

The gradient (23) is undefined (0/0) if c coincides with any of the data points. In order to define the
gradient everywhere, we modify its definition following Kuhn ([29]–[30]), and denote the modified gradient
by −Rk .
If a center ck 6∈ {xj : j ∈ 1:N },
N
X wi qk (xi )
Rk (ck ) := (xi − ck ). (27a)
kxi − ck k
i=1

Otherwise, if ck coincides with xj , define,

Rjk
Rk (xj ) := max {kRjk k − wj qk (xj ), 0} , (27b)
kRjk k

where
X wi qk (xi )
Rjk = (xi − xj ), (27c)
kxi − xj k
i6=j

and qk (xj ) = 1 in (27b), by (15). Therefore, Rk (xj ) = 0 if kRjk k < wj ; otherwise, Rk (xj ) is a vector
with magnitude kRjk k − wj and direction Rjk . As in (8), the vector Rjk is the resultant of N − 1 forces of
magnitude wi qk (xi ) and direction xi − xj , i 6= j.

8. Updating the exponent ν

As is typically the case in gradient methods, the iterations (26) make big steps at first, approaching their
fixed points, then the iterations slow down and movement at each iteration is small. The “fast” iterations
are few in number, the number of the “slow” iterations is determined by the stopping rule.
Most progress towards solving the location problem occurs in the first few iterations. The slow iterations
at the end deal mainly with the assignment problem.
THE MULTI–FACILITY LOCATION PROBLEM 11

(ν)
With a low value of ν, such as ν = 2, the distributions {pk (x)} may be far from pure assignments, and
require rounding to the nearest integer, 0 or 1.
(ν)
High values of the exponent ν in (18b), say ν > 10, produce {pk (xi )} that are close to pure assignments.
For this reason, high values of ν are useful in the slow iterations at the end. In contrast, using high values
of ν at the beginning may cause premature convergence to a solution that is not necessarily optimal.
This suggests increasing the exponent ν at each iteration. We use here the simple update

ν + := ν + ∆ (28)

where ∆ > 0 is the increment per iteration. If ν0 is the initial exponent, the k th –exponent is thus
ν0 + k ∆. This amounts to a simultaneous approximation of the location and assignment problems of § 2,
concentrating on location first then switching gradually to assignment.
Increasing the exponent ν results in a decrease of the value of the objective function (22), as shown in
following lemma.

Lemma 1. Let {c1 , . . . , cK } be any centers, x any point, {dk (x, ck ) : k ∈ 1:K} the distances, not all
(ν)
equal, and {pk (x) : k ∈ 1:K} the probabilities in (18a). If ν2 > ν1 ≥ 1 then
K K
(ν ) (ν )
X X
pk 2 (x) d(x, ck ) < pk 1 (x) d(x, ck ), (29)
k=1 k=1

and
K
(ν)
X
lim pk (x) d(x, ck ) = min {d(x, ck ) : k ∈ 1:K}. (30)
ν→∞
k=1

Proof. Arrange the distances, say

d(x, c1 ) ≤ d(x, c2 ) ≤ · · · ≤ d(x, cK )

Then the corresponding probabilistic assignments (18a) satisfy, for any ν ≥ 1,

(ν) (ν) (ν)
p1 (x) ≥ p2 (x) ≥ · · · ≥ pK (x)

and are not all equal. Raising the exponent from ν = ν1 to ν = ν2 , increases the highest probabilities and
decreases the lowest ones, in particular, there is an index j such that
(ν ) (ν ) (ν ) (ν )
pk 2 (x) > pk 1 (x) for k ≤ j, and pk 2 (x) ≤ pk 1 (x), for k > j.
K
(ν ) (ν ) P
Let αk := pk 2 (x) − pk 1 (x). Then αk > 0 for k ≤ j, and αk ≤ 0 for k > j. Also, αk = 0, and therefore
k=1
j
X K
X
αk = − αk = α, say.
k=1 k=j+1
j
X K
X
∴ 1
α (LHS(29) − RHS(29)) = ( ααk ) d(x, ck ) − (− ααk ) d(x, ck ) < 0
k=1 k=j+1
12 CEM IYIGUN AND ADI BEN-ISRAEL

j K
( ααk ) d(x, ck ) is a convex combination of low values, and (− ααk ) d(x, ck ) is a convex combi-
P P
since
k=1 k=j+1
nation of higher values.
To prove (30) let M be the index set of minimal distances,

M := {i : d(x, ci ) = min {d(x, ck ) : k ∈ 1:K}}.

Then
(ν)
lim pi = 0, for i 6∈ M,
ν→∞
(ν) 1
lim pi = , for i ∈ M,
ν→∞ |M |
where |M | is the number of elements of M . Therefore
K K
(ν) (ν)
X X
lim pk (x) d(x, ck ) = lim pk d(x, ck ) = min {d(x, ck ) : k ∈ 1:K}.
ν→∞ ν→∞
k=1 k=1

9. A generalized Weiszfeld method for the multi–facility location problem

The above results can be implemented in an algorithm for solving (P.K).

Algorithm 1. A generalized Weiszfeld method for several facilities

Data: X = {xi : i ∈ 1:N } data points (locations of customers),

W = {wi : i ∈ 1:N } weights,
K the number of facilities
ǫ > 0 (stopping criterion)
ν0 > 0 (initial value of the exponent ν)
∆ > 0 (the increment in (28))
Initialization: K arbitrary centers {ck : k ∈ 1:K},
Iteration:
Step 1 compute distances {d(x, ck ) : k ∈ 1:K} for all x ∈ X
Step 2 update the exponent ν (using (28))
(ν)
and the probabilistic assignments {pk (x) : x ∈ X, k ∈ 1:K} (using (18b))
Step 3 update the centers {c+ k := Tk (ck ) : k ∈ 1:K} (using (25a))
K
d(c+
P
Step 4 if k , ck ) < ǫ stop
k=1
return to step 1

Remark 2.
(a) Convergence can be established as in the Weiszfeld algorithm, by forcing a move away from non–optimal
data points (where the algorithm gets stuck by (25b)), using ideas of Balas and Yu [3], Vardi and Zhang
THE MULTI–FACILITY LOCATION PROBLEM 13

[43], and others, see Remark 1.

(b) As shown in [4], the above results also hold if the Euclidean distance (1) is replaced by an elliptic
distance
d(x, y) := hx − y, T (x − y)i, (31)
where h·, ·i is the standard inner product, and T is a positive definite matrix. Algorithm 1 can thus be
used with elliptic distances in Steps 1–4.
(c) The algorithm is robust since the coefficients (24b) are directly proportional to the weights, and inversely
proportional to the distances. As a result, outliers or low weights are discounted, having little influence on
the centers, see, e.g., Example 8 below.

Example 2. The dependence of Algorithm 1 on the exponent ν used in (18b) is illustrated here for the
simple Example 1.
Fig. 2 shows three trajectories, starting at the same initial solution (a pair of close points at the center
of the unit square [0, 1]2 ), using a fixed exponent ν throughout the iterations, and converging to the three
solutions of Example 1. For ν = 6, ν = 12, and ν = 18, the iterates converged to Solutions 2, 3 and 1,
respectively.

(a) ν = 6, Solution 2 (b) ν = 12, Solution 3 (c) ν = 18, Solution 1

Figure 2. Trajectories of iterates in Example 1, for 3 different ν values

Table 1 shows the results for 1, 000 initial solutions (each a pair of points chosen randomly in [0, 1]2 ),
using different values of ν (that are held constant throughout), and a stopping criterion of ǫ = 0.0001.
The table shows the percentage of cases ending with Solutions 1, 2 or 3 of Example 1, and the average
number of iterations. The non–optimal Solution 2 is present only for small ν, and occurs in 100% of the
time for ν = 6. The optimal Solution 3 dominates for intermediate values of ν (it occurs 83% of the time
for ν = 13), and is present for all larger ν. The non–optimal Solution 1 dominates for large ν.

10. Territorial boundaries of facilities

Each center ck serves the customers in a territory that is closer to it than to any other center, see, e.g.,
[21]–[22]. This territory is expressed naturally using the membership probabilities (10).
For any two (distinct) centers cj , ck the locus of the points x with pj (x) = pk (x) is by (10) represented
by the equation
kx − cj k = kx − ck k (32)
14 CEM IYIGUN AND ADI BEN-ISRAEL

Table 1. Dependence of convergence on the exponent ν (ǫ = 0.0001)

ν Percentage of initial solutions resulting in Average

Solution 1 Solution 2 Solution 3 # of Iterations
6 0.0 100.0 0.0 35.88
13 9.9 7.0 83.1 124.62
14 15.7 4.2 80.1 97.51
16 28.8 1.6 69.6 79.34
18 41.3 0.9 57.8 61.56
20 51.0 0.6 48.4 47.34
22 58.8 0.2 41.0 34.39
30 72.1 0.0 27.9 13.82
40 75.9 0.0 24.1 9.00
50 76.8 0.0 23.2 6.97
100 76.9 0.0 23.1 5.91

70
50 55 0.6
60 0.
0.7
50 0.8

0.
40
33

9
40 0
0.3.55 0.3 0.333
33
30 30 0.333
0.6
0.5.56

0.5
20 0 00.7
.8 0.9

0.55

5
0.9
20 0.8

5
0.7

0.5
10
0.6

3
0.6 5

0.33
0 0.5

0.3
10

33
−10
0
0 10 20 30 40 50 0 20 40 60

(a) 15 data points (b) Probability level sets and Voronoi diagram

Figure 3. Illustration of Example 3 (N = 15, K = 3)

and is thus a hyperplane5, the orthogonal bisector of the segment joining cj and ck . On this hyperplane
one is indifferent between the two centers. The portion of the hyperplane (32) where

pj (x) = pk (x) ≥ pm (x), ∀ m 6= j, k (33)

is either empty, bounded or unbounded6. We call it the common boundary of the clusters Xj and Xk .
The common boundaries constitute a Voronoi diagram, [2], of the centers. Each Voronoi cell is a
polyhedron, and is the territory served best by its facility.

Example 3. Problem 9 in Table 2, due to Cooper [10], has N = 15 data points, see Figure 3(a), and K = 3
centers. Figure 3(b) shows the optimal centers, found by Algorithm 1, some level sets of the membership
probability for each cluster, and the Voronoi diagram of the centers, partitioning the plane into territories
served by each facility.

5A line in the typical case n = 2.

6If n = 2 it is either empty, a segment or a half–line.
THE MULTI–FACILITY LOCATION PROBLEM 15

0.35 0.3
5
10
10 0.4
9 0.4

0.4
9 0.
5

0.4
0
8 8 0.6 0.7 .6

0.5
0.8

0.35
6
0.

0.6
7

0.
7 0.0.8

0.5

5
7 0.4 0.4
6 0.5
6 0 5
0.7.6 0.3
0.8

0.35
5 0.4
5 0.35 5
0.
35
4
4 0.35 0.4 0.
0.4 0.
3 35 0.5 0.4 35
3 0. 6 0.
0. 0. 5

0.7
2 00.7.8 6

0.8
0.9

0.9
0.35
2 0.6

0.4
1 6
0.

0.35
1 0.5

0.4
0.4
0.5
0

5
0.3

4
0.
0
0 1 2 3 4 5 6 7 8 9 10 0 2 4 6 8 10

(a) 50 data points (b) Probability level sets and Voronoi diagram

Figure 4. Illustration of Example 4 (N = 50, K = 5)

Example 4. Figure 4(a) shows 50 data points of Problem 30 in Table 2 below. This problem, with K = 5,
was solved by Eilon et al in [19, p. 83]. Figure 4(b) shows the optimal centers (found by Algorithm 1),
some probability level sets for each cluster, and the common boundaries of the territories served by the
facilities (2 common boundaries are empty.)

11. Convergence

Algorithm 1 decomposes the problem (P.K) to K problems of type (L.1), coupled by the probabilistic
assignments. The convergence proof of Kuhn [29]–[30] (with modifications, e.g., [3], [7] or [43]) can be
adapted here.
While the algorithm converges from any initial set of centers {c1 , · · · , cK }, it does not necessarily
converge to an optimal solution of the original problem (L.K).
Writing (25a) as
N
X N
X
Tk (c) = λki xi − c + c = λki (xi − c) + c
i=1 i=1
we get, from (27a),

Tk (c) = c + hk (c) Rk (c) (34)

1
with hk (c) = N
(35)
P wj qk (xj )
j=1 kxj − ck
showing that Algorithm 1 is a gradient method, following the direction of the resultant Rk (c) with a step
of length hk (c)kRk (c)k, except for data points {xj : j ∈ 1:N } which are left fixed by Tk .
Results (a)–(i) of § 3.1 are next reproduced for several facilities.
16 CEM IYIGUN AND ADI BEN-ISRAEL

Theorem 1. Let the data {X, W, K}, and a set Q(X) of probabilistic assignments (14), be given.
(a) For any set of points {ck : k ∈ 1:K} ⊂ Rn , the condition7

Rk (ck ) = 0, for all k ∈ 1:K, (36)

is necessary and sufficient for the points {c1 , . . . , cK } to minimize f (c1 , . . . , cK ) in (22a).
(b) The optimal centers {c∗k : k ∈ 1:K} are in the convex hull of X.
(c) If the point c is an optimal location of the k th –facility (for the given Q(X)), then it is a fixed point8
of Tk ,
Tk (c) = c, for some k ∈ 1:K. (37)
Conversely, if c 6∈ X and satisfies (37), then c is optimal.
(d) If Tk (c) 6= c, then the function fk (·) of (22b) satisfies

fk (Tk (c)) < fk (c), k ∈ 1:K. (38)

(e) If the data point xj is an optimal center of fk then wj ≥ kRjk k.

Rjk
(f) If the data point xj is not an optimal center, the direction of steepest descent of fk at xj is .
kRjk k
(g) Let xi be a data point that is not an optimal center ck , for a given k ∈ 1:K. Then there exist δ > 0
and an integer s > 0 such that

0 < kxi − ck ≤ δ implies kxi − Tks−1 (c)k ≤ δ and kxi − Tks (c)k > δ

(h)
kxj − Tk (c)k kRjk k
lim = for k ∈ 1:K, j ∈ 1:N . (39)
c→xj kxj − ck wj qk (xj )
(i) Given any c0k , k ∈ 1:K, define the sequence {crk = Tkr (c0k ) : r = 1, 2, . . .}. If no crk is a data point, then
lim crk = c∗k , for some optimal centers {c∗1 , · · · , c∗K } (for the approximate problem (P.K)).
r→∞

This theorem can be proved by adapting the proof of [26, Theorem 1] for probabilistic assignments. We
give a proof in the supplementary material section at the end.

Remark 3.
(a) If the assignments Q(X) are pure, then Theorem 1 reduces to the results (a)–(i) of § 3.1, for each of
the clusters defined by (13).
(b) Theorem 1(d) guarantees a decrease of the function fk by the iteration (26), but the function fk on the
left side of (38) may be different than the function fk in RHS(38) if the probabilistic assignments {qk (xi )}
(ν)
are updated, as is the case in Algorithm 1. However, Lemma 1 guarantees a decrease if {pk (xi )} of (18b)
are used, with the exponent ν raised at each iteration.
(c) Theorem 1(g) states that each non–optimal data point has a neighborhood that eventually repulses the
sequence of approximations. Therefore the iterates may not get arbitrarily close to any non–optimal data
point.

7Recall the definition of R (·) in (27a)–(27b), and note that R (·) and (22a)–(22b) use the same (given) probabilistic assign-
k k
ments Q(X).
8Recall the definition in (25a)–(25b).
THE MULTI–FACILITY LOCATION PROBLEM 17

12. The Dual Problem

We abbreviate the probabilistic assignments qk (xi ) by qki , for k ∈ 1:K, i ∈ 1:N . A dual problem (D) for
(P.K) is now given, following [29]. It uses the data

data(D) := {X, W, K, Q(X)} (40)

consisting of data points X = {xi }, weights W = {wi }, the number of facilities K, and the probabilistic
assignments Q(X) = {qki }. The dual variables are KN vectors {uki : k ∈ 1:K, i ∈ 1:N }, one for each
center and data point. We denote the set of dual variables by U.
The dual problem is:
K X
X N
max g(U) = uki · xi (D)
k=1 i=1
N
X
s.t. uki = 0 , k ∈ 1:K, (41)
i=1
kuki k ≤ wi qki , i ∈ 1:N, k ∈ 1:K. (42)

Problem (D) is a generalization of the dual problem for the single facility location problem, see also [29],
[17, § 1.1.2], [27] and [42].

Remark 4.
(a) The dual problem (D) uses the same data as (P.K), namely the triple {X, W, K}, and in addition the
probabilistic assignments Q(X).

(b) Unlike problems (L.K) and (P.K) that have locally optimal solutions, the dual problem (D) has a
unique optimal value (for the given data (40)), and its optimal solutions form a closed convex set.

(c) The dual problem (D) has a natural geometric interpretation. Let the weight wi of the i th –customer
be split among the facilities, so that the k th –facility gets the weight qki wi . The vector uki is interpreted
as a force with direction (xi − ck ), and magnitude kuki k ≤ qki wi . The location of the k th –facility is then
the equilibrium point of these forces, as expressed by the constraint (41). The objective of the problem
(D) is
XK X N XK X N
uki · xi = ( uki · (xi − ck )), by (41),
k=1 i=1 k=1 i=1
and maximizing it is equivalent to minimizing the sum of terms uki · (ck − xi ) representing the work done
by the force uki along (ck − xi ).

Theorem 2.
(a) Let the data (40) be given. Then for any set of points {c1 , . . . , cK }, and any set of feasible dual variables
U = {uki }, the inequality
g(U) ≤ f (c1 , . . . , cK ), (43)
holds, where f (c1 , . . . , cK ) is the objective function (22a) of the primal problem (P.K).
(b) Given the data (40), and an optimal solution {c1 , . . . , cK } of the primal problem (P.K) (with the same
18 CEM IYIGUN AND ADI BEN-ISRAEL

Q(X)), there exist feasible dual variables U such that

g(U) = f (c1 , . . . , cK ). (44)

A proof of this theorem is given in Appendix A.

Remark 5.
(a) Theorem 2 establishes duality between the centers–problem of (P.K) and (D), part (a) is a weak
duality in the sense that any feasible solution U of (D) gives a lower bound for the optimal value of
(P.K), and conversely, any set of centers {ck } for (P.K) gives an upper bound on the optimal value of (D).
Parts (b)–(c) show that there is no duality gap.
(b) If Q(X) are pure assignments, then Theorem 2(b) can be used to check the optimality of the centers
for each of the clusters defined by (13). Indeed, if

g(U) = m(c1 , · · · , cK ), see (19), (45)

then the centers {c1 , · · · , cK } are optimal for their (not necessarily optimal) clusters, see, e.g., the following
example.

Example 5. Consider the 4 points {x1 , · · · , x4 } of Example 1, and the 3 solutions given there (for K = 2),
starting with Solution 3.
Solution 3. Here the 2 clusters are X1 = {x1 , x2 , x4 }, X2 = {x3 }, see Fig. 1(b).
For k ∈ 1:2 and i ∈ 1:4 let uki be the dual variables, and let qki be given by pure assignments, qki = 1 if
xi is assigned to cluster Xk , 0 otherwise. Then

q13 = q21 = q22 = q24 = 0

(q13 = 0 since x3 does not belong to X1 , etc.) and therefore, by the constraints (42),

u13 = u21 = u22 = u24 = 0

The dual problem (D) can be written

max u11 · x1 + u12 · x2 + u14 · x4 + u23 · x3 = u12 [1] + u14 [2] + u23 [1] + u23 [2]
s.t. u11 + u12 + u14 = 0, (46)
u23 = 0,
kuki k ≤ 1, for ki = 11, 12, 14 and 23,

where uki [j] is the j th component of uki , j = 1, 2. This problem has the unique solution
! √ √ ! √ √ ! !
1 1 1 2+ 6 1 2− 6 0
u11 = − √2 , u12 = 4 √ √ , u14 = 4 √ √ , u23 = ,
1 2− 6 2+ 6 0
and the optimal value is
1
√ √
u12 [1] + u14 [2] = 2 ( 2 + 6) ≈ 1.931852
THE MULTI–FACILITY LOCATION PROBLEM 19

u14

•x x3•
4

c
• 1
•x x2•
1
u11 u12

Figure 5. Illustration of Example 5: The resultant of the forces u11 , u12 and u14 is zero

which equals the value of Solution 3 in Example 1. By Theorem 2, the centers in Solution 3 are optimal
for their clusters. Figure 5 illustrates that the equilibrium condition (46) is satisfied.
Solution 2. We have the same clusters, pure assignments, and dual problem (D) as in Solution 3, and
get the same optimal value of (D), which does not satisfy (45),

1.931852 < m(c1 , c2 ) = 2

and therefore the centers in Solution 2 are not optimal for their clusters.
Solution 1. The dual problem for Solution 1 is different, and its solution g(U) satisfies (45), showing
the centers (arbitrarily chosen on the lines connecting the end points) are optimal for their clusters.

13. Contour approximation and uncertainty

The joint distance function (JDF) at x, see (11),

K
Q
d(x, cj )
j=1
D(x) = w K Q
,
P
d(x, cm )
ℓ=1 m6=ℓ

depends on the centers {ck : k ∈ 1:K}, and has the following useful property: for optimal centers, most of
the data points are contained in the lower level sets of D(x), see [1], [25]. We call this property contour
approximation of the data.
The JDF of the set X = {xi : i ∈ 1:N } is defined as the sum of the values D(x) over X,
N
X
D(X) := D(xi ). (47)
i=1

From the contour approximation property it follows that D(X) is a measure of the proximity of customers
to their respective facilities, and the lower the value of D(X), the better is the set of centers {ck }.
20 CEM IYIGUN AND ADI BEN-ISRAEL

The JDF has the dimension of weighted distance. Normalizing it, we get the dimensionless function
 1/K
K
Y
E(x) = w1 K D(x) /  d(x, cj ) , (48)
j=1

with 0/0 interpreted as zero. E(x) is the harmonic mean of the distances {d(x, cj ) : j ∈ 1:K} divided by
their geometric mean. It follows that 0 ≤ E(x) ≤ 1, with E(x) = 0 if any d(x, cj ) = 0, i.e. if x is a cluster
center, and E(x) = 1 if and only if the distances d(x, cj ) (and therefore the probabilities pj (x)) are all
equal.
E(x) can be written, using (10)–(11), as the geometric mean of the cluster membership probabilities
(up to a constant),
 1/K
K
Y
E(x) = K  pj (x) . (49)
j=1
In particular, for K = 2, p
d(x, c1 )d(x, c2 ) p
E(x) = 2 = 2 p1 (x)p2 (x) . (50)
d(x, c1 ) + d(x, c2 )
The function E(x) represents the uncertainty of classifying the point x, with higher [lower] values
of E(x) corresponding to higher [lower] uncertainty, see Appendix B. We call E(x) the classification
uncertainty function, abbreviated CUF, at x.
The CUF of the data set X = {xi : i ∈ 1:N } is defined as
N
1 X
E(X) := E(xi ) . (51)
N
i=1

E(X) is a monotone decreasing function of K, the number of clusters, decreasing from E(X) = 1 (for
K = 1), to E(X) = 0 (for K = N , the trivial case where every data point is a separate cluster, with no
uncertainty.) Indeed, increasing the number of centers from K to K + 1 allows more customers access to
closer centers, with increased probabilities for these centers.
The “right” number K of facilities to serve the given customers is in general determined by economic
considerations (operating costs, etc.) An intrinsic criterion for determining the optimal K is provided by
the rate of decrease of E(X), see e.g. Figure 6(d).
Example 6. Figure 6(a) shows the dataset of Example 3. This problem was solved for K = 3, and
Figure 6(b) shows the centers, and some level sets of the JDF (11), illustrating the contour approximation
property, namely that lower contours of the JDF capture most of the data.
Figure 6(c) shows some level sets of the CUF (49), with darker shades indicating higher uncertainty.
Uncertainty is minimal (E(x) = 0) if x is one of the centers. Note the patch of maximal uncertainty
(E(x) ≥ 0.99) in the middle, where one is indifferent between the three centers9.
The “right” number of clusters (if not known a priori) can be determined by solving the problem (P.K)
for values of K = 1, 2, · · · , calculating the CUF of the whole dataset (51), and stopping when the marginal
decrease of uncertainty is negligible. The results are plotted in Figure 6(d), showing that most of the
9In analogy with Lagrange points in Astronomy, where the gravity pulls from several stars cancel.
THE MULTI–FACILITY LOCATION PROBLEM 21

uncertainty is reduced by K = 3, with a marginal reduction of uncertainty for higher values of K. If the
data is amorphous with no clear number of clusters, then the graph of E(X) does not give a clue.
The “right” number of clusters can be determined similarly from the JDF D(X), (47), computed for
K = 1, 2, · · · , etc., see [24].

70 12.98 19
3 .4
.2 8
50 16
60

16
6.

.2
50 49

3
.98

9. 7

9.7
40

3
1.6
12
3.257

4
40
4.8
30 30

12.98
4.875

9
20 3.2

1.63
6.4
20 63

6.4
. 25
10 31 ..8
4 7 6.49

9
9.
74
12
0 9.74

.9
10

8
−10 16
.23 12.98 .23
0
16
0 10 20 30 40 50 −10 0 10 20 30 40 50 60

(a) Data Points in Example 3 (b) Level sets of the JDF

0.22
0.97
98
0.

80
0.9 0.2
5

60 0.18
0.95

0.16
0.70.60.4
0.9
0.

40 5 0.
8

98
5

0. 0.14
0.978
0.9 97
95 0 0.8
0. 0.6.75 5
0.

20
9

0.4 0.12
0.70.60.4
0.97

5
0.
0.9

0.1
85
0.9

0
5

0.95 0.08
0.97
−20 0
.98 0.06
0.98
0.04
−20 0 20 40 60 80 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Number of Clusters

Figure 6. Illustration of Example 6

14. Numerical examples

Tables 2–5 below present results for the 31 test problems considered in [6]. For each of these problems,
Table 2 gives the source (references in bibliography), the number N of customers, the number K of facilities,
and the coordinates of the N customers.
Each problem was solved 500 times, using random initializations. The results, for ν0 = 5, ∆ = 0.5 and
ǫ = 0.001, are reported in Table 3.
Column 2 (“Best known f ”) gives the best values of the objective in the literature ([6], [8]), that are
assumed optimal.
Column 3 (“Best f ”) gives the best values obtained with Algorithm 1. For all 31 problems, Algorithm 1
22 CEM IYIGUN AND ADI BEN-ISRAEL

Table 2. Data for Problems 1–31

# Source N K Coordinates (x, y) of data points

1 [10, Case 1] 7 2 (15, 15) (5, 10) (10, 27) (16, 8) (25, 14)
(31, 23) (22, 29)
2 [10, Case 2] 7 2 (6, 8) (6, 32) (20, 8) (20, 20) (20, 32)
(36, 8) (36, 32)
3 [10, Case 3] 7 2 (8, 12) (5, 19) (5, 26) (5 32) (35, 20)
(35, 26) (35, 31)
4 [10, Case 4] 7 2 (5, 23) (9, 32) (15, 23) (21, 32) (26, 23)
(31, 32) (16, 12)
5 [10, Case 5] 7 2 (8, 10) (8, 26) (11, 20) (17, 15) (17, 22)
(24, 17) (31, 19)
6 [10, Case 6] 7 2 (2, 19) (35, 11) (31, 33) (25, 26) (18, 23)
(18, 16) (11, 33)
7 [10, Case 7] 7 2 (28, 6) (28, 33) (33, 17) (33, 23) (39, 6)
(39, 33) (6, 10)

8 [40] 15 2
9 [11], [40] 15 3 (5, 9) (5, 25) (5, 48) (13, 4) (12, 19)
10 [40] 15 4 (13, 39) (28, 37) (21, 45) (25, 50) (31, 9)
11 [40] 15 5 (39, 2) (39, 16) (45, 22) (41, 30) (49, 31)
12 [40] 15 6

13 [40] 20 2
14 [40] 20 3 same as problem 8, plus
15 [40] 20 4 (53, 8) (1, 34) (33, 8) (3, 26) (17, 9)
16 [40] 20 5
17 [40] 20 6

18 [40] 25 2
19 [40] 25 3 same as problem 13, plus
20 [40] 25 4 (53, 20) (24, 17) (40, 22) (22, 41) (7, 13)
21 [40] 25 5
22 [40] 25 6

23 [40] 30 2
24 [40] 30 3 same as problem 18, plus
25 [40] 30 4 (5, 17) (39, 3) (50, 50) (16, 40) (22, 45)
26 [40] 30 5

27 [19] 50 2 (1.33, 8.89) (1.89, 0.77) (9.27, 1.49) (9.46, 9.36) (9.20, 8.69)
28 [19] 50 3 (7.43, 1.61) (6.08, 1.34) (5.57, 4.60) (6.70, 2.77) (8.99, 2.45)
29 [19] 50 4 (8.93, 7.00) (8.60, 0.53) (4.01, 0.31) (3.34, 4.01) (6.75, 5.57)
30 [19] 50 5 (7.36, 4.03) (1.24, 6.69) (3.13, 1.92) (8.86, 8.74) (4.18, 3.74)
31 [38] 50 10 (2.22, 4.35) (0.88, 7.02) (8.53, 7.04) (6.49, 6.22) (4.53, 7.87)
(4.46, 7.91) (2.83, 9.88) (3.39, 5.65) (0.75, 4.98) (7.55, 5.79)
(8.45, 0.69) (3.33, 5.78) (6.27, 3.66) (7.31, 1.61) (6.37, 7.02)
(7.23, 7.05) (1.68, 6.45) (3.54, 7.06) (7.67, 4.17) (2.20, 1.12)
(3.57, 1.99) (7.34, 1.38) (6.58, 4.49) (5.00, 9.00) (6.63, 5.23)
(5.89, 8.06) (1.13, 5.25) (1.90, 8.35) (1.74, 1.37) (9.37, 6.44)

obtained the optimal values. In problems 27–31, marked by “†” in Column 3, Algorithm 1 gave somewhat
better results than the best known values, perhaps because of more precise cluster centers. In problems 19,
THE MULTI–FACILITY LOCATION PROBLEM 23

20 and 25, marked by a “*” in Column 3, getting the correct 3 rd –decimal required a tolerance ǫ = 0.0001.
Column 4 (“Frequency (%)”) gives the percentage of the runs where Algorithm 1 gave the optimal solution.
In problem 3 the best value was found in all runs.
Columns 5 and 6 give, respectively, the worst values (“Max. f Value”) found, and their frequency (“Fre-
quency (%)”).
Columns 7 and 8 give, respectively, the average values (“Average f Value”) of all 500 solutions obtained,
and the average number of iterations.
(Algorithm 1 is here tested using a program written in MATLAB R2010a and all experiments are conducted
on a machine with 2.53 GHz Intel(R) Core(TM)2 Duo processor.)
Table 4 reports the same information as Table 3, for ∆ = 0.1 and ǫ = 0.0001. The 2nd column of Table 3
is not repeated.

Remark 6. A comparison of Table 3 (with ∆ = 0.5, ǫ = 0.001) and Table 4 (∆ = 0.1, ǫ = 0.0001) shows
that the smaller increment ∆ requires more iterations, but also increases the frequency of optimal values.
To prevent early termination, a small tolerance ǫ is suggested in conjunction with the small ∆. This calls
for further research, including other selections of the exponent ν, in particular an adaptive control where
K
d(c+
P
the exponent ν is updated according to the change k , ck ) in Step 4 of Algorithm 1.
k=1

Example 7. This example uses data from Bongartz et al [6, p. 308], consisting of 287 points, with
different weights ranging from w = 1 to w = 698. The data points, shown in Fig. 7(a), are distinguished
according to their weights. Customers with weight < 20 (called “minor customers”) are represented by
dots. The squares show the “major customers”: the white squares corresponding to customers with weights
20 ≤ w < 100 and the gray squares to those with weights ≥ 100.
The problem was solved for K = 2, . . . , 5, 10 using 200 random initializations for each K. The results
are shown in Table 5. The optimal values in the 2nd column are in agreement with [8]. The 3rd and 5th
columns give, respectively, the best and average values found in 200 random initializations. The errors, in
columns 4 and 6, are relative to the optimal value. The last column gives the average number of iterations.

Example 8. To illustrate the robustness of Algorithm 1, see Remark 2(c), we solved Example 7 again just
for the 61 major customers. The same center locations were found in both cases. The solution for K = 3
is shown in Fig. 7(b), together with the major customers.

References
[1] M. Arav, Contour approximation of data and the harmonic mean, J. Math. Inequalities 2(2008), 161–167
[2] F. Aurenhammer, Voronoi diagrams - a survey of a fundamental geometric data structure, ACM Computing Surveys
23(1991), 345–405
[3] E. Balas and C-S Yu, A note on the Weiszfeld–Kuhn algorithm for the general Fermat problem, Tech. Report, Carnegie–
Melon Uninversity, September 1982
[4] A. Ben–Israel and C. Iyigun, Probabilistic distance clustering, J. Classification 25(2008), 5–26
[5] J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum, New York, 1981
[6] I. Bongartz, E.H. Calamai and A.R. Conn, A projection method for ℓp norm location–allocation problems, Math. Pro-
gramming 66(1994), 283–312.
24 CEM IYIGUN AND ADI BEN-ISRAEL

Table 3. Results for ν0 = 5, ∆ = 0.5 and ǫ = 0.001

Problem Best Min. f Frequency Max. f Frequency Average f Average #

# known f Value (%) Value (%) Value of Iterations
1 50.449 50.449 49.80 58.500 3.40 51.451 53.086
2 72.000 72.000 71.40 87.122 0.80 74.680 30.978
3 38.318 38.318 100.00 38.318 n.a. 38.318 341.998
4 48.850 48.850 93.00 57.650 1.00 49.358 41.158
5 38.033 38.033 91.20 45.456 5.30 38.611 32.606
6 59.716 59.716 13.20 65.878 0.20 62.276 42.384
7 62.203 62.203 72.80 68.968 3.60 62.983 68.300

8 214.281 214.281 85.80 232.616 0.40 216.833 62.664

9 143.196 143.196 91.40 185.387 0.40 145.485 105.056
10 113.568 113.568 58.00 132.522 3.00 119.497 60.202
11 97.289 97.289 6.80 122.556 0.20 103.448 46.246
12 81.264 81.264 5.60 110.514 0.10 89.342 39.548

13 287.606 287.606 61.60 291.390 0.60 288.202 73.440

14 210.195 210.195 50.20 271.481 0.40 217.033 57.996
15 169.434 169.434 21.40 218.306 0.60 174.801 62.116
16 134.263 134.263 23.40 173.498 0.20 146.502 64.622
17 116.312 116.312 8.20 167.773 0.20 126.854 63.480

18 355.336 355.336 97.40 408.860 2.60 356.728 72.600

19 252.245 252.246* 77.40 334.403 0.20 257.194 71.084
20 207.402 207.403* 49.60 244.876 0.40 210.585 94.564
21 169.832 169.832 57.00 217.795 0.80 180.231 109.598
22 152.664 152.664 3.60 201.719 0.20 160.838 102.720

23 447.727 447.727 73.20 449.485 6.80 447.898 92.370

24 307.372 307.372 99.00 307.849 1.00 307.382 85.920
25 254.146 254.147* 62.60 285.587 0.60 260.664 114.000
26 220.056 220.056 27.80 270.470 0.20 229.815 87.400

27 135.522 135.513† 0.60 144.559 0.20 136.762 52.900

28 105.214 105.194† 1.20 113.798 0.20 106.572 51.950
29 84.154 84.134† 1.80 95.594 0.20 86.705 83.262
30 72.237 72.225† 1.40 89.901 0.40 73.983 45.156
31 41.685 41.673† 1.00 55.552 0.20 44.915 59.574

[7] J. Brimberg, Further notes on convergence of the Weiszfeld algorithm, Yugoslav J. Operations Research 13(2003), 199–206
[8] J. Brimberg, P. Hansen, N. Mladenović and E.D. Taillard, Improvements and comparison of heuristics for solving the
uncapacitated multisource Weber problem, Operations Research 48(2000), 444–460
[9] R. Chandrasekaran and A. Tamir, Open questions concerning Weiszfeld’s algorithm for the Fermat–Weber location
problem, Math. Programming 44(1989), 293–295
[10] L. Cooper, Location–allocation problems, Operations Research 11(1963), 331–343.
THE MULTI–FACILITY LOCATION PROBLEM 25

Table 4. Results for ν0 = 5, ∆ = 0.1 and ǫ = 0.0001

Problem Min. f Frequency Max. f Frequency Average f Average #

# Value (%) Value (%) Value of Iterations
1 50.449 55.40 58.501 3.40 51.410 411.688
2 72.000 74.40 81.764 4.00 74.308 61.832
3 38.318 100.00 38.318 100.00 38.318 973.155
4 48.850 95.40 57.650 4.60 49.255 121.252
5 38.033 86.80 48.068 0.20 38.748 31.042
6 59.716 17.40 65.871 0.20 61.973 188.906
7 62.203 79.60 68.968 3.40 62.798 62.342

8 214.281 89.40 232.226 10.60 216.183 362.698

9 143.196 97.80 172.188 2.20 143.834 192.270
10 113.568 59.00 132.522 6.20 119.426 135.108
11 97.289 9.60 123.306 0.40 103.133 172.550
12 81.264 10.80 107.289 0.20 88.511 179.946

13 287.606 95.80 289.123 4.20 287.670 379.588

14 210.195 65.00 234.675 13.40 215.914 217.610
15 169.434 27.40 218.306 0.80 173.303 205.930
16 134.263 55.40 178.164 0.20 144.668 206.210
17 116.312 14.20 153.638 0.20 126.082 197.458

18 355.336 99.00 408.860 1.00 355.871 371.790

19 252.246 93.20 305.447 0.80 254.077 392.712
20 207.403 62.60 241.876 1.20 210.344 483.272
21 169.832 63.40 217.193 0.20 179.344 366.894
22 152.664 4.60 193.662 0.20 160.075 310.980

23 447.727 100.00 447.727 100.00 447.727 567.000

24 307.372 100.00 307.372 100.00 307.372 408.000
25 254.146 65.80 280.952 8.60 260.910 673.274
26 220.056 29.80 271.407 0.20 229.094 471.118

27 135.520 100.00 135.520 100.00 135.520 283.514

28 105.199 81.60 110.533 18.40 106.177 283.328
29 84.134 74.60 94.274 0.60 86.225 537.374
30 72.226 4.60 89.898 0.40 73.556 357.688
31 41.674 0.40 55.218 0.20 44.725 231.574

[11] L. Cooper, Heuristic methods for location–allocation problems, SIAM Review 6(1964), 37–53
[12] T. Drezner, Z. Drezner, The gravity p–median model, European J. Oper. Res. 179(2007), 1239–1251.
[13] Z. Drezner, The planar two–center and two–median problems, Transportation Science 18(1984), 351–361.
[14] Z. Drezner (editor), Location Decisions, Annals Oper. Res. 40(1992)
[15] Z. Drezner, A note on the Weber location problem, pp. 153–161 in [14]
[16] Z. Drezner and H.W. Hamacher (editors), Facility Location: Applications and Theory, Springer, 2002
26 CEM IYIGUN AND ADI BEN-ISRAEL

Table 5. Results for Example 7, ν0 = 5 and ∆ = 0.1

K Optimum Best Error Average Error Average #

Value [8] f Value (%) f Value (%) of Iterations
2 14427.593 14447.970 0.14 14497.313 0.33 297.22
3 12095.442 12115.823 0.17 12354.682 2.14 203.02
4 10661.477 10681.842 0.19 11066.700 3.80 192.83
5 9715.627 9735.993 0.21 10041.572 3.35 173.73
10 6705.036 6.714.277 0.14 7275.055 8.50 202.30

50
42

45
40

40
38

35 36

30 34

25 32

30
20

28
15

26
10

24
5
0 5 10 15 20 25 30 35 40 45 50 10 12 14 16 18 20 22 24 26 28 30

(a) 287 data points (b) Major customers and centers

Figure 7. Illustration of Example 8 (N = 287, K = 3)

[17] Z. Drezner, K. Klamroth, A. Schöbel and G.O. Wesolowsky, The Weber problem, Chapter 1 in [16]
[18] I.N. Eckhardt, Weber’s problem and Weiszfeld’s algorithm in general spaces, Math. Program. 18(1980), 186–196
[19] S. Eilon, C.D.T. Watson–Gandy and N. Christofides, Distribution Management: Mathematical Modelling and Practical
Analysis, Hafner, New York, 1971
[20] S.L. Hakimi, Optimum distribution of switching centers in a communication network and some related graph theoretic
problems, Operations Resesarch 13(1965), 462–475
[21] D.L. Huff, Defining and estimating a trading area, J. Marketing 28(1964), 34–38
[22] D.L. Huff, A programmed solution for approximating an optimum retail location, Land Economics 42(1966), 293–303.
[23] C. Iyigun and A. Ben–Israel, Probabilistic distance clustering adjusted for cluster size, Probability Engrg. Info. Sci.
22(2008), 1–19
[24] C. Iyigun and A. Ben–Israel, Probabilistic Distance Clustering, Algorithm and Applications, Chapter 2, Clustering
Challenges in Biological Networks, S. Butenko, W.A. Chaovalitwongse and P.M. Pardalos (Editors), World Scientific,
Singapore, 2009, ISBN-13 978-981-277-165-0
[25] C. Iyigun and A. Ben–Israel, Contour approximation of data: a duality theory, Lin. Algeb. and Appl. 430(2009), 2771–
2780
[26] C. Iyigun and A. Ben–Israel, A generalized Weiszfeld method for the multi–facility location problem, Operations Research
Letters 38(2010), 207–214
[27] W. Kaplan and W.H. Yang, Duality theorem for a generalized Fermat–Weber problem, Math. Programming 76(1997),
285–297
[28] I.N. Katz, Local convergence in Fermat’s problem, Math. Program. 6(1974), 89–104
[29] H.W. Kuhn, On a pair of dual nonlinear programs, in J. Abadie (ed.), Methods of Nonlinear Programming, Amsterdam,
North-Holland, (1967), 38–54
[30] H.W. Kuhn, A note on Fermat’s problem, Math. Programming 4(1973), 98–107
THE MULTI–FACILITY LOCATION PROBLEM 27

[31] S. Kullback. Information Theory and Statistics. J. Wiley, New York, 1959.
[32] S. Kullback and R.A. Leibler. On information and sufficiency. Annals Math. Statist., 22(1951), 79–86
[33] Y. Levin and A. Ben–Israel, A heuristic method for large-scale multi-facility location problems, Computers and Oper.
Res. 31(2004), 257–272
[34] R.F. Love and H. Juel, Properties and solution methods for large location–allocation problems, J. Operational Research
Society 33(1982), 443–452
[35] R.F. Love, J.G. Morris and G.O. Wesolowsky, Facilities Location: Models and Methods, North-Holland, 1988
[36] R.D. Luce, Individual Choice Behavior, Wiley, New York, 1959
[37] N. Megiddo and K.J. Supowit, On the complexity of some common geometric location problems, SIAM Journal on
Computing 13(1984), 182–196
[38] B.A. Murtagh and S.R. Niwattisyawong, An efficient method for the multi–depot location–allocation problem, Journal
of Operational Research Society 33 (1982) 629–634.
[39] C. ReVelle and R. Swain, Central facilities location, Geographical Analysis 2(1970), 30–42
[40] K.E. Rosing, An optimal method for solving the multi-Weber problems, European Journal of Operational Research 58
(1992) 414–426.
[41] M. Teboulle, A unified continuous optimization framework for center–based clustering methods, J. Machine Learning
8(2007), 65–102
[42] H. Üster and R.F. Love, Duality in constrained multi–facility location models, Naval Research Logistics Quart. 49(2002),
410–421
[43] Y. Vardi and C-H Zhang, A modified Weiszfeld algorithm for the Fermat–Weber location problem, Math. Programming
A90(2001), 559–566
[44] E. Weiszfeld, Sur le point par lequel la somme des distances de n points donnés est minimum, Tohoku Math. J. 43 (1937),
355–386
[45] G.O. Wesolowsky, The Weber problem: its history and perspectives, Location Science 1(1993) 5–23
[46] J.I. Yellott, Jr., Luce’s Choice Axiom, pp. 9094–9097 in International Encyclopedia of the Social & Behavioral Sciences
(N.J. Smelser and P.B. Baltes, editors), ISBN 0-08-043076-7, 2001.

Appendix A: Proof of Theorem 2

Proof of part (a). The objective function g(U) can be written, using (41), as
K N N K X N
! !
X X X X
g(U) = uki · xi − uki · ck = uki · (xi − ck ). (A1)
k=1 i=1 i=1 k=1 i=1
XK X N
≤ kuki kkxi − ck k (A2)
k=1 i=1
XK X N
≤ wi qki kxi − ck k, by (42), (A3)
k=1 i=1
= f (c1 , . . . , cK ).
Proof of part (b). We distinguish two cases.
Case 1. None of the centers {ck } coincides with any of the data points {xi }.
For k ∈ 1:K and i ∈ 1:N define
wi qki
uki := (xi − ck ) . (A4)
kxi − ck k
Then from (27a) and Theorem 1 it follows that,
N
X
uki = Rk (ck ) = 0, verifying (41) .
i=1
Also from (A4), we get, for all k, i,
kuki k = wi qki , (A5)
proving that the inequalities (42) hold as equalities, and that {uki }, defined by (A4), are feasible.
28 CEM IYIGUN AND ADI BEN-ISRAEL

Finally, from (A4) and (A5) it follows that (A2) and (A3) are satisfied as equalities, proving (44).
Case 2. A center coincides with one of the data points, say
ck = xj , for some k ∈ 1:K, j ∈ 1:N , (A6)
in which case (15) holds. Define
wi qki
uki := (xi − xj ) , for i 6= j , (A7a)
kxi − xj k
X
ukj := − uki . (A7b)
i6=j
P
Then uki = 0 by definition, and kuki k = wi qki for all i 6= j, as in (A5). Next,
i

ukj = −Rjk by (27c),

and therefore by (27b),
Rk (xj ) = 0 implies wj qkj ≥ kRjk k = kukj k ,
proving that the variables U defined by (A7a)–(A7b) are feasible.
From Case 1, the equality
uki · (xi − ck ) = wi qki kxi − ck k,
holds for all xi that are not centers.
Also, for any center cm 6= ck , qmj = 0 (by (15)), and therefore umj = 0. In (A2)–(A3) the inequalities
ukj · (xj − ck ) ≤ kukj kkxj − ck k ≤ wj qkj kxj − ck k ,
reduce to trivial equalities, since by (A6) all three terms are zero.
The inequalities (A2)–(A3) therefore hold as equalities, proving (44).
Proof of part (c). Writing the objective function g(U) as in (A1), the Lagrangian of (D) is
K X
X N K X
X N
uki · (xi − ck ) − tki (kuki k − wi qki )
k=1 i=1 k=1 i=1
with Lagrange multipliers {tki }. The Karush–Kuhn–Tucker conditions are
uki
(xi − ck ) − tki =0, (A8a)
kuki k
X N
uki = 0 , (A8b)
i=1
kuki k ≤ wi qki , (A8c)
tki ≥ 0 , (A8d)
tki (kuki k − wi qki ) = 0 , (A8e)
for all k ∈ 1:K, i ∈ 1:N . Again we distinguish two cases.
Case 1. All tki > 0. Then (A5) follows from (A8e) for all k, i, and from (A8a),
kxi − ck k = tki , (A9)
and by substituting (A5) and (A9) in (A8a), we get
wi qki
uki := (xi − ck ) ,
kxi − ck k
which is (A4), and the equality (44) follows as in the proof of Theorem 2(b), Case 1.
Case 2. Some Lagrange multipliers are zero, say tkj = 0. Then by (A8a), ck = xj , and tki > 0 for i 6= j, by
(A8a) and (A8d). Therefore, by (A8e),
kuki k = wi qki , for all i 6= j .
From ck = xj and (A8a) it follows that
uki
(xi − xj ) = tki
kuki k
THE MULTI–FACILITY LOCATION PROBLEM 29

and by taking norms,

kxi − xj k = tki , for all i 6= j .
Substituting tki and kuki k in (A8a) gives,
wi qki
uki := (xi − xj ) , for i 6= j ,
kxi − xj k
and from (A8b),
X
ukj := − uki ,
i6=j

reproducing (A7a)–(A7b), and equality in (44) follows as in the proof of Theorem 2(b), Case 2.

Appendix B: The classification uncertainty function

Let PK be the set of K–dimensional probability vectors, denoted p = (pi ), q = (qi ). The Kullback–Leibler
divergence,
K
X pi
I(p, q) = pi log , (B1)
i=1
qi
is a well–known and useful distance function on PK , [31], [32].
Rewriting the CUF (49) as
1/K
 
K
Y p j (x)
E(x) =  
j=1
1/K
and taking logarithms, we get

X 1/K 1

− log E(x) = (1/K) log = I p(x), K 1 , (B2)
pi (x)
i∈1K
:
the Kullback–Leibler distance between the distributions
1 1 1 1
p(x) = (p1 (x), p2 (x), · · · , pK (x)) and K 1 = (K , K,··· , K ).
1, is of maximal uncertainty in PK , and consequently the divergence I p(x), K
1 1

The latter distribution, K 1 in the
right side of (B2) is a measure of the uncertainty of the distribution p(x), with smaller values corresponding to
greater uncertainty.
Writing (B2) as
1
E(x) = exp {−I p(x), K 1 } (B3)
shows that E(x) is an entropic measure of the uncertainty of classification, with higher values of E(x) corresponding
to a greater uncertainty in assigning x to the given centers.

Appendix C: Cluster membership probabilities

In this appendix, dk (x) stands for d(x, ck ), the distance of x to the k th –center ck .
The membership probabilities {pk (x) : k ∈ 1:K} of a point x are assumed to depend only on the distances
{dk (x) : k ∈ 1:K},
p(x) = f (d(x)) (C1)
K
where p(x) ∈ R is the vector of probabilities (pk (x)), and d(x) is the vector of distances (dk (x)). Desirable
properties of the relation (C1) include
di (x) < dj (x) =⇒ pi (x) > pj (x), for all i, j ∈ 1:K, (C2a)
f (λ d(x)) = f (d(x)), for any λ > 0, (C2b)
Q p(x) = f (Q d(x)), for any permutation matrices Q. (C2c)
Condition (C2a) states that membership in a cluster is more probable the closer it is, which is Assumption (A) of
§ 4. Assuming continuity of f it follows from (C2a) that
di (x) = dj (x) =⇒ pi (x) = pj (x),
for any i, j ∈ 1:K. In particular, the probabilities pk (x) are all equal only if so are the distances dk (x).
30 CEM IYIGUN AND ADI BEN-ISRAEL

The meaning of (C2b) is that the probabilities pk (x) do not depend on the scale of measurement, i.e., the function
f is homogeneous of degree 0. It follows that the probabilities pk (x) depend only on the ratios of the distances
{dk (x) : k ∈ 1:K}, as is assumed in what follows.
The symmetry of f , expressed by (C2c), guarantees that the probabilities {pk (x)} do not depend on the numbering
of the clusters.
For any nonempty subset S ⊂ 1:K, let X
pS (x) = ps (x),
s∈S
the probability that x belongs to one of the clusters {Xs : s ∈ S}, and let pk (x|S) denote the conditional
probability that x belongs to the cluster Xk , given that it belongs to one of the clusters {Xs : s ∈ S}.
Since the probabilities pk (x) depend only on the ratios of the distances {dk (x) : k ∈ 1:K}, and these ratios are
unchanged in subsets S of the index set 1:K, it follows that for all k and S ⊂ 1:K with k ∈ S,
pk (x) = pk (x|S) pS (x) (C3)
which is the choice axiom of Luce, [36, Axiom 1], and therefore, [46],
vk (x)
pk (x|S) = P (C4)
vs (x)
s∈S
where vk (x) is a scale function, in particular,
v (x)
pk (x) = Pk . (C5)
vs (x)
s ∈ 1:K
Assuming vk (x) 6= 0 for all k, it follows that
1
pk (x)vk (x)−1 = P , (C6)
vs (x)
s ∈ 1:K
where the right hand side is a function of x, and does not depend on k.
Property (C2a) implies that the function vk (x) decreases as dk (x) increases. A simple choice is
1
vk (x) = , (C7)
w dk (x)
for which (C6) gives
1
w pk (x) dk (x) = , (C8)
P 1
s ∈ 1:K ds (x)
in agreement with (9).

Supplementary Material: Proof of Theorem 1

The proof in [26, Appendix A] is adapted below for the case of probabilistic assignments (18a).
Proof of part (a). If ck is not one of the data points, then −Rk (ck ) is the gradient (23) at ck , and (36) is both
necessary and sufficient for a minimum, by the convexity of fk .
If ck coincides with a data point xj , consider the change from xj to xj + t z where kzk = 1. Then,
d
fk (xj + t z)t=0 = wj qk (xj ) − Rjk · z .

(D1)
dt
The greatest decrease of fk is along Rjk , i.e., when
Rjk
z= , (proving part (f)).
kRjk k
THE MULTI–FACILITY LOCATION PROBLEM 31

Therefore ck (that coincides with xj ) is a local minimum if and only if,

Rjk · Rjk
wj qk (xj ) − ≥0,
kRjk k
which is equivalent to
kRjk k ≤ wj qk (xj ),
or Rk (ck ) = 0, by (27b).
Proof of part (b). If c∗k is one of the data points, then it is trivially in the convex hull. Otherwise, the condition
Rk (c∗k ) = 0, see (36), results in (24a)–(24b) as above.
Proof of part (c). Follows from part (a), since for c 6∈ X,
c = Tk (c) ⇐⇒ Rk (c) = 0,
while for c ∈ X, c = Tk (c) for all k.
Proof of part (d). If Tk (c) 6= c then c is not one of the data points. It follows then from (25a) that Tk (c) is the
center of gravity of weights
wi qk (xi )
θki := , placed at the data points xi , i ∈ 1:N. (D2)
kxi − ck
By elementary calculus, Tk (c) is the unique minimum of the strictly convex function
N
X
g(y) = θki kxi − yk2 .
i=1

Since c 6= Tk (c),
N
X N
X
g(Tk (c)) = θki kxi − Tk (c)k2 < g(c) = θki kxi − ck2 = fk (c).
i=1 i=1
On the other hand,
N
X h i2
g(Tk (c)) = θki kxi − ck + kxi − Tk (c)k − kxi − ck
i=1
XN h i2
= fk (c) + 2 fk (Tk (c)) − fk (c) + θki kxi − Tk (c)k − kxi − ck
i=1
Combining these results,
N
X h i2
2fk (Tk (c)) + θki kxi − Tk (c)k − kxi − ck < 2fk (c)
i=1
proving that fk (Tk (c)) < fk (c).
Proof of part (e). Follows from part (a), the definition (27b)
Rjk
Rk (xj ) := max {kRjk k − wj qk (xj ), 0} ,
kRjk k
and qk (xj ) = 1 since xj = ck .
Proof of part (f ). This was shown in part (a).
Proof of part (g).
Tk (c) − xi = c + hk (c) Rk (c) − xi
N
X
= hk (c) θkj (xj − c) + hk (c) θki − 1 (xi − c),
j6=i

with hk (c) as in (35) and θki in (D2). Since xi is not optimal, we have, from (27c),
N
X
θkj (xj − xi ) > wi qk (xi ),
j6=i
32 CEM IYIGUN AND ADI BEN-ISRAEL

and therefore there exists δ ′ > 0 and ǫ > 0 such that

N
X
for kxi − ck ≤ δ ′ .

θkj (xj − c) ≥ (1 + 2ǫ) wi qk (xi )
j6=i

By the definition of hk , (35), we have

hk (c) wi qk (xi )
lim = 1.
c→xi kxi − ck
and therefore there exists δ ′′ > 0 such that
hk (c) wi qk (xi ) ǫ
for 0 < kxi − ck ≤ δ ′′ .

kxi − ck − 1 <
2(1 + ǫ)
For δ := min(δ ′ , δ ′′ ), and 0 < kxi − ck ≤ δ, we have
ǫ
kxi − Tk (c)k > hk (c)(1 + 2ǫ) wi qk (xi ) − kxi − ck
2(1 + ǫ)

ǫ ǫ
> 1− (1 + 2ǫ)kxi − ck − kxi − ck
2(1 + ǫ) 2(1 + ǫ)

= (1 + ǫ) kxi − ck.
If kxi − Tk (c)k ≤ δ, then
kxi − Tk2 (c)k > (1 + ǫ)2 kxi − ck, etc.
Since kxi − ck > 0, (1 + ǫ)t kxi − ck > δ for some positive integer t and hence
kxi − Tks (c)k > δ for some positive integer s, and kxi − Tks−1 (c)k ≤ δ.

Proof of part (h). For c 6∈ {xi : i ∈ 1:N },

N
X
Tk (c) = λki xi , with λki as in (25a),
i=1
XN N
X
= λki (xi − xj ) + xj λki .
i6=j i=1
N
X N
X
∴ Tk (c) − xj = λki (xi − xj ), since λki = 1.
i6=j i=1
N
Tk (c) − xj X λki
∴ = (xi − xj ). (D3)
kxj − ck kxj − ck
i6=j

Using (25a),
wi qk (xi ) wi qk (xi )
λki kxi − ck kxi − ck
= N =
kxj − ck kxj − ck P kxj − ck
wj qk (xj ) + wm qk (xm )
P
wm qk (xm )
m=1 kxm − ck m6=j kxm − ck

λki wi qk (xi )
∴ lim = .
c→xj kxj − ck wj qk (xj ) kxi − ck
THE MULTI–FACILITY LOCATION PROBLEM 33

Finally, by (D3),
N
kTk (c) − xj k 1 X xi − xj
lim = wi qk (xi )

kxj − ck wj qk (xj ) kxi − xj k

c→xj
i6=j

kRjk k
= , by (27c).
wj qk (xj )
Proof of part (i). With the possible exception of c0k , the sequence {crk } lies in the convex hull of vertices, a
compact set. By the Bolzano–Weierstrass theorem, there exits a convergent subsequence {cℓk : ℓ = 1, 2, · · · }, such
that lim cℓk = c∞ ∞ ∗
k . We prove that ck = ck .
ℓ→∞
If cℓ+1
k = Tk (cℓk ) = cℓk for some ℓ, then the sequence repeats from that point and c∞ ℓ ℓ
k = ck . Since ck is not a data
∞ ∗
point, ck = ck by Theorem 1(c).
Otherwise, by Theorem 1(d),
fk (c0k ) > fk (c1k ) > . . . > fk (crk ) > . . . > fk (c∗k ) (D4)
Hence
fk (cℓk ) − fk (Tk (cℓk )) = 0.

lim (D5)
ℓ→∞

The continuity of Tk implies

lim Tk (cℓk ) = Tk (c∞
k ) (D6)
ℓ→∞

we have
fk (c∞ ∞
k ) − fk (Tk (ck )) = 0. (D7)
Therefore, by Theorem 1(d), c∞
k = Tk (c∞
k ).
If c∞
is not a data point, then
k = c∞ c∗k
by Theorem 1(c).
k
∞ ∗ ∗
In any event, ck lies in the finite set of points {x1 , . . . , xN , ck }, where ck may be a data point.
The only case that remains is c∞ ∗
k = xj for some j. If xj 6= ck , we first isolate xj from the other data points (and
∗
ck if it is not a data point) by a δ–neighborhood that satisfies Theorem 1(g). Then it is clear that we can choose our
kxj − Tk (cℓk )k
subsequence cℓk → xj such that kxj − Tk (cℓk )k > δ for all ℓ. This means that the ratio is unbounded.
kxj − cℓk k
However, this contradicts Theorem 1(h). Hence xj = c∗k and the theorem is proved.

Cem Iyigun, Department of Industrial Engineering, Middle East Technical University, Ankara, Turkey
E-mail address: iyigun@ie.metu.edu.tr

Adi Ben-Israel, RUTCOR–Rutgers Center for Operations Research, Rutgers University, 640 Bartholomew
Rd., Piscataway, NJ 08854-8003, USA
E-mail address: adi.benisrael@gmail.com

Facility Location
No ratings yet
Facility Location
13 pages
Facility Location Lit Review Old
No ratings yet
Facility Location Lit Review Old
10 pages
Location Allocation Modelling FLP U
No ratings yet
Location Allocation Modelling FLP U
41 pages
Facility Location Problem
No ratings yet
Facility Location Problem
2 pages
4 - Location Models in Facility Planning-Part 2
No ratings yet
4 - Location Models in Facility Planning-Part 2
50 pages
Facilities Planning for Engineers
No ratings yet
Facilities Planning for Engineers
44 pages
Advanced Algorithms for Students
No ratings yet
Advanced Algorithms for Students
8 pages
A Distance Measuring Algorithm For Location Analysis: Ruilin Ouyang, Dinghao Ma, M. S. Morshed and Md. Noor-E-Alam
No ratings yet
A Distance Measuring Algorithm For Location Analysis: Ruilin Ouyang, Dinghao Ma, M. S. Morshed and Md. Noor-E-Alam
15 pages
Facility Location Strategies
No ratings yet
Facility Location Strategies
43 pages
Chapter 4 - Quantitative Warehouse Planning Models (Full)
No ratings yet
Chapter 4 - Quantitative Warehouse Planning Models (Full)
59 pages
Facility Location
No ratings yet
Facility Location
43 pages
Location Planning Models and Methods
No ratings yet
Location Planning Models and Methods
74 pages
(2008) - Naval Research Logistics (NRL) PDF
No ratings yet
(2008) - Naval Research Logistics (NRL) PDF
12 pages
The Design and Analysis of Approximation Algorithms: Facility Location As A Case Study
No ratings yet
The Design and Analysis of Approximation Algorithms: Facility Location As A Case Study
13 pages
Facility Layout Optimization
No ratings yet
Facility Layout Optimization
16 pages
A Genetic Algorithm Based Approach For The Uncapacitated Continuous Location-Allocation Problem
No ratings yet
A Genetic Algorithm Based Approach For The Uncapacitated Continuous Location-Allocation Problem
20 pages
Facility Location: Lindsey Bleimes Charlie Garrod Adam Meyerson
No ratings yet
Facility Location: Lindsey Bleimes Charlie Garrod Adam Meyerson
30 pages
Mathematical Models For Facility Location: Prof Arun Kanda Department of Mech Engg Indian Institute of Technology, Delhi
No ratings yet
Mathematical Models For Facility Location: Prof Arun Kanda Department of Mech Engg Indian Institute of Technology, Delhi
28 pages
Lecture 5
No ratings yet
Lecture 5
24 pages
Lecture6 3
No ratings yet
Lecture6 3
31 pages
Facility Location Problems: A Parameterized View
No ratings yet
Facility Location Problems: A Parameterized View
9 pages
4 - Location Models in Facility Planning-Part 1
No ratings yet
4 - Location Models in Facility Planning-Part 1
68 pages
A Heuristic Algorithm For Facility Location and Demand Allocation in Presence of Barriers in Facilities Network
No ratings yet
A Heuristic Algorithm For Facility Location and Demand Allocation in Presence of Barriers in Facilities Network
10 pages
A Goal-Oriented Evaluation of Location-Allocation Methods
No ratings yet
A Goal-Oriented Evaluation of Location-Allocation Methods
9 pages
TR000605
No ratings yet
TR000605
81 pages
Facility Location with Outliers
No ratings yet
Facility Location with Outliers
10 pages
Logistique Et Distribution: The Facility Location Problem
No ratings yet
Logistique Et Distribution: The Facility Location Problem
43 pages
A Distributionally Robust Optimization Approach For Two-Stage Facility Location Problems
No ratings yet
A Distributionally Robust Optimization Approach For Two-Stage Facility Location Problems
32 pages
Mokhtarian 2012
No ratings yet
Mokhtarian 2012
10 pages
Facility Location Optimization Guide
No ratings yet
Facility Location Optimization Guide
94 pages
12 Location
No ratings yet
12 Location
29 pages
09,10 - Facility Location Models For Distribution System Design
No ratings yet
09,10 - Facility Location Models For Distribution System Design
26 pages
Multiple Facility Location Analysis Problem With Weighted Euclidean Distance
No ratings yet
Multiple Facility Location Analysis Problem With Weighted Euclidean Distance
6 pages
Lecture - 4: Learning Objective 1. To Demonstrate The Quantitative Approach To Locate Facility On A Line and On A Plane
No ratings yet
Lecture - 4: Learning Objective 1. To Demonstrate The Quantitative Approach To Locate Facility On A Line and On A Plane
12 pages
Week Five Slides
No ratings yet
Week Five Slides
41 pages
Facility Location Models For Distribution System Design: Article in Press
No ratings yet
Facility Location Models For Distribution System Design: Article in Press
26 pages
Uncapacitated and Capacitated Facility Location Problems: Vedat Verter
No ratings yet
Uncapacitated and Capacitated Facility Location Problems: Vedat Verter
14 pages
ISE3214 Exam 1: Topics & Prep Guide
No ratings yet
ISE3214 Exam 1: Topics & Prep Guide
8 pages
Fuzzy MODEL IDENTIFICATION BASED ON CLUSTER ESTIMATION
No ratings yet
Fuzzy MODEL IDENTIFICATION BASED ON CLUSTER ESTIMATION
12 pages
10 Updated Short Fall 2022
No ratings yet
10 Updated Short Fall 2022
16 pages
10 K Means Clustering PDF
No ratings yet
10 K Means Clustering PDF
5 pages
Otherwise I Location at Facility A Locate We If y
No ratings yet
Otherwise I Location at Facility A Locate We If y
5 pages
IE 503 - Operations Analysis: Narayan - Rangaraj@iitb - Ac.in
100% (1)
IE 503 - Operations Analysis: Narayan - Rangaraj@iitb - Ac.in
124 pages
Stochastic Multifacility Location Optimization
No ratings yet
Stochastic Multifacility Location Optimization
6 pages
Closest Pairs Clustering
No ratings yet
Closest Pairs Clustering
4 pages
A Coevolutionary Algorithm For A Facility Layout Problem: T. Dunker, E. Westk Amper G. Radons
100% (2)
A Coevolutionary Algorithm For A Facility Layout Problem: T. Dunker, E. Westk Amper G. Radons
8 pages
Ref - 2
No ratings yet
Ref - 2
5 pages
Facility Location Planning Models
No ratings yet
Facility Location Planning Models
85 pages
Learning 2004 1 019
No ratings yet
Learning 2004 1 019
6 pages
Chapter 10 - Facility Location Part I - MINISUM and MINIMAX
No ratings yet
Chapter 10 - Facility Location Part I - MINISUM and MINIMAX
23 pages
Feature Embedding Clustering Using POCS-based Clustering Algorithm
No ratings yet
Feature Embedding Clustering Using POCS-based Clustering Algorithm
6 pages
Optimal Location of Facilities
No ratings yet
Optimal Location of Facilities
91 pages
FP Unit 1
No ratings yet
FP Unit 1
103 pages
10W12-13Ch10 Quantitative Models
No ratings yet
10W12-13Ch10 Quantitative Models
62 pages
Fuzzy Model Identification Based On Cluster Estimation
No ratings yet
Fuzzy Model Identification Based On Cluster Estimation
12 pages
Example 1
No ratings yet
Example 1
8 pages
Lecture Notes: Introduction To Machine Learning For The Sciences
No ratings yet
Lecture Notes: Introduction To Machine Learning For The Sciences
80 pages
DMDW Question Bank
No ratings yet
DMDW Question Bank
17 pages
Energy Economics - Optimal Spot Market Participation of PV BESS
No ratings yet
Energy Economics - Optimal Spot Market Participation of PV BESS
32 pages
Inventory Control
No ratings yet
Inventory Control
7 pages
Python Programming For Beginners - 3 Books in 1 - Beginner's Guide, Data Science and Machine Learning
No ratings yet
Python Programming For Beginners - 3 Books in 1 - Beginner's Guide, Data Science and Machine Learning
253 pages
Machine Learning Guide: Concepts & Types
No ratings yet
Machine Learning Guide: Concepts & Types
13 pages
A Comparison of Land Use and Land Cover Change Detection Methods
No ratings yet
A Comparison of Land Use and Land Cover Change Detection Methods
13 pages
Recommender Systems Clustering Using Bayesian Non Negative Matrix Factorization
No ratings yet
Recommender Systems Clustering Using Bayesian Non Negative Matrix Factorization
16 pages
Evaluating Spatial Methods For Investigating - Satscan - Lisa - Outros
No ratings yet
Evaluating Spatial Methods For Investigating - Satscan - Lisa - Outros
32 pages
Master Thesis Armin - Aca8cc3a 5dd0 4913 8628 Caeb99602aa8
No ratings yet
Master Thesis Armin - Aca8cc3a 5dd0 4913 8628 Caeb99602aa8
115 pages
Phuong Nguyen: The Complete Guide To Cluster Analysis Using Python
No ratings yet
Phuong Nguyen: The Complete Guide To Cluster Analysis Using Python
68 pages
Data Mining & Warehousing 01
No ratings yet
Data Mining & Warehousing 01
53 pages
PGDC Iiit Delhi
No ratings yet
PGDC Iiit Delhi
16 pages
Btech III Year I Semester (Ar20)
No ratings yet
Btech III Year I Semester (Ar20)
7 pages
Data Science Process and Machine Learning
No ratings yet
Data Science Process and Machine Learning
6 pages
Ba 404
No ratings yet
Ba 404
2 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
6 pages
Blog Headline Creation Guide
No ratings yet
Blog Headline Creation Guide
16 pages
Exploring Relationships Between Perceived Neighborhood Boundaries Street
No ratings yet
Exploring Relationships Between Perceived Neighborhood Boundaries Street
23 pages
Chance in Bball
No ratings yet
Chance in Bball
12 pages
An Adaptive K-Means Clustering Algorithm For Breast Image Segmentation
No ratings yet
An Adaptive K-Means Clustering Algorithm For Breast Image Segmentation
4 pages
Machine Learning Theory and Practice
No ratings yet
Machine Learning Theory and Practice
299 pages
NILM Data Collection & ML Techniques
No ratings yet
NILM Data Collection & ML Techniques
9 pages
1 s2.0 S0168169921004592 Main
No ratings yet
1 s2.0 S0168169921004592 Main
18 pages
Unit-Based Speech-to-Speech Translation Without Parallel Data
No ratings yet
Unit-Based Speech-to-Speech Translation Without Parallel Data
12 pages
Harmony Search in Healthcare
No ratings yet
Harmony Search in Healthcare
37 pages
BIDSS
No ratings yet
BIDSS
47 pages
Volume 1
No ratings yet
Volume 1
692 pages
14 - Chapter - 05 FUZZY LOGIC BASED CLUSTERING HIERARCHY FOR PDF
No ratings yet
14 - Chapter - 05 FUZZY LOGIC BASED CLUSTERING HIERARCHY FOR PDF
17 pages
Classification and Percent Severity of Pechay Damage Caused by Cutworm
No ratings yet
Classification and Percent Severity of Pechay Damage Caused by Cutworm
8 pages