The Capacitated K-Center Problem: Yoram Cs .Umd
The Capacitated K-Center Problem: Yoram Cs .Umd
(Extended Abstract)
1 Introduction
The basic K-center problem is a fundamental facility location problem [17] and
is defined as follows: given an edge-weighted graph G = (V, E) find a subset
S C V of size at most K such that each vertex in V is "close" to some vertex in
S. More formally, the objective function is defined as follows:
rain m a x m i n
S C V u E V v6S
d(u, v)
where d is the distance function. For example, one may wish to install K fire
stations and minimize the m a x i m u m distance (response time) from a location to
its closest fire station. T h e problem is known to be NP-hard [8].
An approximation algorithm with a factor of p, for a minimization problem,
is a polynomial time algorithm that guarantees a solution with cost at most p
times the optimal solution. Approximation algorithms for the basic K-center
problem have been very well studied and are known to be optimal [7, 9, 10, 11].
These schemes present natural methods for obtaining an approximation factor of
2. Several approximation algorithms are known for interesting generalizations of
* Research supported by NSF Research Initiation Award CCR-9307462, and NSF CA-
REEK Award CCR-9501355. Email addr: -~alair@cs.umd.edu
** Research supported by NSF CAREER Award CCR-9501355. Email addr:
yoram~cs .umd.edu
153
the basic K-center problem as well [3, 10, 16]. The generalizations include cases
when each node has an associated "cost" for placing a center on it, and rather
than limiting the number of centers, we have a limited budget [10, 16]. Other
generalizations include cases where the vertices have weights and we consider
the weighted distance from a node to its closest center [3, 16].
Recently, a very interesting generalization that we call the capacitated K-
center problem was studied by Bar-Ilan, Kortsarz and Peleg [1]. The input spec-
ifies an upper bound on the number of centers K, as well as a m a x i m u m load
L. We have to output a set of at most K centers, as well as an assignment of
vertices to centers. No more than L vertices m a y be assigned to a single center.
Under these constraints we wish to minimize the m a x i m u m distance between a
vertex u and its assigned center r
min maxd(u, r
SC_VuEV
such that
I{u Ir = v}l < t W S
where
r
Bar-Ilan, Kortsarz and Peleg [1] gave the first polynomial time approxima-
tion algorithm for this problem with an approximation factor of 10. Various
applications for capacitated centers were first mentioned in [14, 15]. A slightly
different problem, where the radius is fixed, and one has to minimize the number
of centers, shows up in the Sloan digital sky survey project [13].
The total demand assigned to any center should not exceed L. Using the method
in [1], we can obtain an approximation factor of 13 for the version with costs.
(Each vertex has a cost for placing a center on it, and we are working with a
fixed budget.) In Section 4 we study some other variants of this problem.
We first give a high-level description of the algorithm. We may assume for sim-
plicity that G is a complete graph, where the edge weights satisfy the triangle
inequality. (We can always replace any edge by the shortest path between the
corresponding pair of vertices.)
High-Level Description
The algorithm uses the threshold method introduced by Edmonds and Fulk-
erson in [4] and used for the K-center problem by Hochbaum and Shmoys [9, 10].
Sort all edge weights in non-decreasing order. Let the (sorted) list of edges be
el, e2,...em. For each i, let the threshold graph Gi be the subgraph obtained
from G by including edges of weight at most w(ei). Run the algorithm below for
each i from 1 to m, until a solution is obtained. (Hochbaum and Shmoys [10]
suggest using binary search to speed up the computation. If running time is not
a factor, however, it does appear that to get the best solution (in practice) we
should run the algorithm for all i, and take the best solution.) In each iteration,
we work with the subgraph Gi and view it as an unweighted graph. Since Gi is
an unweighted graph, when we refer to the distance between two nodes, we refer
to the number of edges on a shortest path between them. In iteration i, we find
a solution using some number of centers. If the number of centers exceeds K, we
prove that there is no solution with cost at most w(e~). If the number of centers
is at most K, we show that the maximum distance in Gi between a vertex and
its assigned center is at most five. This gives an approximation factor of 5.
First find a maximal independent set I in G~. (G~ is the graph obtained by
adding edges to Gi between nodes that have a common neighbor.) This technique
was introduced by Hochbaum and Shmoys [9, 10] and has been used extensively
to solve K-center problems. We refer to a node in the maximal independent set
as a monarch. The algorithm also constructs a "tree" of monarchs which will
be used to assign vertices to centers. There are two key differences between our
algorithm and the one presented in [1]:
Each monarch has an empire that consists of a subset of vertices within the
immediate neighborhood of the monarch in G~. When a monarch is added to
the maximal independent set, all such vertices that do not currently belong to
an empire are added to this monarch's empire. The algorithm also constructs
a tree of monarchs as the monarchs are selected. This tree has the property
that an edge in the tree corresponds to two monarchs whose distance in Ci
is exactly three. Each monarch then tries to collect a domain of size L - - a
subset of vertices that are close to the monarch and assigned to it. In doing so,
a monarch may grab vertices from other empires if none are available in its own
empire. After this process is complete there may still be unassigned vertices. We
then use the tree of monarchs to assign new centers to handle the unassigned
vertices. Nodes that are left unassigned in a particular empire may be assigned
to the parent monarch. Eventually, we will put additional centers at the monarch
vertices (recall that more than one center may be located at a single vertex).
ASSIONCENTERS (Gi).
1 SUCCESSFUL---- t r u e
2 Let n i be the number of vertices in connected component G i.
9 r
3 xf ()-]r [ n_i_z ]) > K t h e n SUCCESSFUL ---- false
else
4 for each connected component G~ of Gi d o
5 ASSIGNMONARCHS(G~).
6 ASSIGNDOMAINS(G~).
7 REASSmN(G~).
8 i f total centers used > K t h e n SUCCESSFUL ---- false
9 r e t u r n (SUCCESSFUL)
lO e n d - p r o c
nodes within distance two in Gi. To pick a new vertex to add to the independent
set, we pick an unmarked vertex that is adjacent to a marked vertex. Rather than
describing the algorithm in terms of (G~) ~ we work with G~, to separate the
nodes in a monarch's empire into level-1 and level-2 nodes. The level-1 nodes
are adjacent to the monarch, and the level-2 nodes are at distance two from
the monarch. We define El(v) and E2(v) to be the level-1 and level-2 nodes,
respectively, in v's empire. Thus the empire of v is El(v) U E2(v).
ASSIQNMONARCHS(G~).
1 Pick an arbitrary vertex v E G~ and set Q = {v}.
2 p(v) = n i l .
3 w h i l e Q has unmarked nodes do
4 Remove an unmarked node v from Q.
5 Make v a monarch and mark it.
6 f o r all u E Adj(v) do
7 if u is unmarked t h e n
8 Add u to El(v) and mark u
9 for all u E Adj(v) do
10 for all w E Adj(u) do
11 if w is unmarked t h e n
12 Add w to E2(v) and mark w
13 f o r all u E E:(v) do
14 for all w E Adj(u) do
15 if w is unmarked a n d w ~ Q t h e n
16 Set p(w) = v and add w to Q
17 e n d - p r o c
3. The distance between a monarch and any vertex in its empire is at most
two.
4. Each monarch (except for the root) has at least one edge to a level two
vertex in its parent monarch's empire. Moreover, each such level two vertex
has only one such neighbor that is a monarch. More generally, any vertex
can have at most one neighbor that is a monarch (Corollary of property 1).
One way to implement this procedure is by finding a min cost maximum flow
in an appropriate bipartite graph. The flow problem has a very simple structure
and there are only two types of costs on edges (0 and 1).
ASSIGNDOMAINS(G$).
1 Let M be the set of monarchs in G~.
2 Let E ' = {(m, v) I m E M, v E V, distance from m to v is at most two }.
3 Construct a bipartite graph G ~ -- (M, V, E').
4 Add vertices s and t. Add edges {(s, m) I m E M } and {(v, t) [ v E Y}.
5 For m E M, v E V, set capacities u(s, m) = L, u(m, v) = 1 and u(v,t) -- 1.
6 Cost of edge e(m, v) -- 1 if v is not in m's empire.
Cost of all other edges is 0.
7 Compute a min-cost m a x i m u m integral flow in G ~.
8 For each monarch rn, set
domain(m) = {v I v receives one unit of flow from m in G'}.
9 For each v, if v E domain(m) then define r = m.
10 e n d - p r o c
Proof. Assume that there is a vertex u in m's domain that is not in m's empire.
Let x be a vertex in m's empire that is not assigned to any domain. We can
change the flow function in G ~ and send one unit of flow from m to x instead of
m to u. This produces a m a x flow in G ~ of lower cost, a contradiction, rq
158
If there is no heavy monarch, all vertices belong to a domain and the algo-
rithm halts successfully.
Let KL be the number of light monarchs. Let nL be the number of vertices
belonging to the domains of light monarchs, and let n be the total number of
vertices.
T h e proof is simpler than the proof given in [1]. The following lemmas were
established in [1]. We repeat them for completeness.
Let s be the set of monarchs as defined in [1]. We repeat the definition here.
Let G0 be the set of light monarchs. Iteratively, add to G0 any monarch t h a t
contains a vertex in its domain that could have been assigned to a monarch in
G0.
Let G be the largest set Gj obtained in this process. Let ~v be the set of
remaining monarchs.
Proof. Suppose heavy monarch 0 was added .at iteration j. We can transfer a
node v to a center 01 in Gj-1. By a sequence of such transfers, we eventually reach
a center in G0 which has at most L - 1 nodes in its domain, and can absorb the
extra node. This corresponds to a higher flow, since the heavy monarch cab
absorb an unassigned node, a contradiction. []
Proof. Assume for contradiction that 8 is a center in the optimal solution that
covers b o t h e E E and u. If u does not belong to any domain, then since the
distance from u to e in Gi is at most two, we can perform a sequence of transfers,
eventually reaching a center in s which has at most L - 1 nodes in its domain,
and can absorb the extra node, resulting in a higher flow, a contradiction. If u
belongs to the domain of monarch f , then since the distance from e to u in Gi
is at most two, and the distance from u to f in Gi is at most two, it must be
the case that f E E as required by the lemma. El
159
We will prove that this is also an upper bound on the number of centers we
USe,
REAssmN(G~).
1 Let M be the set of monarchs in G~.
2 For each monarch m E M, set
unassigned(m) = ({m} U El(m) U E2(m)) \ (UueMdomain(u)).
3 Let T be the tree of monarchs in G~.
4 f o r all nodes m in T, set passed(m) = 0.
5 w h i l e T is not empty d o
6 Remove a leaf node m from T.
7 Let lunassigned(m)l + ]passed(m)[ = k'L + e
8 Allocate k' new centers at monarch m and assign k'L of the nodes
to them. For each such node v we define r =- m.
9 Assign the e remaining nodes to monarch m and for each such
node v define r = m, freeing UP to e nodes in domain(m).
10 Add the freed nodes to passed(p(m)).
11 Delete m from T.
12 e n d - p r o e
Proof. All centers except possibly light monarchs cover L nodes by construction.
The size of the domain of a light monarch does not decrease. Therefore the total
number of centers used is at most KL + [(n - nL)/n].
A node is either covered by the node from which it receives flow, or by the
parent of its original monarch. In the former case, it is at distance at most two
from the center that covers it. In the latter case the passed nodes are always
covered by their monarch's parent, i.e., they are only passed once. Thus the
distance from a node to the center that covers it is at most five (at most two to
its original monarch, and three more to the parent monarch). U
3 Algorithm for/C-Centers
We now consider the version where we are required to pick K distinct vertices as
centers. We use the same high level approach as in the previous case, but need
160
to pick the centers carefully. We are able to show that the algorithm obtains an
approximation factor of 6. (Obtaining a factor of 7 using the previous approach
is easy.)
The main difficulty lies in allocating centers to cover the vertices left unas-
signed by ASSmNDOMAINS. We first introduce some new notation.
Nodes in a monarch's empire are called its subjects. In ASSIGNMONARCHS,
each level-2 subject w of a monarch is brought in by a node u at distance 1
from the monarch. We define link(w) = u. Each monarch m (except the root)
was placed into Q by a unique level-2 subject of its parent monarch. This node
is called the spouse s(m) of monarch m (Fig. 1). Note that each monarch has
a unique spouse, and a node can be the spouse of at most one monarch (by
property 4).
We need to be careful when allocating new centers to cover unassigned nodes.
We require that: (1) a node can only be allocated as a center once; (2) monarchs
have sufficient available nodes to allocate centers for the nodes passed to them.
To ensure this we enforce the following rule. A monarch m a y allocate centers of
the following types only:
itself, but covers L other vertices. A vertex allocated as a center which is not
assigned to a center covers itself as well as L - 1 other vertices.
High-Level Description
Monarch m first assigns the nodes that are passed to it from children m o n a r -
chs, using T(m) to place full centers. Any nodes that are left over (at m~st L - 1),
t h a t were not assigned, are assigned to monarch m, displacing vertices that were
in m ' s domain, which become unassigned. At this stage there m a y be m a n y free
nodes in m ' s empire - nodes that were never part of a domain, as well as the
nodes t h a t were recently displaced from m ' s domain. Note that the nodes t h a t
were never p a r t of a domain do not have a center on them, while the ones t h a t
were displaced from m ' s domain could have centers placed on them (since they
m a y belong to other trees and m a y have been chosen as centers). However, there
are at most L - 1 of these, so any which do not get assigned within m ' s empire
can be passed to m ' s parent monarch. We now choose centers from the set of
nodes t h a t never belonged to any domain. In doing so, we m a y assign some of
the displaced nodes as well. T h e remaining unassigned vertices, including the
displaced nodes, are passed to m ' s parent monarch.
We now describe in detail how the passed vertices are handled. Group the
leaves of T(m), placing leaves u and v in the same group iff link(u) = link(v).
162
We process the groups in turn, processing the group whose c o m m o n link is s(m)
last, if such a group exists.
We assign passed nodes to centers by processing the groups in order. We
process a leaf node in a group as follows: We start by adding the vertices passed
to the leaf node to a list called pending. Whenever pending has at least L
vertices, we create a center and assign vertices to it. There are two things we
have to be careful about - if we create a center at a vertex that is free, we have
to assign the vertex to its own center. If a center is going to be assigned vertices
f r o m different groups we move the center up one level, from a leaf node to the
link node. Centers chosen in the last group are not assigned any vertices from
other groups, and so we never assign a center at s(m), but only at leaf nodes in
this group.
To ensure t h a t nodes in T(m) do not have centers placed on them, we process
the monarchs in T in the reverse of the order in which they were placed in T.
Note t h a t if a node v in T(m) belongs to the empire of another monarch m ~,
then m ~ m u s t have been placed in T before m, otherwise m would have placed v
in its own empire. We thus process m before m ~. If a center is placed at v by m
then v is assigned to itself in case it was free. When we eventually process m ~,
we are guaranteed that if v is free, it does not have a center placed on it.
For the last group, we proceed as above, except t h a t any nodes carried over
f r o m the last group are picked up by monarch m, possibly replacing some nodes
already assigned to m. These replaced nodes are either passed or allocated a
center in El(m)U E2(m) by monarch m. (Note t h a t if monarch m is light, then
the nodes are passed, if not then it does not grab nodes from other empires, so
it is safe to allocate centers a t / f o r them).
T h e proof is too long to include here but will be provided in the full paper.
Example
Before describing the pseudo-code, we discuss the example given in Fig. 3
in detail. We process the leaves from left to right. Each leaf is labeled with the
n u m b e r of vertices that are passed to it from the corresponding child monarch.
Assume t h a t L is 10. After we process ul, pending(m) has size 4, and no centers
are allocated. When we process u~, pending has size 12. Since we can allocate
a full center at u2, we do so. Since u2 is free, we assign us to itself and assign
9 (-- L - 1) vertices from pending(m) to us. T h e size of pending(m) is now 3.
In processing u3, we add 2 more vertices to pending(m). Before we process the
leaves in v2's group, we set X = vl. Observe that vertices passed to nodes in v l ' s
group are going to share a center with vertices passed to nodes in v2's group,
hence we "promote" the center one level up. When we process u4 and us, we
add 3 more vertices to pending(m) t h a t now has size 8. We then process u6,
adding 8 vertices to pending(m). Since we can allocate a full center, we allocate
a center at vl (current value of X ) . Since vl is currently unassigned, we assign
vl to itself and assign 9 ( = L - 1) vertices from pending(m) to it. T h e size of
163
REAsslCN(G~).
1 Let M be the set of monarchs in G~.
2 For each monarch m E M , set
unassigned(m) = ({m} U El(m) U E~(m)) \ (U~Mdomain(u)).
3 For each monarch m E M, let pending(m) be an ordered list, initially 0.
4 Let T be the tree of monarchs in G~.
5 f o r all nodes m in T d o
6 Define T(m) such that the group containing s(m) comes last.
7 f o r all leaf nodes v in T(m)) d o set passed(v) = 0.
8 f o r all nodes v in G~, let X(v) = {v} iff v is unassigned and 1~ otherwise.
9 f o r all nodes m in T, process them in reverse order of their insertion into T:
10 Set X = null.
11 f o r all children v of m in T(m) d o
12 f o r all children u of v in T(rn) d o
13 Append passed(u) to pending(m).
14 i f X = n u l l t h e n set X = u.
15 i f Ix(X)I + ]pending(m)] >_L then
16 Allocate a center at X and assign x(X) to it.
17 Assign the first L - Ix(S)l nodes from pending(m) to X and
remove them from pending(m).
18 Set X = null.
19 else i f X = u t h e n set X = null.
20 i f v ~ s(m) a n d X = n u l l t h e n
21 i f IX(v)[ + Ipending(m)[ = Z t h e n
22 Allocate a center at v and assign X(V) to it.
23 Remove all nodes from pending(m) and assign them s v.
24 else set X = v.
25 Let displaced(m) = Idomain(m)l + Ipending(m)l _ L.
26 Assign all nodes in pending(m) to m,
possibly displacing nodes in domain(m).
27 Let lunassigned(m)] + displaced(m) = k'L + e.
28 Allocate k I new centers at nodes in unassigned(m).
29 Assign k'L of the nodes to them, assigning nodes in unassigned(m) first.
30 Add the e remaining nodes to passed(s(m)).
31 Delete m from T.
32 e n d - p r o c
3.1 Capacltated C e n t e r s w i t h C o s t s
objective is to pick a set of centers whose total cost is at most K, such that the
radius is minimized. (Note that this is equivalent to the weighted capacitated
K-center problem in [1]. We use cosl here to distinguish from weights as defined
in, for example, [3, 16].) More formally, we are given a cost function c : V --* 7r +,
and we add the constraint ~ s c(v) < K to the statement of the capacitated
K-center problem.
Bar-Ilan, Kortsarz and Peleg gave the first polynomial time approximation
algorithm for this problem with an approximation factor of 21. Their technique,
which involves finding a minimum-cost perfect matching in a bipartite graph,
generalizes to finding a 2p + 1 solution given a p-approximation algorithm for
the capacitated K-centers problem. It therefore yields an approximation algo-
rithm with an approximation factor of 13 when combined with our algorithm for
capacitated K-centers.
4 Remarks
References
1. 3. Bar-Ilan, G. Kortsarz and D. Peleg, "How to allocate network centers", 1. Al-
gorithms, 15:385-415, (1993).
2. T. Cormen, C. Leiserson, and R. Rivest, Introduction to Algorithms, The MIT
Press, 1989.
3. M. Dyer and A. M. Frieze, "A simple heuristic for the p-center problem", Opera-
tions Research Letters, Vol 3:285-288, (1985).
4. J. Edmonds and D. R. Fulkerson, ~Bottleneck extrema', Journal of Combinatorial
Theory, Vol 8:299-306, (1970).
5. T. Feder and D. Greene, "Optimal algorithms for approximate clustering", Proc.
of the 20 th ACM Symposium on the Theory o] Computing, pages 434-444, (1988).
165
6. U. Feige, "A threshold of In n for approximating set cover", Proc. o]the 28 th ACM
Symposium on the Theory of Computing, pages 314-318, (1996).
7. T. Gonzalez, "Clustering to minimize the maximum inter-cluster distance", The-
oretical Computer Science, Vol 38:293-306, (1985).
8. M. R. Gaxey and D. S. Johnson, "Computers and Intractability: A guide to the
theory of NP-completeness", Freeman, San Francisco (1978).
9. D. Hochbaum and D. B. Shmoys, "A best possible heuristic for the k-center prob-
lem", Mathematics of Operations Research, Vol 10:180-184, (1985).
10. D. Hochbaum and D. B. Shmoys, "A unified approach to approximation algo-
rithms for bottleneck problems", Journal of the ACM, Vol 33(3):533-550, (1986).
11. W. L. Hsu and G. L. Nemhauser, "Easy and hard bottleneck location problems",
Discrete Applied Mathematics, Vol 1:209-216, (1979).
12. O. Lund and M. Yannakakis, "On the hardness of approximating minimization
problems", Journal o] the ACM, Vol 41(5):960-981, (1994).
13. R. Lupton, F. M. Maley, and N. E. Young, "Data collection for the Sloan digi-
tal sky survey - a network-flow heuristic", Proc. of the 7 th Annual ACM-SIAM
Symposium on Discrete Algorithms, pages 296-303, (1996).
14. H. L. Morgan and K. D. Levin, "Optimal program and data locations in computer
networks", Communications of the ACM, Vol 20:315-322, (1977).
15. K. Murthy and g. Kam, "An approximation algorithm to the file allocation prob-
lem in computer networks", Proc. of 2 "d ACM Symposium on Principles of
Database Systems, pages 258-266, (1983).
16. 3. Plesnik, "A heuristic for the p-center problem in graphs", Discrete Applied
Mathematics, Vol 17:263-268, (t987)
17. C. Toregas, R. Swain, C. Revelle and L. Bergman, =The location of emergency
service facilities", Operations Research, Vol 19:1363-1373, (1971).
r" . . . . . . . . . . .
, 1 ]
I I
I I
/1;
tink(13) = I113' ' link(s)
IN I ,~14; '(14)----9
F . . . . . . . . . . -I F . . . . . . . . . . .
I I
I I
I I
I I
link(s(mq~ i t
_ _ _" " _ "_'_ _ _ J L ~ __-~ ~_(2)~
~ .......... n . . . . . 7 Monarch rn's empire
l I x,',. /
I I I I
,
......~ ,(~') z I j l_}_,,
I L L ~~_
_.~ :
. . . .
, . , , . , . ,
I t I I I l I I
i~/~ n I I I I I I
I I I I I I I
I I I I I I I
I I I I I I I
I I I I I I I
I I I I I I I
I I I I I I I
L---.J L----J L---2 L--.J L---J L__2 L___I L___I L__-I L__-I
L=10
(E) = unassigned vertex
~-] =centers
~1 U2 ~3 U4 ?/,5 ~6 ?t,7
4 8 2 1 2 8 5