Rozprawa
Rozprawa
(in-)approximability.
A doctoral dissertation
by
Jan Marcinkowski
Wydział Matematyki
i Informatyki
Wrocław-Tylicz, ad
ii
Streszczenie
W niniejszej rozprawie doktorskiej prezentujemy wyniki dotyczące tematu aproksymowalno-
ści wybranych optymalizacyjnych problemów kombinatorycznych:
Abstract
In this thesis we present the results on approximability of selected combinatorial optimisation
problems:
. In Chapter we are showing that a trivial 2-approximation algorithm for the Minimum
Maximal Matching problem is the limit of what can be expected (also for bipartite
graphs). Existence of a (2 − γ)-approximation algorithm (for any constant γ) would
contradict the Unique Games Conjecture (or its stronger variants, in the bipartite case).
. In Chapter we present an approximation algorithm for deciding winners in the Pro-
portional Approval Voting system (and an infinite family of related electoral sys-
tems). We additionally show, that a better polynomial-time approximation would im-
ply P = NP, and a better approximation running in time parametrised by the committee
size would contradict Gap-ETH.
. In Chapter we construct a constant-factor approximation algorithm for the Capaci-
tated k-Median problem, working in time parametrised by k. The problem in ques-
tion cannot be solved exactly within such a complexity (unless FPT = W[2]), and a
polynomial-time approximation—although not strictly out of question—remains out
of reach.
Contents
Notations v
Introduction
Computational tasks and their complexity . . . . . . . . . . . . . . . .
Approximation algorithms . . . . . . . . . . . . . . . . . . . . . . . . .
This dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Gap problems
Label Cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Unique Games and variations . . . . . . . . . . . . . . . . . . . . . . .
Bibliography
Notations
A small change in the description of the task may however put it in a very different
place on the Computational Complexity map.
Introduction
One can verify that if G has a vertex cover (subset of V hitting every edge) of
V
size k, then removing k edges from EH suffices to make digraph H acyclic (this
statement is called the completeness of the reduction). Moreover, if H can be made
acyclic by removing l arcs, then there is a vertex cover of l vertices in graph G
(this states the soundness of our reduction).
. Approximation algorithms
Having built a web of reductions like the one above, Computer Scientists began to
look for a way around the common difficulty (NP-hardness) of problems like MAS or VC.
One way around is approximation algorithms. Such an algorithm has an approximation
ratio of ρ if it returns solutions whose value is within ρ of the optimal value for the
input instance.
ratio’ for a particular problem, or ‘whether such a constant approximation ratio can
be even guaranteed’ by a polynomial-time algorithm. Many general techniques have
been developed for constructing approximation algorithms, but in the end, the exact
answer must be given individually for each computational task.
Giving negative answers to approximability questions has been enabled by a chain
of results (starting with the PCP theorem [Aro+]) and hypotheses (most notably
the Unique Games Conjecture [Kho]). They allow us to construct reductions like
above, but proving inapproximability, i.e. impossibility of approximation within a
certain ratio. For example, an inapproximability result is known for the Maximum
Acyclic Subgraph.
. This dissertation
In this thesis we tackle the approximability question for three combinatorial problems,
each of very different nature. In Chapter , we describe our hardness results for a
problem called Minimum Maximal Matching. It is a somewhat similar story to the
Maximum Acyclic Subgraph—there is a trivial approximation algorithm with ratio
equal to 2, and a considerable effort had been invested into finding better approxi-
mation algorithms (i.e., with smaller ratio). In our papers [DLM] (joint work with
Szymon Dudycz and Mateusz Lewandowski; published at IPCO ) and [DMM]
(joint work with Szymon Dudycz and Pasin Manurangsi; pending publication) we
prove such an algorithm impossible, conditionally on the Unique Games Conjecture.
Our results effectively close a long line of research on the approximability of Minimum
Maximal Matching. While proving the inapproximability, we need to look deeply
into previous results on hardness of approximation of VC and Balanced Biclique,
as well as the relationship between Small Set Expansion Hypothesis and the Unique
Games Conjecture. Our write-up may be helpful to those willing to understand them
but struggling throughout the (sometimes obscure) writing of the original papers.
In Chapter we revisit the task of counting votes in the Proportional Approval
Voting and related electoral systems. This problem—although dressed in the robes
of political sciences—is a computational task similar to Maximum Coverage, only
with a generalised objective function . In our paper [Dud+] (joint work with
Szymon Dudycz, Pasin Manurangsi and Krzysztof Sornat; published ad IJCAI ),
we give a new approximation algorithm (one for entire family of voting systems), and
Introduction
. Acknowledgements
I am very grateful to my PhD supervisor Jarosław Byrka, who encouraged me to
follow the path of science. He gave me freedom seldom experienced by graduate
students. I thank my teachers, most notably Katarzyna Paluch, who attracted me to
the world of Combinatorial Optimisation and also suggested that we work on MMM. I
thank my collaborators for the time and the joy, among them Szymon Dudycz, with
whom we have been learning maths together for the last twenty-four years.
I am indebted to my family. My parents have taught me to climb mountains—both
real and intellectual—and we still enjoy hiking together. My wife Patrycja not only is
my greatest companion but also a professional touching people’s lives with her daily
work, which makes me immensely proud. Our daughter Jadwiga makes sure I get up
from bed every day.
I would also like to thank the hard-working Polish taxpayer who mines coal,
strikes iron, and sows the land so that we can travel to conferences. I tried to pay this
debt back by teaching with dedication and enthusiasm.
Some of the results in this thesis are inapproximability proofs. When working on
hardness of approximation, we introduce and reduce between gap problems, in which
the algorithm is promised that the incoming instance will come from one of two sets
separated by a large gap in a specific measure. A good approximation algorithm with
regard to this measure would need to distinguish between instances coming from
these sets. In this chapter we present a selection of important gap problems, known
results and conjectures. They will serve as starting points in our reductions in the
next chapters.
. Label Cover
One problem that has turned out to be useful in the hardness of approximation is
Label Cover, in which we colour (label) the vertices of a bipartite graph, attempting
to maximise the number of satisfied edge constraints. The constraints are functions
projecting the colour of the left vertex onto the right vertex. Hence this problem is
sometimes called projection games .
Problem (Label Cover, LC(δ, t, R))
Given a bipartite, bi-regular graph G = (U ∪· V , E) with right degree t, two sets of
labels [L], [R], and for each edge e ∈ E a function constraint πe : [L] → [R], distinguish
between two cases:
• (YES) There exists a labelling l : U → [L], such that every right-side vertex v
is satisfied—i.e. for every two neighbours u1 , u2 of v we have π(u1 ,v) (l(u1 )) =
π(u2 ,v) (l(u2 )).
The game here is played between two provers and one verifier: The verifier samples a right-vertex
and asks the provers to colour its random two neighbours. The provers are not allowed to communicate
and do not know, which right-vertex had been selected, so if their colours agree, the verifier is satisfied
they likely can colour the graph to satisfy large fraction of the constraints.
Gap problems
• (NO) For every labelling l : U → [L] there are at most δ|V | weakly satisfied
right-side vertices v—i.e. there exist two distinct neighbours u1 , u2 for which
π(u1 ,v) (l(u1 )) = π(u2 ,v) (l(u2 )).
u u
u
u
u u
Figure : Unique Label Cover can be translated from one variant to another with a
marginal loss in .
Conjecture (Strong ugc [BK]). For every > 0 and δ > 0 exists R ∈ N such that
ULCE(, R, δ) is NP-hard.
Another conjecture we base our results on is the Small Set Expansion Hypothesis
(sseh), which concerns the hardness of finding well-separated parts of a graph.
Formally, in a d-regular graph G = (V , E), an edge expansion of a subset S of vertices
is defined as
|E(S, V \ S)|
Φ(S) = ,
d · min {|S|, |V \ S|}
where E(S, V \ S) is a set of edges crossing the cut (S, V \ S). The small set expansion
problem parametrised by δ, η ∈ (0, 1) is defined as follows.
The sseh, introduced by Raghavendra and Steurer [RS], hypothesises about hard-
ness of this problem.
Conjecture (sseh). For every η > 0 exists δ = δ(η) ∈ (0, 21 ] such that SSE(δ, η) is
NP-hard.
It turns out, surprisingly, that sseh is also stronger than ugc [RS]. We will see
the intuition behind this fact in the next chapter. No direct relationship is however
known between Strong-ugc and sseh.
here. In this thesis we follow the nomenclature of [Bha+], which refers to Conjecture as Strong
ugc
The Minimum Maximal Matching
problem
An edge dominating set of an undirected graph is a subset D ⊂ E of edges, such that
every edge must either be in D or adjacent to some edge in D. A Minimum Edge
Dominating Set (EDS) problem asks to find such a set of smallest cardinality. An
independent edge dominating set, in which no two edges have a common vertex, is
also called a maximal matching—the domination criterion means that no edge may be
added to it while maintaining the independence. In a natural algorithmic question of
Minimum Maximal Matching (MMM) we are asked to find such an independent edge
dominating set of smallest cardinality.
Both these problems also have counterparts in graphs with weights on edges, wEDS
and wMMM, where we aim to find the lightest (independent) edge dominating set rather
than smallest.
In this chapter we are going to present four inapproximability results which
appeared in two papers [DLM; DMM], however for the first of this results we
will present a different (simpler) proof then the published one.
. Related work
The first traceable discussion of our problems goes back to , when it was shown
that every inclusion-wise minimal edge dominating set can be transformed into a
maximal matching of the same cardinality (see Fig. (a)). The EDS and MMM problems
(in unweighted variants) are therefore equivalent [Gup].
v v
u u
s s
w w
(a) Every Edge Dominating Set can be transformed (b) The equivalence between MMM and
into a Maximal Matching of the same size: Let (u, v), EDS does not hold in the weighted
(u, w) be two adjacent EDS-edges. We find a neigh- graphs. The lightest maximum match-
bour s of w with no incident EDS-edge and replace ing weighs , while the Minimum
(u, w) with (w, s). If no such s exists, the edge (u, w) Edge Dominating Set weighs only .
can be dropped from the EDS.
Weighted variants The equivalence between EDS and MMM does not hold when
edges carry weights (see example in Fig. (b)). All of the aforementioned approxi-
mation algorithms (with the exception of Baker’s method, but including the trivial
-approximation) do not apply for wEDS and wMMM. A series of papers by Fujito, Parekh
and others concluded in a -approximation algorithm for wEDS [Car+; FN; Par].
wEDS has been shown to be as hard to approximate as weighted Vertex Cover [FN;
. Related work
Theorem
Assuming Unique Games Conjecture (Conjecture .), there is no polynomial-time
algorithm that approximates MMM (or equivalently EDS) on unweighted graphs within
factor 2 − ε for any constant ε > 0, unless P = NP.
A special case of MMM, particularly natural for matching experts is, when the input
graph is bipartite. We were not able to carry our negative bound directly to the
bipartite graphs, but we managed to obtain three ‘weaker’ results.
Theorem
Assuming Unique Games Conjecture (Conjecture .), there is no polynomial-time
algorithm that approximates MMM (or equivalently EDS) on unweighted, bipartite graphs
within factor 43 − ε for any constant ε > 0, unless P = NP.
Theorem
Assuming Strong Unique Games Conjecture (Conjecture .), there is no polynomial-
time algorithm that approximates MMM (or equivalently EDS) on unweighted, bipartite
graphs within factor 2 − ε for any constant ε > 0, unless P = NP.
Theorem
Assuming Small Set Expansion Hypothesis (Conjecture .), there is no polynomial-time
algorithm that approximates MMM (or equivalently EDS) on unweighted, bipartite graphs
within factor 2 − ε for any constant ε > 0, unless P = NP.
The rest of this chapter will be devoted to presenting the proofs of these theorems.
A fact we were unaware of when working on our results. But it makes the unweighted MMM all the
more intriguing.
The Minimum Maximal Matching problem
. Our method
Our proofs take advantage of a close relationship between the Independent Set
problem and MMM. In short, the set of vertices left unmatched by a maximal matching
must form an independent set. In [KR] it was proven that it is NP-hard (under
ugc) to distinguish (YES) graphs with independent set of size ( 12 − )|V | from (NO)
those with no independent set of size |V |. In the proof of Theorem presented in our
paper [DLM] this reduction was modified by only adding edges, so that in the (YES)
case there was a perfect matching between vertices not included in the independent
set. This strategy aimed for simplicity, as typically, when proving correctness of
gap reductions, analysing the (NO) case is significantly more difficult than the (YES)
case. When we only add edges, no independent set can be created, so there is no
need to look at the (NO) part at all. In the proof of Theorem in this thesis we do
the same trick, just starting from a different construction for Vertex Cover [BK],
which will have an advantage of producing unweighted graphs right away (in the
paper [DLM] we had to exploit the structure of the resulting weighted graph to get
rid of the weights).
In bipartite graphs, we are aiming to employ the same strategy. However, while
maximum independent sets in bipartite graphs can be found in polynomial time,
we notice that vertices left out by any maximal matching must form a balanced bi-
independent set—an independent set with the same number of vertices on both sides
of bipartition. We must hence look at existing hardness of approximation proofs for
balanced bi-independent set [Bha+; Man] and carefully add edges to them.
. General graphs
At the heart of many ugc-based reductions lies a dictatorship test. A dictatorship
is a function f : {0, 1}R → {0, 1} that depends only on one coordinate, so f (x) = xi or
R
f (x) = xi . Dictatorships are an error-correcting code—there are 22 functions, but
only 2R dictatorships. Importantly, there are ways to only query f on some inputs
and decide whether it is (similar to) a dictatorship, and decode it if it is. In an
inapproximability reduction we are given t functions f (1) , . . . , f (t) and must test if
they all are dictatorships of the same coordinate.
The test F,t coined by Bansal and Khot [BK] works as follows:
. Accept if all f (l) -s are equal to some b on all inputs in Cx,S and b on all inputs in
Cx,S .
Completeness If all the f (l) -s are the dictatorship of the same coordinate i, this
coordinate will be included in the set S with probability . When i < S, the test will
pass.
Soundness For any function f : {0, 1}R → {0, 1} and any integer s 6 R one can deter-
mine the set Ls [f ] ⊂ [R] of s most low-degree influential coordinates . Bansal and Khot
prove the following
Theorem ([BK, Theorem ., simplified])
For every , δ > 0 there are t, s ∈ N such, that if the test F,t accepts a collection f (1) , . . . , f (t)
of functions {0, 1}R → {0, 1} (for any R) with probability larger than δ, then there are two
of the tested functions f (l1 ) , f (l2 ) and a coordinate i for which
i ∈ Ls [f (l1 ) ] ∩ Ls [f (l2 ) ].
Reduction
Input: A ULC(, t, R) instance G = (U ∪· V , E, {πe }e∈E ).
Output: An undirected graph H = (VH , EH ) with VH = V × 2(1−)R R R
× {0, 1}.
For any left-side vertex u ∈ U and its two (not necessarily distinct) neighbours
We will not delve into how this set is defined, as that would require us to explore the topic of
Fourier analysis of boolean functions, which we managed to avoid in our proofs. Luckily, we do not
need that to understand how dictatorship test translates into reductions.
The Minimum Maximal Matching problem
D E D E
v1 , v2 ∈ V we add an edge in H between v1 , Cx1 ,S1 , b1 and v2 , Cx2 ,S2 , b2 if
• b1 , b2 and exists x ∈ {0, 1}R such that x◦π(u,v1 ) ∈ Cx1 ,S1 and x◦π(u,v2 ) ∈ Cx2 ,S2 ;
or
• b1 = b2 and exists x ∈ {0, 1}R such that x ◦ π(u,v1 ) ∈ Cx1 ,S1 and x ◦ π(u,v2 ) ∈ Cx2 ,S2 .
n o
To see that this is indeed an independent set, define functions fu : x 7→ xl(u) .
u∈U
Every vertex in I S corresponds to an accepting run of F,t on t of these functions.
One can see, that when defining I S we could just as well have written xl(v) , b instead
n o
of xl(v) = b, which would have corresponded to fu : x 7→ xl(u) . There are in fact
u∈U
two independent sets of size (1 − 2)|VH | in the (YES) case. The freedom of picking
one of the two sets is what the ‘free bit’ from the title of [BK] refers to.
Lemma (Soundness [BK, Section .]). If G was a (NO) instance of ULC, no inde-
pendent set of size |H| exists in H.
Proof (sketch). Assume contrapositive, that there is an independent set S in H of size
2δ|H|. We will use this assumption to colour the vertices of ULC instance G: Imagine
that for every u ∈ U there is a boolean function f (u) : {0, 1}R → {0, 1}. For each such
function we can identify the set Ls [f (u) ] and pick one of the s colours at random.
For v ∈ V , let
VH [v] = v 0 , Cx,S , b ∈ VH | v 0 = v .
n o
By Markov’s inequality, the set V 0 of right vertices v for which |VH [v] ∩ S| > δ|VH [v]|
has size at least 12 |V |. Take any v ∈ V 0 and its t neighbours, u1 , . . . , u t . Functions
f (u1 ) ◦ π(u1 ,v) , . . . , f (ut ) ◦ π(ut ,v) pass the dictatorship test F,t with probability exceeding
x ◦ π is the vector x with coordinates permuted by permutation π(u,v) . Since the coordinates
(u,v)
correspond to colours, we can think that this permutation allows us to look at x from the perspective
of vertex v.
. General graphs
0
δ, so by Theorem there are two neighbours w, w0 such that Ls [f (w) ] ∩ Ls [f (w ) ] , .
Consequently, our random colouring weakly satisfies at least s12 |V 0 | > s22 |V | right
vertices of G.
Since Independent Set and Vertex Cover are complements in any graph, the
reduction gives us that it is NP-hard to distinguish between graphs with Minimum
Vertex Cover of size ( 12 + )|VH | and (1 − )|VH | for any .
Reduction
Input: A ULC(, t, R) instance G = (U ∪· V , E, {πe }e∈E ).
Output: An undirected graph H 0 = (VH 0 , EH 0 ) with VH 0 = V × 2(1−)R R R
× {0, 1}.
For any left-side vertex u ∈ U and its two (not necessarily distinct) neighbours
v1 , v2 ∈ V we add an edge in H 0 between v1 , Cx1 ,S1 , b1 and v2 , Cx2 ,S2 , b2 if exist
D E D E
Lemma . If G was a (YES) instance of ULC, H 0 has an independent set of size ( 12 −
2)|VH 0 |.
We would not have been able to prove the above lemma for the alternative definition
of I S in Lemma , because 1 = 0 6 1. In fact, the symmetry is broken—there are now
many internal edges in VH 0 \ I S.
(G) v ∈ V 0 , l(v) < S, xl(v) , b: We pair such a vertex with v, Cx0 ,S , b , where
xi = b if i = l(v),
0
xi =
xi
otherwise.
This concludes the analysis of the (YES) case and together with Lemma hands us
that Gap-MMM is NP-hard (assuming ugc), which proves Theorem .
Lemma . Assuming the Unique Games Conjecture, for any > 0 it is NP-hard to
distinguish between balanced bipartite graphs of 2n vertices:
Lemma . For any maximal matching M in H, exists a vertex cover C in G of size
|C| 6 23 |M|.
Proof. For every p ∈ P (M) we add V (p) to C. When p is a cycle, we add |p| vertices.
For a path we add at least 32 |p| (the longer the path, the higher the ratio). V (M) is a
vertex cover of H, so C = V (P (M)) is a vertex cover in G.
In the (NO) case graph G had no vertex cover smaller than (1 − )|V (G)|, so there is
no maximal matching of size smaller then |V (G)|(1 − ) 32 .
Figure : The gadget Tq for a prime number q > 2 (blue) is an undirected -regular
graph (counting edges, not endpoints). The modified gadget Tq0 is obtained by adding
directed arcs from i to q − i for i < Loq (red).
Definition (Tq ). The vertex set is Zq = {0, 1, . . . , q − 1}. There are three kinds of
(undirected) edges.
Let us additionally define the sets Hiq = {bq/2c + 1, . . . , q − 1} and Loq = {1, . . . , bq/2c}.
0 belongs to neither. As one can see, there are no edges in Tq between Hiq and Loq , so
they form a balanced bi-independent set (even though Tq is not bipartite). The graph
T = Tq⊗R has the vertex set ZRq (so every vertex is an R-element vector of numbers
from Zq , every coordinate corresponds to one label colour in Unique Label Cover)
and two vertices x and y are connected, if xi and yi are connected in Tq for every
coordinate i ∈ R. The reduction from [Bha+] is now defined as follows.
Reduction
Input: An ULCE(, R, δ) instance G = UG ∪· VG , EG , {πe }e∈E .
Output: A bipartite graph H = hUH ∪· VH , EH i.
Let UH = VH = VG × ZR q , so every vertex in H corresponds to one vertex in
VG and R numbers in Zq . For any u ∈ UG and its two neighbours v1 , v2 in G, we
−1
add an edge between (v1 , x) ∈ UH and (v2 , y) ∈ VH in EH whenever x ◦ π(u,v ) is 1
−1
connected to y ◦ π(u,v in T (x and y’s coordinates are permuted to the point of
2)
view of vertex u).
. Bipartite graphs II: From Strong ugc to MMM
It is easy to see that if the input graph came from the (YES) case, then there is a
bi-independent set of size ( 12 − − 1q )|UH ∪· VH | in H. To find it pick the labelling l
and the set V 0 ⊂ VG of vertices fully satisfied by the labelling l. For any v ∈ V 0 and x
such that x(l(v)) ∈ Loq pick (v, x) into the left side of the bi-independent set. Similarly,
pick (v, y) ∈ VH if y(l(v)) ∈ Hiq . The reader can see, that in fact two symmetrical
bi-independent sets exist in H. The (NO) case is analysed in the original paper.
Now it is time to introduce our modification. Let us define a new gadget Tq0 by
adding directed arcs {(i, q − i)}i<Loq (see Fig. ). The undirected edge between two
vertices is replaced by a pair of two directed arcs, the self-loop is replaced by a single
arc. Consequently, the graph T 0 = Tq0 ⊗R is also directed. Our reduction is similar to
the one above.
Reduction
Input: An ULCE(, R, δ) instance DG = UG ∪· VGE, EG , {πe }e∈E .
0
Output: A bipartite graph H 0 = UH ∪· VH , EH .
R
Let UH = VH = VG × Zq , like before. For any u ∈ UG and its two neighbours
v1 , v2 in G, we add an edge between (v1 , x) ∈ UH and (v2 , y) ∈ VH in EH whenever
−1 −1 0
there is an arc from x ◦ π(u,v ) to y ◦ π(u,v ) in T .
1 2
Lemma (completeness). If the graph G comes from the (YES) case of ULCE, there is a
bi-independent set BI S ⊂ UH ∪· VH of size |BI S| > ( 12 − − 1q )|UH ∪· VH | in H 0 . Moreover,
there is a perfect matching in H 0 between UH \ BI S and VH \ BI S.
Proof. Take l and V 0 as in Problem .. For every v ∈ V 0 and x ∈ ZR q such that
x(l(v)) ∈ Loq , we pick the vertex (v, x) ∈ UH into the set BI S. Similarly, for every
v ∈ V 0 and y with y(l(v)) ∈ Hiq , we pick (v, y) ∈ VH into BI S. To see that BI S is
indeed a bi-independent set pick any (v1 , x) ∈ UH ∩ BI S and (v2 , y) ∈ VH ∩ BI S. Since
v1 , v2 ∈ V 0 , for every u in their common neighbourhood in G, π(u,v1 ) (l(u)) = l(v1 )
and π(u,v2 ) (l(u)) = l(v2 ). There is no arc from x(l(v1 )) ∈ Loq to y(l(v2 )) ∈ Hiq in Tq0 , so
there is no edge between (v1 , x) ∈ UH and (v2 , y) ∈ VH . The size of BI S is |BI S| >
bq/2c R
(1 − )|VG | q q · 2 > 12 − − 1q |UH ∪· VH |.
Now we define a perfect matching M between the vertices not in BI S. First, for
every v ∈ V 0 we will match the vertex (v, x) ∈ UH where x(l(v)) < Loq with (v, x0 ) ∈ VH
The Minimum Maximal Matching problem
where
0 x(i)
when i , l(v),
x (i) =
q − x(i) when i = l(v).
For any vertex v ∈ VG \ V 0 let M ((v, x)) = (v, x). This edge exists since there was a loop
(i, i) in Tq0 for every i ∈ Zq .
Corollary. Assuming Strong Unique Games Conjecture, for any > 0, it is NP-hard given
a bipartite graph G = (U ∪· V , E) (with |U | = |V |) to distinguish between two cases.
• (YES) There is a bisection (VH0 , VH1 ) of VH s.t. |EH (VH0 )|, |EH (VH1 )| > ( 21 − )|EH |.
|VH |
• (NO) For every set T ⊂ VH of size |T | 6 2 , |EH (T )| 6 |EH |.
We are only modifying the second step, but we need to describe the first one as well,
as we are relying on the structure of the solution in the (YES) case. Before we are able
to do that, let us give an overview of the reduction, to explain the origin of its various
components.
(A, B) |X|
is a bisection of X if |A| = |B| = 2 and A ∪ B = X.
. Bipartite graphs III: Hardness based on sseh
The idea behind this reduction is that if there is a well-separated set S ⊂ V of size δ|V |
in G, then a vector of 1δ vertices should in expectation have exactly one vertex in S. We
call a vector v ∈ V R S-singular if |{v1 , . . . , vR }∩S| = 1. It is clear that if u ∈ U1 and v ∈ V1
are both S-singular and (u, v) ∈ E1 then the constraint π(u,v) is satisfied (see Fig. ).
Unfortunately, only 1e -fraction of all the vectors will be S-singular, so Reduction is
clearly a failed attempt, as the easy part—completeness—is not working.
There is a simple way to overcome this problem. If we pick a large constant k and
sample k distinct R-element vectors (blocks) of vertices, it is very likely that at least
one of these vectors will be S-singular. This idea stands behind the second attempt at
reducing sseh to ugc.
Equivalently we may write that π1,...,k (u) and v are neighbours in G⊗(R×k) .
The Minimum Maximal Matching problem
u v u v
(a) (b)
The reader may verify that the completeness of this reduction is satisfied. However,
it turns out that the reduction presented above does not yet provide the desired
soundness, and certain ‘noises’ have to be added.
For simplicity we describe them as a modification of the Reduction . Combin-
ing such a modified version of the reduction into blocks gives a correct instance of
ULC.
Every edge in E1 would now correspond to an edge in G⊗R , if not for the sets T in the
reduction. Alternatively, sets T could be generated by picking every coordinate with
probability . That would result in an edge-weighted instance of ULC—the weight of
an edge equal to the probability of sampling the set T associated with it.
. Bipartite graphs III: Hardness based on sseh
Reduction (combinatorial)
Let R = Θ( 1δ ), k, l be constants. Let also Ω = {0, 1, ⊥}. The hypergraph is built as
follows:
We will denote as πv,e the permutation that is used to show that v ∈ e. The
weight wH (e) of an edge is defined by a probabilistic process described below.
To understand how this construction is obtained, imagine first, how we might try to
reduce ugc to MUCHB. The same dictatorship test as in Section will be used, only in
a slightly different way. Every hyperedge now corresponds to an accepting run of the
dictatorship test. It contains all consistent inputs to the test.
Clearly, this reduction has the desired completeness. The soundness is not satisfied,
as our ULC instance could potentially be composed of two unconnected copies of a
(NO) instance. It would still be a (NO) instance, but a resulting hypergraph would
have a perfect bisection.
Instead our hypergraph H from Reductions and is constructed by running a
similar operation on an instance created in the previous subsection. To describe the
probabilistic construction we need to introduce some notation.
0 | x0
n o
• For a set D ⊆ [R] × [k] and x ∈ ΩR×k
β let C D (x) = x ([R]×[k])\D = x([R]×[k])\D .
Reduction (probabilistic)
. Sample A z V R×k and Ã1 , . . . , Ãl z TV (A).
. Sample B1 z G⊗(R×k) (Ã1 ), . . . , Bl z G⊗(R×k) (Ãl ) and B̃1 z TV (B1 ), . . . , B̃l z TV (Bl ).
. Sample x z ΩR×k β and D z SεT (Rk).
B0 ∈ Mx0 (B̃p ) .
o
Completeness—The bisection
We present the construction of sets VH0 and VH1 from Theorem (in the (YES) case).
They will be important in our further proceedings. The following additional notation
will be used:
• For a vector A ∈ V R×k let A1 , . . . , Ak ∈ V R be its blocks. Let W (A, x, j) denote the
j
set of all coordinates i in j-th block such that xi ,⊥ and Ai ∈ S, i.e., W (A, x, j) =
j
n o
i ∈ [R] | Ai ∈ S ∧ xi ,⊥ .
• Let j ∗ (A, x) ∈ [k] denote the first x-filtered S-singular block, i.e., smallest j with
|W (A, x, j)| = 1. Note that if such block does not exist, we set j ∗ (A, x) = −1.
• Let i ∗ (A, x) ∈ [R] × [k] be the only element in W (A, x, j ∗ (A, x)). If j ∗ (A, x) = −1, let
i ∗ (A, x) = −1. For a hypervertex v = (Av , xv ) we will sometimes write i ∗ (v) instead
of i ∗ (Av , xv ).
We will split all hypervertices into three sets: T0 , T1 , T⊥ . Let T⊥ be the set of hyper-
vertices (A, x), such that j ∗ (A, x) = −1. Then, for every hypervertex (A, x) not in T⊥ we
look at the value b = xi ∗ (A,x) and include it in T0 if b = 0 and in T1 if b = 1. It is shown
that |EH (T0 )|, |EH (T1 )| > ( 12 − )|EH |.
hypergraph are not disjoint if there exist two permutations π1 , π2 ∈ (SR )k satisfying the
following conditions
T and T1 are called T00 and T10 respectively in [Man].
0
The Minimum Maximal Matching problem
() x1 (π1 (i)) = x2 (π2 (i)) for i such that π1 (i) < D1 ∧ π2 (i) < D2 ;
p p
() ∃p1 , p2 ∈ [l] such that B11 (π1 (i)) = B22 (π2 (i)) for all i satisfying
() x1 (π1 (i)) 4 x2 (π2 (i)) for i such that π1 (i) < D1 ∧ π2 (i) < D2 ;
p p
() ∃p1 , p2 ∈ [l] such that B11 (π1 (i)) = B22 (π2 (i)) for all i satisfying
• VL , VR = EH .
• el ∈ VL and er ∈ VR are connected if they are not disjoint (i.e. (el , er ) ∈ Λ).
• VL , VR = EH .
• el ∈ VL and er ∈ VR are connected in E MMM if (el , er ) ∈ ∆.
. Bipartite graphs III: Hardness based on sseh
Completeness
In the (YES) case we need to find a maximal matching that leaves roughly half of the
vertices of G MMM unmatched. On the next few pages we will be constructing the
matching, as promised in the following lemma.
Lemma (Completeness). If G came from the (YES) case of sseh, then in G MMM there
is a maximal matching of size at most ( 12 + )|VL |.
Our matching M will be composed of two parts. In M1 , VL ∩ EH (T0 ) will be matched
with VR ∩ EH (T1 ) while VL ∩ EH (T1 ) and VR ∩ EH (T0 ) will form a bi-independent set
(this is why we needed to build the graph on an asymmetric relation). This will be
the larger part, of size ( 12 − )|VL |. In M2 , the vertices in VL \ (EH (T0 ) ∪ EH (T1 )) and
VR \ (EH (T0 ) ∪ EH (T1 )) will be paired.
Constructing M1 . The first part of our matching will be based on a bijection be-
1
D E
l
tween EH (T0 ) and EH (T1 ). For a hyperedge t = B , . . . , B , x, D we define its witness
indexes W (t) = i ∈ [R] × [k] | ∃p ∈ [l], x0 ∈ CD (x), i = i ∗ (Bp , x0 ) . Now we construct a
n o
Proof. Assume ∃i ∈ D0 ∩ W (t0 ). Let v = (A, x) ∈ t0 be such that πv,t0 (i ∗ (v)) = i. Define
x0 (i) = 1 and x0 (j) = x(j) for all j , i. The vertex v 0 = (A, x0 ) is still in t0 , because i ∈ D0 .
v 0 is however also in T1 because x0 (i ∗ (v 0 )) = 1, which contradicts t0 ∈ EH (T0 ).
Proof. For any p ∈ [l] and x0 ∈ CD (x), a vertex v = (Bp , x0 ) is in t, and, by consequence,
in T0 . Thus, x0 (i ∗ (Bp , x0 )) = 0. As i ∗ (v) < D, x(i ∗ (Bp , x0 )) = 0 as well.
Lemma . Let t0 be any hyperedge in EH (T0 ) and t1 = φ(t0 ). Let v = (A, x) be any
hypervertex in t1 . Then v ∈ T1 .
Proof. πv,t1 (i ∗ (A, x)) ∈ W (t1 ), so by () we have x1 (πv,t1 (i ∗ (A, x))) = 1. πv,t1 (i ∗ (A, x)) <
D0 , so x(i ∗ (A, x)) = x1 (πv,t1 (i ∗ (A, x))) = 1. Thus, v ∈ T1 .
Proof. Observe that for any hyperedge t0 ∈ EH (T0 ) we have W (t0 ) = W (φ(t0 )). This
follows from the fact that i ∗ ( · , x) does not distinguish between zeros and ones in x.
Now, it is easy to construct the inverse function to φ. φ−1 will be almost the same
as φ, except that in the vector x we will set x(i) to be 0 instead of 1. Thanks to the
observation, φ−1 changes the same coordinates in x as φ, so it is an inverse of φ.
The
n matching between VL ∩ o EH (T0 ) and VR ∩ EH (T1 ) uses φ in a natural way: M1 =
(t0 , φ(t0 )) | t0 ∈ VL ∩ EH (T0 ) . We need to show that all the edges of this matching are
in graph G MMM .
Lemma . Let t0 , t1 be any pair such that φ(t0 ) = t1 . Then (t0 , t1 ) ∈ ∆.
() is satisfied, as for all i either x0 (i) = x1 (i) or x0 (i) = 0 and x1 (i) = 1, which is
exactly the definition of x0 (i) 4 x1 (i).
() is satisfied, because B10 = B11 .
. Bipartite graphs III: Hardness based on sseh
x
=
x0
x =
I D D
Figure : Vectors from the proof of Lemma . x1 and x0 are identical everywhere
apart from positions in D0 ∪ D1 ∪ I. x00 is constructed from x0 by replacing ones with
noughts at indices of I. The vector xv 0 of v 0 ∈ t1 ∩ t00 must agree with x1 everywhere
except D1 , and with x00 everywhere except D0 .
Proof. Assume that there is t1 = B11 , . . . , Bl1 , x1 , D1 ∈ EH (T1 ) and t0 = B10 , . . . , Bl0 , x0 , D0
D E D E
such that (t1 , t0 ) ∈ ∆. We will show that there exists a hypervertex v ∈ t0 ∩ T1 , and
consequently t0 < EH (T0 ).
The relations ∆ and Λ are very similar. If (t1 , t0 ) ∈ Λ (t1 and t0 are not disjoint),
we have that t0 ∩ T1 , right away. Otherwise, from the definition of ∆ we have that
there are two permutations π1 , π0 such that ∀i < D1 ∪ D0 . x1 (π1 (i)) 4 x0 (π0 (i)) and the
inequality is strict for at least one i (since (t1 , t0 ) < Λ), we shall denote the set of these
Infact, we did not need to define M2 to prove the completeness: Since M1 matches almost half
of vertices from each side, and (as we are proving in Lemma ) another almost half forms a bi-
independent set, extending M1 greedily could extend it by at most |EH | edges. That would give us the
Lemma with the matching size (1 + 2)|VL |.
The Minimum Maximal Matching problem
indices with strict inequality as I (see Fig. ). By permuting the indices in t1 and t0
we will assume that π1 Dand π0 are bothEthe identity permutations. Let us consider
another hyperedge t00 = B10 , . . . , Bl0 , x00 , D0 , where x00 is defined as follows:
0 if i ∈ I,
x00 (i) = ()
x0 (i) otherwise.
Clearly (t1 , t00 ) ∈ Λ (by Property ), which means that t00 < EH (T0 ). Let v 0 = (Av 0 , xv 0 )
be such that v 0 ∈ t00 ∩ T1 (again, we will assume that πv 0 ,t00 is the identity permutation).
We will change v 0 slightly, to ‘reverse’ changes made in (). xv will be defined as
follows:
1 if i ∈ I,
xv (i) = ()
xv 0 (i) otherwise.
Soundness
We recall Lemma from [Man] about graph G BI S in the (NO) case of MUCHB.
Lemma ([Man, Appendix A]). If for every T ⊂ VH of size at most |V2H | we have
|EH (T )| 6 |EH |, then in G BI S there is no bi-independent set of size 2 · (|EH | + 1).
As the set of edges of graph G MMM is a superset of the set of edges of G BI S , analogous
statement is also true for G MMM .
Lemma (Soundness). If H came from the (NO) case of Problem , then in G MMM
there is no bi-independent set of size 2 · (|EH | + 1).
Altogether Lemmas and give us the following, from which Theorem immedi-
ately follows.
Corollary. Assuming Small Set Expansion Hypothesis, for any > 0 it is NP-hard given a
bipartite graph G = (U ∪· V , E) (with |U | = |V |) to distinguish between two cases.
. Notes
. Notes
The dictatorship test described in Section is based on the ‘It ain’t over till it’s over’
theorem conjectured by Friedgut and Galai and proved by Mossel, O’Donnell and
Oleszkiewicz [MOO]. The reduction from Unique Label Cover to Vertex Cover,
described in Section comes from a paper by Bansal and Khot [BK]. Its modifica-
tion proving Theorem , described in Section . is original, albeit based on the same
idea, as one in our paper with Szymon Dudycz and Mateusz Lewandowski [DLM]
published at IPCO —only in that paper we were modifying a different reduction
of Khot and Regev [KR]. The proof of Theorem described in Section comes
from our aforementioned paper [DLM].
The original reduction from ULCE to Balanced Bi-Clique, revisited in Section ,
was presented by Bhangale, Gandhi, Hajiaghayi, Khadenkar and Kortsarz [Bha+].
Our modification proving Theorem comes from our paper with Szymon Dudycz
and Pasin Manurangsi [DMM].
The reduction from SSE to ULC described in Section was presented by Raghaven-
dra and Steurer [RS] and expanded on in their paper with Tulsiani [RST]. An
accessible write-up of this reduction and other results on Small Set Expansion is
Steurer’s PhD thesis [Ste].
The reduction from SSE to Balanced Bi-Clique was given by Manurangsi [Man].
We have modified it with Dudycz and Manurangsi [DMM] to prove Theorem .
The Minimum Maximal Matching problem was (on separate occasions) suggested
to me and Szymon Dudycz by Katarzyna Paluch with expectation that we improve
upon the trivial -approximation algorithm.
Proportional Approval Voting
. Algorithmic question
Finding a winning committee in a w-Thiele electoral system becomes a computational
task. It turns out only to be solvable in polynomial time when w = (1, 1, 1, . . . ) .
As opposed to the rank-based systems like Single Transferrable Vote (used in Australia), where
voters order the candidates from the most to the least preferred.
. Algorithmic question
All previously studied w-Thiele electoral systems are defined by geometrically domi-
nant sequences w.
Theorem
Let w be any geometrically dominant sequence. There exists a polynomial-time αw -
approximation algorithm for w-Thiele rule where
x ∞ x
X X 1 X
αw = E wi = wi .
xzPoi(1) ex!
i=1 x=1 i=1
Forsimplicity, throughout this chapter we are assuming w1 = 1. Every vector can be scaled so
without any effect on the election.
α is the expected score of a voter, if the number of elected candidates from his ballot is sampled
w
from the Poisson distribution with mean one.
Proportional Approval Voting
Table : Our approximation ratios for w-Thiele rules. They are tight unless P=NP. All
listed w-Thiele rules were known to have (1 − 1/e)-approximation algorithm [NWF;
SFL]. To the best of our knowledge, none better than (1 − 1/e)-approximation
algorithm was known for any of these rules.
Theorem
Let w be any non-increasing sequence such that limi→∞ wi = 0. For any > 0 it is NP-
hard to compute an (αw + )-approximate solution to w-Thiele. Furthermore, assuming
Gap-ETH, such an approximation cannot be computed even in time f (k) · (|V | + |C|)o(k) .
specifying for each candidate fractionally, how much the candidate is selected.
|Av |
XX
scrw (x) = wl · min {1, max {0, xAv − l + 1}},
v∈V l=1
where xAv = c∈Av xc . Intuitively, for each voter v this function adds the first xAv
P
elements of the sequence w. The last element of the sequence may be counted in only
fractionally. With this definition, we now compute the optimum fractional solution
x∗ of the following linear program.
maximize scr w (x)
X
subject to xc = k ()
c∈C
x ∈ [0, 1]C
Remark (Solving the LP). The program () is not really an LP, as its objective function
is not linear. To solve it we need to rewrite it into an equivalent form:
|Av |
XX
maximize wl · yvl
v∈V l=1
X
subject to xc = k
c∈C
|Av |
X X
yvl 6 xc ∀v ∈ V
l=1 c∈Av
This can be done with a simple dynamic programming procedure: Let dpv [k][i]
denote the probability that exactly i candidates out of c1 , . . . , ck ∈ Av have been selected.
Clearly it can be computed using the identity
Lemma ([Bar+] ). Let x ∈ [0, 1]m be such that x1 + · · · + xm = τ. Then, for P any non-
negative sequence (a1 , a2 , . . . ), there exist q ∈ (0, 1) and x ∈ {0, q, 1} such that i xi0 = τ
0 m
and m m
X X X X
a P
l x̂∼x x̂i = l > a P
l x̂∼x0 x̂i = l .
l i=1 l i=1
. The approximation algorithm
Now, notice ∗
P that∗ the denominator scrw (x , v) in our ratio ρv only ∗depends on the
sum τ = c∈Av xc and not the way values are distributed among xc ’s. Hence, from
Lemma , ρv is minimised by x∗ that only takes three values—0, q and 1, although
we do not yet know what the value of q is .
Lemma . Suppose xc̃∗ = 1 for a candidate c̃ ∈ Av . Then the approximation ratio for the
voter v decreases after removing c̃ from the fractional solution x∗ . Formally, if xc̃∗ = 1, then
P∞ P∞
l=0 (wl Px̂∼x∗ [ c x̂c > l]) l=0 (wl Px̂∼x∗ [ c,c̃ x̂c > l])
P P
ρv = > .
w1 + · · · + wbτc + {τ } · wbτc+1 w1 + · · · + wbτ−1c + {τ } · wbτc
∆int
In fact we will just show that c̃
> 1. Starting with the numerator.
∆frac
c̃
∞
X X X
∆int
= w · P −
x̂ l x̂ l
> P >
c̃ l x̂∼x∗ c x̂∼x∗
c
l=0 c c,c̃
∞
X
X X
= wl · P x̂c > l − P ∗ x̂c > l + 1
x̂∼x∗ x̂∼x
l=0 c c
In [Bar+, Lemma .], τ is required to be an integer. It is however obvious from the proof that
this is not necessary.
Note that this reduction cannot be performed globally. It only makes sense for a single voter v and
his candidates Av , as applying Lemma to different sets {xc∗ }c∈Av may result in different q values for
the same candidate.
Proportional Approval Voting
∞
X
X X X
= wl · P ∗ x̂c = l = E ∗ w x̂c = w(τ) = ∆frac
x̂c > w E ∗ c̃
x̂∼x x̂∼x x̂∼x
l=0 c c c
E[w1 + · · · + wPoi(τ) ]
ρv > . ()
w1 + · · · + wbτc + {τ } · wbτc+1
bτc
X
E[w1 + · · · + wPoi(τ) ] = E [wy+1 + · · · + wy+Poi(1) ]
yzPoi(l−1)
()
l=1
+ E [wy+1 + · · · + wy+Poi({τ}) ].
yzPoi(bτc)
Before we attempt to prove it, it will be helpful to have the following lemma at hand.
EkzPoi(δ) [w1 +w2 +···+wk ]
Lemma . Let f (δ) = δ . For any δ ∈ (0, 1] f (δ) > f (1).
Proof. We start the proof with a simple calculus:
d EkzPoi(x) [g(k)]
Observation. For any g : N → R, = E [g(k + 1) − g(k)].
dx kzPoi(x)
E [w +w +···+w ]
Using the above we analyse the function f (δ) = kzPoi(δ) 1δ 2 k
. Its first derivative
is
df EkzPoi(δ) [δwk+1 ] − EkzPoi(δ) [w1 + · · · + wk ]
f 0 (δ) = = .
dδ δ2
We are going to show that f 0 (δ) 6 0 for δ ∈ (0, 1] by focusing on the numerator
0
fnum (δ) (the denominator δ2 is always going to be positive). Obviously fnum 0 (0) = 0.
0
dfnum
Moreover = δ E [wk+2 − wk+1 ] is never greater than 0 since the sequence w
dδ kzPoi(δ)
is non-increasing.
Proportional Approval Voting
Putting together (), (), Lemmas and allows us to conclude that the ratio ρv is at
least
w1 + · · · + wbτc + {τ } · wbτc+1
E[w1 + · · · + wPoi(1) ] ·
w1 w1 + · · · + wbτc + {τ } · wbτc+1
. Hardness of Approximation
The hardness of approximation proof is based on a reduction, much like those in
Chapter . We do however want this reduction, when applied to Theorem ., to
prove the second part of our Theorem . Our reduction must therefore not only be
complete and sound, but also must be aware of the size of the instance and parameter
k. We will prove the following:
Theorem
Let w be such that limi→∞ wi = 0. For any > 0, there exist δ > 0, t ∈ N and a reduction
that takes in an instance L = (A, B, E, [L], [R], {πe }e∈E ) of LC(δ, t, R) and produces an election
E = (V , C, {Av }v∈V ) in poly(|C|, |V |) time such that:
• (Completeness) If val(L) = 1, then there exists W ⊆ C such that scrw (W ) = |V |.
• (Soundness) If weak-val(L) < δ, then for all W ⊆ C, we have scrw (W ) = (αw + ) · |V |.
• (Size bound) |C| = |A| · L and |V | = |B| · t R .
• (Parameter) k = |A|.
. Hardness of Approximation
Plugging in Theorem to Theorems ., . immediately yields the desired hardness
of approximation in Theorem .
The rest of this section is devoted to the proof of Theorem . Our reduction is in
fact the same as Feige’s [Fei], which can be interpreted as the inapproximability for
w-Thiele electoral system with Chamberlin-Courant rules. Hence, on the hardness
front, our main contribution is in extending the analysis to work with more general
sequences of weights. To this end, we provide several helpful auxiliary lemmas in
Section .. Then, in Section ., we describe Feige’s reduction together with our
generalized analysis, and prove Theorem .
Lemma . For any sequence w such that limi→∞ wi = 0, and any γ, λ ∈ R+ exists a
boundary p0 = p0 (w, γ, λ) such that, for any positive p < p0 and any positive integer
N < λ/p we have
h i h i
E w1 + · · · + wBin(N ,p) < E w1 + · · · + wPoi(N p) + γ.
To this end we rely on a classic result that gives an upper bound on the total vari-
ation distance between the Binomial distribution and the corresponding Poisson
distribution.
Theorem ([LeC])
For any n ∈ N and p ∈ R>0 , ∞k=0 |P[Bin(n, p) = k] − P[Poi(np) = k]| < 4p.
P
0.01γ 2
Let i ∗ be the smallest positive integer such that wi ∗ 6 0.1γ/λ. We let p0 = i ∗ λ . We
∗
will separate the summands in () into two parts, based on whether k 6 k0 B b i λ·10 γ c.
Proportional Approval Voting
The distribution that will turn up in our reduction is in fact a bit different from
the Binomial distribution. It can be described as a grouped Binomial Distribution,
where the outcome of N coins is decided in l coin-flips.
Lemma . Let w be such that limi→∞ wi = 0. For any p ∈ R+ , and l non-negative
integers m1 , . . . , ml , we have
We will prove a much stronger statement, similar to that of Barman et al. (which we
recalled before as Theorem ).
. Hardness of Approximation
P, u v
∗ ∗ ∗ ∗ ∗ ∗
Figure : A Hypercube Set System. The set P5,1 contains all vectors with at fifth
position, so u < P5,1 but v ∈ P5,1 .
Proposition . For any convex function f , any parameter p ∈ [0, 1], and l non-negative
integers m1 , . . . , ml , we have Ez1 ,...,zl zBer(p) [f ( li=1 zi mi )] > E[f (Bin( li=1 mi , p))].
P P
Proof. Similarly to [SS, Theorem .A.], let the independent random variables
m
Z1 , . . . , Zl z Ber(p) and for each i ∈ [l], Xi1 = · · · = Xi i = Zi are copies of Zi . We then
have
m m
[f ( li=1 zi mi )] = E f X11 + · · · + X1 1 + · · · + Xl1 + · · · + Xl l Z1 , . . . , Zl
P h i
E
z1 ,...,zl zBer(p)
m m
> E f E[X11 |Z1 ] + · · · + E[X1 1 |Z1 ] + · · · + E[Xl l |Zl ]
h i
(Jensen’s Inequality)
= E[f (Bin( li=1 mi , p))].
P
Reduction
Input: An LC(δ, t, R) instance L = A ∪· B, E, [L], [R], {πe }e∈E .
Output: An election E = V , C, {Av }v∈V , k with V = B × [t]R , C = A × [L], and
k = |A|.
For each vertex b ∈ B we fix an arbitrary order on its t neighbours a1 , . . . , at . A voter
(b, u) ∈ V approves of a candidate (ai , σ ) ∈ C if ai is (i-th) neighbour of b, and
u(πai ,b (σ )) = i (u belongs to the set Pπa ,b (σ ),i in the (t, R)-hypercube set system).
i
This reduction is tuned by parameters t and δ which we can feed into Theorems .
and .. We pick them based on expected in Theorem as follows:
∗
• Let i ∗ denote the smallest positive integer such that wi ∗ < 0.1, and let ϑ = 10 i .
• Let γ = 0.7, p0 = p0 (w, γ, 10ϑ) be as in Lemma .
0.0001
• We select t = d2/p0 e and δ = ϑ3 t2
.
The reduction obviously runs in poly(|C|, |V |) time and produces an instance of size
as stated in Theorem .
Completeness
Suppose n that there is o an assignment φ : A → [L] that satisfies every b ∈ B. Pick
W = (a, φ(a)) | a ∈ A . We claim that scrw (W ) > |V |: Specifically, we will observe
that every voter had voted for at least one candidate in W . Consider a voter (b, u) ∈
B × [t]R and let a1 , . . . , at be b’s (ordered) neighbours. Since φ satisfies b, we have
π(a1 ,b) (φ(a1 )) = · · · = π(at ,b) (φ(at )) = r ∈ [R]. The value i B u(r) tells us, which of the t
candidates was supported by (b, u)—it was (ai , φ(ai )).
Soundness
Assume by contraposition that there exists a subset W ⊆ C of k candidates, such that
scrw (W ) > (αw + ) · |V |. We will show that weak-val(L) > δ.
For every b ∈ B, let V [b] = {b} × [t]R . Let also degW (b) B |W ∩ (Γ (b) × [L])|, where
Γn (b) denotes the set o of neighbours of b in L. For every a ∈ A, let Wa ⊆ [L] denote
σ ∈ [L] | (a, σ ) ∈ W . Finally, let D = 10ϑt.
We divide the set B into three sets:
• Let B2 denote the set of b ∈ B such that degW (b) 6 D and, for all distinct neigh-
bours a1 , a2 of b and all σ1 ∈ Wa1 , σ2 ∈ Wa2 , we have π(a1 ,b) (σ1 ) , π(a2 ,b) (σ2 ).
• Let B3 = B \ (B1 ∪ B2 ).
We will next argue that |B3 | > Ωt, (|B|), which will then allow us to ‘decode’ back
the assignment to the Label Cover instance L . To do so, we start by bounding the
contribution of B1 to scrw (W , B1 )—this may be seen as an argument that it does not
make sense for the committee W to satisfy the voters very unevenly. The main idea is
that when degW (b) is large, the average score per candidate becomes small, which gives
the following upper bound on the desired quantity:
Proof. Let us fix a vertex b ∈ B1 . Notice that each candidate (a, σ ) ∈ Γ (b) × [L] is
approved by exactly |V t[b]| voters in V [b] (and voters in V [b] do not approve P∞ any
candidate outside Γ (b) × [L]). As a result, scrw (W , V [b]) is of the form i=1 ri wi for
|V [b]|
some sequence (ri )i∈N of non-negative integers that satisfies ∞ i=1 ri = degW (b) · t
P
and |V [b]| > r1 > r2 > · · · (r1 voters having at least one candidate P∞ in W , r2 having at
least two, etc.). It is simple to see that among such sequences i=1 ri wi is maximized
when r1 = · · · = rm = |V [b]| for m = degW (b)/t , rm+1 = |V [b]| · {degW (b)/t } and rm+2 =
rm+3 = · · · = 0. In other words, scrw (W , V [b]) is upper bounded by
m ! X m
X deg W (b) · |V [b]|
|V [b]| · w + |V [b]| · {deg (b)/t } · w · ·w
i W m+1 6 i
tm
i=1 i=1
degW (b) · |V [b]|
!
∗
(From definition of i ) 6 · (i ∗ + (m − i ∗ )0.1)
tm
deg (b) · |V [b]|
!
(From our choice of D, ϑ; i ∗ 6 0.1m) 6 W
(0.2m)
tm
degW (b) · |V [b]|
!
= (0.2) .
t
As
we will see later, the bound degW (b) 6 D for b ∈ B3 will play the same role as number s in the
decoding in Lemma ..
Proportional Approval Voting
|V | X |V | X
= 0.2 · · degW (b) 6 0.2 · · degW (b)
|B| · t |B| · t
b∈B1 b∈B
|V | X |V |
= 0.2 · · |Γ (a)| 6 0.2 · · |B| · t = 0.2 · |V |,
|B| · t |B| · t
(a,σ )∈W
with the last inequality following from |W | = |A| and bi-regularity of the LC instance.
Next, we bound the score contribution from B2 , using the auxiliary lemmas shown
in the previous subsection.
Proposition . b∈B2 scrw (W , V [b]) 6 (αw + 0.7) · |V |.
P
Proof. Let us fix b ∈ B2 . Let a1 , . . . , at be its (ordered) neighbours. For each κ ∈ [R], let
τ κ = |{(a, σ ) ∈ W | a ∈ N (b), π(a,b) (σ ) = κ}|. Moreover, if τ κ , 0, let i κ denote the index
i ∈ [t] of the neighbour s.t. π(ai ,b) (σi ) = κ for some σi ∈ Wi . (Due to our definition of
B2 , there is a unique such i.) If τ κ = 0, let i κ =⊥.
Recall that a voter (b, u) approves a candidate (ai , σ ) if uπ(a ,b) (σ ) = i. Hence, the
i
number of candidates in W approved by (b, u) ∈ Vb is
X X X
1[uπ(a ,b) (σ ) = i] = τ κ · 1[uκ = i κ ].
i
i∈[t] σ ∈Wai κ∈[R]
Observe that for a fixed vertex b, τ κ are non-negative integer constants such that
κ 1 R
κ∈[R] τ = degW (b). 1[u1 = i ], . . . , 1[uR = i ] are independent random variables,
P
identically distributed by Ber(1/t). We may apply Lemmas and to get
scrw (W , V [b]) 6 |[t]R | · E w1 + · · · + wPoi(degW (b)/t) + γ.
h i
Decoding the assignment to Label Cover From the two P propositions above and our
assumption that scrw (W ) > (αw + ) · |V |, we must have b∈B3 scrw (W , V [b]) > 0.1 · |V |.
Obviously, for b ∈ B3 we have scrw (W , V [b]) 6 degW (b) · |Vtb | , which is at most D·|Vt·|B|
|
for
b ∈ B3 . Thus,
0.1 · |V | 0.1t|B| 0.01|B|
|B3 | > = = . ()
D · |V |/(t · |B|) D ϑ
Now, consider the following (random) assignment: for every a ∈ A such that Wa , ,
let φ(a) be a random element of Wa . By definition, each b ∈ B3 has distinct neighbours
a1 , a2 and σ1 ∈ Wa1 , σ2 ∈ Wa2 such that π(a1 ,b) (σ1 ) = π(a2 ,b) (σ2 ). Hence, the probability
1
that φ weakly satisfies such b is at least |W |·|W |
> D12 , where the inequality follows
a1 a2
from degW (b) 6 D. Thus,
|B3 | () 0.01 0.0001
E[weak-val(φ)] > > = = δ.
φ |B| · D 2 ϑ · D2 ϑ3t2
This completes our proof of Theorem .
S
S A A A
S
Figure : Tight instance for the greedy algorithm. The universe of m · k voters is
covered by 2k candidates. The optimal solution is to pick S1 , . . . , Sk and give every
voter exactly one supported candidate in the committee. The greedy algorithm will
pick A1 , . . . , Ak instead. Compared to the tight example one would build for Maximum
Coverage, here the sets Ai must be larger, because even after picking A1 , . . . Al the
sets Si still gain some score from the already “covered” voters.
2
Clearly, when there is an element i of w with wi · wi+2 > wi+1 , the inequality in
Proposition becomes strict, as v < w. Let us now analyse the greedy algorithm for
these w-Thiele rules.
Theorem
For any Thiele sequence w = (1,
p, w3 .. . ), the approximation ratio of the greedy algorithm
1 1
for w-Thiele is at most 1−p · 1 − e1−p .
We will prove this theorem by building a specific instance. The instance is defined
below (see also Figure ), and its construction is parametrised by numbers m, k (the
committee size), and p (the second element of w).
scrw ({S1 , . . . , Sk }) = m · k.
|Al | m 1 − p l−1
|Si ∩ Al | = = 1− .
k k k
No voter is supporting both Ai and Al for i , l (Ai ∩ Al = ).
. Notes
Proposition . For any l ∈ {0, . . . , k − 1} if the candidates A1 , . . . , Al have been picked into
the committee, the greedy algorithm will select Al+1 .
Proof. To verify this we need to compare scrw (Al+1 | A1 , . . . , Al )—a score gain from
picking Al+1 once A1 , . . . , Al have been selected—to scrw (Si | A1 , . . . , Al ) (for any i ∈ [k]).
Since the electorates of Ai candidates are pairwise disjoint, the former is very simple:
1−p l
scrw (Al+1 | A1 , . . . , Al ) = |Al+1 | = m · 1 − .
k
To calculate the latter we notice that some of Si ’s voters are already satisfied by the
already picked candidates. Those voters will only earn an additional score of p. The
other voters supporting Si will gain 1 when he is picked.
scrw (Si | A1 , . . . , Al ) = p · |Si ∩ (A1 ∪ · · · ∪ Al )| + |Si \ (A1 ∪ · · · ∪ Al )|
|A | + · · · + |Al | |A | + · · · + |Al | 1−p
=p· 1 +m− 1 = m− (|A1 | + · · · + |Al |)
k k k
l
1−p X 1 − p j−1 1−p l
= m−m 1− = m· 1− .
k k k
j=1
We have thus established that the greedy algorithm will indeed select A1 , . . . , Ak . We
also need to compute the score of the greedy solution,
1−p k k
!
scrw ({A1 , . . . , Ak }) = |A1 | + · · · + |Ak | = m · 1 − 1 − ,
k 1−p
which allows us to bound the approximation ratio of the greedy algorithm:
1−p k
m·k
1− 1− k
scrw ({A1 , . . . , Ak }) 1−p 1 1
= −−−−−→ · 1 − 1−p .
scrw ({S1 , . . . , Sk }) m·k k→∞ 1−p e
. Notes
The pipage rounding method has been introduced by Ageev and Sviridenko [AS]
with an application of optimising a function over bipartite matching constraints. It
Onemay break ties by adding k additional voters u1 , . . . , uk in the instance. ui would only approve
the candidate Ai . This increases the score of the greedy solution only by k, with no impact on the
approximation ratio.
Proportional Approval Voting
was later extended by Călinescu, Chekuri, Pál and Vondrák [Căl+] to any matroid
constraints. In our algorithm in Section we are applying this powerful method to
a simple cardinality constraint, but in fact it could be replaced with any matroid
constraint without effect on the approximation ratio.
The algorithm of Barman, Fawzi, Ghosal and Gurpinar [Bar+] for the Multi `-
Coverage problem is also using the pipage rounding framework. Their hardness is
based on Unique Game Conjecture—an assumption we are able to get rid of with our
reduction.
The results in this chapter are based on our paper with Szymon Dudycz, Pasin
Manurangsi and Krzysztof Sornat, which appeared at IJCAI [Dud+]. We left
unanswered the question of approximation of w-Thiele rules with w sequence not
geometrically dominant. This has been subsequently resolved by Barman, Fawzi and
Fermé [BFF].
k-Medians clustering with
capacities
In this chapter we will focus on a topic of metric clustering. In the clustering
problems we are given a discrete metric (defined by a distance function d obeying
the triangle inequality) and a number k. Our task is to select k points (centres) in the
metric and assign every other point to one of the k clusters identified by the selected
representatives. The objective may be to minimise a maximum distance from a point
to its representative (this problem is called k-Center), the sum of distances from
the points to their respective centres (k-Median) or the sum of squared distances
(k-Means).
A common application is categorising data which is usually represented by long
vectors within the metric defined by the Hamming distance. For k-Means, Data
Scientists would typically use a library function implementing some local-search
heuristic, like Lloyd’s algorithm [Llo] initialised with a seed given by a proce-
dure called k-means++ [AV]. This algorithm—although practical—only gives an
approximation guarantee of O(log k).
Our Capacitated k-Median (CkM) problem adds another difficulty to the k-Median:
Every point in the metric has an associated capacity and at most that many members
may belong to its cluster. We describe it in the language of Facility Location.
Problem (Capacitated k-Median)
Given a metric (F ∪ C, d), and for every facility f ∈ F a capacity uf , find a subset
S ⊂ F of size k and an assignment of clients φ : C → S such that:
• For every facility f ∈ S, its capacity is respected, i.e. |φ−1 (f )| 6 uf ,
• The total distance from clients to their facilities, i.e.
X
d(φ) B d(c, φ(c)),
c∈C
An instance of CkM is uniform if the capacities are equal for all the facilities.
Remark. The problem becomes marginally harder in a client-weighted variant, when
each client location i ∈ C has cardinality ci of clients, which can be directed to
different facilities. The additional complexity comes from the fact, that ci may be
large compared to the size of the metric.
. Related work
Problems without capacities There is an extensive body of work on approxima-
bility of the clustering problems: For the k-Center problem a simple local search
procedure gives 2-approximation [Gon], which cannot be improved unless P =
NP [HN]. Also for k-Means and k-Median, the first constant-factor approxima-
tion algorithms were using this method—giving the ratios of (9 + ε) [Kan+] and
(3 + ε) [Cha+; Ary+] respectively. Stronger results based on linear programming,
and taking advantage of the similarity between clustering and Facility Location,
have been attained later. Currently the best ratio for k-Means is 6.357 [Ahm+],
and for k-Median–(2.675 + ε) [LS; Byr+].
Our results We cheat in a different way. Rather than violating the cardinality
constraint or the capacities, we abuse the time complexity, by constructing a constant-
factor approximation algorithm for CkM which runs in FPT(k) time.
Theorem
For any > 0 there is a (7 + )-approximation algorithm for Capacitated k-Median
(Problem ) running in time 2O(k log k) · poly(|F | + |C|).
This result makes sense from the point of view of parametrised complexity, as the
k-Median problem (even without capacities) is W[2]-hard, which makes it unlikely
to admit exact FPT(k) algorithms.
Proof. In the Dominating Set problem we are given a graph G and a parameter k,
and must decide, whether there is a S subset of k vertices dominating entire graph
(every vertex must either belong or have a neighbour in S). The Dominating Set
problem is W[2]-hard with respect to the parameter k (see e.g. [Cyg+]).
We construct k-Median instance as follows: let F = C = V (G). The metric is
induced by shortest paths in G. A dominating set of size k in G is equivalent to a
k-Means solution of total cost equal to |V (G)| − k.
Our algorithm from Theorem is composed of two ingredients. The first is a re-
duction that allows us to solve the problem on simpler metrics called l-centred rather
than general metrics. We describe this reduction in Section . In Section we will
familiarise ourselves with the l-centred instances by using them to construct a simple
O(log k)-approximation algorithm for CkM. The second ingredient is a parametrised
algorithm for CkM on the l-centred instances. In Section . we will present its simpler
form—an exact algorithm for uniform instances. Finally in Section . we extend it
to work on general instances (and pay a marginal price in the approximation ratio).
. l-Centred instances
The l-centred metric is an alternative distance function defined on the same universe
F ∪C. We start off by selecting a set S ⊂ F of l facilities—a solution to l-Median (for a
number l > k) on the universe F ∪ C (ignoring the capacities). This solution naturally
determines a mapping φS : C ∪ F → S, with every point mapped to the closest facility
of S (see Fig. (a)).
The l-centred distance function d S will have two layers. For any f , g ∈ S it is
identical to the original metric, i.e. d S (f , g) = d(f , g). To travel anywhere from a point
k-Medians clustering with capacities
(a) A set of S ⊂ F determines a mapping (b) To go from a to b in the l-centred metric, one
φS corresponding to the Voronoi diagram must first go to a’s centre (to the top layer), then
on F ∪ C. follow the original metric to the b’s centre and
from there to b.
a < S, we first route to a’s centre φS (a). Generally the new metric is defined as follows
(see Figure (b)):
d S (a, b) B d(a, φS (a)) + d(φS (a), φS (b)) + d(φS (b), b). ()
Proof. The left inequality is a trivial consequence of the triangle inequality. We focus
on the right inequality:
Fortunately, since the solution S ignores the capacities, we can ensure that its cost is
a constant approximation of the optimum solution to the Capacitated k-Median on
F ∪ C. Additionally we can count on a better approximation by picking l significantly
larger than k.
Altogether, we reduce the Capacitated k-Median on general metric d to two
different problems: The k-Median (without capacities) with the cardinality constraint
relaxed from k to l, and the CkM on the l-centred metric.
Lemma . For any metric d over the universe F ∪ C and capacities {uf }f ∈F let φOPT
denote the optimum solution to CkM on that instance. Let S ⊂ F with φS picking the closest
centre in S for every point (according to the metric d), and let l = |S|. Let also φOPT(S)
denote the optimum solution to CkM on metric d S .
If a solution φ respects the capacities, and satisfies
d S (φ) 6 α · d S (φOPT(S) ) (Asm. )
(φ is an α-approximation for CkM on the l-centred metric d S ), while φS satisfies
d(φ(S)) 6 β · d(φOPT ) (Asm. )
(which holds when it is a β-approximation for k-Median on the metric d), then
d(φ) 6 α · (3 + 4β) d(φOPT )
(φ is an α · (3 + 4β)-approximation for CkM on metric d).
Proof. Starting from the left side:
() (Asm. ) (p)
d(φ) 6 d S (φ) 6 α · d S (φOPT(S) ) 6 α · d S (φOPT )
() (Asm. )
6 α · 3d(φOPT ) + 4d(φS ) 6 α · 3d(φOPT ) + 4β · d(φOP T ) ,
where the inequality (p) follows from optimality of the solution φOPT(S) for CkM on
metric d S .
k-Medians clustering with capacities
• d T is a shortest-path metric on a weighted tree with vertex set T ; X are the leaves in
that tree,
• For any a, b ∈ X,
d(a, b) 6 d T (a, b),
(so d T is indeed a metric embedding of d),
• For any a, b ∈ X, the expected distortion—with respect to the random choice of the
metric T by the algorithm—is bounded:
We will apply Theorem to the metric d over the ground set S—the upper layer
of our l-centred metric d S . This new metric d T can replace d at the upper layer of d S .
To that end we define
for any a, b ∈ F ∪C. The metric d T defined over the set F ∪C∪T , is induced by shortest
paths on an edge-weighted tree. Let now φOPT(T ) denote the optimum solution co
CkM on that metric. It turns out, that φOPT(T ) is an O(log k)-approximation for CkM on
the original metric d.
. Polynomial-time O(log k)-approximation for CkM
Proposition . The expected cost of φOPT(T ) —with respect to the random choice of T in
Theorem —measured on metric d is bounded by
where φOPT(S) is the optimum solution to CkM on the l-centred metric d S , and ρ = O(log k)
comes from Theorem .
Proof. The first inequality is given by the fact that d T is a metric embedding of d.
Since φOPT(T ) is optimal on the metric d T , we have
d T (φOPT(T ) ) 6 d T (φOPT(S) ) .
h i h i
E E
T zFRT(S,d) T zFRT(S,d)
For a leaf c ∈ C, dp[c, 0, −1] = d T (ec ). We set dp[c, k 0 , b] = ∞ for every other value of
k0 and b, as they are infeasible. Similarly, for a leaf f ∈ F we set
if k 0 = 0,
0
dp[f , k 0 , b] = b · d T (ef ) if k 0 = 1 and b ∈ [0, uf ],
otherwise.
∞
Finally, for t with two subtrees t1 , t2 , computing dp[t, k 0 , b] amounts to finding k10 and
b1 which minimize
dp[t1 , k10 , b1 ] + dp[t2 , k 0 − k10 , b − b1 ].
This can be trivially done by iterating over all O(k · |C|) choices. Once such a pair is
found, we set
At the root we can read the optimum solution to CkM on entire tree T , as it is equal to
. Constant-factor approximation
In this section we will present the algorithm for Capacitated k-Median on l-centred
metrics. The algorithm is a composition of ideas which we are going to gradually
introduce—first, in Section ., we describe the exact algorithm for the Uniform
CkM, and only then we will put together the (1 + )-approximation algorithm for the
non-uniform variant. Both algorithms will guess the k facilities to open, and rely
on a subroutine which (in polynomial time) optimally maps clients to the opened
facilities. We start by describing this subroutine.
(In our integer program, xc,f = 1 represents that φ(c) = f .) Let us consider the linear-
programming relaxation of (), where the last constraint is replaced with xc,f > 0.
The constraint matrix of this LP is an incidence matrix of a complete bipartite graph
on C ∪· F open and is known to be totally unimodular, which means that the relaxed LP
has an optimal solution with integral values of xc,f .
Remark. The solution is integral regardless of the right side of the inequalities in ()
(as long as they are integers), so LP-solving (which can be done in polynomial time)
gives us the optimal mapping even in the client-weighted variant of CkM.
(a) In the algorithm for Uniform CkM we guess (b) In the general (non-uniform) case, every
the number of open facilities in every cell cell is divided into rings of geometrically in-
( in figure) and pick the closest facilities to creasing radii. We guess the number of open
the centre. This minimises the distance (as facilities in every ring, and pick the ones with
every client is routed through the centre in largest capacity (facilities in the same ring
l-centred metric). have almost equal distance to the centre).
|C|
n & '
−i
o
f ∈ F [s] | d(s, f ) ∈ [0, (1 + ) · D] for i = log1+
,
i
F [s] B
|C|
& '
−(i+1) −i
n o
f ∈ F [s] | d(s, f ) ∈ [(1 + ) · D, (1 + ) · D] for i < log1+
.
The clients further than D from their centre may be disregarded, since they will not
appear in the optimum solution.
Once we know the number ksi of facilities to open in a particular ring, we may
again do it greedily—this time by opening ones with largest capacity.
. Constant-factor approximation
D E
Proposition . Let F OPT(S) , φOPT(S) be the optimum solution to CkM on l-centred metric
D E
d S . For every ring F i [s] let ksi = |F i [s] ∩ F OPT(S) |. Let F SOL , φSOL be determined by
picking ksi largest facilities in each ring F i [s]. Then the following inequality holds:
d S (φSOL ) 6 (1 + ) · d S (φOPT ) + · D 6 (1 + 2) · d S (φOPT ). ()
Proof. By definition,
X
d S (φOPT(S) ) = d S (c, φOPT(S) (c))
c∈C
−1
X X
d S c, φS (φOPT(S) (c)) + φOPT(S) (f ) · d S φS (f ), f
=
c∈C f ∈F OPT(S)
X
d S c, φS (φOPT(S) (c))
=
c∈C B
−1
XX X
φOPT(S) (f ) · d S φS (f ), f .
+
s∈S i=0 f ∈F OPT(S) ∩F i [s]
Let us analyse the additional cost incurred by replacing facilities from F OPT(S) with
those from F SOL . Define a bijection σ : F OPT(S) → F SOL which preserves every ring
F i [s]. We set φ(c) B σ (φOPT(S) (c)). The solution F SOL has picked the largest-capacity
facilities in every ring, so φ does not violate any capacities. Now, we have
X
d S (φ) = d S c, φS (φOPT(S) (c))
c∈C B
−1
XX X
φOPT(S) (f ) · d S σ (φS (f )), f .
+
s∈S i=0 f ∈F OPT(S) ∩F i [s]
since the rings have geometrically increasing radii. The central ring, with i = B, has
inner radius equal to 0, so that kind of proportion on distances does not hold. Its
radius is however small, so we have
·D
d S σ (φS (f )), f 6
,
|C|
for every facility f ∈ F B [s]. Even if all |C| clients were connected to the facilities in the
central rings of the Voronoi cells, this cost would sum up to · D. Finally we observe,
that d S (φSOL ) 6 d S (φ), as φSOL is the optimal mapping to F SOL .
The right inequality of () is given by D 6 d S (φOPT(S) ).
k-Medians clustering with capacities
Proposition lays out the algorithm for us. P AfterP guessing the upper-bound D,
we scan all configurations (ksi )s∈Si∈{0,...,B} with s∈S Bi=0 ksi = k, which determine the
number of open facilities in every ring. For each such configuration we greedily
open ksi largest-capacity facilities in every ring F i [s] and find the optimal mapping
of clients to the facilities using our subroutine from Section .. There are at most
k
O l · 1 · ln |C|
possible configurations and |F | · |C| choices of D.
• |U | 6 (1 + 1ε )k(ln(|C ∪ F |) + 1),
• c∈C minf ∈U {d(c, f )} 6 (1 + ε) c∈C minf ∈F OPT {d(c, f )},
P P
By plugging Theorem to construct the l-centred metric and running the al-
gorithm from Section . on that metric, we obtain a (7 + O())-approximation for
Capacitated k-Median with running time
!k
1 1 |C|
O 1 + · k · (ln(|C ∪ F |) + 1) · · ln · poly(|F ∪ C|)
()
k 1
2
= O k · ln (|C ∪ F |) · poly k · poly(|F ∪ C|).
We will use a standard trick in parametrised complexity to bound this running time.
ln n
Proof. We analyse two cases. First, if 2 ln ln n 6 k, we have ln n = O(k ln k) and thus
2k O(k) ln n 2k ln n
(ln n) = k . If 2 ln ln n > k, then (ln n) = exp(2k · ln ln n) < exp(2 2 ln ln n · ln ln n) =
n.
By applying Proposition with n = |F ∪ C| to () we obtain the bound on complexity
!O(k)
k
· poly(|F ∪ C|),
. Notes
The topic of approximation in parametrised time—a crossover of approximation
algorithms and parametrised complexity—has gained a significant popularity in
recent years. It has allowed to break some known (or conjectured) bounds on approx-
imability by abusing the running time, like (2 − ε) ratio for k-Cut [GLL]. At the
same time, some problems have turned out to be resistant to such improvements.
One such example is presented in Chapter of this thesis.
The results in this chapter are based on our paper with Marek Adamczyk, Jarosław
Byrka, Syed Mohammad Meesum and Michał Włodarczyk, which was presented
at ESA [Ada+]. After the preprint of our paper was announced, Xu et al.
worked out a similar algorithm for Capacitated k-Means on euclidean metrics. Both
our and their ratios have soon been improved by Cohen-Addad and Li [CL], who
obtained (3 + ε)-approximation for Capacitated k-Median and (9 + ε)-approximation
for Capacitated k-Means. At the same conference, they also showed—together
with Gupta, Kumar and Lee [Coh+]—that an approximation ratio of (1 + 2e − ε) is
impossible for k-Median in time parametrised by k, even without the capacities, if
Gap-ETH holds. The same lower-bound had been known for algorithms running in
polynomial time [Fei].
In a subsequent paper, written together with Jarosław Byrka, Szymon Dudycz,
Pasin Manurangsi and Michał Włodarczyk [Byr+]—in which my involvement was
lesser, and which is therefore not included in this dissertation—we analysed the
approximability of k-Median problem parametrised by |F | − k rather then k. It turns
out, that although (1 + ε)-approximation is possible for k-Median without capacities,
the bound of (1 + 2e ) still cannot be overcome for CkM.
Bibliography
[Thi] Thorvald Thiele. ‘Om Flerfoldsvalg’. In: Oversigt over det Kongelige Dan-
ske Videnskabernes Selskabs Forhandlinger (in Danish). .
[LeC] Lucien LeCam. ‘On the Distribution of Sums of Independent Random
Variables’. In: Bernoulli , Bayes , Laplace . . doi: 10.
1007/978-3-642-99884-3_10.
[Gup] Ram Prakash Gupta. ‘Independence and covering numbers of line graphs
and total graphs’. In: Proof Techniques in Graph Theory. .
[Kar] Richard M. Karp. ‘Reducibility Among Combinatorial Problems’. In:
Proceedings of a symposium on the Complexity of Computer Computations.
. doi: 10.1007/978-1-4684-2001-2\_9.
[MH] Sandra Lee Mitchell and Stephen Hedetniemi. ‘Edge Domination in
Trees’. In: Proceedings of the Eighth Southeastern Conf. on Combinatorics,
Graph Theory and Computing. .
[NWF] George L. Nemhauser, Laurence A. Wolsey and Marshall L. Fisher. ‘An
analysis of approximations for maximizing submodular set functions -
I’. In: Math. Program. (). doi: 10.1007/BF01588971.
[HN] Wen-Lian Hsu and George L. Nemhauser. ‘Easy and hard bottleneck
location problems’. In: Discret. Appl. Math. (). doi: 10.1016/0166-
218X(79)90044-1.
[YG] Mihalis Yannakakis and Fanica Gavril. ‘Edge Dominating Sets in Graphs’.
In: SIAM Journal on Applied Mathematics (). doi: 10.1137/0138030.
[Llo] Stuart P. Lloyd. ‘Least squares quantization in PCM’. In: IEEE Trans. Inf.
Theory (). doi: 10.1109/TIT.1982.1056489.
[Bak] Brenda S. Baker. ‘Approximation Algorithms for NP-Complete Problems
on Planar Graphs (Preliminary Version)’. In: th Annual Symposium on
Foundations of Computer Science, FOCS. . doi: 10.1109/SFCS.1983.7.
Bibliography
[MKI] Yusuke Matsumoto, Naoyuki Kamiyama and Keiko Imai. ‘An approx-
imation algorithm dependent on edge-coloring number for minimum
maximal matching problem’. In: Inf. Process. Lett. (). doi: 10.1016/j.
ipl.2011.02.006.
[RST] Prasad Raghavendra, David Steurer and Madhur Tulsiani. ‘Reductions
between Expansion Problems’. In: Proceedings of the th Conference on
Computational Complexity, CCC. . doi: 10.1109/CCC.2012.43.
[SV] Richard Schmied and Claus Viehmann. ‘Approximating edge dominat-
ing set in dense graphs’. In: Theor. Comput. Sci. (). doi: 10.1016/j.
tcs.2011.10.001.
[LS] Shi Li and Ola Svensson. ‘Approximating k-median via pseudo-approxi-
mation’. In: Symposium on Theory of Computing Conference, STOC. .
doi: 10.1145/2488608.2488723.
[Byr+] Jaroslaw Byrka, Thomas W. Pensyl, Bartosz Rybicki, Aravind Srinivasan
and Khoa Trinh. ‘An Improved Approximation for k-median, and Positive
Correlation in Budgeted Optimization’. In: Proceedings of the Twenty-
Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA. .
doi: 10.1137/1.9781611973730.50.
[Cyg+] Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov,
Dániel Marx, Marcin Pilipczuk, Michal Pilipczuk and Saket Saurabh.
Parameterized Algorithms. . isbn: ----. doi: 10.1007/
978-3-319-21275-3.
[Esc+] Bruno Escoffier, Jérôme Monnot, Vangelis Th. Paschos and Mingyu Xiao.
‘New Results on Polynomial Inapproximabilityand Fixed Parameter Ap-
proximability of Edge Dominating Set’. In: Theory Comput. Syst. ().
doi: 10.1007/s00224-014-9549-5.
[Li] Shi Li. ‘On Uniform Capacitated k-Median Beyond the Natural LP Relax-
ation’. In: Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium
on Discrete Algorithms, SODA. . doi: 10.1137/1.9781611973730.47.
[Bha+] Amey Bhangale, Rajiv Gandhi, Mohammad Taghi Hajiaghayi, Rohit
Khandekar and Guy Kortsarz. ‘Bicovering: Covering Edges With Two
Small Subsets of Vertices’. In: rd International Colloquium on Automata,
Languages, and Programming, ICALP. . doi: 10.4230/LIPIcs.ICALP.
2016.6.
Bibliography
[BRU] Jaroslaw Byrka, Bartosz Rybicki and Sumedha Uniyal. ‘An Approxima-
tion Algorithm for Uniform Capacitated k-Median Problem with 1 + ε
Capacity Violation’. In: Integer Programming and Combinatorial Optimiza-
tion - th International Conference, IPCO. . doi: 10.1007/978-3-319-
33461-5\_22.
[DL] H. Gökalp Demirci and Shi Li. ‘Constant Approximation for Capaci-
tated k-Median with (1 + ε)-Capacity Violation’. In: rd International
Colloquium on Automata, Languages, and Programming, ICALP. . doi:
10.4230/LIPIcs.ICALP.2016.73.
[Din] Irit Dinur. ‘Mildly exponential reduction from gap SAT to polynomial-
gap label-cover’. In: Electron. Colloquium Comput. Complex. ().
[Li] Shi Li. ‘Approximating capacitated k-median with (1+ε)k open facilities’.
In: Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on
Discrete Algorithms, SODA. . doi: 10.1137/1.9781611974331.ch56.
[SFL] Piotr Skowron, Piotr Faliszewski and Jérôme Lang. ‘Finding a Collective
Set of Items: From Proportional Multirepresentation to Group Recom-
mendation’. In: Artif. Intell. (). doi: 10.1016/j.artint.2016.09.003.
[Ahm+] Sara Ahmadian, Ashkan Norouzi-Fard, Ola Svensson and Justin Ward.
‘Better Guarantees for k-Means and Euclidean k-Median by Primal-Dual
Algorithms’. In: th IEEE Annual Symposium on Foundations of Computer
Science, FOCS. . doi: 10.1109/FOCS.2017.15.
[Cha+] Parinya Chalermsook, Marek Cygan, Guy Kortsarz, Bundit Laekhanukit,
Pasin Manurangsi, Danupon Nanongkai and Luca Trevisan. ‘From Gap-
ETH to FPT-Inapproximability: Clique, Dominating Set, and More’. In:
th IEEE Annual Symposium on Foundations of Computer Science, FOCS.
. doi: 10.1109/FOCS.2017.74.
[Man] Pasin Manurangsi. ‘Inapproximability of Maximum Edge Biclique, Max-
imum Balanced Biclique and Minimum k-Cut from the Small Set Ex-
pansion Hypothesis’. In: th International Colloquium on Automata,
Languages, and Programming, ICALP. . doi: 10.4230/LIPIcs.ICALP.
2017.79.
[MR] Pasin Manurangsi and Prasad Raghavendra. ‘A Birthday Repetition The-
orem and Complexity of Approximating Dense CSPs’. In: th Inter-
national Colloquium on Automata, Languages, and Programming, ICALP.
. doi: 10.4230/LIPIcs.ICALP.2017.78.
Bibliography
[GLL] Anupam Gupta, Euiwoong Lee and Jason Li. ‘An FPT Algorithm Beating
-Approximation for k-Cut’. In: Proceedings of the Twenty-Ninth Annual
ACM-SIAM Symposium on Discrete Algorithms, SODA. . doi: 10.1137/
1.9781611975031.179.
[Ada+] Marek Adamczyk, Jaroslaw Byrka, Jan Marcinkowski, Syed Mohammad
Meesum and Michal Wlodarczyk. ‘Constant-Factor FPT Approximation
for Capacitated k-Median’. In: th Annual European Symposium on Algo-
rithms, ESA. . doi: 10.4230/LIPIcs.ESA.2019.1.
[Coh+] Vincent Cohen-Addad, Anupam Gupta, Amit Kumar, Euiwoong Lee
and Jason Li. ‘Tight FPT Approximations for k-Median and k-Means’. In:
th International Colloquium on Automata, Languages, and Programming,
ICALP. . doi: 10.4230/LIPIcs.ICALP.2019.42.
[CL] Vincent Cohen-Addad and Jason Li. ‘On the Fixed-Parameter Tractability
of Capacitated Clustering’. In: th International Colloquium on Automata,
Languages, and Programming, ICALP. . doi: 10.4230/LIPIcs.ICALP.
2019.41.
[DLM] Szymon Dudycz, Mateusz Lewandowski and Jan Marcinkowski. ‘Tight
Approximation Ratio for Minimum Maximal Matching’. In: Integer Pro-
gramming and Combinatorial Optimization - th International Conference,
IPCO. .
[Bar+] Siddharth Barman, Omar Fawzi, Suprovat Ghoshal and Emirhan Gür-
pinar. ‘Tight Approximation Bounds for Maximum Multi-coverage’. In:
Integer Programming and Combinatorial Optimization - st International
Conference, IPCO. . doi: 10.1007/978-3-030-45771-6\_6.
[Bat] Wouter Cames van Batenburg. ‘Minimum maximal matchings in cubic
graphs’. In: CoRR (). arXiv: 2008.01863.
[Byr+] Jaroslaw Byrka, Szymon Dudycz, Pasin Manurangsi, Jan Marcinkowski
and Michal Wlodarczyk. ‘To Close Is Easier Than To Open: Dual Param-
eterization To k-Median’. In: Approximation and Online Algorithms - th
International Workshop, WAOA. . arXiv: 2011.08083.
[DMM] Szymon Dudycz, Pasin Manurangsi and Jan Marcinkowski. ‘Tight Inap-
proximability of Minimum Maximal Matching on Bipartite Graphs and
Related Problems’. to appear. .
Bibliography