0% found this document useful (0 votes)

19 views77 pages

Rozprawa

This doctoral dissertation by Jan Marcinkowski presents findings on the approximability of various combinatorial optimization problems. It demonstrates that a trivial 2-approximation algorithm for the Minimum Maximal Matching problem is optimal, introduces an approximation algorithm for Proportional Approval Voting, and constructs a constant-factor approximation algorithm for the Capacitated k-Median problem. The results highlight significant implications for existing conjectures in computational complexity, such as the Unique Games Conjecture and Gap-ETH.

Uploaded by

homocreatore

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views77 pages

Rozprawa

Uploaded by

homocreatore

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 77

Three small discoveries in the field of

(in-)approximability.

A doctoral dissertation
by
Jan Marcinkowski

Wydział Matematyki
i Informatyki

Wrocław-Tylicz, ad 
ii

Streszczenie
W niniejszej rozprawie doktorskiej prezentujemy wyniki dotyczące tematu aproksymowalno-
ści wybranych optymalizacyjnych problemów kombinatorycznych:

. W rozdziale  pokazujemy, że trywialny algorytm 2-aproksymacyjny dla problemu Mi-

nimum Maximal Matching jest najlepszym, jaki można uzyskać (także w szczególnym
przypadku grafów dwudzielnych). Istnienie algorytmu (2 − γ)-aproksymacyjnego (dla
dowolnej stałej γ) przeczyłoby hipotezie Unique Games Conjecture (lub jej mocniejszym
wariantom, w przypadku dwudzielnym).
. W rozdziale  prezentujemy algorytm aproksymacyjny dla problemu rozstrzygania wy-
borów w ordynacji zwanej Proportional Approval Voting (a także w całej nieskończo-
nej rodzinie powiązanych ordynacji wyborczych). Pokazujemy też, że algorytm ten
jest najlepszą możliwą aproksymacją, jaką może uzyskać algorytm działający w cza-
sie wielomianowym (o ile P , NP), a nawet w czasie parametryzowanym rozmiarem
wybieranego zgromadzenia (o ile prawdziwa jest hipoteza Gap-ETH).
. W rozdziale  konstruujemy algorytm o stałym współczynniku aproksymacji dla pro-
blemu Capacitated k-Median, który działa w czasie parametryzowanym liczbą k. Pro-
blem ten nie może być rozwiązany dokładnie w takiej złożoności (o ile FPT , W[2]),
a algorytm aproksymacyjny o stałym współczynniku działający w czasie wielomiano-
wym — chociaż nie jest wykluczony — pozostaje niedosięgniony.

Abstract
In this thesis we present the results on approximability of selected combinatorial optimisation
problems:

. In Chapter  we are showing that a trivial 2-approximation algorithm for the Minimum
Maximal Matching problem is the limit of what can be expected (also for bipartite
graphs). Existence of a (2 − γ)-approximation algorithm (for any constant γ) would
contradict the Unique Games Conjecture (or its stronger variants, in the bipartite case).
. In Chapter  we present an approximation algorithm for deciding winners in the Pro-
portional Approval Voting system (and an infinite family of related electoral sys-
tems). We additionally show, that a better polynomial-time approximation would im-
ply P = NP, and a better approximation running in time parametrised by the committee
size would contradict Gap-ETH.
. In Chapter  we construct a constant-factor approximation algorithm for the Capaci-
tated k-Median problem, working in time parametrised by k. The problem in ques-
tion cannot be solved exactly within such a complexity (unless FPT = W[2]), and a
polynomial-time approximation—although not strictly out of question—remains out
of reach.
Contents

Notations v

Introduction 
 Computational tasks and their complexity . . . . . . . . . . . . . . . . 
 Approximation algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 
 This dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

 Gap problems 
 Label Cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
 Unique Games and variations . . . . . . . . . . . . . . . . . . . . . . . 

 The Minimum Maximal Matching problem 

 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
 Our method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
 General graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. The vertex cover instance . . . . . . . . . . . . . . . . . . . . . 
. Supplementing H with edges . . . . . . . . . . . . . . . . . . . 
 Bipartite graphs I: Bipartisation and path covers . . . . . . . . . . . . 
. Covering with paths . . . . . . . . . . . . . . . . . . . . . . . . 
 Bipartite graphs II: From Strong ugc to MMM . . . . . . . . . . . . . . . 
 Bipartite graphs III: Hardness based on sseh . . . . . . . . . . . . . . 
. Reducing from sseh to ugc . . . . . . . . . . . . . . . . . . . . 
. Max Uncut Hypergraph Bisection . . . . . . . . . . . . . . . . . 
. From MUCHB to Bipartite MMM . . . . . . . . . . . . . . . . . . . . 
 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

 Proportional Approval Voting 

 Algorithmic question . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
 The approximation algorithm . . . . . . . . . . . . . . . . . . . . . . . 
iv Contents

. Step : Three-valued x∗ . . . . . . . . . . . . . . . . . . . . . . . 

. Step : Getting rid of ones . . . . . . . . . . . . . . . . . . . . . 
. Step : Comparing Binomial and Poisson . . . . . . . . . . . . 
. Step : Changing the mean of Poisson . . . . . . . . . . . . . . 
 Hardness of Approximation . . . . . . . . . . . . . . . . . . . . . . . . 
. Comparing Binomial and Poisson . . . . . . . . . . . . . . . . . 
. The Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
 Determining performance of the Greedy Algorithm . . . . . . . . . . . 
 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

 k-Medians clustering with capacities 

 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
 l-Centred instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
 Polynomial-time O(log k)-approximation for CkM . . . . . . . . . . . . 
. Solving CkM on trees . . . . . . . . . . . . . . . . . . . . . . . . . 
 Constant-factor approximation . . . . . . . . . . . . . . . . . . . . . . 
. Mapping clients to open facilities . . . . . . . . . . . . . . . . . 
. Solving Uniform CkM on l-centred metrics . . . . . . . . . . . . 
. Approximating non-uniform CkM . . . . . . . . . . . . . . . . . 
. Choosing the right l . . . . . . . . . . . . . . . . . . . . . . . . . 
 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

Bibliography 
Notations

[m] The set of (positive) natural numbers up to m: {1, . . . , m}.

x Negation. For a boolean value, b = ¬b. For a boolean vector, xi = xi = ¬xi .
◦ Denotes function composition. If π is a permutation and x is a vector, x ◦ π is
the vector with permuted coordinates: (x ◦ π)i = xπ(i) .
⊕ XOR. We slightly abuse the notation by writing b ⊕ x for a boolean value
b and a boolean vector x. The operation applies to every element of x, so
(b ⊕ x)i = b ⊕ xi .
{x } Fractional part of a real number x. {x} = x − bxc.
∪· Disjoint union. X ∪· Y = X ∪ Y and additionally asserts that X ∩ Y = .
⊗ Tensor product. For two vectors v and w, v ⊗ w is the same as their outer
product vw| —a matrix where every element of v is combined with every
element of w.
For two graphs G and H, G⊗H is a graph with the vertex set V (G)×V (H) and an
edge connecting hu1 , v1 i with hu2 , v2 i only if (u1 , u2 ) ∈ E(G) and (v1 , v2 ) ∈ E(H).
G⊗k = G ⊗ G ⊗ · · · ⊗ G (k times) is a graph in which each vertex corresponds
to a vector of k vertices of G and two such vectors are connected if there is a
connection in G on each coordinate.
Sm A set of permutations of [m]. In some other sources called Sm (a symmetric
group of degree m).
z Sample from a distribution. When S is a set, s z S means sampling uniformly
from S. When x∗ is a vector of real numbers between 0 and 1, x z x∗ means
sampling each coordinate xi independently from Ber(xi∗ ).
vi Notations
Introduction

We are Computer Scientists. Our role is to investigate—with mathematical scrutiny—

which problems can be solved with computers, and how efficiently it can be done.
Computer Science was started by a realisation (now known as Church-Turing Thesis)
that if any intelligent life in the universe has computing machines, they would have
the same computing power as ours, and most likely—however their machines were
built—it should be possible to emulate their computer with ours with only a polyno-
mial overhead in number of operations (anything they compute in n steps, we can
compute in O(nc ) for some constant c). This has led us to a (rarely contested) agree-
ment that every problem which can be solved in time polynomial in the description
of problem instance is ‘practical’ (we say it is in class P), even though some problems
may be more suited for computers on Earth and some other for those belonging to
aliens.

. Computational tasks and their complexity

The problems at the focus of Computer Science are combinatorial in nature, and
usually their application to practical use of computers is indirect. Some important
tasks turn out to be very simple.

EXAMPLE (Maximum Acyclic Undirected Subgraph).

In the MAUS problem we are given and undirected graph, and must find the largest
possible set of edges which form an acyclic subgraph. This can be solved very
easily, in time almost proportional to the time needed to read the input graph, as
the maximum acyclic subgraph is the same as a spanning forest.

A small change in the description of the task may however put it in a very different
place on the Computational Complexity map.
 Introduction

EXAMPLE (Maximum Acyclic Subgraph).

In the MAS problem we have the same task as in MAUS, but the input graph is
directed. We will show that it is unlikely, that MAS ∈ P, by reducing another classic
problem called Vertex Cover (VC) to it. Thus by solving MAS we would also solve
VC in polynomial time.

Reduction  (Karp [Kar])

Input: A VC instance—an undirected graph G = hV , Ei.
V ·
Output: A MAS instance
n H with vertex oset VH = V ×{0,
n 1} and arc set EH = EoH ∪
E V E
EH , where EH = (hu, 0i , hu, 1i) | u ∈ V , and EH = (hu, 1i , hv, 0i) | (u, v) ∈ E .

One can verify that if G has a vertex cover (subset of V hitting every edge) of
V
size k, then removing k edges from EH suffices to make digraph H acyclic (this
statement is called the completeness of the reduction). Moreover, if H can be made
acyclic by removing l arcs, then there is a vertex cover of l vertices in graph G
(this states the soundness of our reduction).

. Approximation algorithms
Having built a web of reductions like the one above, Computer Scientists began to
look for a way around the common difficulty (NP-hardness) of problems like MAS or VC.
One way around is approximation algorithms. Such an algorithm has an approximation
ratio of ρ if it returns solutions whose value is within ρ of the optimal value for the
input instance.

EXAMPLE (1/2-approximation for MAS).

We can number the vertices of the graph arbitrarily. This ordering determines two
sets: of forward arcs (ones that point from lower to higher number), and backward
arcs. Let us pick the larger of these sets and call it A. The graph hV , Ai is clearly
acyclic, since all arcs agree with the same linear ordering of vertices. Moreover
|A| > |E| |P |
2 , so for any acyclic set P of edges (including the optimal one), |A| > 2 .

In the field of approximation algorithms—which has been central to Computer

Science since s—we try to construct solutions with ρ as close to 1 as possible.
More generally, we can ask the question ‘what is the best possible approximation
. This dissertation 

ratio’ for a particular problem, or ‘whether such a constant approximation ratio can
be even guaranteed’ by a polynomial-time algorithm. Many general techniques have
been developed for constructing approximation algorithms, but in the end, the exact
answer must be given individually for each computational task.
Giving negative answers to approximability questions has been enabled by a chain
of results (starting with the PCP theorem [Aro+]) and hypotheses (most notably
the Unique Games Conjecture [Kho]). They allow us to construct reductions like
above, but proving inapproximability, i.e. impossibility of approximation within a
certain ratio. For example, an inapproximability result is known for the Maximum
Acyclic Subgraph.

Theorem  (Guruswami, Manokaran, Raghavendra [GMR])

Assuming Unique Games Conjecture, for any γ > 0, no polynomial-time algorithm can
( 12 + γ)-approximate MAS.

. This dissertation
In this thesis we tackle the approximability question for three combinatorial problems,
each of very different nature. In Chapter , we describe our hardness results for a
problem called Minimum Maximal Matching. It is a somewhat similar story to the
Maximum Acyclic Subgraph—there is a trivial approximation algorithm with ratio
equal to 2, and a considerable effort had been invested into finding better approxi-
mation algorithms (i.e., with smaller ratio). In our papers [DLM] (joint work with
Szymon Dudycz and Mateusz Lewandowski; published at IPCO ) and [DMM]
(joint work with Szymon Dudycz and Pasin Manurangsi; pending publication) we
prove such an algorithm impossible, conditionally on the Unique Games Conjecture.
Our results effectively close a long line of research on the approximability of Minimum
Maximal Matching. While proving the inapproximability, we need to look deeply
into previous results on hardness of approximation of VC and Balanced Biclique,
as well as the relationship between Small Set Expansion Hypothesis and the Unique
Games Conjecture. Our write-up may be helpful to those willing to understand them
but struggling throughout the (sometimes obscure) writing of the original papers.
In Chapter  we revisit the task of counting votes in the Proportional Approval
Voting and related electoral systems. This problem—although dressed in the robes
of political sciences—is a computational task similar to Maximum Coverage, only
with a generalised objective function . In our paper [Dud+] (joint work with
Szymon Dudycz, Pasin Manurangsi and Krzysztof Sornat; published ad IJCAI ),
we give a new approximation algorithm (one for entire family of voting systems), and
 Introduction

show that it cannot be improved upon, even by an algorithm with super-polynomial

(parametrised) complexity.
In Chapter  we take on a metric clustering problem called Capacitated k-Median.
Although the problem has been around for at least three decades, the question of its
approximability is still wide open: no constant-factor approximation algorithm is
known, yet there are no good reasons to believe such an algorithm to be impossible.
While we are not resolving this question, in our paper [Ada+] (joint work with
Marek Adamczyk, Jarosław Byrka, Syed Mohammad Meesum and Michał Włodarczyk;
published at ESA ) we follow a recent fashion and allow our algorithm to run in
super-polynomial time (parametrised by k). We show, that this relaxation of the rules
gives way to a rather simple constant-factor approximation for the problem.
The Chapters , , and  are independent and can be read in any order. They are
based on respective published articles, but offer a much wider (and—hopefully—more
accessible) writing, not constrained by the space limitation of conference proceed-
ings. Chapter  introduces the reader to the land of inapproximability results and
conjectures, which become the starting point for our reductions in Chapters  and .

. Acknowledgements
I am very grateful to my PhD supervisor Jarosław Byrka, who encouraged me to
follow the path of science. He gave me freedom seldom experienced by graduate
students. I thank my teachers, most notably Katarzyna Paluch, who attracted me to
the world of Combinatorial Optimisation and also suggested that we work on MMM. I
thank my collaborators for the time and the joy, among them Szymon Dudycz, with
whom we have been learning maths together for the last twenty-four years.
I am indebted to my family. My parents have taught me to climb mountains—both
real and intellectual—and we still enjoy hiking together. My wife Patrycja not only is
my greatest companion but also a professional touching people’s lives with her daily
work, which makes me immensely proud. Our daughter Jadwiga makes sure I get up
from bed every day.
I would also like to thank the hard-working Polish taxpayer who mines coal,
strikes iron, and sows the land so that we can travel to conferences. I tried to pay this
debt back by teaching with dedication and enthusiasm.

 Although a similar thing could be said about many optimisation problems.

 Gap problems

Some of the results in this thesis are inapproximability proofs. When working on
hardness of approximation, we introduce and reduce between gap problems, in which
the algorithm is promised that the incoming instance will come from one of two sets
separated by a large gap in a specific measure. A good approximation algorithm with
regard to this measure would need to distinguish between instances coming from
these sets. In this chapter we present a selection of important gap problems, known
results and conjectures. They will serve as starting points in our reductions in the
next chapters.

. Label Cover
One problem that has turned out to be useful in the hardness of approximation is
Label Cover, in which we colour (label) the vertices of a bipartite graph, attempting
to maximise the number of satisfied edge constraints. The constraints are functions
projecting the colour of the left vertex onto the right vertex. Hence this problem is
sometimes called projection games .
Problem  (Label Cover, LC(δ, t, R))
Given a bipartite, bi-regular graph G = (U ∪· V , E) with right degree t, two sets of
labels [L], [R], and for each edge e ∈ E a function constraint πe : [L] → [R], distinguish
between two cases:
• (YES) There exists a labelling l : U → [L], such that every right-side vertex v
is satisfied—i.e. for every two neighbours u1 , u2 of v we have π(u1 ,v) (l(u1 )) =
π(u2 ,v) (l(u2 )).
 The game here is played between two provers and one verifier: The verifier samples a right-vertex
and asks the provers to colour its random two neighbours. The provers are not allowed to communicate
and do not know, which right-vertex had been selected, so if their colours agree, the verifier is satisfied
they likely can colour the graph to satisfy large fraction of the constraints.
 Gap problems

• (NO) For every labelling l : U → [L] there are at most δ|V | weakly satisfied
right-side vertices v—i.e. there exist two distinct neighbours u1 , u2 for which
π(u1 ,v) (l(u1 )) = π(u2 ,v) (l(u2 )).

If we denote by val(L) the maximum fraction of right-side vertices satisfied by

any labelling in a Label Cover instance L, and by weak-val(L) the maximum fraction
of weakly satisfied right-side vertices, we see that the task in LC(δ, t, R) is to decide if
val(L) = 1 or weak-val(L) < δ. The instances coming from the (YES) and (NO) sets are
thus separated by a large gap in the val (and even weak-val) measure.
Two results concerning hardness of Label Cover will be relevant to our work. The
first is now a classical theorem:

Theorem  (Feige [Fei])

For every δ > 0, every t ∈ N, t > 2 exists sufficiently large R ∈ N such, that LC(δ, t, R) is
NP-hard.

The second theorem is conditional—it relies on a conjecture called Gap-ETH :

Theorem  (Manurangsi [Man])

Assuming Gap-ETH, for every δ > 0, t ∈ N (t > 2), exists sufficiently large R ∈ N, such
that no f (|U |) · |V |o(|U |) -time algorithm can solve LC(δ, t, R).

. Unique Games and variations

When reducing from Label Cover instances, it would sometimes be useful to con-
straint the input instance somehow without losing the hardness. For example, con-
trolling the (at least one sided) degree of the bipartite graph may play a role, and we
can see from the aforementioned theorems that one indeed may do that. Another
idea, introduced and used by Khot, was to assume that the constraint functions are
bijections (permutations). That has led to Unique Label Cover.

Problem  (Unique Label Cover, ULC(, t, R))

Given a bipartite graph G = (U ∪· V , E) with right-degree t, a set of labels [R], and for
each e ∈ E a permutation πe : [R] → [R] distinguish between two cases:
 Recently,
significant effort has been directed at the question of approximability within paramet-
rised complexity (rather than in polynomial time). Most of the results in that area are conditional on
various assumptions. Gap-ETH [Din; MR] is one such conjecture, which has given a lot of results
(e.g. [Cha+]). Other assumptions from FPT world, like ETH or FPT , W[1] are also used, but tend
to be much harder to reduce from.
. Unique Games and variations 

u u

u
u

u u

Figure : Unique Label Cover can be translated from one variant to another with a
marginal loss in .

• (YES) There exists a subset V 0 ⊂ V of size |V 0 | > (1 − )|V | and a labelling l : U →

[R] such that for every vertex v ∈ V 0 and its two neighbours u1 , u2 , π(u1 ,v) (l(u1 )) =
π(u2 ,v) (l(u2 )).
• (NO) For every labelling l and any subset V 0 ⊂ V of size |V 0 | > |V |, there is
a vertex v ∈ V 0 such that for any two neighbours u1 , u2 of v, π(u1 ,v) (l(u1 )) ,
π(u2 ,v) (l(u2 )).
There is a slight difference here compared to the definition of Label Cover—in the
(YES) case we do not have val(L) = 1, since checking if every edge-constraint can be
satisfied in Unique Label Cover is very easy.
Complexity of the Unique Label Cover is still unresolved, but assuming it to be
NP-hard has brought a ton of interesting consequences. This assumption is called
Unique Games Conjecture [Kho].
Conjecture  (ugc [Kho]). For every > 0 and t > 2 exists R such that ULC(, t, R) is
NP-hard.
It must be noted that these conjectures and theorems are quite robust. In ULC we
could formulate the gap in terms of a fraction of satisfied constraints rather than
(weakly) satisfied right-vertices and the conjecture remains equivalent. We could
even allow the graph to be non-bipartite (Fig ). However, some assumptions on a
structure of the input graph have led to conjectures that are not yet known to be
equivalent to ugc. One such extension of ugc, which we will be relying on, is the
so-called Strong Unique Games Conjecture . It is a strengthening of ugc, where in
the (NO) case the graph G is also promised to be a ‘small set expander’; a precise
definition is given below.
 Originally, the name Strong ugc was used to refer to Conjecture  itself [KR], as it is described
 Gap problems

Problem  (ULC with weak expansion, ULCE(, R, δ))

Given a bipartite graph G = (U ∪· V , E), a set of labels [R], and for each e ∈ E a
permutation πe : [R] → [R] distinguish between two cases:

• (YES) There exists a subset V 0 ⊂ V of size |V 0 | > (1 − )|V | and a labelling l : U →

[R] that satisfies every vertex v ∈ V 0 .
• (NO) For every labelling l the number of weakly-satisfied right edges is smaller
than |V |. Moreover, for every S ⊂ V of size |S| = δ|V |, |Γ (S)| > (1 − δ)|V |, where
Γ (S) is a neighbourhood of the set S in the graph G.

Conjecture  (Strong ugc [BK]). For every > 0 and δ > 0 exists R ∈ N such that
ULCE(, R, δ) is NP-hard.

Another conjecture we base our results on is the Small Set Expansion Hypothesis
(sseh), which concerns the hardness of finding well-separated parts of a graph.
Formally, in a d-regular graph G = (V , E), an edge expansion of a subset S of vertices
is defined as
|E(S, V \ S)|
Φ(S) = ,
d · min {|S|, |V \ S|}
where E(S, V \ S) is a set of edges crossing the cut (S, V \ S). The small set expansion
problem parametrised by δ, η ∈ (0, 1) is defined as follows.

Problem  (SSE(δ, η))

Given a regular graph G = (V , E) distinguish between two cases:

• (YES) There exists S ⊂ V of size |S| = δ|V | such that Φ(S) 6 η.

• (NO) For every S ⊂ V of size |S| = δ|V |, Φ(S) > 1 − η.

The sseh, introduced by Raghavendra and Steurer [RS], hypothesises about hard-
ness of this problem.

Conjecture  (sseh). For every η > 0 exists δ = δ(η) ∈ (0, 21 ] such that SSE(δ, η) is
NP-hard.

It turns out, surprisingly, that sseh is also stronger than ugc [RS]. We will see
the intuition behind this fact in the next chapter. No direct relationship is however
known between Strong-ugc and sseh.

here. In this thesis we follow the nomenclature of [Bha+], which refers to Conjecture  as Strong
ugc
 The Minimum Maximal Matching
problem
An edge dominating set of an undirected graph is a subset D ⊂ E of edges, such that
every edge must either be in D or adjacent to some edge in D. A Minimum Edge
Dominating Set (EDS) problem asks to find such a set of smallest cardinality. An
independent edge dominating set, in which no two edges have a common vertex, is
also called a maximal matching—the domination criterion means that no edge may be
added to it while maintaining the independence. In a natural algorithmic question of
Minimum Maximal Matching (MMM) we are asked to find such an independent edge
dominating set of smallest cardinality.
Both these problems also have counterparts in graphs with weights on edges, wEDS
and wMMM, where we aim to find the lightest (independent) edge dominating set rather
than smallest.
In this chapter we are going to present four inapproximability results which
appeared in two papers [DLM; DMM], however for the first of this results we
will present a different (simpler) proof then the published one.

. Related work
The first traceable discussion of our problems goes back to , when it was shown
that every inclusion-wise minimal edge dominating set can be transformed into a
maximal matching of the same cardinality (see Fig.  (a)). The EDS and MMM problems
(in unweighted variants) are therefore equivalent [Gup].

Algorithms and Complexity A series of polynomial-time algorithms for MMM on

selected families of graphs have been developed: for trees [MH; YG], claw-free
chordal graphs, locally connected claw-free graphs, the line graphs of total graphs, the
line graphs of chordal graphs [HK], bipartite permutation graphs, co-triangulated
graphs [Sri+]. For another families of graphs the problem has been shown to be
 The Minimum Maximal Matching problem

v v





u u
s s

w w


(a) Every Edge Dominating Set can be transformed (b) The equivalence between MMM and
into a Maximal Matching of the same size: Let (u, v), EDS does not hold in the weighted
(u, w) be two adjacent EDS-edges. We find a neigh- graphs. The lightest maximum match-
bour s of w with no incident EDS-edge and replace ing weighs , while the Minimum
(u, w) with (w, s). If no such s exists, the edge (u, w) Edge Dominating Set weighs only .
can be dropped from the EDS.

Figure : The equivalence of MMM and EDS.

NP-hard: first by Yannakakis and Gavril on planar or bipartite graphs of maximum

degree  [YG]; then planar bipartite graphs, line and total graphs, perfect claw-free
graphs, and planar cubic graphs [HK], and many other.

Approximation algorithms By endpoint-counting, no matching can be more than

twice as large as any other maximal matching. This gives an immediate -approx-
imation algorithm for MMM: find any maximal matching. Attempts to beat this ap-
log |V |
proximation ratio resulted in 2 − c |V | [GLR] (where c is an arbitrary constant),
and 2 − χ01(G) (χ0 (G) is an edge-colouring number of G and is bounded by maximum
degree plus one) [MKI].
EDS was one of problems exemplifying Baker’s framework for constructing PTAS
on planar graphs [Bak]. Better-than-two constant approximation ratios has been
achieved for dense graphs [SV], as well as for cubic graphs [Bat].

Weighted variants The equivalence between EDS and MMM does not hold when
edges carry weights (see example in Fig.  (b)). All of the aforementioned approxi-
mation algorithms (with the exception of Baker’s method, but including the trivial
-approximation) do not apply for wEDS and wMMM. A series of papers by Fujito, Parekh
and others concluded in a -approximation algorithm for wEDS [Car+; FN; Par].
wEDS has been shown to be as hard to approximate as weighted Vertex Cover [FN;
. Related work 

Par], hence, realistically, no further progress should be expected. Surprisingly, wMMM

is impossible to approximate within any polynomially computable factor [Fuj] .

Negative results Chlebík and Chlebíkova proved that it is impossible to approx-

imate the unweighted variant of EDS (MMM) with any constant ratio better then 76
unless P = NP [CC]. That bound was later improved to . [Esc+]. Addition-
ally, a simple 32 -hardness conditional on Unique Games Conjecture were known by a
straightforward reduction from Vertex Cover.
We enter the story with our results, by showing that the Unique Games Conjecture
implies that the MMM cannot be approximated within a constant factor better then .

Theorem 
Assuming Unique Games Conjecture (Conjecture .), there is no polynomial-time
algorithm that approximates MMM (or equivalently EDS) on unweighted graphs within
factor 2 − ε for any constant ε > 0, unless P = NP.

A special case of MMM, particularly natural for matching experts is, when the input
graph is bipartite. We were not able to carry our negative bound directly to the
bipartite graphs, but we managed to obtain three ‘weaker’ results.

Theorem 
Assuming Unique Games Conjecture (Conjecture .), there is no polynomial-time
algorithm that approximates MMM (or equivalently EDS) on unweighted, bipartite graphs
within factor 43 − ε for any constant ε > 0, unless P = NP.

Theorem 
Assuming Strong Unique Games Conjecture (Conjecture .), there is no polynomial-
time algorithm that approximates MMM (or equivalently EDS) on unweighted, bipartite
graphs within factor 2 − ε for any constant ε > 0, unless P = NP.

Theorem 
Assuming Small Set Expansion Hypothesis (Conjecture .), there is no polynomial-time
algorithm that approximates MMM (or equivalently EDS) on unweighted, bipartite graphs
within factor 2 − ε for any constant ε > 0, unless P = NP.

The rest of this chapter will be devoted to presenting the proofs of these theorems.
A fact we were unaware of when working on our results. But it makes the unweighted MMM all the
more intriguing.
 The Minimum Maximal Matching problem

. Our method
Our proofs take advantage of a close relationship between the Independent Set
problem and MMM. In short, the set of vertices left unmatched by a maximal matching
must form an independent set. In [KR] it was proven that it is NP-hard (under
ugc) to distinguish (YES) graphs with independent set of size ( 12 − )|V | from (NO)
those with no independent set of size |V |. In the proof of Theorem  presented in our
paper [DLM] this reduction was modified by only adding edges, so that in the (YES)
case there was a perfect matching between vertices not included in the independent
set. This strategy aimed for simplicity, as typically, when proving correctness of
gap reductions, analysing the (NO) case is significantly more difficult than the (YES)
case. When we only add edges, no independent set can be created, so there is no
need to look at the (NO) part at all. In the proof of Theorem  in this thesis we do
the same trick, just starting from a different construction for Vertex Cover [BK],
which will have an advantage of producing unweighted graphs right away (in the
paper [DLM] we had to exploit the structure of the resulting weighted graph to get
rid of the weights).
In bipartite graphs, we are aiming to employ the same strategy. However, while
maximum independent sets in bipartite graphs can be found in polynomial time,
we notice that vertices left out by any maximal matching must form a balanced bi-
independent set—an independent set with the same number of vertices on both sides
of bipartition. We must hence look at existing hardness of approximation proofs for
balanced bi-independent set [Bha+; Man] and carefully add edges to them.

. General graphs
At the heart of many ugc-based reductions lies a dictatorship test. A dictatorship
is a function f : {0, 1}R → {0, 1} that depends only on one coordinate, so f (x) = xi or
R
f (x) = xi . Dictatorships are an error-correcting code—there are 22 functions, but
only 2R dictatorships. Importantly, there are ways to only query f on some inputs
and decide whether it is (similar to) a dictatorship, and decode it if it is. In an
inapproximability reduction we are given t functions f (1) , . . . , f (t) and must test if
they all are dictatorships of the same coordinate.
The test F,t coined by Bansal and Khot [BK] works as follows:

. Pick a random input x ∈ {0, 1}R ;

 These
results talk about Balanced Biclique which is more natural, but since bi-independent set fits
our needs more, we will silently take graph complement when recalling their reductions.
. General graphs 

. Pick a random S ⊂ [R] of size R. Define the sub-cubes:

Cx,S = z ∈ {0, 1}R | zj = xj ∀j < S ,
n o

Cx,S = z ∈ {0, 1}R | zj = xj ∀j < S ;

n o

. Accept if all f (l) -s are equal to some b on all inputs in Cx,S and b on all inputs in
Cx,S .

Completeness If all the f (l) -s are the dictatorship of the same coordinate i, this
coordinate will be included in the set S with probability . When i < S, the test will
pass.

Soundness For any function f : {0, 1}R → {0, 1} and any integer s 6 R one can deter-
mine the set Ls [f ] ⊂ [R] of s most low-degree influential coordinates . Bansal and Khot
prove the following
Theorem  ([BK, Theorem ., simplified])
For every , δ > 0 there are t, s ∈ N such, that if the test F,t accepts a collection f (1) , . . . , f (t)
of functions {0, 1}R → {0, 1} (for any R) with probability larger than δ, then there are two
of the tested functions f (l1 ) , f (l2 ) and a coordinate i for which
i ∈ Ls [f (l1 ) ] ∩ Ls [f (l2 ) ].

.. The vertex cover instance

We now use the test F,t to translate a Unique Label Cover instance G = (U ∪·
V , E, {πe }e∈E ) into a graph H. Intuitively, every vertex of H will correspond to the
V -vertex performing the dictatorship test on the colouring of its t neighbours. For
every such vertex there are 2(1−)R R R
possible sub-cubes Cx,S , and 2 possible values
of the bit b. Every edge will correspond to inconsistent runs of the test.

Reduction 
Input: A ULC(, t, R) instance G = (U ∪· V , E, {πe }e∈E ).
Output: An undirected graph H = (VH , EH ) with VH = V × 2(1−)R R R
× {0, 1}.
For any left-side vertex u ∈ U and its two (not necessarily distinct) neighbours

 We will not delve into how this set is defined, as that would require us to explore the topic of
Fourier analysis of boolean functions, which we managed to avoid in our proofs. Luckily, we do not
need that to understand how dictatorship test translates into reductions.
 The Minimum Maximal Matching problem

D E D E
v1 , v2 ∈ V we add an edge in H between v1 , Cx1 ,S1 , b1 and v2 , Cx2 ,S2 , b2 if

• b1 , b2 and exists x ∈ {0, 1}R such that x◦π(u,v1 ) ∈ Cx1 ,S1  and x◦π(u,v2 ) ∈ Cx2 ,S2 ;
or
• b1 = b2 and exists x ∈ {0, 1}R such that x ◦ π(u,v1 ) ∈ Cx1 ,S1 and x ◦ π(u,v2 ) ∈ Cx2 ,S2 .

Two properties are proved:

Lemma  (Completeness [BK, Section .]). If G was a (YES) instance of ULC, there
is an independent set of size ( 12 − 2)|VH | in H.
Proof. Take a labelling l : U → R, as in ULC. l naturally extends to the set V 0 of
satisfied right vertices. Define

I S = v, Cx,S , b | v ∈ V 0 , l(v) < S, xl(v) = b .

n o

n o
To see that this is indeed an independent set, define functions fu : x 7→ xl(u) .
u∈U
Every vertex in I S corresponds to an accepting run of F,t on t of these functions.
One can see, that when defining I S we could just as well have written xl(v) , b instead
n o
of xl(v) = b, which would have corresponded to fu : x 7→ xl(u) . There are in fact
u∈U
two independent sets of size (1 − 2)|VH | in the (YES) case. The freedom of picking
one of the two sets is what the ‘free bit’ from the title of [BK] refers to.
Lemma  (Soundness [BK, Section .]). If G was a (NO) instance of ULC, no inde-
pendent set of size |H| exists in H.
Proof (sketch). Assume contrapositive, that there is an independent set S in H of size
2δ|H|. We will use this assumption to colour the vertices of ULC instance G: Imagine
that for every u ∈ U there is a boolean function f (u) : {0, 1}R → {0, 1}. For each such
function we can identify the set Ls [f (u) ] and pick one of the s colours at random.
For v ∈ V , let
VH [v] = v 0 , Cx,S , b ∈ VH | v 0 = v .
n o

By Markov’s inequality, the set V 0 of right vertices v for which |VH [v] ∩ S| > δ|VH [v]|
has size at least 12 |V |. Take any v ∈ V 0 and its t neighbours, u1 , . . . , u t . Functions
f (u1 ) ◦ π(u1 ,v) , . . . , f (ut ) ◦ π(ut ,v) pass the dictatorship test F,t with probability exceeding
x ◦ π is the vector x with coordinates permuted by permutation π(u,v) . Since the coordinates
(u,v)
correspond to colours, we can think that this permutation allows us to look at x from the perspective
of vertex v.
. General graphs 

0
δ, so by Theorem  there are two neighbours w, w0 such that Ls [f (w) ] ∩ Ls [f (w ) ] , .
Consequently, our random colouring weakly satisfies at least s12 |V 0 | > s22 |V | right
vertices of G.
Since Independent Set and Vertex Cover are complements in any graph, the
reduction gives us that it is NP-hard to distinguish between graphs with Minimum
Vertex Cover of size ( 12 + )|VH | and (1 − )|VH | for any .

.. Supplementing H with edges

We wish to exhibit a similar gap for MMM:
Problem  (Gap-MMM)
Given an undirected graph H = (VH , EH ) distinguish between two cases:
• (YES) There is a maximal matching of size 21 ( 12 − )|VH |.
• (NO) There is no independent set of size |VH |.
In the (NO) case, when there are no independent sets of size |VH |, every maximal
matching must match almost all of the vertices.
To show that H is an instance of Gap-MMM, in the (YES) case we would need to find
a matching between vertices of VH \ I S. This is however impossible, because—as we
have said—inside VH \ I S there is another independent set of size |I S|. We thus need
to amend our reduction and break this symmetry.

Reduction 
Input: A ULC(, t, R) instance G = (U ∪· V , E, {πe }e∈E ).
Output: An undirected graph H 0 = (VH 0 , EH 0 ) with VH 0 = V × 2(1−)R R R
× {0, 1}.
For any left-side vertex u ∈ U and its two (not necessarily distinct) neighbours
v1 , v2 ∈ V we add an edge in H 0 between v1 , Cx1 ,S1 , b1 and v2 , Cx2 ,S2 , b2 if exist
D E D E

y1 , y2 ∈ {0, 1}R such that

• b1 ⊕ y1 (i) 6 b2 ⊕ y2 (i) for all i < S1 ∪ S2  ; and

• b1 ⊕ (y1 ◦ π(u,v1 ) ) ∈ Cx1 ,S1 and b2 ⊕ (y2 ◦ π(u,v2 ) ) ∈ Cx2 ,S2 .

We only added edges to the graph, compared to Reduction , so Lemma  is (even

more) true for H 0 . We need to prove that in the (YES) case a large independent set
still exists.
 This is a slight modification of Reduction , where this condition could be written as b1 ⊕ y1 (i) =
b2 ⊕ y2 (i) for all i < S1 ∪ S2 . In our reduction we replace = with 6.
 The Minimum Maximal Matching problem

Lemma . If G was a (YES) instance of ULC, H 0 has an independent set of size ( 12 −
2)|VH 0 |.

Proof. Take I S as defined in Lemma . For any v, Cx,S , b ∈ I S and (u, v) ∈ E, if

we have a vector y ∈ {0, 1}R such that b ⊕ (y ◦ π(u,v) ) ∈ Cx,S then y(l(u)) = 0 (since
x(π(u,v) (l(u))) = b). As 0 = 1 0, no edge exists in EH 0 between two vertices in I S.

We would not have been able to prove the above lemma for the alternative definition
of I S in Lemma , because 1 = 0 6 1. In fact, the symmetry is broken—there are now
many internal edges in VH 0 \ I S.

Lemma . There is a perfect matching in VH 0 \ I S.

Proof. We consider three groups of vertices v, Cx,S , b ∈ VH 0 \ I S.

(G) v ∈ V 0 , l(v) < S, xl(v) , b: We pair such a vertex with v, Cx0 ,S , b , where

xi = b if i = l(v),

0
xi = 

xi
 otherwise.

This edge exists in EH 0 because 1 = 0 6 1.

(G) v ∈ V 0 , l(v) ∈ S: We pair such a vertex with v, Cx,S , b .
(G) v < V 0 : We also pair such a vertex with v, Cx,S , b .

This concludes the analysis of the (YES) case and together with Lemma  hands us
that Gap-MMM is NP-hard (assuming ugc), which proves Theorem .

. Bipartite graphs I: Bipartisation and path covers

In this section we will show that:

Lemma . Assuming the Unique Games Conjecture, for any > 0 it is NP-hard to
distinguish between balanced bipartite graphs of 2n vertices:

• (YES) with a maximal matching of size n( 12 − ).

• (NO) with no maximal matching of size smaller then n( 23 − ).
. Bipartite graphs II: From Strong ugc to MMM 

Theorem  is a direct consequence of this lemma.

Our reduction is going to be very predictable—we start with graph G-an instance
of Gap-MMM and produce two copies of each vertex v: v (l) and v (r) . The vertices u (l)
and v (r) will be connected in the resulting graph H whenever (u, v) ∈ E(G). It is easy
to see that for every maximal matching M ⊂ E(G) there is a maximal matching in H
of size 2|M|—we take two edges corresponding to each e ∈ M.

.. Covering with paths

In order to analyse the (NO) case we need an alternative interpretation of graph
H and its matchings from the perspective of G. Let us think that the graph G is
bi-directed and an edge between u (l) and v (r) corresponds to an arc (u, v) in G. For
any matching M in H, the corresponding set of arcs in G forms directed paths and
cycles (no vertex has more than one outgoing and one incoming arc). For a matching
M ⊂ E(H), P (M) will be the set of these paths and cycles in G.
Observation. If M is a maximal matching in H then every p ∈ P (M) has length |p| > 2.
Proof. Take any p = (u, v) ∈ P (M) of length 1. Vertices v (l) and u (r) are unmatched in
M and are connected by an edge in H (corresponding to the reverse arc (v, u)) which
contradicts maximality of M.
This observation will help us tie maximal matchings in H and vertex covers in G.

Lemma . For any maximal matching M in H, exists a vertex cover C in G of size
|C| 6 23 |M|.

Proof. For every p ∈ P (M) we add V (p) to C. When p is a cycle, we add |p| vertices.
For a path we add at least 32 |p| (the longer the path, the higher the ratio). V (M) is a
vertex cover of H, so C = V (P (M)) is a vertex cover in G.
In the (NO) case graph G had no vertex cover smaller than (1 − )|V (G)|, so there is
no maximal matching of size smaller then |V (G)|(1 − ) 32 .

. Bipartite graphs II: From Strong ugc to MMM

In this section we will review the reduction from Unique Label Cover with Weak
Expansion to Bi-Independent Set problem, that was presented in [Bha+]. Then
we will present our modification and prove its correctness. The main building block
of their reduction is the graph T that we are going to use as a gadget. It will be
composed of identical layers and parametrised by a prime number q > 2 (see Fig ).
 The Minimum Maximal Matching problem

bq/c bq/c −     q− q− bq/c +  bq/c + 

Figure : The gadget Tq for a prime number q > 2 (blue) is an undirected -regular
graph (counting edges, not endpoints). The modified gadget Tq0 is obtained by adding
directed arcs from i to q − i for i < Loq (red).

Definition  (Tq ). The vertex set is Zq = {0, 1, . . . , q − 1}. There are three kinds of
(undirected) edges.

• {i, i + 1 (mod q)} for i , bq/2c.

• A self-loop {i, i} for i , 0 and an additional one for bq/2c + 1.
• An edge {0, bq/2c}.

The resulting graph is -regular and connected.

Let us additionally define the sets Hiq = {bq/2c + 1, . . . , q − 1} and Loq = {1, . . . , bq/2c}.
0 belongs to neither. As one can see, there are no edges in Tq between Hiq and Loq , so
they form a balanced bi-independent set (even though Tq is not bipartite). The graph
T = Tq⊗R has the vertex set ZRq (so every vertex is an R-element vector of numbers
from Zq , every coordinate corresponds to one label colour in Unique Label Cover)
and two vertices x and y are connected, if xi and yi are connected in Tq for every
coordinate i ∈ R. The reduction from [Bha+] is now defined as follows.

Reduction 
Input: An ULCE(, R, δ) instance G = UG ∪· VG , EG , {πe }e∈E .
Output: A bipartite graph H = hUH ∪· VH , EH i.
Let UH = VH = VG × ZR q , so every vertex in H corresponds to one vertex in
VG and R numbers in Zq . For any u ∈ UG and its two neighbours v1 , v2 in G, we
−1
add an edge between (v1 , x) ∈ UH and (v2 , y) ∈ VH in EH whenever x ◦ π(u,v ) is 1
−1
connected to y ◦ π(u,v in T (x and y’s coordinates are permuted to the point of
2)
view of vertex u).
. Bipartite graphs II: From Strong ugc to MMM 

It is easy to see that if the input graph came from the (YES) case, then there is a
bi-independent set of size ( 12 − − 1q )|UH ∪· VH | in H. To find it pick the labelling l
and the set V 0 ⊂ VG of vertices fully satisfied by the labelling l. For any v ∈ V 0 and x
such that x(l(v)) ∈ Loq pick (v, x) into the left side of the bi-independent set. Similarly,
pick (v, y) ∈ VH if y(l(v)) ∈ Hiq . The reader can see, that in fact two symmetrical
bi-independent sets exist in H. The (NO) case is analysed in the original paper.

Lemma  (soundness, [Bha+, Lemma  reinterpreted]). For every constant γ,

there exist constants , δ such that if G came from the NO case of ULCE(, R, δ) then H has
no balanced bi-independent set larger than |UH ∪· VH |.

Now it is time to introduce our modification. Let us define a new gadget Tq0 by
adding directed arcs {(i, q − i)}i<Loq (see Fig. ). The undirected edge between two
vertices is replaced by a pair of two directed arcs, the self-loop is replaced by a single
arc. Consequently, the graph T 0 = Tq0 ⊗R is also directed. Our reduction is similar to
the one above.

Reduction 
Input: An ULCE(, R, δ) instance DG = UG ∪· VGE, EG , {πe }e∈E .
0
Output: A bipartite graph H 0 = UH ∪· VH , EH .
R
Let UH = VH = VG × Zq , like before. For any u ∈ UG and its two neighbours
v1 , v2 in G, we add an edge between (v1 , x) ∈ UH and (v2 , y) ∈ VH in EH whenever
−1 −1 0
there is an arc from x ◦ π(u,v ) to y ◦ π(u,v ) in T .
1 2

Lemma  (completeness). If the graph G comes from the (YES) case of ULCE, there is a
bi-independent set BI S ⊂ UH ∪· VH of size |BI S| > ( 12 − − 1q )|UH ∪· VH | in H 0 . Moreover,
there is a perfect matching in H 0 between UH \ BI S and VH \ BI S.

Proof. Take l and V 0 as in Problem .. For every v ∈ V 0 and x ∈ ZR q such that
x(l(v)) ∈ Loq , we pick the vertex (v, x) ∈ UH into the set BI S. Similarly, for every
v ∈ V 0 and y with y(l(v)) ∈ Hiq , we pick (v, y) ∈ VH into BI S. To see that BI S is
indeed a bi-independent set pick any (v1 , x) ∈ UH ∩ BI S and (v2 , y) ∈ VH ∩ BI S. Since
v1 , v2 ∈ V 0 , for every u in their common neighbourhood in G, π(u,v1 ) (l(u)) = l(v1 )
and π(u,v2 ) (l(u)) = l(v2 ). There is no arc from x(l(v1 )) ∈ Loq to y(l(v2 )) ∈ Hiq in Tq0 , so
there is no edge between (v1 , x) ∈ UH and (v2 , y) ∈ VH . The size of BI S is |BI S| >
bq/2c R
(1 − )|VG | q q · 2 > 12 − − 1q |UH ∪· VH |.
Now we define a perfect matching M between the vertices not in BI S. First, for
every v ∈ V 0 we will match the vertex (v, x) ∈ UH where x(l(v)) < Loq with (v, x0 ) ∈ VH
 The Minimum Maximal Matching problem

where 
0 x(i)
 when i , l(v),
x (i) = 

q − x(i) when i = l(v).


For any vertex v ∈ VG \ V 0 let M ((v, x)) = (v, x). This edge exists since there was a loop
(i, i) in Tq0 for every i ∈ Zq .

With Lemmas  and  we can conclude.

Corollary. Assuming Strong Unique Games Conjecture, for any > 0, it is NP-hard given
a bipartite graph G = (U ∪· V , E) (with |U | = |V |) to distinguish between two cases.

• (YES) There is a maximal matching in G of size ( 12 − )|U |.

• (NO) There is no bi-independent set of size |U ∪· V |, so every maximal matching
must have at least (1 − )|U | edges.

which gives us Theorem , as the (2 − )-approximation algorithm faced with a (YES)

instance would produce a solution of size at most (1 − 2)|U | unequivocally qualifying
it as a (YES) case.

. Bipartite graphs III: Hardness based on sseh

Our third reduction starts from the Small Set Expansion problem. A modification of
the hardness of approximation proof of Maximum Bi-Independent Set from [Man],
it is more complicated than the one presented in Section . The reduction is two-step,
with an intermediate problem called Max UnCut Hypergraph Bisection (MUCHB)
defined below.

Problem  (Max UnCut Hypergraph Bisection)

Given a hypergraph H = (VH , EH ) decide between cases:

• (YES) There is a bisection (VH0 , VH1 ) of VH s.t. |EH (VH0 )|, |EH (VH1 )| > ( 21 − )|EH |.
|VH |
• (NO) For every set T ⊂ VH of size |T | 6 2 , |EH (T )| 6 |EH |.

We are only modifying the second step, but we need to describe the first one as well,
as we are relying on the structure of the solution in the (YES) case. Before we are able
to do that, let us give an overview of the reduction, to explain the origin of its various
components.
 (A, B) |X|
is a bisection of X if |A| = |B| = 2 and A ∪ B = X.
. Bipartite graphs III: Hardness based on sseh 

.. Reducing from sseh to ugc

In order to understand the reduction it makes sense to first see an outline of the proof
that sseh is strictly stronger than ugc, which was presented in [RS]. This will help
us understand the reduction of Manurangsi in the next subsection.
We begin with a graph G = (V , E) — an instance of the Small Set Expansion
problem. We will produce a bipartite graph H1 = (U1 ∪· V1 , E1 , {πe }e∈E1 ) which shall
constitute an instance of Unique Label Cover problem.

Reduction  (sseh to ugc: First attempt)

Set R = 1δ (δ is a parameter from SSE). Let U1 = V1 = V R , so every vertex in H1
corresponds toa vector of
R vertices from G.
u1 v1
For u = .. and v = .. we add (u, v) to E1 if there is a permutation π such
. .
uR vR
that π(u) and v are neighbours in G⊗R . We set π(u,v) = π.

The idea behind this reduction is that if there is a well-separated set S ⊂ V of size δ|V |
in G, then a vector of 1δ vertices should in expectation have exactly one vertex in S. We
call a vector v ∈ V R S-singular if |{v1 , . . . , vR }∩S| = 1. It is clear that if u ∈ U1 and v ∈ V1
are both S-singular and (u, v) ∈ E1 then the constraint π(u,v) is satisfied (see Fig. ).
Unfortunately, only 1e -fraction of all the vectors will be S-singular, so Reduction  is
clearly a failed attempt, as the easy part—completeness—is not working.
There is a simple way to overcome this problem. If we pick a large constant k and
sample k distinct R-element vectors (blocks) of vertices, it is very likely that at least
one of these vectors will be S-singular. This idea stands behind the second attempt at
reducing sseh to ugc.

Reduction  (sseh to ugc: Second attempt)

Let R = 1δ and k be a constant depending on we wish to achieve in our ugc. Let
H2 = (U2 ∪· V2 , E2 , {πe }e∈E2 ) where U2 = V2 = V R×k (every vertex in H2 corresponds
to k length-R
" vectors # of vertices
" of G).#
u11 ··· u1k v11 ··· v1k
For u = 1 ··· k and v = 1 ··· k we add (u, v) to E2 if there are k permuta-
uR ··· uR vR ··· vR
1
tions π , . . . , πk such that π (u1 ) and
1 v1 are neighbours in G⊗R , and π2 (u2 ) and
v2 are neighbours in and so G⊗R on . We will set π(u,v) = π1,...,k = π1 |π2 | . . . |πk
(concatenation of k permutations).

 Equivalently we may write that π1,...,k (u) and v are neighbours in G⊗(R×k) .
 The Minimum Maximal Matching problem

u v u v

(a) (b)

Figure : In Reduction  (a), a vertex of H1 corresponds to a block (vector) of R

vertices of G. Two blocks are connected if one can be permuted so that the blocks are
point-wise connected in G. Here both are S-singular and neither of the connecting
edges is cut by S, so the colouring lS with lS (u) = 4 and lS (v) = 6 satisfies the constraint.
In a refined Reduction , R pairs connected by the permutation are not connected
in G (dotted line). In Reduction  (b), a vertex of H2 corresponds to k blocks of R
vertices of G each. In u second and fourth block are S-singular, in v first and fourth.
Colouring lS with lS (u) = (4, 1), lS (v) = (4, 5) satisfies the constraint.

The reader may verify that the completeness of this reduction is satisfied. However,
it turns out that the reduction presented above does not yet provide the desired
soundness, and certain ‘noises’ have to be added.
For simplicity we describe them as a modification of the Reduction . Combin-
ing such a modified version of the reduction into blocks gives a correct instance of
ULC.

Reduction  (sseh to ugc: First attempt + random noises, [Ste, Reduc-

tion .])
Let R = 1δ . Let U1 = V1 = V R , so every vertex in H1 corresponds to a vector of R
vertices from u1G.
v1
For u = .. and v = ... we add (u, v) to E1 if there is set T ⊂ [R] of size R,
.
uR vR
and a permutation π such that (ui , vπ(i) ) ∈ E(G) whenever i < T . We set π(u,v) = π.

Every edge in E1 would now correspond to an edge in G⊗R , if not for the sets T in the
reduction. Alternatively, sets T could be generated by picking every coordinate with
probability . That would result in an edge-weighted instance of ULC—the weight of
an edge equal to the probability of sampling the set T associated with it.
. Bipartite graphs III: Hardness based on sseh 

.. Max Uncut Hypergraph Bisection

The first phase of the reduction was described in [Man]. It begins with a graph
G = (V , E) and produces a weighted hypergraph H = (VH , EH , wH ), an instance of
MUCHB. We will present two equivalent formulations of this reduction—one will
be combinatorial (easier to work with), the other (original) is probabilistic and we
present it for the sake of completeness (as it defines the weight function and may be
more intuitive).

Reduction  (combinatorial)
Let R = Θ( 1δ ), k, l be constants. Let also Ω = {0, 1, ⊥}. The hypergraph is built as
follows:

• The set of vertices VH = V R×k × ΩR×k .

• Every e ∈ EH is associated with a tuple B1 , . . . , Bl , x, D where B1 , . . . , Bl ∈ V R×k ,
D E

x ∈ ΩR×k , D ⊆ [R] × [k].

• A hypervertex v = (Av , xv ) belongs to the hyperedge e = B1e , . . . , Ble , xe , De if
D E

there is a permutation π ∈ (SR )k such that:

p
(i) ∃p s.t. Av (i) = Be (π(i)) for all i where xv (i) ,⊥.
(ii) xv (i) = xe (π(i)) for all i such that π(i) < De .

We will denote as πv,e the permutation that is used to show that v ∈ e.  The
weight wH (e) of an edge is defined by a probabilistic process described below.

To understand how this construction is obtained, imagine first, how we might try to
reduce ugc to MUCHB. The same dictatorship test as in Section  will be used, only in
a slightly different way. Every hyperedge now corresponds to an accepting run of the
dictatorship test. It contains all consistent inputs to the test.

Reduction  (An incorrect reduction from ULC to MUCHB)

Input: A ULC(, t, R) instance G = (U ∪· V , E, {πe }e∈E ).
Output: A hypergraph H = (VH , EH ) with VH = U × {0, 1}R . For v ∈ V , its l
neighbours u 1 , . . . , u l , a vector x ∈ {0, 1}R and S ⊂ [R] oof size R we compose a
hyperedge eu 1 ,...,u l ,v,x,S = (u p , x0 ) | p ∈ [l], x ◦ π(u,v) ∈ Cx,S .
n

 This may be not unique. The choice of a permutation is arbitrary.

 The Minimum Maximal Matching problem

Clearly, this reduction has the desired completeness. The soundness is not satisfied,
as our ULC instance could potentially be composed of two unconnected copies of a
(NO) instance. It would still be a (NO) instance, but a resulting hypergraph would
have a perfect bisection.
Instead our hypergraph H from Reductions  and  is constructed by running a
similar operation on an instance created in the previous subsection. To describe the
probabilistic construction we need to introduce some notation.

• Let εV be a small constant. For each A ∈ V R×k , TV (A) denotes a distribution

on V R×k where (i, j)-th coordinate is set to Ai,j with probability 1 − εV and a
uniformly random vertex of V otherwise.
• For A ∈ V R×k , G⊗(R×k) (A) denotes a distribution on V R×k where (i, j)-th coordinate
is a random neighbour of Ai,j in G.
• Let β be a small constant. Ωβ = {0, 1, ⊥}β denotes a distribution over Ω = {0, 1, ⊥}
where 0 and 1 are sampled with probability β/2 and ⊥ is sampled with probabil-
ity 1 − β.
• Let εT be a small constant. SεT (R) is a distribution over subsets of [R] of size εT R.
• For x ∈ ΩR×k and A ∈ V R×k let Mx (A) = A0 ∈ V R×k | ∀i ∈ [R × k].xi =⊥ ∨Ai = A0i .
n o

0 | x0
n o
• For a set D ⊆ [R] × [k] and x ∈ ΩR×k
β let C D (x) = x ([R]×[k])\D = x([R]×[k])\D .

Reduction  (probabilistic)
. Sample A z V R×k and Ã1 , . . . , Ãl z TV (A).
. Sample B1 z G⊗(R×k) (Ã1 ), . . . , Bl z G⊗(R×k) (Ãl ) and B̃1 z TV (B1 ), . . . , B̃l z TV (Bl ).
. Sample x z ΩR×k β and D z SεT (Rk).

. Output a hyperedge (π(B0 ), π(x0 )) | ∃p ∈ [l], π = π1 | . . . |πk ∈ (SR )k , x0 ∈ CD (x),

B0 ∈ Mx0 (B̃p ) .
o

Remark. Reductions  and  produce a weighted hypergraph (with weights assigned

to hyperedges). By copying the hyperedges with quantities proportional to their
weights we can make the instance unweighted as defined in Problem .
Manurangsi [Man] shows that for every ε > 0 there exist constants R, k, l, β, εV ,
εT such that the following theorems hold:
. Bipartite graphs III: Hardness based on sseh 

Theorem  ([Man, Lemma ])

If G came from a (YES) case of sseh, then there is a bisection (VH0 , VH1 ) of VH such that
|EH (VH0 )|, |EH (VH1 )| > (1/2 − )|EH |.
Theorem  ([Man, Lemma ])
|VH |
If G came from a (NO) case of sseh, then every subset T ⊂ VH with |T | 6 2 has
|EH (T )| 6 |EH |.
Henceforth, we will assume that we use the same parameters R, k, l, β, εV , εT as
in [Man].

Completeness—The bisection
We present the construction of sets VH0 and VH1 from Theorem  (in the (YES) case).
They will be important in our further proceedings. The following additional notation
will be used:
• For a vector A ∈ V R×k let A1 , . . . , Ak ∈ V R be its blocks. Let W (A, x, j) denote the
j
set of all coordinates i in j-th block such that xi ,⊥ and Ai ∈ S, i.e., W (A, x, j) =
j
n o
i ∈ [R] | Ai ∈ S ∧ xi ,⊥ .
• Let j ∗ (A, x) ∈ [k] denote the first x-filtered S-singular block, i.e., smallest j with
|W (A, x, j)| = 1. Note that if such block does not exist, we set j ∗ (A, x) = −1.
• Let i ∗ (A, x) ∈ [R] × [k] be the only element in W (A, x, j ∗ (A, x)). If j ∗ (A, x) = −1, let
i ∗ (A, x) = −1. For a hypervertex v = (Av , xv ) we will sometimes write i ∗ (v) instead
of i ∗ (Av , xv ).
We will split all hypervertices into three sets: T0 , T1 , T⊥  . Let T⊥ be the set of hyper-
vertices (A, x), such that j ∗ (A, x) = −1. Then, for every hypervertex (A, x) not in T⊥ we
look at the value b = xi ∗ (A,x) and include it in T0 if b = 0 and in T1 if b = 1. It is shown
that |EH (T0 )|, |EH (T1 )| > ( 12 − )|EH |.

Not-disjointness relation between hypergraph edges

Following from the definition of the hypergraph (as in Reduction ), we have the
following property.
Property . Two hyperedges e1 = B11 , . . . , Bl1 , x1 , D1 and e2 = B12 , . . . , Bl2 , x2 , D2 of the
D E D E

hypergraph are not disjoint if there exist two permutations π1 , π2 ∈ (SR )k satisfying the
following conditions
T and T1 are called T00 and T10 respectively in [Man].
0
 The Minimum Maximal Matching problem

() x1 (π1 (i)) = x2 (π2 (i)) for i such that π1 (i) < D1 ∧ π2 (i) < D2 ;
p p
() ∃p1 , p2 ∈ [l] such that B11 (π1 (i)) = B22 (π2 (i)) for all i satisfying

(i) xj (πj (i)) =⊥⇒ πj (i) ∈ Dj for j ∈ {1, 2}, and

(ii) π1 (i) < D1 ∨ π2 (i) < D2 .

We will denote the not-disjointness relation by Λ. So (e1 , e2 ) ∈ Λ iff exists a hypergraph

vertex v ∈ e1 ∩ e2 . Similarly, we define our relation ∆ ⊂ EH × EH . Its definition is
similar to the conditions of not-disjointness, but it is not symmetrical. It is also a
superset of Λ.

Definition . Two hyperedges e1 = B11 , . . . , Bl1 , x1 , D1 and e2 = B12 , . . . , Bl2 , x2 , D2

D E D E

of the hypergraph are in relation ∆ if there exist two permutations π1 , π2 ∈ (SR )k

satisfying the following conditions

() x1 (π1 (i)) 4 x2 (π2 (i)) for i such that π1 (i) < D1 ∧ π2 (i) < D2 ;
p p
() ∃p1 , p2 ∈ [l] such that B11 (π1 (i)) = B22 (π2 (i)) for all i satisfying

(i) xj (πj (i)) =⊥⇒ πj (i) ∈ Dj for j ∈ {1, 2}, and

(ii) π1 (i) < D1 ∨ π2 (i) < D2 .

x1 4 x2 is a point-wise inequality on vectors of Ω, where a 4 a for all a ∈ Ω and 0 4 1.

.. From MUCHB to Bipartite MMM

In [Man] the bipartite graph G BI S = (VL ∪ VR , E BI S ) is created the following way.

• VL , VR = EH .
• el ∈ VL and er ∈ VR are connected if they are not disjoint (i.e. (el , er ) ∈ Λ).

We are applying a modification to this construction. Instead of Λ we are basing the

graph on the relation ∆ which will result in a superset of edges of E BI S .

Reduction  (MUCHB to MMM)

The bipartite graph G MMM = (VL ∪ VR , E MMM ) is built as follows.

• VL , VR = EH .
• el ∈ VL and er ∈ VR are connected in E MMM if (el , er ) ∈ ∆.
. Bipartite graphs III: Hardness based on sseh 

Completeness
In the (YES) case we need to find a maximal matching that leaves roughly half of the
vertices of G MMM unmatched. On the next few pages we will be constructing the
matching, as promised in the following lemma.
Lemma  (Completeness). If G came from the (YES) case of sseh, then in G MMM there
is a maximal matching of size at most ( 12 + )|VL |.
Our matching M will be composed of two parts. In M1 , VL ∩ EH (T0 ) will be matched
with VR ∩ EH (T1 ) while VL ∩ EH (T1 ) and VR ∩ EH (T0 ) will form a bi-independent set
(this is why we needed to build the graph on an asymmetric relation). This will be
the larger part, of size ( 12 − )|VL |. In M2 , the vertices in VL \ (EH (T0 ) ∪ EH (T1 )) and
VR \ (EH (T0 ) ∪ EH (T1 )) will be paired.

Constructing M1 . The first part of our matching will be based on a bijection be-
1
D E
l
tween EH (T0 ) and EH (T1 ). For a hyperedge t = B , . . . , B , x, D we define its witness
indexes W (t) = i ∈ [R] × [k] | ∃p ∈ [l], x0 ∈ CD (x), i = i ∗ (Bp , x0 ) . Now we construct a
n o

bijection φ : EH (T0 ) → EH (T1 ). Let t0 = B1 , . . . , Bl , x0 , D be any hyperedge in EH (T0 ).

D E

Then let a vector x1 be as follows.


1 if i ∈ W (t0 ),

x1 (i) =  ()

x0 (i) otherwise.


We set φ(t0 ) B t1 , where t1 = B1 , . . . , Bl , x1 , D . We verify the validity of this definition

D E

in the next lemmas. In Lemma  we prove that φ is a properly defined function,

meaning that t1 ∈ EH (T1 ). Then in Lemma  we prove that φ is a bijection. To prove
these Lemmas, we first need the following facts, explaining why W (t) are the witness
indexes.
Lemma . Let W 0 (t) be constructed in the following way. Consider any hypervertex v in
t. For every such v we include πv,t (i ∗ (v)) in W 0 (t). Then W (t) = W 0 (t).
Proof. We will first show that W (t) ⊆ W 0 (t). Consider any i ∈ W (t). Let p ∈ [l], x0 ∈
CD (x) be such that i = i ∗ (Bp , x0 ). Then (Bp , x0 ) ∈ t, so i ∈ W 0 (t).
Next, we will show that W 0 (t) ⊆ W (t). Let v = (Av , xv ) be any vertex in t. Let p
p
be such that Av (i) = Be (πv,t (i)) for all i where xv (i) ,⊥. We consider another vertex
v 0 = (πv,t (Av ), πv,t (xv )), which also is in t. π permutes each block in Av separately, so
it preserves value i ∗ , that is i ∗ (v 0 ) = πv,t (i ∗ (v)). Now we claim that i ∗ (v 0 ) = i ∗ (Bp , x). It
is so, because if Av (πv,t (j)) , Bp (j) then x(πv,t (j)) =⊥.
 The Minimum Maximal Matching problem

Lemma . Let t0 be any hyperedge in EH (T0 ). Then D0 ∩ W (t0 ) = .

Proof. Assume ∃i ∈ D0 ∩ W (t0 ). Let v = (A, x) ∈ t0 be such that πv,t0 (i ∗ (v)) = i. Define
x0 (i) = 1 and x0 (j) = x(j) for all j , i. The vertex v 0 = (A, x0 ) is still in t0 , because i ∈ D0 .
v 0 is however also in T1 because x0 (i ∗ (v 0 )) = 1, which contradicts t0 ∈ EH (T0 ).

Lemma . For any t = B1 , . . . , Bl , x, D ∈ EH (T0 ), for any i ∈ W (t), x(i) = 0.

D E

Proof. For any p ∈ [l] and x0 ∈ CD (x), a vertex v = (Bp , x0 ) is in t, and, by consequence,
in T0 . Thus, x0 (i ∗ (Bp , x0 )) = 0. As i ∗ (v) < D, x(i ∗ (Bp , x0 )) = 0 as well.

Lemma . Let t0 be any hyperedge in EH (T0 ) and t1 = φ(t0 ). Let v = (A, x) be any
hypervertex in t1 . Then v ∈ T1 .

Proof. πv,t1 (i ∗ (A, x)) ∈ W (t1 ), so by () we have x1 (πv,t1 (i ∗ (A, x))) = 1. πv,t1 (i ∗ (A, x)) <
D0 , so x(i ∗ (A, x)) = x1 (πv,t1 (i ∗ (A, x))) = 1. Thus, v ∈ T1 .

Lemma . Function φ is a bijection.

Proof. Observe that for any hyperedge t0 ∈ EH (T0 ) we have W (t0 ) = W (φ(t0 )). This
follows from the fact that i ∗ ( · , x) does not distinguish between zeros and ones in x.
Now, it is easy to construct the inverse function to φ. φ−1 will be almost the same
as φ, except that in the vector x we will set x(i) to be 0 instead of 1. Thanks to the
observation, φ−1 changes the same coordinates in x as φ, so it is an inverse of φ.

The
n matching between VL ∩ o EH (T0 ) and VR ∩ EH (T1 ) uses φ in a natural way: M1 =
(t0 , φ(t0 )) | t0 ∈ VL ∩ EH (T0 ) . We need to show that all the edges of this matching are
in graph G MMM .

Lemma . Let t0 , t1 be any pair such that φ(t0 ) = t1 . Then (t0 , t1 ) ∈ ∆.

Proof. We choose both π1 , π2 to be the identity permutation. Both conditions of

Definition  are satisfied:

() is satisfied, as for all i either x0 (i) = x1 (i) or x0 (i) = 0 and x1 (i) = 1, which is
exactly the definition of x0 (i) 4 x1 (i).
() is satisfied, because B10 = B11 .
. Bipartite graphs III: Hardness based on sseh 

x     

=
x0     

x =     

I D D
Figure : Vectors from the proof of Lemma . x1 and x0 are identical everywhere
apart from positions in D0 ∪ D1 ∪ I. x00 is constructed from x0 by replacing ones with
noughts at indices of I. The vector xv 0 of v 0 ∈ t1 ∩ t00 must agree with x1 everywhere
except D1 , and with x00 everywhere except D0 .

Constructing M2 . As we said earlier, the vertices in VL \ (EH (T0 ) ∪ EH (T1 )) and

VR \ (EH (T0 ) ∪ EH (T1 )) need ton be taken care of. We are going to match them according
to identity function: M2 = (el , er ) | el ∈ VL \ (EH (T0 ) ∪ EH (T1 )), eR ∈ VR \ (EH (T0 ) ∪
o
EH (T1 )) st. el = er (every vertex is matched with its own copy). These pairs are in
E MMM as the relation ∆ is reflexive . We define our matching M = M1 ∪ M2 .

Maximality of M. Now, all vertices left unmatched by M are either in VL ∩ EH (T1 )

or VR ∩ EH (T0 ). In the following lemma we prove that they form a bi-independent
set, and therefore our matching is maximal.

Lemma . For every t1 ∈ EH (T1 ) and t0 ∈ EH (T0 ), (t1 , t0 ) < ∆.

Proof. Assume that there is t1 = B11 , . . . , Bl1 , x1 , D1 ∈ EH (T1 ) and t0 = B10 , . . . , Bl0 , x0 , D0
D E D E

such that (t1 , t0 ) ∈ ∆. We will show that there exists a hypervertex v ∈ t0 ∩ T1 , and
consequently t0 < EH (T0 ).
The relations ∆ and Λ are very similar. If (t1 , t0 ) ∈ Λ (t1 and t0 are not disjoint),
we have that t0 ∩ T1 , right away. Otherwise, from the definition of ∆ we have that
there are two permutations π1 , π0 such that ∀i < D1 ∪ D0 . x1 (π1 (i)) 4 x0 (π0 (i)) and the
inequality is strict for at least one i (since (t1 , t0 ) < Λ), we shall denote the set of these
 Infact, we did not need to define M2 to prove the completeness: Since M1 matches almost half
of vertices from each side, and (as we are proving in Lemma ) another almost half forms a bi-
independent set, extending M1 greedily could extend it by at most |EH | edges. That would give us the
Lemma  with the matching size (1 + 2)|VL |.
 The Minimum Maximal Matching problem

indices with strict inequality as I (see Fig. ). By permuting the indices in t1 and t0
we will assume that π1 Dand π0 are bothEthe identity permutations. Let us consider
another hyperedge t00 = B10 , . . . , Bl0 , x00 , D0 , where x00 is defined as follows:

0 if i ∈ I,

x00 (i) =  ()

x0 (i) otherwise.


Clearly (t1 , t00 ) ∈ Λ (by Property ), which means that t00 < EH (T0 ). Let v 0 = (Av 0 , xv 0 )
be such that v 0 ∈ t00 ∩ T1 (again, we will assume that πv 0 ,t00 is the identity permutation).
We will change v 0 slightly, to ‘reverse’ changes made in (). xv will be defined as
follows: 
1 if i ∈ I,

xv (i) =  ()

xv 0 (i) otherwise.


Now, a hypervertex v = (Av 0 , xv ) belongs to t0 (with the witness permutation πv,t0 =

π0 ◦ πv 0 ,t00 ). To finish the proof we need to show that v < T0 . Since v 0 ∈ T1 , we have that
xv 0 (i ∗ (Av , xv )) = 1. Then, from (), also xv0 (i ∗ (Av , xv0 )) = xv (i ∗ (Av , xv )) as I ∩ W (t1 ) = ,
so v is also in T1 .
To finish the proof of Lemma  we bound the size of matching M.
1

|M| = |M1 | + |M2 | = |EH (T0 )| + |VL \ (EH (T0 ) ∪ EH (T1 ))| = |VL | − |EH (T1 )| 6 + |VL |.
2

Soundness

We recall Lemma from [Man] about graph G BI S in the (NO) case of MUCHB.

Lemma  ([Man, Appendix A]). If for every T ⊂ VH of size at most |V2H | we have
|EH (T )| 6 |EH |, then in G BI S there is no bi-independent set of size 2 · (|EH | + 1).
As the set of edges of graph G MMM is a superset of the set of edges of G BI S , analogous
statement is also true for G MMM .
Lemma  (Soundness). If H came from the (NO) case of Problem , then in G MMM
there is no bi-independent set of size 2 · (|EH | + 1).
Altogether Lemmas  and  give us the following, from which Theorem  immedi-
ately follows.
Corollary. Assuming Small Set Expansion Hypothesis, for any > 0 it is NP-hard given a
bipartite graph G = (U ∪· V , E) (with |U | = |V |) to distinguish between two cases.
. Notes 

• (YES) There is a maximal matching in G of size ( 12 + )|U |.

• (NO) There is no bi-independent set of size |U ∪· V |, so every maximal matching
must have at least (1 − )|U | edges.

. Notes
The dictatorship test described in Section  is based on the ‘It ain’t over till it’s over’
theorem conjectured by Friedgut and Galai and proved by Mossel, O’Donnell and
Oleszkiewicz [MOO]. The reduction from Unique Label Cover to Vertex Cover,
described in Section  comes from a paper by Bansal and Khot [BK]. Its modifica-
tion proving Theorem , described in Section . is original, albeit based on the same
idea, as one in our paper with Szymon Dudycz and Mateusz Lewandowski [DLM]
published at IPCO —only in that paper we were modifying a different reduction
of Khot and Regev [KR]. The proof of Theorem  described in Section  comes
from our aforementioned paper [DLM].
The original reduction from ULCE to Balanced Bi-Clique, revisited in Section ,
was presented by Bhangale, Gandhi, Hajiaghayi, Khadenkar and Kortsarz [Bha+].
Our modification proving Theorem  comes from our paper with Szymon Dudycz
and Pasin Manurangsi [DMM].
The reduction from SSE to ULC described in Section  was presented by Raghaven-
dra and Steurer [RS] and expanded on in their paper with Tulsiani [RST]. An
accessible write-up of this reduction and other results on Small Set Expansion is
Steurer’s PhD thesis [Ste].
The reduction from SSE to Balanced Bi-Clique was given by Manurangsi [Man].
We have modified it with Dudycz and Manurangsi [DMM] to prove Theorem .
The Minimum Maximal Matching problem was (on separate occasions) suggested
to me and Szymon Dudycz by Katarzyna Paluch with expectation that we improve
upon the trivial -approximation algorithm.
 Proportional Approval Voting

In this chapter we are going to study a family of multi-winner approval-based electoral

systems. In these systems the voters are electing a committee of k members (e.g.
a parliament) by checking boxes next to the candidates they approve of , without
limitation on the number of approved candidates. An electoral system is a rule
determining the result of elections based on the votes cast.
In late th century, a Danish mathematician and astronomer Thorvald N. Thiele
proposed [Thi] two such rules with a goal of making approval-voting results
proportional. The first, Sequential proprotional approval voting was practical:
The result is given by a well-defined algorithm. It was used in Sweden in early th
century. The second rule, called Proportional approval voting just says that the
P|W ∩A |
winning committee must be one that maximises scrw (W ) = v∈V i=1 v wi , out of
P
all possible committees W ⊂ C of size k (V is the set of voters; C denotes the set of
candidates; Av ⊆ C is a set of candidates supported by the voter v). The vector w
was set by Thiele to (1, 1/2, 1/3, . . . ), but one can see that this choice is arbitrary and
in fact there is a different, valid w-Thiele electoral system for every vector w as
long as it is non-increasing (the law of diminishing returns on the voter’s happiness)
and non-negative (the voter’s happiness should not decrease when an additional
candidate supported by him is voted in).

. Algorithmic question
Finding a winning committee in a w-Thiele electoral system becomes a computational
task. It turns out only to be solvable in polynomial time when w = (1, 1, 1, . . . ) .
 As opposed to the rank-based systems like Single Transferrable Vote (used in Australia), where
voters order the candidates from the most to the least preferred.
. Algorithmic question 

Every other w-Thiele electoral system is NP-hard to decide [SFL].The researchers

have thus put forward the question of approximability of these voting systems,
with justification that, while it obviously would not have an application to political
elections, such an algorithm may still be relevant for selection process less prone to
controversy like participatory budgets [SFL].

Connection to classical problems When w = (1, 0, 0, . . . ), deciding the election be-

comes equivalent n to the Maximum Coverage o problem, where we are given a collection
of sets C = T1 , . . . , T|C| | v ∈ Tc ⇔ c ∈ Av over the same universe V and must select
k of them to cover the most elements. Famously, the greedy algorithm gives an
(1 − 1/e)-approximation [NWF] and this is tight [Fei] unless P = NP. Moreover,
assuming Gap-ETH, even an algorithm running in time f (k) · (|C| + |V |)o(k) (for any
function f ) cannot achieve a ratio smaller than (1 − 1/e) [Man]. Recently, Bar-
man et al. [Bar+] analysed the Multi `-Coverage—with an intermediate between
(1, 1, 1, . . . ) and (1, 0, 0, . . . )—namely w` = (1, . . . , 1, 0, . . . ) (` ones followed by noughts).
`
They gave an algorithm with an approximation ratio of 1 − e`` `! ≈ 1 − √ 1 . They also
2π`
showed, this ratio cannot be beaten assuming ugc.

Our results We are presenting an approximation algorithm for a natural class of

w-Thiele rules.
Definition . A non-increasing vector x = (x1 , x2 , . . . ) is geometrically dominant if for
2
every i ∈ N+ , xi · xi+2 > xi+1 . Equivalently, x is geometrically dominant if either
xi xi+1
x = (1, 0, 0, . . . ) or xi+1 > xi+2 (x decreases slower than a geometric sequence).

All previously studied w-Thiele electoral systems are defined by geometrically domi-
nant sequences w.
Theorem 
Let w be any geometrically dominant sequence. There exists a polynomial-time αw -
approximation algorithm for w-Thiele rule where
 x  ∞  x

X  X  1 X 
αw = E  wi  =  wi  .
xzPoi(1) ex!

i=1 x=1 i=1

 Forsimplicity, throughout this chapter we are assuming w1 = 1. Every vector can be scaled so
without any effect on the election.
 α is the expected score of a voter, if the number of elected candidates from his ballot is sampled
w
from the Poisson distribution with mean one.
 Proportional Approval Voting

Problem Name Sequence w Our Ratio

Approval Chamberlin-Courant (aka. (1, 0, 0, . . . ) 1 − 1/e > 0.6321
Maximum Coverage)
Proportional Approval Voting wi = 1/i 0.7965 . . .
Saint-Laguë method wi = 1/(2i − 1) 0.7394 . . .
Penrose apportionment method wi = 1/i 2 0.7084 . . .
P∞ 1 Px 1
p-Harmonic wi = 1/i p x=1e·x! · ip
i=1
1 1
p-Geometric wi = p i 1−p · 1 − e1−p

Table : Our approximation ratios for w-Thiele rules. They are tight unless P=NP. All
listed w-Thiele rules were known to have (1 − 1/e)-approximation algorithm [NWF;
SFL]. To the best of our knowledge, none better than (1 − 1/e)-approximation
algorithm was known for any of these rules.

The algorithm is a simple application of a well known linear programming rounding

technique called pipage rounding due to Ageev and Sviridenko [AS]. Our contri-
bution is an analysis of this algorithm for w-Thiele rules. The exact value of this
approximation ratio for specific rules is presented in Table . Additionally we provide
a family of hard instances for the greedy algorithm to show that our algorithm is
indeed a stronger approximation, not just better analysed. We are also showing that
our approximation ratio is tight.

Theorem 
Let w be any non-increasing sequence such that limi→∞ wi = 0. For any > 0 it is NP-
hard to compute an (αw + )-approximate solution to w-Thiele. Furthermore, assuming
Gap-ETH, such an approximation cannot be computed even in time f (k) · (|V | + |C|)o(k) .

Our inapproximability construction is a reduction from Label Cover (Problem .),

in principle the same as the reduction for Maximum Coverage of Feige. We do
however need a more involved analysis to serve an arbitrary vector w.

. The approximation algorithm

Our algorithm is based on an LP rounding technique, which requires us to first write
a linear programming relaxation of the problem. Since scrw only makes sense on
subsets of candidates, we first extend scrw also to a fractional solution x ∈ [0, 1]C
. The approximation algorithm 

specifying for each candidate fractionally, how much the candidate is selected.
|Av |
XX
scrw (x) = wl · min {1, max {0, xAv − l + 1}},
v∈V l=1

where xAv = c∈Av xc . Intuitively, for each voter v this function adds the first xAv
P
elements of the sequence w. The last element of the sequence may be counted in only
fractionally. With this definition, we now compute the optimum fractional solution
x∗ of the following linear program.
maximize scr w (x)
X
subject to xc = k ()
c∈C
x ∈ [0, 1]C
Remark (Solving the LP). The program () is not really an LP, as its objective function
is not linear. To solve it we need to rewrite it into an equivalent form:
|Av |
XX
maximize wl · yvl
v∈V l=1
X
subject to xc = k
c∈C
|Av |
X X
yvl 6 xc ∀v ∈ V
l=1 c∈Av

x ∈ [0, 1] ; y ∈ [0, 1]V ×C .

This equivalence holds because w is a non-increasing sequence.

To round x∗ , we will employ a framework called pipage rounding [AS]. Specifi-
cally, we resort to the following result of Călinescu et al.
Theorem  ([Căl+, Lemma .])
Let f be a monotone, submodular function and a polytope o B(M) be described by matroid
∗
n
constraints. Given vector y = arg max f (x) | x ∈ B(M) the procedure Pipage-Rounding
will return a solution S ∈ M of value E[f (S)] > Eŷzy ∗ [f (ŷ)].
Remark. Applying the theorem directly results in a randomized algorithm. There is
however a deterministic version of this theorem [Căl+]. In order to use it, we have
to be able—given a vector x∗ —to efficiently compute Ex̂zx∗ [scrw (x̂)].
 Proportional Approval Voting

This can be done with a simple dynamic programming procedure: Let dpv [k][i]
denote the probability that exactly i candidates out of c1 , . . . , ck ∈ Av have been selected.
Clearly it can be computed using the identity

dpv [k][i] = dpv [k − 1][i − 1] · xc∗k + dpv [k − 1][i] · (1 − xc∗k ).

The function scrw is submodular (because w is non-increasing), and monotone

(because w is non-negative). We can thus use Theorem  to round the optimum LP
solution x∗ . The resulting approximation ratio is then equal to

Ex̂∼x∗ [scr w (x̂)]

ρ= .
scrw (x∗ )

The remainder of this section is dedicated to bounding ρ.

E ∗ [scr (x̂,v)]
Let the approximation ratio of a voter v be ρv B x̂∼xscr (xw∗ ,v) where scrw (x, v) is
w
P|Av |
the score of v, i.e., l=1 wk · min {1, max {0, xAv − l + 1}}. Henceforth, we will focus on a
single voter v and show that ρv > αw . This straightforwardly implies that ρ > αw as
desired.
We next characterize x∗ that minimizes the ratio ρv , which in the end will lead us
to the claimed bound.

.. Step : Three-valued x∗

∗ ∗ ∗
Let τ = xA = c∈Av xc be a sum of x values for candidates approved by v. The score
P
v
of the fractional solution x∗ is now scrw (x∗ , v) = w1 + · · · + wbτc + {τ } · wbτc+1 .
The numerator Ex̂∼x∗ [scr w (x̂, v)] of ρv can be written as

∞ 
 
X  X 
(w1 + · · · + wl ) P  x̂c = l  .
 
 x̂∼x ∗ 
 
l=0 c∈Av

We recall the following lemma due to Barman et al.

Lemma  ([Bar+] ). Let x ∈ [0, 1]m be such that x1 + · · · + xm = τ. Then, for P any non-
negative sequence (a1 , a2 , . . . ), there exist q ∈ (0, 1) and x ∈ {0, q, 1} such that i xi0 = τ
0 m

and   m    m 
X  X  X  X 
a P 
 l x̂∼x  x̂i = l  > a P 
 l x̂∼x0  x̂i = l  .
l i=1 l i=1
. The approximation algorithm 

Now, notice ∗
P that∗ the denominator scrw (x , v) in our ratio ρv only ∗depends on the
sum τ = c∈Av xc and not the way values are distributed among xc ’s. Hence, from
Lemma , ρv is minimised by x∗ that only takes three values—0, q and 1, although
we do not yet know what the value of q is .

.. Step : Getting rid of ones

Let w(x) be a function w : R>0 → R>0 extending w, defined by w(x) B wbxc · (1 −
{x}) + wbxc+1 · {x}. Clearly, when x is integral w(x) = wx . For fractional x, the func-
tion just takes the weighted average of wbxc , wbxc+1 . The following observation is
straightforward.
Observation. The function w is convex on R>1 .
Using this observation, we can prove the following lemma, which allows us to
only consider x∗ that does not contain 1.

Lemma . Suppose xc̃∗ = 1 for a candidate c̃ ∈ Av . Then the approximation ratio for the
voter v decreases after removing c̃ from the fractional solution x∗ . Formally, if xc̃∗ = 1, then
P∞ P∞
l=0 (wl Px̂∼x∗ [ c x̂c > l]) l=0 (wl Px̂∼x∗ [ c,c̃ x̂c > l])
P P
ρv = > .
w1 + · · · + wbτc + {τ } · wbτc+1 w1 + · · · + wbτ−1c + {τ } · wbτc

Proof. It suffices to show that

P∞ P∞
∆int l=0 (wl Px̂∼x∗ [ c x̂c > l]) − l=0 (wl Px̂∼x∗ [ c,c̃ x̂c > l])
P P
c̃
B > ρv .
∆frac

c̃ w1 + · · · + wbτc + {τ } · wbτc+1 − w1 + · · · + wbτ−1c + {τ } · wbτc

∆int
In fact we will just show that c̃
> 1. Starting with the numerator.
∆frac
c̃

∞ 
     
X X X
∆int
  
= w ·  P  −
 
x̂ l x̂ l
 
> P >
  
c̃  l x̂∼x∗  c  x̂∼x∗ 
 c 
l=0 c c,c̃
∞ 
X
    
X X
= wl ·  P  x̂c > l  − P ∗  x̂c > l + 1
   
 x̂∼x∗  x̂∼x
l=0 c c

 In [Bar+, Lemma .], τ is required to be an integer. It is however obvious from the proof that
this is not necessary.
 Note that this reduction cannot be performed globally. It only makes sense for a single voter v and

his candidates Av , as applying Lemma  to different sets {xc∗ }c∈Av may result in different q values for
the same candidate.
 Proportional Approval Voting

∞ 
X
       
X  X  X 
= wl · P ∗  x̂c = l  = E ∗ w  x̂c  = w(τ) = ∆frac
  
x̂c  > w  E ∗  c̃
x̂∼x x̂∼x x̂∼x
l=0 c c c

Where the inequality follows from convexity of w.

Hence, the fractional solution x∗ that minimizes the ratio ρv of a single voter only has
values 0 and q where q ∈ (0, 1).

.. Step : Comparing Binomial and Poisson

The values {xc∗ }c∈Av are now all either 0 or q and sum up to τ. It means that, when we
sample x̂ z x∗ , we now are performing τq coin tosses with an identical coin with bias
q. The value of our Pintegral solution which is equal to l (wl · Px̂∼x∗ [ c x̂c > l]) can be
P P
now rewritten as l (wl · P[Bin(τ/q, q) > l]) = E[w1 + · · · + wBin(τ/q,q) ].
Let sw (n) = w1 + · · · + wn . The function sw is monotone (as w is non-negative) and
concave (as w is non-increasing). We recall another lemma by Barman et al.
Lemma  ([Bar+, Lemma .]). For any convex function f , any integer N > 1 and
parameter p ∈ [0, 1] we have E[f (Bin(N , p))] 6 E[f (Poi(N p))].
By applying the above lemma with the function s−w (x) = −sw (x) which is convex, we
get that E[w1 + · · · + wBin(τ/q,q) ] > E[w1 + · · · + wPoi(τ) ] and hence

E[w1 + · · · + wPoi(τ) ]
ρv > . ()
w1 + · · · + wbτc + {τ } · wbτc+1

.. Step : Changing the mean of Poisson

A property of Poisson distributions is that sampling Xi z Poi(λi ) for i = 1, . . . , m and
adding them is equivalent to sampling Y z Poi( i λi ). Hence,
P

 bτc 
X 
E[w1 + · · · + wPoi(τ) ] =  E [wy+1 + · · · + wy+Poi(1) ]

 
yzPoi(l−1) 
()
l=1
+ E [wy+1 + · · · + wy+Poi({τ}) ].
yzPoi(bτc)

We can bound each term in () as follows.

Lemma . For any l ∈ {1, . . . , bτc}, we have
wl
E [wy+1 + · · · + wy+Poi(1) ] > E[w1 + · · · + wPoi(1) ] .
yzPoi(l−1) w1
. The approximation algorithm 

Proof. We start with the left side of the inequality

∞
X
E [wy+1 + · · · + wy+Poi(1) ] = E[wy+1 + · · · + wy+Poi(1) ] · P[Poi(l − 1) = y]
yzPoi(l−1)
y=0

By w being geometrically dominant we get

∞
X
E[wy+1 + · · · + wy+Poi(1) ] · P[Poi(l − 1) = y]
y=0
∞
X wy+1
> E[w1 + · · · + wPoi(1) ] · P[Poi(l − 1) = y]
w1
y=0
h i
E wPoi(l−1)+1
= E[w1 + · · · + wPoi(1) ] ·
w1
wE[Poi(l−1)+1] wl
> E[w1 + · · · + wPoi(1) ] · = E[w1 + · · · + wPoi(1) ] · .
w1 w1
The last inequality coming from convexity of w.
Next, we will apply a similar reasoning to the last term of (). Specifically, we will
show the following:
wbτc+1
Lemma . EyzPoi(bτc) [wy+1 + · · · + wy+Poi({τ}) ] > E[w1 + · · · + wPoi(1) ] · w1 · τ .
{ }

Before we attempt to prove it, it will be helpful to have the following lemma at hand.
EkzPoi(δ) [w1 +w2 +···+wk ]
Lemma . Let f (δ) = δ . For any δ ∈ (0, 1] f (δ) > f (1).
Proof. We start the proof with a simple calculus:
d EkzPoi(x) [g(k)]
Observation. For any g : N → R, = E [g(k + 1) − g(k)].
dx kzPoi(x)
E [w +w +···+w ]
Using the above we analyse the function f (δ) = kzPoi(δ) 1δ 2 k
. Its first derivative
is
df EkzPoi(δ) [δwk+1 ] − EkzPoi(δ) [w1 + · · · + wk ]
f 0 (δ) = = .
dδ δ2
We are going to show that f 0 (δ) 6 0 for δ ∈ (0, 1] by focusing on the numerator
0
fnum (δ) (the denominator δ2 is always going to be positive). Obviously fnum 0 (0) = 0.
0
dfnum
Moreover = δ E [wk+2 − wk+1 ] is never greater than 0 since the sequence w
dδ kzPoi(δ)
is non-increasing.
 Proportional Approval Voting

Now we can come back to Lemma .

Proof of Lemma . Again we transform the left side of desired inequality and take
advantage of geometrical dominance and convexity of sequence w.
wbτc+1
E [wy+1 + · · · + wy+Poi({τ}) ] > E[w1 + · · · + wPoi({τ}) ] ·
yzPoi(bτc) w1

Now, by applying Lemma  with δ = {τ },

wbτc+1 wbτc+1
E[w1 + · · · + wPoi({τ}) ] · > E[w1 + · · · + wPoi(1) ] · {τ } · .
w1 w1

Putting together (), (), Lemmas  and  allows us to conclude that the ratio ρv is at
least
w1 + · · · + wbτc + {τ } · wbτc+1
E[w1 + · · · + wPoi(1) ] ·
w1 w1 + · · · + wbτc + {τ } · wbτc+1

which is equal to E[w1 + · · · + wPoi(1) ] = αw since w1 = 1. Hence, our algorithm yields

an αw -approximate solution.

. Hardness of Approximation
The hardness of approximation proof is based on a reduction, much like those in
Chapter . We do however want this reduction, when applied to Theorem ., to
prove the second part of our Theorem . Our reduction must therefore not only be
complete and sound, but also must be aware of the size of the instance and parameter
k. We will prove the following:
Theorem 
Let w be such that limi→∞ wi = 0. For any > 0, there exist δ > 0, t ∈ N and a reduction
that takes in an instance L = (A, B, E, [L], [R], {πe }e∈E ) of LC(δ, t, R) and produces an election
E = (V , C, {Av }v∈V ) in poly(|C|, |V |) time such that:
• (Completeness) If val(L) = 1, then there exists W ⊆ C such that scrw (W ) = |V |.
• (Soundness) If weak-val(L) < δ, then for all W ⊆ C, we have scrw (W ) = (αw + ) · |V |.
• (Size bound) |C| = |A| · L and |V | = |B| · t R .
• (Parameter) k = |A|.
. Hardness of Approximation 

Plugging in Theorem  to Theorems ., . immediately yields the desired hardness
of approximation in Theorem .
The rest of this section is devoted to the proof of Theorem . Our reduction is in
fact the same as Feige’s [Fei], which can be interpreted as the inapproximability for
w-Thiele electoral system with Chamberlin-Courant rules. Hence, on the hardness
front, our main contribution is in extending the analysis to work with more general
sequences of weights. To this end, we provide several helpful auxiliary lemmas in
Section .. Then, in Section ., we describe Feige’s reduction together with our
generalized analysis, and prove Theorem .

.. Comparing Binomial and Poisson

Similar to the analysis of our algorithm, we will need to move from (generalizations of)
Binomial random variables to Poisson random variables. The difference between here
and the algorithmic counterpart (Section .) is that the sign is flipped. Specifically,
in Section ., we need E[w1 + · · · + wBin(N ,p) ] > E[w1 + · · · + wPoi(N p) ]. In this section, we
show that in fact the left-hand side is not much more than the right-hand side.

Lemma . For any sequence w such that limi→∞ wi = 0, and any γ, λ ∈ R+ exists a
boundary p0 = p0 (w, γ, λ) such that, for any positive p < p0 and any positive integer
N < λ/p we have
h i h i
E w1 + · · · + wBin(N ,p) < E w1 + · · · + wPoi(N p) + γ.

Proof. We will bound the following quantity:

h i h i
E w1 + · · · + wBin(N ,p) − E w1 + · · · + wPoi(N p)
∞
X ()
= P [Bin(N , k) = k] − P [Poi(N k) = k] · (w1 + · · · + wk ) .
k=1

To this end we rely on a classic result that gives an upper bound on the total vari-
ation distance between the Binomial distribution and the corresponding Poisson
distribution.

Theorem  ([LeC])
For any n ∈ N and p ∈ R>0 , ∞k=0 |P[Bin(n, p) = k] − P[Poi(np) = k]| < 4p.
P

0.01γ 2
Let i ∗ be the smallest positive integer such that wi ∗ 6 0.1γ/λ. We let p0 = i ∗ λ . We
∗
will separate the summands in () into two parts, based on whether k 6 k0 B b i λ·10 γ c.
 Proportional Approval Voting

We bound the first part:

k0
X
P [Bin(N , p) = k] − P [Poi(N p) = k] · (w1 + · · · + wk )
k=1
k0
X
6 k0 |P [Bin(N , k) = k] − P [Poi(N k) = k]|
k=1 ()
(From Theorem ) 6 k0 · 4p
(From our choice of k0 and p0 ) 6 0.4γ.
The second part can be bounded as follows:
∞
X
P[Bin(N , p) = k] − P[Poi(N p) = k] · (w1 + · · · + wk )
k=k0 +1
∞
X
6 P[Bin(N , p) = k] · (w1 + · · · + wk )
k=k0 +1
∞
X γ
(From our choice of i ∗ ) 6 P[Bin(N , p) = k] · (i ∗ + (k − i ∗ ) )
10λ
k=k0 +1
∞ ()
γ X γ
(From choice of k0 , k > i ∗) 6 P[Bin(N , p) = k] · (k )
10λ 5λ
k=k0 +1
γ γ
6 E[Bin(N , p)] = · Np
5λ 5λ
(From assumption that N p 6 λ) 6 0.2γ.

The distribution that will turn up in our reduction is in fact a bit different from
the Binomial distribution. It can be described as a grouped Binomial Distribution,
where the outcome of N coins is decided in l coin-flips.
Lemma . Let w be such that limi→∞ wi = 0. For any p ∈ R+ , and l non-negative
integers m1 , . . . , ml , we have

E [w1 + · · · + wPl ] 6 E[w1 + · · · + wBin(Pl ].

z1 ,...,zl zBer(p) i=1 zi mi i=1 mi ,p)

We will prove a much stronger statement, similar to that of Barman et al. (which we
recalled before as Theorem ).
. Hardness of Approximation 

P, u v
∗ ∗ ∗ ∗  ∗ ∗              

Figure : A Hypercube Set System. The set P5,1 contains all vectors with  at fifth
position, so u < P5,1 but v ∈ P5,1 .

Proposition . For any convex function f , any parameter p ∈ [0, 1], and l non-negative
integers m1 , . . . , ml , we have Ez1 ,...,zl zBer(p) [f ( li=1 zi mi )] > E[f (Bin( li=1 mi , p))].
P P

Proof. Similarly to [SS, Theorem .A.], let the independent random variables
m
Z1 , . . . , Zl z Ber(p) and for each i ∈ [l], Xi1 = · · · = Xi i = Zi are copies of Zi . We then
have
m m
[f ( li=1 zi mi )] = E f X11 + · · · + X1 1 + · · · + Xl1 + · · · + Xl l Z1 , . . . , Zl
P h i
E
z1 ,...,zl zBer(p)
m m
> E f E[X11 |Z1 ] + · · · + E[X1 1 |Z1 ] + · · · + E[Xl l |Zl ]
h i
(Jensen’s Inequality)
= E[f (Bin( li=1 mi , p))].
P

Finally we will need a simple property of a much-used function in our analysis.

Lemma . The function f (z) = E[w1 + · · · + wPoi(z) ] is concave.
Proof. Similarly to the proof of Lemma , we calculate the second derivative of f
and get:
d2 f (z)
= E [wk+2 − wk+1 ] 6 0.
dz2 k∼Poi(z)

.. The Reduction

We can now describe the reduction and prove Theorem . As we had stated earlier,
our reduction is the same as that of Feige [Fei], which employs the following gadget
(see Fig ):
Definition  (Hypercube Set System [Fei]). For t, R ∈ N, the (t, R)-hypercube set
system is defined as H = (U, P1 , . . . , PR ) where the universe U is [t]R and each Pj =
n o
(Pj,1 , . . . , Pj,t ) is a partition of U into t parts with Pj,i = u = (u1 , . . . , uR ) ∈ [t]R | uj = i .
With this gadget we can describe the reduction.
 Proportional Approval Voting

Reduction 
Input: An LC(δ, t, R) instance L = A ∪· B, E, [L], [R], {πe }e∈E .
Output: An election E = V , C, {Av }v∈V , k with V = B × [t]R , C = A × [L], and
k = |A|.
For each vertex b ∈ B we fix an arbitrary order on its t neighbours a1 , . . . , at . A voter
(b, u) ∈ V approves of a candidate (ai , σ ) ∈ C if ai is (i-th) neighbour of b, and
u(πai ,b (σ )) = i (u belongs to the set Pπa ,b (σ ),i in the (t, R)-hypercube set system).
i

This reduction is tuned by parameters t and δ which we can feed into Theorems .
and .. We pick them based on expected in Theorem  as follows:
∗
• Let i ∗ denote the smallest positive integer such that wi ∗ < 0.1, and let ϑ = 10 i .
• Let γ = 0.7, p0 = p0 (w, γ, 10ϑ) be as in Lemma .
0.0001
• We select t = d2/p0 e and δ = ϑ3 t2
.

The reduction obviously runs in poly(|C|, |V |) time and produces an instance of size
as stated in Theorem .

Completeness
Suppose n that there is o an assignment φ : A → [L] that satisfies every b ∈ B. Pick
W = (a, φ(a)) | a ∈ A . We claim that scrw (W ) > |V |: Specifically, we will observe
that every voter had voted for at least one candidate in W . Consider a voter (b, u) ∈
B × [t]R and let a1 , . . . , at be b’s (ordered) neighbours. Since φ satisfies b, we have
π(a1 ,b) (φ(a1 )) = · · · = π(at ,b) (φ(at )) = r ∈ [R]. The value i B u(r) tells us, which of the t
candidates was supported by (b, u)—it was (ai , φ(ai )).

Soundness
Assume by contraposition that there exists a subset W ⊆ C of k candidates, such that
scrw (W ) > (αw + ) · |V |. We will show that weak-val(L) > δ.
For every b ∈ B, let V [b] = {b} × [t]R . Let also degW (b) B |W ∩ (Γ (b) × [L])|, where
Γn (b) denotes the set o of neighbours of b in L. For every a ∈ A, let Wa ⊆ [L] denote
σ ∈ [L] | (a, σ ) ∈ W . Finally, let D = 10ϑt.
We divide the set B into three sets:

• Let B1 denote the set of b ∈ B such that degW (b) > D.

. Hardness of Approximation 

• Let B2 denote the set of b ∈ B such that degW (b) 6 D and, for all distinct neigh-
bours a1 , a2 of b and all σ1 ∈ Wa1 , σ2 ∈ Wa2 , we have π(a1 ,b) (σ1 ) , π(a2 ,b) (σ2 ).
• Let B3 = B \ (B1 ∪ B2 ).

We will next argue that |B3 | > Ωt, (|B|), which will then allow us to ‘decode’ back
the assignment to the Label Cover instance L . To do so, we start by bounding the
contribution of B1 to scrw (W , B1 )—this may be seen as an argument that it does not
make sense for the committee W to satisfy the voters very unevenly. The main idea is
that when degW (b) is large, the average score per candidate becomes small, which gives
the following upper bound on the desired quantity:

Proposition . b∈B1 v∈V [b] scrw (W , v) 6 0.2 · |V |.

P P

Proof. Let us fix a vertex b ∈ B1 . Notice that each candidate (a, σ ) ∈ Γ (b) × [L] is
approved by exactly |V t[b]| voters in V [b] (and voters in V [b] do not approve P∞ any
candidate outside Γ (b) × [L]). As a result, scrw (W , V [b]) is of the form i=1 ri wi for
|V [b]|
some sequence (ri )i∈N of non-negative integers that satisfies ∞ i=1 ri = degW (b) · t
P
and |V [b]| > r1 > r2 > · · · (r1 voters having at least one candidate P∞ in W , r2 having at
least two, etc.). It is simple to see that among such sequences i=1 ri wi is maximized
when r1 = · · · = rm = |V [b]| for m = degW (b)/t , rm+1 = |V [b]| · {degW (b)/t } and rm+2 =

rm+3 = · · · = 0. In other words, scrw (W , V [b]) is upper bounded by
 m  ! X m

X  deg W (b) · |V [b]|  
|V [b]| · w  + |V [b]| · {deg (b)/t } · w · ·w

 i W m+1 6  i

tm
 
 
 
i=1 i=1
degW (b) · |V [b]|
!
∗
(From definition of i ) 6 · (i ∗ + (m − i ∗ )0.1)
tm
deg (b) · |V [b]|
!
(From our choice of D, ϑ; i ∗ 6 0.1m) 6 W
(0.2m)
tm
degW (b) · |V [b]|
!
= (0.2) .
t

Summing the above inequality over all b ∈ B1 , we have

 
X  X deg (b) · |V [b]| 
W
scrw (W , V [b]) 6 0.2 
 
t

 
b∈B1 b∈B1

 As
we will see later, the bound degW (b) 6 D for b ∈ B3 will play the same role as number s in the
decoding in Lemma ..
 Proportional Approval Voting

|V | X |V | X
= 0.2 · · degW (b) 6 0.2 · · degW (b)
|B| · t |B| · t
b∈B1 b∈B
|V | X |V |
= 0.2 · · |Γ (a)| 6 0.2 · · |B| · t = 0.2 · |V |,
|B| · t |B| · t
(a,σ )∈W

with the last inequality following from |W | = |A| and bi-regularity of the LC instance.

Next, we bound the score contribution from B2 , using the auxiliary lemmas shown
in the previous subsection.
Proposition . b∈B2 scrw (W , V [b]) 6 (αw + 0.7) · |V |.
P

Proof. Let us fix b ∈ B2 . Let a1 , . . . , at be its (ordered) neighbours. For each κ ∈ [R], let
τ κ = |{(a, σ ) ∈ W | a ∈ N (b), π(a,b) (σ ) = κ}|. Moreover, if τ κ , 0, let i κ denote the index
i ∈ [t] of the neighbour s.t. π(ai ,b) (σi ) = κ for some σi ∈ Wi . (Due to our definition of
B2 , there is a unique such i.) If τ κ = 0, let i κ =⊥.
Recall that a voter (b, u) approves a candidate (ai , σ ) if uπ(a ,b) (σ ) = i. Hence, the
i
number of candidates in W approved by (b, u) ∈ Vb is
X X X
1[uπ(a ,b) (σ ) = i] = τ κ · 1[uκ = i κ ].
i
i∈[t] σ ∈Wai κ∈[R]

Hence, we may rearrange scrw (W , V [b]) as

|[t]R | ·
h i
E w1 + · · · + wPκ∈[R] τ κ ·1[uκ =i κ ] .
u1 ,...,uR ∼[t]

Observe that for a fixed vertex b, τ κ are non-negative integer constants such that
κ 1 R
κ∈[R] τ = degW (b). 1[u1 = i ], . . . , 1[uR = i ] are independent random variables,
P
identically distributed by Ber(1/t). We may apply Lemmas  and  to get
scrw (W , V [b]) 6 |[t]R | · E w1 + · · · + wPoi(degW (b)/t) + γ.
h i

By summing this over b ∈ B2 , b∈B2 scrw (W , V [b]) is at most

P
X h
|[t]R | ·
i
E w1 + · · · + wPoi(degW (b)/t) + γ
b∈B2
" # !
R
(From Lemma ) 6 |[t] | · |B| · E w1 + · · · + w b∈B degW (b)
P +γ
Poi |B|t

b∈B degW (b) = |B| · t) = (αw + 0.7) · |V |.

P
(By
. Determining performance of the Greedy Algorithm 

Decoding the assignment to Label Cover From the two P propositions above and our
assumption that scrw (W ) > (αw + ) · |V |, we must have b∈B3 scrw (W , V [b]) > 0.1 · |V |.
Obviously, for b ∈ B3 we have scrw (W , V [b]) 6 degW (b) · |Vtb | , which is at most D·|Vt·|B|
|
for
b ∈ B3 . Thus,
0.1 · |V | 0.1t|B| 0.01|B|
|B3 | > = = . ()
D · |V |/(t · |B|) D ϑ
Now, consider the following (random) assignment: for every a ∈ A such that Wa , ,
let φ(a) be a random element of Wa . By definition, each b ∈ B3 has distinct neighbours
a1 , a2 and σ1 ∈ Wa1 , σ2 ∈ Wa2 such that π(a1 ,b) (σ1 ) = π(a2 ,b) (σ2 ). Hence, the probability
1
that φ weakly satisfies such b is at least |W |·|W |
> D12 , where the inequality follows
a1 a2
from degW (b) 6 D. Thus,
|B3 | () 0.01 0.0001
E[weak-val(φ)] > > = = δ.
φ |B| · D 2 ϑ · D2 ϑ3t2
This completes our proof of Theorem .

. Determining performance of the Greedy Algorithm

As touched upon earlier, an important and natural algorithm for all covering problems
is the greedy algorithm, in which at each of k steps we select the set (candidate)
maximising the utility increase. As we discussed, this algorithm not only is very
efficient, but also provides tight approximation ratio for basic variants of the problem.
It is hence tempting to question, if an LP-based algorithm like the one we presented
in Section  does give any real improvement over the greedy algorithm, or is the
gain in ratio only thanks to more thorough analysis? We answer this question by
giving—for any geometrically dominant sequence w other than geometric—an instance
for which the ratio of our algorithm is strictly larger than that of the greedy.
Proposition . For any geometrically dominant w = (w1 = 1, w2 = p, w3 , w4 , . . . ),
1 1

αw > · 1 − 1−p .
1−p e
Proof. αw is lower-bounded by αv for v = (1, p, p2 , . . . ).
 k 
1 − pk 1 1
" # !
X
i−1 k
1 − ep−1 ,

αv = E   p  = E
 = 1 − E [p ] =
k∼Poi(1) k∼Poi(1) 1 − p 1−p k∼Poi(1) 1−p
i=1
where the last equality follows from the well-known fact that the probability-generat-
ing function (PGF) of Poi(λ) is G(z) = eλ(z−1) .
 Proportional Approval Voting

S
S A A A
S
Figure : Tight instance for the greedy algorithm. The universe of m · k voters is
covered by 2k candidates. The optimal solution is to pick S1 , . . . , Sk and give every
voter exactly one supported candidate in the committee. The greedy algorithm will
pick A1 , . . . , Ak instead. Compared to the tight example one would build for Maximum
Coverage, here the sets Ai must be larger, because even after picking A1 , . . . Al the
sets Si still gain some score from the already “covered” voters.

2
Clearly, when there is an element i of w with wi · wi+2 > wi+1 , the inequality in
Proposition  becomes strict, as v < w. Let us now analyse the greedy algorithm for
these w-Thiele rules.

Theorem 
For any Thiele sequence w = (1,
p, w3 .. . ), the approximation ratio of the greedy algorithm
1 1
for w-Thiele is at most 1−p · 1 − e1−p .

We will prove this theorem by building a specific instance. The instance is defined
below (see also Figure ), and its construction is parametrised by numbers m, k (the
committee size), and p (the second element of w).

• The set of voters V = {vi,j }i∈[k],j∈[m] .

• One collection of k candidates is {Si }i∈[k] . The candidate Si is supported by the
voters {vi,j }j∈[m] . Every voter voting for exactly one Si gives

scrw ({S1 , . . . , Sk }) = m · k.

• An alternative collection of candidates is {Ai }i∈[k] . Ai is supported by |Ai | =

1−p i−1

m· 1− k voters. Moreover the overlap between Si and Al (the number of
voters supporting both candidates) is equal to

|Al | m 1 − p l−1

|Si ∩ Al | = = 1− .
k k k
No voter is supporting both Ai and Al for i , l (Ai ∩ Al = ).
. Notes 

Proposition . For any l ∈ {0, . . . , k − 1} if the candidates A1 , . . . , Al have been picked into
the committee, the greedy algorithm will select Al+1 .
Proof. To verify this we need to compare scrw (Al+1 | A1 , . . . , Al )—a score gain from
picking Al+1 once A1 , . . . , Al have been selected—to scrw (Si | A1 , . . . , Al ) (for any i ∈ [k]).
Since the electorates of Ai candidates are pairwise disjoint, the former is very simple:

1−p l

scrw (Al+1 | A1 , . . . , Al ) = |Al+1 | = m · 1 − .
k
To calculate the latter we notice that some of Si ’s voters are already satisfied by the
already picked candidates. Those voters will only earn an additional score of p. The
other voters supporting Si will gain 1 when he is picked.
scrw (Si | A1 , . . . , Al ) = p · |Si ∩ (A1 ∪ · · · ∪ Al )| + |Si \ (A1 ∪ · · · ∪ Al )|
|A | + · · · + |Al | |A | + · · · + |Al | 1−p
=p· 1 +m− 1 = m− (|A1 | + · · · + |Al |)
k k k
l
1−p X 1 − p j−1 1−p l

= m−m 1− = m· 1− .
k k k
j=1

We have thus established that the greedy algorithm will indeed select A1 , . . . , Ak  . We
also need to compute the score of the greedy solution,

1−p k k
!
scrw ({A1 , . . . , Ak }) = |A1 | + · · · + |Ak | = m · 1 − 1 − ,
k 1−p
which allows us to bound the approximation ratio of the greedy algorithm:
1−p k

m·k
1− 1− k
scrw ({A1 , . . . , Ak }) 1−p 1 1

= −−−−−→ · 1 − 1−p .
scrw ({S1 , . . . , Sk }) m·k k→∞ 1−p e

. Notes
The pipage rounding method has been introduced by Ageev and Sviridenko [AS]
with an application of optimising a function over bipartite matching constraints. It
 Onemay break ties by adding k additional voters u1 , . . . , uk in the instance. ui would only approve
the candidate Ai . This increases the score of the greedy solution only by k, with no impact on the
approximation ratio.
 Proportional Approval Voting

was later extended by Călinescu, Chekuri, Pál and Vondrák [Căl+] to any matroid
constraints. In our algorithm in Section  we are applying this powerful method to
a simple cardinality constraint, but in fact it could be replaced with any matroid
constraint without effect on the approximation ratio.
The algorithm of Barman, Fawzi, Ghosal and Gurpinar [Bar+] for the Multi `-
Coverage problem is also using the pipage rounding framework. Their hardness is
based on Unique Game Conjecture—an assumption we are able to get rid of with our
reduction.
The results in this chapter are based on our paper with Szymon Dudycz, Pasin
Manurangsi and Krzysztof Sornat, which appeared at IJCAI  [Dud+]. We left
unanswered the question of approximation of w-Thiele rules with w sequence not
geometrically dominant. This has been subsequently resolved by Barman, Fawzi and
Fermé [BFF].
 k-Medians clustering with
capacities
In this chapter we will focus on a topic of metric clustering. In the clustering
problems we are given a discrete metric (defined by a distance function d obeying
the triangle inequality) and a number k. Our task is to select k points (centres) in the
metric and assign every other point to one of the k clusters identified by the selected
representatives. The objective may be to minimise a maximum distance from a point
to its representative (this problem is called k-Center), the sum of distances from
the points to their respective centres (k-Median) or the sum of squared distances
(k-Means).
A common application is categorising data which is usually represented by long
vectors within the metric defined by the Hamming distance. For k-Means, Data
Scientists would typically use a library function implementing some local-search
heuristic, like Lloyd’s algorithm [Llo] initialised with a seed given by a proce-
dure called k-means++ [AV]. This algorithm—although practical—only gives an
approximation guarantee of O(log k).
Our Capacitated k-Median (CkM) problem adds another difficulty to the k-Median:
Every point in the metric has an associated capacity and at most that many members
may belong to its cluster. We describe it in the language of Facility Location.
Problem  (Capacitated k-Median)
Given a metric (F ∪ C, d), and for every facility f ∈ F a capacity uf , find a subset
S ⊂ F of size k and an assignment of clients φ : C → S such that:
• For every facility f ∈ S, its capacity is respected, i.e. |φ−1 (f )| 6 uf ,
• The total distance from clients to their facilities, i.e.
X
d(φ) B d(c, φ(c)),
c∈C

is minimised over all possible choices of S and φ.

 k-Medians clustering with capacities

An instance of CkM is uniform if the capacities are equal for all the facilities.
Remark. The problem becomes marginally harder in a client-weighted variant, when
each client location i ∈ C has cardinality ci of clients, which can be directed to
different facilities. The additional complexity comes from the fact, that ci may be
large compared to the size of the metric.

. Related work
Problems without capacities There is an extensive body of work on approxima-
bility of the clustering problems: For the k-Center problem a simple local search
procedure gives 2-approximation [Gon], which cannot be improved unless P =
NP [HN]. Also for k-Means and k-Median, the first constant-factor approxima-
tion algorithms were using this method—giving the ratios of (9 + ε) [Kan+] and
(3 + ε) [Cha+; Ary+] respectively. Stronger results based on linear programming,
and taking advantage of the similarity between clustering and Facility Location,
have been attained later. Currently the best ratio for k-Means is 6.357 [Ahm+],
and for k-Median–(2.675 + ε) [LS; Byr+].

Approximability of CkM Capacitated k-Median is among few remaining natural

optimisation problems for which it is not yet settled, whether a polynomial time algo-
rithm may achieve a constant approximation ratio. The only proper approximation
algorithm, which we describe in Section , has the ratio of O(log k) . The difficulty
lies with the natural linear programming formulation, which has an unbounded
integrality gap (the ratio between the optimum fractional and integral solutions). All
the known approximation algorithms overcome this problems by ‘cheating’: they
either open more than k facilities, or violate the capacities (but the comparison is
made with an optimum solution which opens exactly k facilities and does not violate
the capacities). There is a long series of results fitting this pattern: Starting with
-approximation algorithm for the uniform instances [Cha+] that may violate
the capacities by the factor of . Then, for general instances Chuzhoy and Rabani
gave a -approximation algorithm violating the capacities  times [CR]. Li
used stronger LP-relaxations to obtain a constant-factor approximation algorithm
which opens (1 + ε) facilities [Li; Li]. Similar ideas have been used to con-
struct a constant-factor approximation which violates the capacities by a factor of
(1 + ε) [BRU; DL].
 To
the best of our knowledge, it had never been published before our paper came out, but it
has been a known application of metric tree-embeddings, similar to the O(log k)-approximation for
k-Median by Charikar et al. [Cha+]. We are framing it in the language of our main tool.
. l-Centred instances 

Our results We cheat in a different way. Rather than violating the cardinality
constraint or the capacities, we abuse the time complexity, by constructing a constant-
factor approximation algorithm for CkM which runs in FPT(k) time.

Theorem 
For any > 0 there is a (7 + )-approximation algorithm for Capacitated k-Median
(Problem ) running in time 2O(k log k) · poly(|F | + |C|).

This result makes sense from the point of view of parametrised complexity, as the
k-Median problem (even without capacities) is W[2]-hard, which makes it unlikely
to admit exact FPT(k) algorithms.

Proposition . The k-Median problem is W[2]-hard when parametrised by k.

Proof. In the Dominating Set problem we are given a graph G and a parameter k,
and must decide, whether there is a S subset of k vertices dominating entire graph
(every vertex must either belong or have a neighbour in S). The Dominating Set
problem is W[2]-hard with respect to the parameter k (see e.g. [Cyg+]).
We construct k-Median instance as follows: let F = C = V (G). The metric is
induced by shortest paths in G. A dominating set of size k in G is equivalent to a
k-Means solution of total cost equal to |V (G)| − k.

Our algorithm from Theorem  is composed of two ingredients. The first is a re-
duction that allows us to solve the problem on simpler metrics called l-centred rather
than general metrics. We describe this reduction in Section . In Section  we will
familiarise ourselves with the l-centred instances by using them to construct a simple
O(log k)-approximation algorithm for CkM. The second ingredient is a parametrised
algorithm for CkM on the l-centred instances. In Section . we will present its simpler
form—an exact algorithm for uniform instances. Finally in Section . we extend it
to work on general instances (and pay a marginal price in the approximation ratio).

. l-Centred instances
The l-centred metric is an alternative distance function defined on the same universe
F ∪C. We start off by selecting a set S ⊂ F of l facilities—a solution to l-Median (for a
number l > k) on the universe F ∪ C (ignoring the capacities). This solution naturally
determines a mapping φS : C ∪ F → S, with every point mapped to the closest facility
of S (see Fig.  (a)).
The l-centred distance function d S will have two layers. For any f , g ∈ S it is
identical to the original metric, i.e. d S (f , g) = d(f , g). To travel anywhere from a point
 k-Medians clustering with capacities

(a) A set of S ⊂ F determines a mapping (b) To go from a to b in the l-centred metric, one
φS corresponding to the Voronoi diagram must first go to a’s centre (to the top layer), then
on F ∪ C. follow the original metric to the b’s centre and
from there to b.

Figure : l-Centred metric defined by a set S of Wrocław’s hospitals.

a < S, we first route to a’s centre φS (a). Generally the new metric is defined as follows
(see Figure  (b)):

d S (a, b) B d(a, φS (a)) + d(φS (a), φS (b)) + d(φS (b), b). ()

The usefulness of d S follows from the fact, that it is a metric embedding of d.

Proposition . For any a, b ∈ F ∪ C, the inequalities hold

d(a, b) 6 d S (a, b) 6 3d(a, b) + 4d(a, φS (a)).

Proof. The left inequality is a trivial consequence of the triangle inequality. We focus
on the right inequality:

d S (a, b) = d(a, φS (a)) + d(φS (a), φS (b)) + d(φS (b), b)

(By triangle inequality) 6 d(a, φS (a)) + d(φS (a), a) + d(a, b) + d(b, φS (b))
+ d(φS (b), b)
(Since d(b, φS (b)) 6 d(b, φS (a))) 6 2d(a, φS (a)) + d(a, b) + 2d(b, φS (a))
6 2d(a, φS (a)) + d(a, b) + 2 d(b, a) + d(a, φS (a)) .

(By triangle inequality)
. l-Centred instances 

The cost on any solution φ for S

D E CkM can be measured on the metric d instead of d.
By summing over pairs c, φS (c) for all clients c ∈ C, Proposition  shows us, that the
loss incurred by this swapping of the underlying metric depends on the quality of
the solution S, which we used to define that metric.
Corollary. For any mapping φ : C → F we have
d(φ) 6 d S (φ) 6 3d(φ) + 4d(φS ), ()
where d(ϕ) = c∈C d(c, ϕ(c)) is the total cost of mapping ϕ on metric d.
P

Fortunately, since the solution S ignores the capacities, we can ensure that its cost is
a constant approximation of the optimum solution to the Capacitated k-Median on
F ∪ C. Additionally we can count on a better approximation by picking l significantly
larger than k.
Altogether, we reduce the Capacitated k-Median on general metric d to two
different problems: The k-Median (without capacities) with the cardinality constraint
relaxed from k to l, and the CkM on the l-centred metric.
Lemma . For any metric d over the universe F ∪ C and capacities {uf }f ∈F let φOPT
denote the optimum solution to CkM on that instance. Let S ⊂ F with φS picking the closest
centre in S for every point (according to the metric d), and let l = |S|. Let also φOPT(S)
denote the optimum solution to CkM on metric d S .
If a solution φ respects the capacities, and satisfies
d S (φ) 6 α · d S (φOPT(S) ) (Asm. )
(φ is an α-approximation for CkM on the l-centred metric d S ), while φS satisfies
d(φ(S)) 6 β · d(φOPT ) (Asm. )
(which holds when it is a β-approximation for k-Median on the metric d), then
d(φ) 6 α · (3 + 4β) d(φOPT )
(φ is an α · (3 + 4β)-approximation for CkM on metric d).
Proof. Starting from the left side:
() (Asm. ) (p)
d(φ) 6 d S (φ) 6 α · d S (φOPT(S) ) 6 α · d S (φOPT )
() (Asm. )
6 α · 3d(φOPT ) + 4d(φS ) 6 α · 3d(φOPT ) + 4β · d(φOP T ) ,

where the inequality (p) follows from optimality of the solution φOPT(S) for CkM on
metric d S .
 k-Medians clustering with capacities

. Polynomial-time O(log k)-approximation for CkM

In this section we will show, how l-centred metrics can be used to obtain a O(log k)-
approximation for CkM. As we have already said, constant-factor approximation
algorithms do exist for k-Median (without capacities), with currently best ratio of
β = 2.675 + ε. By plugging this β-approximation into Lemma , it is sufficient to have
a O(log k)-approximation for CkM on l-centred metrics (with l = k).
A standard tool used to provide such an approximation in a metric setting is
a randomised algorithm of Fakcharoenphol, Rao and Talwar (FRT) [FRT] which
embeds an arbitrary metric onto a tree.

Theorem  (FRT [FRT])

For any metric d over the ground set X, a randomised, polynomial-time (in |X|) procedure
finds a metric d T over the ground set T ⊃ X, such that:

• d T is a shortest-path metric on a weighted tree with vertex set T ; X are the leaves in
that tree,
• For any a, b ∈ X,
d(a, b) 6 d T (a, b),
(so d T is indeed a metric embedding of d),
• For any a, b ∈ X, the expected distortion—with respect to the random choice of the
metric T by the algorithm—is bounded:

d T (a, b) 6 ρ · d(a, b),

h i
E
T zFRT(X,d)

where ρ = O(log |X|).

We will apply Theorem  to the metric d over the ground set S—the upper layer
of our l-centred metric d S . This new metric d T can replace d at the upper layer of d S .
To that end we define

d T (a, b) B d(a, φS (a)) + d T (φS (a), φS (b)) + d(φS (b), b),

for any a, b ∈ F ∪C. The metric d T defined over the set F ∪C∪T , is induced by shortest
paths on an edge-weighted tree. Let now φOPT(T ) denote the optimum solution co
CkM on that metric. It turns out, that φOPT(T ) is an O(log k)-approximation for CkM on
the original metric d.
. Polynomial-time O(log k)-approximation for CkM 

Proposition . The expected cost of φOPT(T ) —with respect to the random choice of T in
Theorem —measured on metric d is bounded by

d(φOPT(T ) ) 6 d S,T (φOPT(T ) ) 6 ρ · d S (φOPT(S) ),

h i h i
E E
T zFRT(S,d) T zFRT(S,d)

where φOPT(S) is the optimum solution to CkM on the l-centred metric d S , and ρ = O(log k)
comes from Theorem .

Proof. The first inequality is given by the fact that d T is a metric embedding of d.
Since φOPT(T ) is optimal on the metric d T , we have

d T (φOPT(T ) ) 6 d T (φOPT(S) ) .
h i h i
E E
T zFRT(S,d) T zFRT(S,d)

Furthermore, by summing over all clients c ∈ C,

i X i X
d T (φOPT(S) ) = d T (c, φOPT(S) (c)) 6 ρ · d S (c, φOPT(S) (c)).
h h
E E
T zFRT(S,d) T zFRT(S,d)
c∈C c∈C

.. Solving CkM on trees

We are now left with the task of finding the optimum solution to CkM on an edge
weighted tree T (with a weight function d T ). We can additionally assume that the
tree is binary and all the facilities and clients lie in the leaves (which—if it were not
true already—could be ensured by at most trebling the number of nodes).
Consider a subtree t dangling on an edge et , and imagine we have decided, which
facilities shall be opened inside of that subtree (let this set Pbe denoted as R[t] ⊂ F ∩ t).
The total capacity of that subtree is equal to u(R[t]) = f ∈R[t] uf . If the number of
clients inside t is larger than the total capacity, t has a negative balance of |C∩t|−u(R[t])
which must be routed up through the edge et . Each such client must pay d T (et ) for
passing this edge. If the subtree t has a sufficient capacity to serve all its clients, it
has a positive balance, and up to u(R[t]) − |C ∩ t| clients may be rooted down through
the edge et .
This reasoning lays out the dynamic programming for us. For any subtree t,
number k 0 ∈ {0, . . . , k} and the balance b ∈ {−|C|, . . . , |C|}, we define dp[t, k 0 , b] to be the
minimum cost of opening exactly k 0 facilities inside t and routing exactly b clients
down through et (b < 0 meaning, we are routing exactly −b clients up). The cost of
routing is counted to the top of the edge et , so it includes this edge.
 k-Medians clustering with capacities

For a leaf c ∈ C, dp[c, 0, −1] = d T (ec ). We set dp[c, k 0 , b] = ∞ for every other value of
k0 and b, as they are infeasible. Similarly, for a leaf f ∈ F we set

if k 0 = 0,



0
dp[f , k 0 , b] = b · d T (ef ) if k 0 = 1 and b ∈ [0, uf ],



otherwise.

∞


Finally, for t with two subtrees t1 , t2 , computing dp[t, k 0 , b] amounts to finding k10 and
b1 which minimize
dp[t1 , k10 , b1 ] + dp[t2 , k 0 − k10 , b − b1 ].
This can be trivially done by iterating over all O(k · |C|) choices. Once such a pair is
found, we set

dp[t, k 0 , b] = dp[t1 , k10 , b1 ] + dp[t2 , k 0 − k10 , b − b1 ] + d T (et ) · |b|.

At the root we can read the optimum solution to CkM on entire tree T , as it is equal to

min dp[T , k 0 0] | k 0 ∈ [k] .

n o

. Constant-factor approximation
In this section we will present the algorithm for Capacitated k-Median on l-centred
metrics. The algorithm is a composition of ideas which we are going to gradually
introduce—first, in Section ., we describe the exact algorithm for the Uniform
CkM, and only then we will put together the (1 + )-approximation algorithm for the
non-uniform variant. Both algorithms will guess the k facilities to open, and rely
on a subroutine which (in polynomial time) optimally maps clients to the opened
facilities. We start by describing this subroutine.

.. Mapping clients to open facilities

We are given a metric d over the set C ∪ F open , in which we need to find the optimal
mapping φ : C → F open . Were there no capacities associated with the facilities, the
task would be very simple—we would just map every client to the closest facility. It
turns out however, that we can accomplish that task in polynomial time, even with
the capacities present.
Intuitively, mapping clients C to F open is very similar to minimum-cost b-match-
ing in bipartite graphs. Every client has b = 1 (or ci in client-weighted variant of the
. Constant-factor approximation 

problem), while every facility can be matched up to uf times. b-matching can be

found with LP-solvers, which is how we will find our mapping. We formulate our
problem as an integer program:
X X
minimize d(c, f ) · xc,f
c∈C f ∈F open
X
subject to xc,f = 1 ∀c ∈ C,
f ∈F open ()
X
xc,f 6 uf ∀f ∈ F open ,
c∈C
xc,f ∈ {0, 1}.

(In our integer program, xc,f = 1 represents that φ(c) = f .) Let us consider the linear-
programming relaxation of (), where the last constraint is replaced with xc,f > 0.
The constraint matrix of this LP is an incidence matrix of a complete bipartite graph
on C ∪· F open and is known to be totally unimodular, which means that the relaxed LP
has an optimal solution with integral values of xc,f .

Remark. The solution is integral regardless of the right side of the inequalities in ()
(as long as they are integers), so LP-solving (which can be done in polynomial time)
gives us the optimal mapping even in the client-weighted variant of CkM.

.. Solving Uniform CkM on l-centred metrics

Our parametrised algorithm for Uniform CkM is based on a simple observation, that
once we know the number ks of facilities to open in a particular Voronoi cell s ∈ S of
the l-centred metric, then we can choose them greedily to be the closest ks facilities
to the centre of the cell (see Figure  (a)). The reason is, that every client (even ones
in the same cell) must connect to the facilities through the centre. It thus suffices
to scan all configurations (ks )s∈S with s∈S ks = k, for each one open the facilities
P
greedily and use our subroutine from Section . to find the mapping of clients to
the facilities. There are at most l k possible configurations—some of them may be
infeasible, when ks is larger then the number of facilities available in the cell s, but
we may just disregard them when scanning over the configurations.

Corollary. Uniform capacitated k-Median on l-centred instances can be solved exactly

in time l k · poly(|F ∪ C|).
 k-Medians clustering with capacities




(a) In the algorithm for Uniform CkM we guess (b) In the general (non-uniform) case, every
the number of open facilities in every cell cell is divided into rings of geometrically in-
( in figure) and pick the closest facilities to creasing radii. We guess the number of open
the centre. This minimises the distance (as facilities in every ring, and pick the ones with
every client is routed through the centre in largest capacity (facilities in the same ring
l-centred metric). have almost equal distance to the centre).

Figure : Determining the facilities to open.

.. Approximating non-uniform CkM

In general (non-uniform) case, our greedy facility-opening strategy will not suffice.
Once we know, how many facilities need to be open in the cell s ∈ S, it might
make sense to open a facility further from the centre, if it has higher capacity. Our
configurations will therefore need to be more granular.
We begin by guessing the largest distance d S (c, φOPT(S) (c)) between the client and
its facility in the optimum solution (for the l-centred metric d S ). This quantity will be
denoted as D. There are at most |C|·|F | choices for D. Consider the setlF [s] of facilities
in a single Voronoi cell s ∈ S. We will partition it into B + 1 (with B = log1+ |C|
m
) rings
F 0 [s], F 1 [s], . . . , F B [s] of geometrically growing diameters:

The clients further than D from their centre may be disregarded, since they will not
appear in the optimum solution.
Once we know the number ksi of facilities to open in a particular ring, we may
again do it greedily—this time by opening ones with largest capacity.
. Constant-factor approximation 

D E
Proposition . Let F OPT(S) , φOPT(S) be the optimum solution to CkM on l-centred metric
D E
d S . For every ring F i [s] let ksi = |F i [s] ∩ F OPT(S) |. Let F SOL , φSOL be determined by
picking ksi largest facilities in each ring F i [s]. Then the following inequality holds:
d S (φSOL ) 6 (1 + ) · d S (φOPT ) + · D 6 (1 + 2) · d S (φOPT ). ()
Proof. By definition,
X
d S (φOPT(S) ) = d S (c, φOPT(S) (c))
c∈C
−1
X X
d S c, φS (φOPT(S) (c)) + φOPT(S) (f ) · d S φS (f ), f

=
c∈C f ∈F OPT(S)
X
d S c, φS (φOPT(S) (c))

=
c∈C B
−1
XX X
φOPT(S) (f ) · d S φS (f ), f .

+
s∈S i=0 f ∈F OPT(S) ∩F i [s]

Let us analyse the additional cost incurred by replacing facilities from F OPT(S) with
those from F SOL . Define a bijection σ : F OPT(S) → F SOL which preserves every ring
F i [s]. We set φ(c) B σ (φOPT(S) (c)). The solution F SOL has picked the largest-capacity
facilities in every ring, so φ does not violate any capacities. Now, we have
X
d S (φ) = d S c, φS (φOPT(S) (c))

c∈C B
−1
XX X
φOPT(S) (f ) · d S σ (φS (f )), f .

+
s∈S i=0 f ∈F OPT(S) ∩F i [s]

For every f ∈ F i [s] with i < B,

d S σ (φS (f )), f 6 (1 + ) · d S φS (f ), f ,

since the rings have geometrically increasing radii. The central ring, with i = B, has
inner radius equal to 0, so that kind of proportion on distances does not hold. Its
radius is however small, so we have
·D
d S σ (φS (f )), f 6

,
|C|
for every facility f ∈ F B [s]. Even if all |C| clients were connected to the facilities in the
central rings of the Voronoi cells, this cost would sum up to · D. Finally we observe,
that d S (φSOL ) 6 d S (φ), as φSOL is the optimal mapping to F SOL .
The right inequality of () is given by D 6 d S (φOPT(S) ).
 k-Medians clustering with capacities

Proposition  lays out the algorithm for us. P AfterP guessing the upper-bound D,
we scan all configurations (ksi )s∈Si∈{0,...,B} with s∈S Bi=0 ksi = k, which determine the
number of open facilities in every ring. For each such configuration we greedily
open ksi largest-capacity facilities in every ring F i [s] and find the optimal mapping
of clients to the facilities using our subroutine from Section .. There are at most
k
O l · 1 · ln |C|

possible configurations and |F | · |C| choices of D.

Corollary. Capacitated k-Median on l-centred instances can be (1 + )-approximated in

k
time O l · 1 · ln |C|

· poly(|F ∪ C|).

.. Choosing the right l

We now have the algorithms for l-centred instances, and only need to come up with an
appropriate l-centred metric. One can see, that using a constant-factor approximation
for k-Median with l = k would be enough to achieve a (much larger) constant-factor
for CkM. We can however optimise by taking l > k. Specifically, we will use a result of
Lin and Vitter [LV].

Theorem  (Lin, Vitter [LV, Theorem ])

For any k-Median instance hF ∪ C, d, ki, let F OPT be the optimum solution to k-Median
(without capacities). A set U that satisfies:

• |U | 6 (1 + 1ε )k(ln(|C ∪ F |) + 1),
• c∈C minf ∈U {d(c, f )} 6 (1 + ε) c∈C minf ∈F OPT {d(c, f )},
P P

can be found deterministically in time polynomial in |C ∪ F | and 1ε .

By plugging Theorem  to construct the l-centred metric and running the al-
gorithm from Section . on that metric, we obtain a (7 + O())-approximation for
Capacitated k-Median with running time
!k
1 1 |C|

O 1 + · k · (ln(|C ∪ F |) + 1) · · ln · poly(|F ∪ C|)
()
k 1

2

= O k · ln (|C ∪ F |) · poly k · poly(|F ∪ C|).

We will use a standard trick in parametrised complexity to bound this running time.

Proposition . For any n, k, (ln n)2k 6 max{n, k O(k) }.

. Notes 

ln n
Proof. We analyse two cases. First, if 2 ln ln n 6 k, we have ln n = O(k ln k) and thus
2k O(k) ln n 2k ln n
(ln n) = k . If 2 ln ln n > k, then (ln n) = exp(2k · ln ln n) < exp(2 2 ln ln n · ln ln n) =
n.
By applying Proposition  with n = |F ∪ C| to () we obtain the bound on complexity
!O(k)
k
· poly(|F ∪ C|),

thus proving Theorem .

. Notes
The topic of approximation in parametrised time—a crossover of approximation
algorithms and parametrised complexity—has gained a significant popularity in
recent years. It has allowed to break some known (or conjectured) bounds on approx-
imability by abusing the running time, like (2 − ε) ratio for k-Cut [GLL]. At the
same time, some problems have turned out to be resistant to such improvements.
One such example is presented in Chapter  of this thesis.
The results in this chapter are based on our paper with Marek Adamczyk, Jarosław
Byrka, Syed Mohammad Meesum and Michał Włodarczyk, which was presented
at ESA  [Ada+]. After the preprint of our paper was announced, Xu et al.
worked out a similar algorithm for Capacitated k-Means on euclidean metrics. Both
our and their ratios have soon been improved by Cohen-Addad and Li [CL], who
obtained (3 + ε)-approximation for Capacitated k-Median and (9 + ε)-approximation
for Capacitated k-Means. At the same conference, they also showed—together
with Gupta, Kumar and Lee [Coh+]—that an approximation ratio of (1 + 2e − ε) is
impossible for k-Median in time parametrised by k, even without the capacities, if
Gap-ETH holds. The same lower-bound had been known for algorithms running in
polynomial time [Fei].
In a subsequent paper, written together with Jarosław Byrka, Szymon Dudycz,
Pasin Manurangsi and Michał Włodarczyk [Byr+]—in which my involvement was
lesser, and which is therefore not included in this dissertation—we analysed the
approximability of k-Median problem parametrised by |F | − k rather then k. It turns
out, that although (1 + ε)-approximation is possible for k-Median without capacities,
the bound of (1 + 2e ) still cannot be overcome for CkM.
Bibliography

[Thi] Thorvald Thiele. ‘Om Flerfoldsvalg’. In: Oversigt over det Kongelige Dan-
ske Videnskabernes Selskabs Forhandlinger (in Danish). .
[LeC] Lucien LeCam. ‘On the Distribution of Sums of Independent Random
Variables’. In: Bernoulli , Bayes , Laplace . . doi: 10.
1007/978-3-642-99884-3_10.
[Gup] Ram Prakash Gupta. ‘Independence and covering numbers of line graphs
and total graphs’. In: Proof Techniques in Graph Theory. .
[Kar] Richard M. Karp. ‘Reducibility Among Combinatorial Problems’. In:
Proceedings of a symposium on the Complexity of Computer Computations.
. doi: 10.1007/978-1-4684-2001-2\_9.
[MH] Sandra Lee Mitchell and Stephen Hedetniemi. ‘Edge Domination in
Trees’. In: Proceedings of the Eighth Southeastern Conf. on Combinatorics,
Graph Theory and Computing. .
[NWF] George L. Nemhauser, Laurence A. Wolsey and Marshall L. Fisher. ‘An
analysis of approximations for maximizing submodular set functions -
I’. In: Math. Program. (). doi: 10.1007/BF01588971.
[HN] Wen-Lian Hsu and George L. Nemhauser. ‘Easy and hard bottleneck
location problems’. In: Discret. Appl. Math. (). doi: 10.1016/0166-
218X(79)90044-1.
[YG] Mihalis Yannakakis and Fanica Gavril. ‘Edge Dominating Sets in Graphs’.
In: SIAM Journal on Applied Mathematics (). doi: 10.1137/0138030.
[Llo] Stuart P. Lloyd. ‘Least squares quantization in PCM’. In: IEEE Trans. Inf.
Theory (). doi: 10.1109/TIT.1982.1056489.
[Bak] Brenda S. Baker. ‘Approximation Algorithms for NP-Complete Problems
on Planar Graphs (Preliminary Version)’. In: th Annual Symposium on
Foundations of Computer Science, FOCS. . doi: 10.1109/SFCS.1983.7.
Bibliography 

[Gon] Teofilo F. Gonzalez. ‘Clustering to Minimize the Maximum Intercluster

Distance’. In: Theor. Comput. Sci. (). doi: 10.1016/0304- 3975(85)
90224-5.
[LV] Jyh-Han Lin and Jeffrey Scott Vitter. ‘ε-Approximations with Minimum
Packing Constraint Violation (Extended Abstract)’. In: Proceedings of the
th Annual ACM Symposium on Theory of Computing, STOC. . doi:
10.1145/129712.129787.
[HK] Joseph Douglas Horton and Kyriakos Kilakos. ‘Minimum Edge Dominat-
ing Sets’. In: SIAM J. Discret. Math. (). doi: 10.1137/0406030.
[Sri+] Anand Srinivasan, K. Madhukar, P. Nagavamsi, C. Pandu Rangan and
Maw-Shang Chang. ‘Edge Domination on Bipartite Permutation Graphs
and Cotriangulated Graphs’. In: Inf. Process. Lett. (). doi: 10.1016/
0020-0190(95)94093-8.
[Fei] Uriel Feige. ‘A Threshold of ln n for Approximating Set Cover (Prelim-
inary Version)’. In: Proceedings of the Twenty-Eighth Annual ACM Sym-
posium on the Theory of Computing, STOC. . doi: 10.1145/237814.
237977.
[Aro+] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan and Mario
Szegedy. ‘Proof Verification and the Hardness of Approximation Prob-
lems’. In: J. ACM (). doi: 10.1145/278298.278306.
[Cha+] Moses Charikar, Chandra Chekuri, Ashish Goel and Sudipto Guha.
‘Rounding via Trees: Deterministic Approximation Algorithms for Group
Steiner Trees and k-Median’. In: Proceedings of the Thirtieth Annual ACM
Symposium on the Theory of Computing (STOC). . doi: 10 . 1145 /
276698.276719.
[Cha+] Moses Charikar, Sudipto Guha, Éva Tardos and David B. Shmoys. ‘A
Constant-Factor Approximation Algorithm for the k-Median Problem
(Extended Abstract)’. In: Proceedings of the Thirty-First Annual ACM
Symposium on Theory of Computing (STOC). . doi: 10.1145/301250.
301257.
[Car+] Robert D. Carr, Toshihiro Fujito, Goran Konjevod and Ojas Parekh. ‘A
1
2 10 -Approximation Algorithm for a Generalization of the Weighted
Edge-Dominating Set Problem’. In: Algorithms - ESA , th Annual
European Symposium. . doi: 10.1007/3-540-45253-2_13.
 Bibliography

[Fuj] Toshihiro Fujito. ‘On approximability of the independent/connected

edge dominating set problems’. In: Inf. Process. Lett. (). doi: 10.1016/
S0020-0190(01)00138-7.
[FN] Toshihiro Fujito and Hiroshi Nagamochi. ‘A -approximation algorithm
for the minimum weight edge dominating set problem’. In: Discrete
Applied Mathematics (). doi: 10.1016/S0166-218X(00)00383-8.
[Kan+] Tapas Kanungo, David M. Mount, Nathan S. Netanyahu, Christine D.
Piatko, Ruth Silverman and Angela Y. Wu. ‘A local search approximation
algorithm for k-means clustering’. In: Proceedings of the th Annual Sym-
posium on Computational Geometry. . doi: 10.1145/513400.513402.
[Kho] Subhash Khot. ‘On the power of unique -prover -round games’. In:
Proceedings on th Annual ACM Symposium on Theory of Computing,
STOC. . doi: 10.1145/509907.510017.
[Par] Ojas Parekh. ‘Edge dominating and hypomatchable sets’. In: Proceedings
of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms,
SODA. .
[FRT] Jittat Fakcharoenphol, Satish Rao and Kunal Talwar. ‘A tight bound on
approximating arbitrary metrics by tree metrics’. In: Proceedings of the
th Annual ACM Symposium on Theory of Computing, STOC. . doi:
10.1145/780542.780608.
[KR] Subhash Khot and Oded Regev. ‘Vertex Cover Might be Hard to Approx-
imate to within 2 − ε’. In: th Annual IEEE Conference on Computational
Complexity CCC. . doi: 10.1109/CCC.2003.1214437.
[AS] Alexander A. Ageev and Maxim Sviridenko. ‘Pipage Rounding: A New
Method of Constructing Algorithms with Proven Performance Guaran-
tee’. In: J. Comb. Optim. (). doi: 10.1023/B:JOCO.0000038913.96607.
c2.
[Ary+] Vijay Arya, Naveen Garg, Rohit Khandekar, Adam Meyerson, Kamesh
Munagala and Vinayaka Pandit. ‘Local Search Heuristics for k-Median
and Facility Location Problems’. In: SIAM J. Comput. (). doi: 10 .
1137/S0097539702416402.
[CR] Julia Chuzhoy and Yuval Rabani. ‘Approximating k-median with non-
uniform capacities’. In: Proceedings of the Sixteenth Annual ACM-SIAM
Symposium on Discrete Algorithms, SODA. .
Bibliography 

[MOO] Elchanan Mossel, Ryan O’Donnell and Krzysztof Oleszkiewicz. ‘Noise

stability of functions with low in.uences invariance and optimality’. In:
th Annual IEEE Symposium on Foundations of Computer Science, FOCS.
. doi: 10.1109/SFCS.2005.53.
[CC] Miroslav Chlebík and Janka Chlebíková. ‘Approximation hardness of
edge dominating set problems’. In: J. Comb. Optim. (). doi: 10.1007/
s10878-006-7908-0.
[AV] David Arthur and Sergei Vassilvitskii. ‘k-means++: the advantages of
careful seeding’. In: Proceedings of the Eighteenth Annual ACM-SIAM
Symposium on Discrete Algorithms, SODA. .
[Căl+] Gruia Călinescu, Chandra Chekuri, Martin Pál and Jan Vondrák. ‘Max-
imizing a Submodular Set Function Subject to a Matroid Constraint
(Extended Abstract)’. In: Integer Programming and Combinatorial Opti-
mization, th International IPCO Conference. . doi: 10.1007/978-3-
540-72792-7\_15.
[SS] M. Shaked and J.G. Shanthikumar. Stochastic Orders. . isbn: --
--. doi: 10.1007/978-0-387-34675-5.
log n
[GLR] Zvi Gotthilf, Moshe Lewenstein and Elad Rainshmidt. ‘A 2 − c n Ap-
proximation Algorithm for the Minimum Maximal Matching Problem’.
In: Approximation and Online Algorithms, th International Workshop,
WAOA. . doi: 10.1007/978-3-540-93980-1_21.
[GMR] Venkatesan Guruswami, Rajsekar Manokaran and Prasad Raghavendra.
‘Beating the Random Ordering is Hard: Inapproximability of Maximum
Acyclic Subgraph’. In: th Annual IEEE Symposium on Foundations of
Computer Science, FOCS. . doi: 10.1109/FOCS.2008.51.
[BK] Nikhil Bansal and Subhash Khot. ‘Optimal Long Code Test with One
Free Bit’. In: th Annual IEEE Symposium on Foundations of Computer
Science, FOCS. . doi: 10.1109/FOCS.2009.23.
[RS] Prasad Raghavendra and David Steurer. ‘Graph expansion and the
unique games conjecture’. In: Proceedings of the nd ACM Symposium on
Theory of Computing, STOC. . doi: 10.1145/1806689.1806792.
[Ste] David Steurer. ‘On the Complexity of Unique Games and Graph Expan-
sion’. PhD thesis. Princeton University, .
[Căl+] Gruia Călinescu, Chandra Chekuri, Martin Pál and Jan Vondrák. ‘Max-
imizing a Monotone Submodular Function Subject to a Matroid Con-
straint’. In: SIAM J. Comput. (). doi: 10.1137/080733991.
 Bibliography

[MKI] Yusuke Matsumoto, Naoyuki Kamiyama and Keiko Imai. ‘An approx-
imation algorithm dependent on edge-coloring number for minimum
maximal matching problem’. In: Inf. Process. Lett. (). doi: 10.1016/j.
ipl.2011.02.006.
[RST] Prasad Raghavendra, David Steurer and Madhur Tulsiani. ‘Reductions
between Expansion Problems’. In: Proceedings of the th Conference on
Computational Complexity, CCC. . doi: 10.1109/CCC.2012.43.
[SV] Richard Schmied and Claus Viehmann. ‘Approximating edge dominat-
ing set in dense graphs’. In: Theor. Comput. Sci. (). doi: 10.1016/j.
tcs.2011.10.001.
[LS] Shi Li and Ola Svensson. ‘Approximating k-median via pseudo-approxi-
mation’. In: Symposium on Theory of Computing Conference, STOC. .
doi: 10.1145/2488608.2488723.
[Byr+] Jaroslaw Byrka, Thomas W. Pensyl, Bartosz Rybicki, Aravind Srinivasan
and Khoa Trinh. ‘An Improved Approximation for k-median, and Positive
Correlation in Budgeted Optimization’. In: Proceedings of the Twenty-
Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA. .
doi: 10.1137/1.9781611973730.50.
[Cyg+] Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov,
Dániel Marx, Marcin Pilipczuk, Michal Pilipczuk and Saket Saurabh.
Parameterized Algorithms. . isbn: ----. doi: 10.1007/
978-3-319-21275-3.
[Esc+] Bruno Escoffier, Jérôme Monnot, Vangelis Th. Paschos and Mingyu Xiao.
‘New Results on Polynomial Inapproximabilityand Fixed Parameter Ap-
proximability of Edge Dominating Set’. In: Theory Comput. Syst. ().
doi: 10.1007/s00224-014-9549-5.
[Li] Shi Li. ‘On Uniform Capacitated k-Median Beyond the Natural LP Relax-
ation’. In: Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium
on Discrete Algorithms, SODA. . doi: 10.1137/1.9781611973730.47.
[Bha+] Amey Bhangale, Rajiv Gandhi, Mohammad Taghi Hajiaghayi, Rohit
Khandekar and Guy Kortsarz. ‘Bicovering: Covering Edges With Two
Small Subsets of Vertices’. In: rd International Colloquium on Automata,
Languages, and Programming, ICALP. . doi: 10.4230/LIPIcs.ICALP.
2016.6.
Bibliography 

[BRU] Jaroslaw Byrka, Bartosz Rybicki and Sumedha Uniyal. ‘An Approxima-
tion Algorithm for Uniform Capacitated k-Median Problem with 1 + ε
Capacity Violation’. In: Integer Programming and Combinatorial Optimiza-
tion - th International Conference, IPCO. . doi: 10.1007/978-3-319-
33461-5\_22.
[DL] H. Gökalp Demirci and Shi Li. ‘Constant Approximation for Capaci-
tated k-Median with (1 + ε)-Capacity Violation’. In: rd International
Colloquium on Automata, Languages, and Programming, ICALP. . doi:
10.4230/LIPIcs.ICALP.2016.73.
[Din] Irit Dinur. ‘Mildly exponential reduction from gap SAT to polynomial-
gap label-cover’. In: Electron. Colloquium Comput. Complex. ().
[Li] Shi Li. ‘Approximating capacitated k-median with (1+ε)k open facilities’.
In: Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on
Discrete Algorithms, SODA. . doi: 10.1137/1.9781611974331.ch56.
[SFL] Piotr Skowron, Piotr Faliszewski and Jérôme Lang. ‘Finding a Collective
Set of Items: From Proportional Multirepresentation to Group Recom-
mendation’. In: Artif. Intell. (). doi: 10.1016/j.artint.2016.09.003.
[Ahm+] Sara Ahmadian, Ashkan Norouzi-Fard, Ola Svensson and Justin Ward.
‘Better Guarantees for k-Means and Euclidean k-Median by Primal-Dual
Algorithms’. In: th IEEE Annual Symposium on Foundations of Computer
Science, FOCS. . doi: 10.1109/FOCS.2017.15.
[Cha+] Parinya Chalermsook, Marek Cygan, Guy Kortsarz, Bundit Laekhanukit,
Pasin Manurangsi, Danupon Nanongkai and Luca Trevisan. ‘From Gap-
ETH to FPT-Inapproximability: Clique, Dominating Set, and More’. In:
th IEEE Annual Symposium on Foundations of Computer Science, FOCS.
. doi: 10.1109/FOCS.2017.74.
[Man] Pasin Manurangsi. ‘Inapproximability of Maximum Edge Biclique, Max-
imum Balanced Biclique and Minimum k-Cut from the Small Set Ex-
pansion Hypothesis’. In: th International Colloquium on Automata,
Languages, and Programming, ICALP. . doi: 10.4230/LIPIcs.ICALP.
2017.79.
[MR] Pasin Manurangsi and Prasad Raghavendra. ‘A Birthday Repetition The-
orem and Complexity of Approximating Dense CSPs’. In: th Inter-
national Colloquium on Automata, Languages, and Programming, ICALP.
. doi: 10.4230/LIPIcs.ICALP.2017.78.
 Bibliography

[GLL] Anupam Gupta, Euiwoong Lee and Jason Li. ‘An FPT Algorithm Beating
-Approximation for k-Cut’. In: Proceedings of the Twenty-Ninth Annual
ACM-SIAM Symposium on Discrete Algorithms, SODA. . doi: 10.1137/
1.9781611975031.179.
[Ada+] Marek Adamczyk, Jaroslaw Byrka, Jan Marcinkowski, Syed Mohammad
Meesum and Michal Wlodarczyk. ‘Constant-Factor FPT Approximation
for Capacitated k-Median’. In: th Annual European Symposium on Algo-
rithms, ESA. . doi: 10.4230/LIPIcs.ESA.2019.1.
[Coh+] Vincent Cohen-Addad, Anupam Gupta, Amit Kumar, Euiwoong Lee
and Jason Li. ‘Tight FPT Approximations for k-Median and k-Means’. In:
th International Colloquium on Automata, Languages, and Programming,
ICALP. . doi: 10.4230/LIPIcs.ICALP.2019.42.
[CL] Vincent Cohen-Addad and Jason Li. ‘On the Fixed-Parameter Tractability
of Capacitated Clustering’. In: th International Colloquium on Automata,
Languages, and Programming, ICALP. . doi: 10.4230/LIPIcs.ICALP.
2019.41.
[DLM] Szymon Dudycz, Mateusz Lewandowski and Jan Marcinkowski. ‘Tight
Approximation Ratio for Minimum Maximal Matching’. In: Integer Pro-
gramming and Combinatorial Optimization - th International Conference,
IPCO. .
[Bar+] Siddharth Barman, Omar Fawzi, Suprovat Ghoshal and Emirhan Gür-
pinar. ‘Tight Approximation Bounds for Maximum Multi-coverage’. In:
Integer Programming and Combinatorial Optimization - st International
Conference, IPCO. . doi: 10.1007/978-3-030-45771-6\_6.
[Bat] Wouter Cames van Batenburg. ‘Minimum maximal matchings in cubic
graphs’. In: CoRR (). arXiv: 2008.01863.
[Byr+] Jaroslaw Byrka, Szymon Dudycz, Pasin Manurangsi, Jan Marcinkowski
and Michal Wlodarczyk. ‘To Close Is Easier Than To Open: Dual Param-
eterization To k-Median’. In: Approximation and Online Algorithms - th
International Workshop, WAOA. . arXiv: 2011.08083.
[DMM] Szymon Dudycz, Pasin Manurangsi and Jan Marcinkowski. ‘Tight Inap-
proximability of Minimum Maximal Matching on Bipartite Graphs and
Related Problems’. to appear. .
Bibliography 

[Dud+] Szymon Dudycz, Pasin Manurangsi, Jan Marcinkowski and Krzysztof

Sornat. ‘Tight Approximation for Proportional Approval Voting’. In:
Proceedings of the Twenty-Ninth International Joint Conference on Artificial
Intelligence, IJCAI. . doi: 10.24963/ijcai.2020/39.
[Man] Pasin Manurangsi. ‘Tight Running Time Lower Bounds for Strong Inap-
proximability of Maximum k-Coverage, Unique Set Cover and Related
Problems (via t-Wise Agreement Testing Theorem)’. In: Proceedings of
the  ACM-SIAM Symposium on Discrete Algorithms, SODA. . doi:
10.1137/1.9781611975994.5.
[BFF] Siddharth Barman, Omar Fawzi and Paul Fermé. ‘Tight Approxima-
tion Guarantees for Concave Coverage Problems’. In: th International
Symposium on Theoretical Aspects of Computer Science, STACS. . doi:
10.4230/LIPIcs.STACS.2021.9.

Algorithms
No ratings yet
Algorithms
501 pages
PB Algo2 2013 PDF
No ratings yet
PB Algo2 2013 PDF
87 pages
Cmu850 f20
No ratings yet
Cmu850 f20
309 pages
Cmu850 f20
No ratings yet
Cmu850 f20
285 pages
Graphs and Beyond - Faster Algorithms For High Dimensional Convex Optimization - Jakub Pachocki
No ratings yet
Graphs and Beyond - Faster Algorithms For High Dimensional Convex Optimization - Jakub Pachocki
202 pages
Introduction To Random Graphs
100% (1)
Introduction To Random Graphs
583 pages
BOOK
No ratings yet
BOOK
583 pages
Book
No ratings yet
Book
583 pages
Exbook
No ratings yet
Exbook
39 pages
Advanced Algorithms
No ratings yet
Advanced Algorithms
218 pages
Comput Complexity of Count and Sampl
No ratings yet
Comput Complexity of Count and Sampl
409 pages
Algorithm Design for Students
No ratings yet
Algorithm Design for Students
117 pages
Algoritmos Distribuidos
No ratings yet
Algoritmos Distribuidos
221 pages
Ucpc Finals
No ratings yet
Ucpc Finals
24 pages
Approx Algorithms Lecture Notes
No ratings yet
Approx Algorithms Lecture Notes
227 pages
Klein Berg Book
No ratings yet
Klein Berg Book
459 pages
6 Spanning Subgraphs 89: II Basic Model Extensions 149
No ratings yet
6 Spanning Subgraphs 89: II Basic Model Extensions 149
4 pages
Algorithms: 2006 S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani July 18, 2006
No ratings yet
Algorithms: 2006 S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani July 18, 2006
8 pages
Algorithms: 2006 S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani July 18, 2006
100% (2)
Algorithms: 2006 S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani July 18, 2006
336 pages
Cs Theorists Toolkit
No ratings yet
Cs Theorists Toolkit
95 pages
Sariel PDF
No ratings yet
Sariel PDF
265 pages
Design and Analysis of Algorithms
No ratings yet
Design and Analysis of Algorithms
124 pages
Graph Theory Concepts & Applications
100% (2)
Graph Theory Concepts & Applications
722 pages
16 Mappings 347: IV Further Topics 441
No ratings yet
16 Mappings 347: IV Further Topics 441
6 pages
Icpc Seoul Qual
No ratings yet
Icpc Seoul Qual
25 pages
Combinatorial Optimization - Chekuri (2022)
No ratings yet
Combinatorial Optimization - Chekuri (2022)
255 pages
Book
No ratings yet
Book
585 pages
Algorithms and Complexity II
100% (2)
Algorithms and Complexity II
102 pages
Introduction To Random Graphs
No ratings yet
Introduction To Random Graphs
682 pages
E Maxx
No ratings yet
E Maxx
643 pages
Lecture Notes On Computational Complexity: Luca Trevisan Notes Written in Fall 2002, Revised May 2004
No ratings yet
Lecture Notes On Computational Complexity: Luca Trevisan Notes Written in Fall 2002, Revised May 2004
171 pages
Book
No ratings yet
Book
579 pages
Ubc 2006 Fall Furrow Bartholomew
No ratings yet
Ubc 2006 Fall Furrow Bartholomew
69 pages
Book
No ratings yet
Book
652 pages
Complexity Theory
0% (1)
Complexity Theory
91 pages
AAscript
No ratings yet
AAscript
158 pages
Data Stream Algorithms Notes
No ratings yet
Data Stream Algorithms Notes
61 pages
Lecture Notes For Algorithm Analysis and Design: JNTU World
No ratings yet
Lecture Notes For Algorithm Analysis and Design: JNTU World
128 pages
Notebook
No ratings yet
Notebook
160 pages
MIT Probabilistic Method Lectures
No ratings yet
MIT Probabilistic Method Lectures
124 pages
Notebook
No ratings yet
Notebook
40 pages
Computational Geomatory
No ratings yet
Computational Geomatory
212 pages
Content Discrete
No ratings yet
Content Discrete
4 pages
CS170: Efficient Algorithms and Intractable Problems Fall 2001
No ratings yet
CS170: Efficient Algorithms and Intractable Problems Fall 2001
113 pages
COmputing and Combinatoria
No ratings yet
COmputing and Combinatoria
490 pages
Mpnotes 2018
No ratings yet
Mpnotes 2018
186 pages
Graphs
No ratings yet
Graphs
409 pages
Competitive Programmer Notebook
No ratings yet
Competitive Programmer Notebook
42 pages
Calculus 4
No ratings yet
Calculus 4
10 pages
Linear Algebra in Maple
No ratings yet
Linear Algebra in Maple
22 pages
Banach Spaces and Hilbert Spaces in Machine Learning Theory
No ratings yet
Banach Spaces and Hilbert Spaces in Machine Learning Theory
33 pages
Math 252 - Exam 2 Review - FA20
No ratings yet
Math 252 - Exam 2 Review - FA20
2 pages
MTH 211
No ratings yet
MTH 211
2 pages
4B Ch.7 Exponential Functions Supplmentary Exercise Solutions
No ratings yet
4B Ch.7 Exponential Functions Supplmentary Exercise Solutions
9 pages
Notes MA1102 2018
No ratings yet
Notes MA1102 2018
81 pages
MATH 1231 Spring 24 Final Exam
No ratings yet
MATH 1231 Spring 24 Final Exam
15 pages
MA201 Linear Algebra and Complex Analysis PDF
No ratings yet
MA201 Linear Algebra and Complex Analysis PDF
3 pages
Matrix Types and Operations Guide
No ratings yet
Matrix Types and Operations Guide
6 pages
(Ebook PDF) Analysis and Design of Algorithms 3rd Ed. Edition by Amrinder Arora PDF Download
100% (4)
(Ebook PDF) Analysis and Design of Algorithms 3rd Ed. Edition by Amrinder Arora PDF Download
61 pages
3D Surface Reconstruction Method
No ratings yet
3D Surface Reconstruction Method
10 pages
1.6 - Maximum and Minimum Points of Quadratic Functions
No ratings yet
1.6 - Maximum and Minimum Points of Quadratic Functions
9 pages
3 - BC 2017-18 5.-6.3 FTC To Sketch and Find Total Change
No ratings yet
3 - BC 2017-18 5.-6.3 FTC To Sketch and Find Total Change
6 pages
Probabilistic Method in Combinatorics
No ratings yet
Probabilistic Method in Combinatorics
9 pages
Question Bank of Discrete Mathematics PDF
100% (1)
Question Bank of Discrete Mathematics PDF
7 pages
1 - Network Flow Models
No ratings yet
1 - Network Flow Models
43 pages
One-Sided Limits
No ratings yet
One-Sided Limits
3 pages
Class 12 Relations & Functions Test
No ratings yet
Class 12 Relations & Functions Test
5 pages
7 Network Flows: Objectives
No ratings yet
7 Network Flows: Objectives
14 pages
Discrete Mathematics: R. Johnsonbaugh
No ratings yet
Discrete Mathematics: R. Johnsonbaugh
70 pages
Summer 2020 MAT 216 Problem Sheet: Solve The Following Problems
No ratings yet
Summer 2020 MAT 216 Problem Sheet: Solve The Following Problems
3 pages
Daa Project Report 2
No ratings yet
Daa Project Report 2
14 pages
15-396 hw1
No ratings yet
15-396 hw1
3 pages
Parabolas: Mhel Brian G. Dayon (Em Bi) 0998-1591-389
100% (1)
Parabolas: Mhel Brian G. Dayon (Em Bi) 0998-1591-389
11 pages
Calculus Differentiation Rules
No ratings yet
Calculus Differentiation Rules
2 pages
Quiz Functions I
No ratings yet
Quiz Functions I
30 pages
BCA-234 Data Structures Laboratory
No ratings yet
BCA-234 Data Structures Laboratory
29 pages
LU Factorization
No ratings yet
LU Factorization
15 pages
Advanced Math for Engineering Students
No ratings yet
Advanced Math for Engineering Students
17 pages

Rozprawa

Uploaded by

Rozprawa

Uploaded by

Three small discoveries in the field of

. W rozdziale  pokazujemy, że trywialny algorytm 2-aproksymacyjny dla problemu Mi-

 The Minimum Maximal Matching problem 

 Proportional Approval Voting 

. Step : Three-valued x∗ . . . . . . . . . . . . . . . . . . . . . . . 

 k-Medians clustering with capacities 

[m] The set of (positive) natural numbers up to m: {1, . . . , m}.

We are Computer Scientists. Our role is to investigate—with mathematical scrutiny—

. Computational tasks and their complexity

EXAMPLE (Maximum Acyclic Undirected Subgraph).

EXAMPLE (Maximum Acyclic Subgraph).

Reduction  (Karp [Kar])

EXAMPLE (1/2-approximation for MAS).

In the field of approximation algorithms—which has been central to Computer

Theorem  (Guruswami, Manokaran, Raghavendra [GMR])

show that it cannot be improved upon, even by an algorithm with super-polynomial

 Although a similar thing could be said about many optimisation problems.

If we denote by val(L) the maximum fraction of right-side vertices satisfied by

Theorem  (Feige [Fei])

The second theorem is conditional—it relies on a conjecture called Gap-ETH :

Theorem  (Manurangsi [Man])

. Unique Games and variations

Problem  (Unique Label Cover, ULC(, t, R))

• (YES) There exists a subset V 0 ⊂ V of size |V 0 | > (1 − )|V | and a labelling l : U →

Problem  (ULC with weak expansion, ULCE(, R, δ))

• (YES) There exists a subset V 0 ⊂ V of size |V 0 | > (1 − )|V | and a labelling l : U →

Problem  (SSE(δ, η))

• (YES) There exists S ⊂ V of size |S| = δ|V | such that Φ(S) 6 η.

Algorithms and Complexity A series of polynomial-time algorithms for MMM on

Figure : The equivalence of MMM and EDS.

NP-hard: first by Yannakakis and Gavril on planar or bipartite graphs of maximum

Approximation algorithms By endpoint-counting, no matching can be more than

Par], hence, realistically, no further progress should be expected. Surprisingly, wMMM

Negative results Chlebík and Chlebíkova proved that it is impossible to approx-

. Pick a random input x ∈ {0, 1}R ;

. Pick a random S ⊂ [R] of size R. Define the sub-cubes:

Cx,S = z ∈ {0, 1}R | zj = xj ∀j < S ;

.. The vertex cover instance

Two properties are proved:

I S = v, Cx,S , b | v ∈ V 0 , l(v) < S, xl(v) = b .

.. Supplementing H with edges

y1 , y2 ∈ {0, 1}R such that

• b1 ⊕ y1 (i) 6 b2 ⊕ y2 (i) for all i < S1 ∪ S2  ; and

We only added edges to the graph, compared to Reduction , so Lemma  is (even

Proof. Take I S as defined in Lemma . For any v, Cx,S , b ∈ I S and (u, v) ∈ E, if

Lemma . There is a perfect matching in VH 0 \ I S.

Proof. We consider three groups of vertices v, Cx,S , b ∈ VH 0 \ I S.

This edge exists in EH 0 because 1 = 0 6 1.

. Bipartite graphs I: Bipartisation and path covers

• (YES) with a maximal matching of size n( 12 − ).

Theorem  is a direct consequence of this lemma.

.. Covering with paths

. Bipartite graphs II: From Strong ugc to MMM

bq/c bq/c −     q− q− bq/c +  bq/c + 

• {i, i + 1 (mod q)} for i , bq/2c.

The resulting graph is -regular and connected.

Lemma  (soundness, [Bha+, Lemma  reinterpreted]). For every constant γ,

With Lemmas  and  we can conclude.

• (YES) There is a maximal matching in G of size ( 12 − )|U |.

which gives us Theorem , as the (2 − )-approximation algorithm faced with a (YES)

. Bipartite graphs III: Hardness based on sseh

Problem  (Max UnCut Hypergraph Bisection)

.. Reducing from sseh to ugc

Reduction  (sseh to ugc: First attempt)

Reduction  (sseh to ugc: Second attempt)

Figure : In Reduction  (a), a vertex of H1 corresponds to a block (vector) of R

Reduction  (sseh to ugc: First attempt + random noises, [Ste, Reduc-

.. Max Uncut Hypergraph Bisection

• The set of vertices VH = V R×k × ΩR×k .

x ∈ ΩR×k , D ⊆ [R] × [k].

there is a permutation π ∈ (SR )k such that:

Reduction  (An incorrect reduction from ULC to MUCHB)

 This may be not unique. The choice of a permutation is arbitrary.

• Let εV be a small constant. For each A ∈ V R×k , TV (A) denotes a distribution

. Output a hyperedge (π(B0 ), π(x0 )) | ∃p ∈ [l], π = π1 | . . . |πk ∈ (SR )k , x0 ∈ CD (x),

Remark. Reductions  and  produce a weighted hypergraph (with weights assigned

Problem  (Unique Label Cover, ULC(, t, R))

• (YES) There exists a subset V 0 ⊂ V of size |V 0 | > (1 − )|V | and a labelling l : U →

Problem  (ULC with weak expansion, ULCE(, R, δ))

• (YES) There exists a subset V 0 ⊂ V of size |V 0 | > (1 − )|V | and a labelling l : U →

. Pick a random S ⊂ [R] of size R. Define the sub-cubes:

• (YES) with a maximal matching of size n( 12 − ).

• (YES) There is a maximal matching in G of size ( 12 − )|U |.

which gives us Theorem , as the (2 − )-approximation algorithm faced with a (YES)

Lemma . Let t0 be any hyperedge in EH (T0 ). Then D0 ∩ W (t0 ) = .

• (YES) There is a maximal matching in G of size ( 12 + )|U |.

Proposition . b∈B1 v∈V [b] scrw (W , v) 6 0.2 · |V |.

b∈B degW (b) = |B| · t) = (αw + 0.7) · |V |.