Linian SIAM1992
Linian SIAM1992
Received by the editors July 17, 1989; accepted for publication (in revised form) December 4,
1990.
Computer Science Department, Hebrew University, Jerusalem 91904, Israel. Part of this work
was done while the author was visiting the IBM Research Center, Almaden, California, and the De-
partment of Computer Science, Stanford University, Stanford, California 94305. This work was sup-
ported in part by grants from the Israel-US Binational Science Foundation and the Israeli Academy
of Sciences.
193
194 NATHAN LINIAL
In most cases ID is a bijection onto 1,..., V I- It is assumed that at time zero the
processor occupying a node in G knows the ID of that node. Incidentally, all our
Downloaded 11/22/13 to 132.64.42.56. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
lower bounds hold even if every processor knows in advance what the graph G is, and
only the labeling function ID is unknown.
It is clear that the present model allows us to compute every function of G in
time O(diameter(G)). After this amount of time every processor obtains complete
knowledge of both G and ID. The problem is thus solved if, in the memory of each
processor a solution for the entire problem is prestored, for every possible labeling.
Our concern is therefore only with time complexities below diam(G). The major
question we raise is which functions may be computed by a nontrivial algorithm in
this model, in the sense that they can be computed faster than diam(G).
The model proposed here is, of course, of purely theoretical interest. Of the many
difficulties arising in distributed processing, it focuses only on transforming local data
into a global solution. Further research into this model may help classify problems as
either locally computable (solvable in time shorter than diam(G)) or not. One may
also look for bounds on run times which depend on graph parameters other than the
diameter. In accordance with the theory of the complexity class NC, it seems natural
to investigate graph problems whose time complexity in this model is polylogarithmic
in the number of vertices.
Here are our main results:
(1) Finding a maximal independent set distributively in a labeled n-cycle, requires
time (log* n). This bound is tight in view of the O(log* n) algorithm by Cole and
Vishkin [CV]. (Technically, their result was stated in the PRAM model, but it extends
without change to the distributed model as well.) Our proof relies on the interesting
construct of neighborhood graphs. An alternative proof based on the Ramsey theorem
was found by a number of other investigators in the area [A].
(2) Coloring trees: Let T be the d-ary tree ofheight r. In time 2r/3, it is
impossible to color T in fewer than colors. Note in contrast the algorithm by
Goldberg and Plotkin [GP], which shows that if every node in T "knows its parent
in the tree," i.e., a consistent orientation from the root outwards is given, then a
3-coloring can be found in time O(log* n). Though stated for the PRAM model, it is
easy to see that this result of [GP] applies to the distributed model as well.
(3) In a labeled graph of order n with maximal degree A, it is possible to find an
O(A2)-coloring in time (1 + o(1))log* n. This was previously shown in [GP] only in
the case of constant A.
Our terminology is standard: a k-labeling of a graph G (V, E) is 1:1 mapping
f V --. {1,..-,k}. In case k -[ V a k-labeling is called a labeling. Given a k-
labeling, G is said to be k-labeled, etc. Logarithms are to base 2. The k times iterated
logarithm is denoted by log (k) x, i.e., log(l) x :- log x and log(k) x log(log k-1 x).
The least integer k for which log (k) x < 1 is denoted by log* x.
2. Lower bound on finding a maximal independent set in a cycle. In
[CV] a very nice algorithm was presented to find a maximal independent set of vertices
(-- MIS) in the n-cycle Cn in time log* n. In this section we show that this is optimal
even in the present model, where computation takes no time. The algorithm presented
in 4 also achieves this time bound.
A basic observation is that in the present model there is no loss of generality
in assuming that processing proceeds by first collecting all data and then deciding.
That is, at time t each processor knows the labeling of all nodes at distance t or less
away. Also known are all edges between these nodes, except for edges both endpoints
LOCALITY IN DISTRIBUTED GRAPH ALGORITHMS 195
of which are at distance exactly t. Note that no further information can reach a
processor by time t. This allows us to view the problem in purely combinatorial
Downloaded 11/22/13 to 132.64.42.56. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
terms.
Let us state our theorem.
THEOREM 2.1. A synchronous distributed algorithm which finds a maximal in-
dependent set in a labeled n-cycle must take at least 1/2(log* n- 1) units of time. An
algorithm of the same class which colors the n-cycle with three colors requires time at
least 1/2 (log* n- 3). The same bounds hold also .for randomized algorithms.
Proof. The proof holds even under the assumption that there is a consistent
notion of clockwise orientation common to all processors. Given an algorithm which
finds a maximal independent set in the n-cycle endowed with a clockwise orientation,
it is easily seen that in one more timestep, the cycle may be 3-colored. The lower
bound is established for 3-coloring.
Coming back to the previous observation, at time t the data known to a processor
P is an ordered list of 2t / 1 labels, starting t places before it, through its own and on
to the next t labels. Let V be the set of all vectors (Xl,... ,x2t+l) where the xi are
mutually distinct integers from (1,..., n}. The algorithm is nothing but a mapping
c from V into { 1, 2, 3}.
Let us denote by Bt,n the graph whose set of vertices is V. All edges of Bt,n are
given by:
(xl,"’,x2t+l) and (y, xl,’.’,x2t)
are neighbors for all y
of degree 2(n- 2t- 1). Note that the mapping c: V
3-coloring of Bt,n. For suppose that c assigns
(Xl,’",x2t+l) and (y, xl,’",x2t)
-
# x2t+l. So St,, has n(n- 1)... (n-2t) vertices and is regular
{1, 2, 3} is, in fact, a proper
the same color. Then the 3-coloring algorithm for the n-cycle fails in case the labeling
happens to contain the segment:
y Xl X2 X2t+l.
The proof follows now by standard graph-theoretic arguments which show that the
chromatic number x(B,n) of B,n satisfies
Proof.The statement concerning Dl,n is just the definition. For the other claim,
identify the edge connecting (x,...,xs) and (x2,...,xs,y) in D,n with the vertex
Downloaded 11/22/13 to 132.64.42.56. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
__
This is easily seen to be a 2 k vertex-coloring of G. Indeed if u (x, y) E E(G), then
(u) e c(x) but (u) c(y), or else is improper. Therefore c(x) c(y).
The main claim can be derived now. If an MIS can be found in time t- 1, then
necessarily
3 >_ x(B,).
But
x(Bt,n) x(D2t+l,n) log (2t) n.
So
2t _> log*n- 1, t _> (log 2*n- 1),
as claimed.
The claim on randomized algorithms is proved in Corollary 2.1 below. [:]
In contrast with the low time complexity of 3-coloring, we can show that for an
even n, finding a 2-coloring of Cn requires time f(n).
THEOREM 2.2. A synchronous distributed algorithm which 2-colors a labeled 2n
cycle with labels from {1,..., 2n} must take at least n- 1 units of time.
Proof. Now the lowest t has to be found such that Bt,2n is bipartite. But even
for t n- 2, the graph Bt,2n contains an odd cycle:
and
x(Bt,n) X(Nt(Cn)).
function which computes a color for a vertex from its t-neighborhood and the random
a. Fix a graph G, an integer t, and two adjacent vertices x and y in G. Consider
Downloaded 11/22/13 to 132.64.42.56. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
,
neighborhoods of x, y, and a random string a. The color is chosen based on seeing
the labeled neighborhood and the random a must differ from the one given for r/, a.
In other words, the coloring function constitutes a valid coloring for a graph which is
the disjoint union of copies of Nt (G), one copy per each a. This graph has, of course,
the same chromatic number as Nt(G), and the claim follows. D
1
2
>
and so Td,r cannot be colored with fewer than
follows now on observing that for Td,r,
log n
>
1/2
>
colors in time t. The conclusion
> r.
log(d- 1)
Corollary 2.1 establishes the bound for randomized algorithms as well. D
Two remarks are in order now: It is probably possible to improve the lower
bound of 1/2vf to (d/logd) by using an appropriate random graph rather than
Ramanujan graphs (e.g., [B, 11.4]). It is shown in the next section that for a d-
regular graph, an O(d2)-coloring can be found in time O(log* n). The gap between
(d/log d) and d2 is quite intriguing and is closely related to the complexity of finding
an MIS distributively, as we explain below. Also, it is not clear how the number
of colors sufficient to color the tree goes down as t grows from (2r/3) to 2r, where
already two colors suffice.
4. An O(log* n) algorithm for O(A2)-coloring. This section contains some
positive results on what can be achieved distributively in this model. It is shown how
to find an O(A2)-coloring in time O(log* n). The first proof is based on the existence of
certain hypergraphs for which no explicit construction is known. However, as Karloff
pointed out [K], a slightly weaker, though constructive, set system suffices to achieve
this goal.
THEOREM 4.1. Let G be a graph of order n and largest degree A. It is possible
to color G with 5A 2 log n colors in one unit of time distributively. Equivalently,
LEMMA 4.1. For integers n > A, there is a family J o.f n subsets o.f (1,...,
5 A2 log n] } such that if Fo,..., FA E J, then
_
Downloaded 11/22/13 to 132.64.42.56. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Proof. The existence of such families was considered in the literature [KSS],
[EFF], and follows, e.g., from Theorem 3.1 in JEFF]. This lemma is simple enough,
though, for a proof to be reproduced here. To prove the lemma, let m- 5A21ogn]
and consider a random collection J of n subsets of { 1,--., m} which is constructed as
follows" For 1 _< x _< m and 1 _< i n, let Pr(x Si) (l/A). All the decisions on
whether x Si are made independently.
We claim that there is a selection of such a family for which the lemma holds.
The probability that for a given 1 _< x <_ m and given F0,..., FA J (i.e., Fi S,
for appropriately chosen indices), there holds
i8
1( i-
->4A"
1
then there is a selection of a family J which satisfies the lemma. Standard estimates
show that this holds when
m > 5A 2 log n.
To prove the theorem, fix a family J {F1,..., Fn} as in the lemma. Consider
vertex with label i whose neighbors’ labels are j,..., jd where d _< A. Since
d
there is a 1 _< x _< m with x e F\(U= Fj.). The color of this vertex is x. It is
easily verified that this is a proper m-coloring of G.
The same proof yields, more generally, the following corollary.
COROLLARY 4.1. Let G be a graph whose vertices are properly colored with k
colors and whose largest degree is A. It is possible to color G with 5A 2 log k colors in
one unit of time distributively.
By iterating the coloring given by the corollary log* n times, a 10A 2 log A-coloring
is obtained. To eliminate the log A term we need a result of the same type as Lemma
4.1, only in that range.
Downloaded 11/22/13 to 132.64.42.56. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
REFERENCES
Downloaded 11/22/13 to 132.64.42.56. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php