0% found this document useful (0 votes)
92 views6 pages

Towards A Common API For Structured Peer-to-Peer Overlays

This document discusses efforts to define common APIs for structured peer-to-peer overlays. It identifies a key-based routing API (KBR) as a common abstraction across overlays that maps keys to nodes. Higher-level services like distributed hash tables (DHTs), group communication (CAST), and decentralized object location (DOLR) can be built on the KBR. Defining common APIs would accelerate adoption of overlays, facilitate independent innovation, and allow direct comparisons of systems.

Uploaded by

Nghĩa Zer
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views6 pages

Towards A Common API For Structured Peer-to-Peer Overlays

This document discusses efforts to define common APIs for structured peer-to-peer overlays. It identifies a key-based routing API (KBR) as a common abstraction across overlays that maps keys to nodes. Higher-level services like distributed hash tables (DHTs), group communication (CAST), and decentralized object location (DOLR) can be built on the KBR. Defining common APIs would accelerate adoption of overlays, facilitate independent innovation, and allow direct comparisons of systems.

Uploaded by

Nghĩa Zer
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Towards a Common API for Structured Peer-to-Peer Overlays 

Frank Dabek½ Ben Zhao¾ Peter Druschel¿ John Kubiatowicz¾ Ion Stoica¾
½
MIT Laboratory for Computer Science, Cambridge, MA.
¾
University of California, Berkeley, CA.
¿
Rice University, Houston, TX.

Abstract common to all structured overlays. We show that the KBR


In this paper, we describe an ongoing effort to define com- can be easily implemented by existing overlay protocols
mon APIs for structured peer-to-peer overlays and the key and that it allows the efficient implementation of higher
abstractions that can be built on them. In doing so, we level services and a wide range of applications. Thus, the
hope to facilitate independent innovation in overlay pro- KBR forms the common denominator of services pro-
tocols, services, and applications, to allow direct experi- vided by existing structured overlays.
mental comparisons, and to encourage application devel- In addition, we have identified a number of higher level
opment by third parties. We provide a snapshot of our (tier 1) abstractions and sketch how they can be built upon
efforts and discuss open problems in an effort to solicit the basic KBR. These abstractions include distributed
feedback from the research community. hash tables (DHT), group anycast and multicast (CAST),
and decentralized object location and routing (DOLR).
1 Introduction Efforts to define common APIs for these services are cur-
Structured peer-to-peer overlay networks have recently rently underway.
gained popularity as a platform for the construction or re- We believe that defining common abstractions and APIs
silient, large-scale distributed systems [6, 7, 8, 10, 11]. will accelerate the adoption of structured overlays, facili-
Structured overlays conform to a specific graph structure tate independent innovation in overlay protocols, services,
that allows them to locate objects by exchanging    and applications, and permit direct experimental compar-
messages where  is the number of nodes in the overlay. isons between systems.
Structured overlays can be used to construct services Our APIs will not be universal. Certain applications
such as distributed hash tables [4], scalable group multi- will wish to use protocol-specific APIs that allow them to
cast/anycast [3, 12], and decentralized object location [5]. exploit particular characteristics of a protocol. This is nec-
These services in turn promise to support novel classes essary and desirable to facilitate innovation. However, we
of highly scalable, resilient, distributed applications, in- expect that such non-standard APIs, once properly under-
cluding cooperative archival storage, cooperative content stood and abstracted, can be added to the common APIs
distribution and messaging. over time.
Currently, each structured overlay protocol exports a The rest of this paper is organized as follows. Section 2
different API and provides services with subtly different provides an overview of structured overlays and the key
semantics. Thus, application designers must understand services they provide. Next, Section 3 defines and differ-
the intricacies of each protocol and the services they pro- entiates current tier 1 services. Section 4 describes our
vide to decide which system best meets their needs. Sub- KBR API and Section 5 evaluates our proposed API by
sequently, applications are locked into one system and un- demonstrating how it can be used to implement a vari-
able to leverage innovations in other protocols. Moreover, ety of services and how existing overlay protocols can ef-
the semantic differences make a comparative evaluation ficiently implement the API. Section 6 discusses future
of different protocol designs difficult. work: developing commons API for higher level tier 1
This work attempts to identify the fundamental abstrac- services like distributed hash tables. We conclude in Sec-
tions provided by structured overlays and to define APIs tion 6.
for the common services they provide. As the first step,
we have identified and defined a key-based routing API 2 Background
(KBR), which represents basic (tier 0) capabilities that are In this section, we define application-visible concepts
 This research was conducted as part of the IRIS project common to all structured overlay protocols.
(http://project-iris.net/), supported by the National Sci- A node represents an instance of a participant in the
ence Foundation under Cooperative Agreement No. ANI-0225660. overlay (one or more nodes may be hosted by a sin-

1
CFS PAST I3 Scribe SplitStream Bayeux OceanStore stractions provided by some of the existing systems. Most
Tier 2 applications and higher-level (tier 2) services use one
or more of these abstractions. Some tier 2 systems, like

[9], use the KBR directly.
Tier 1 DHT CAST DOLR The KBR API at tier 0 will be defined in detail in the
following section. Here, we briefly explain the tier 1 ab-
stractions and their semantic differences. The key opera-
Tier 0
tions of each of these abstractions are sketched in Table 1.
Key−based Routing Layer (KBR) The DHT abstraction provides the same functionality as
a traditional hashtable, by storing the mapping between a
key and a value. This interface implements a simple store
Figure 1: Basic abstractions and APIs, including Tier 1 in-
and retrieve functionality, where the value is always stored
terfaces: distributed hash tables (DHT), decentralized ob-
at the live overlay node(s) to which the key is mapped by
ject location and routing (DOLR), and group anycast and
the KBR layer. Values can be objects of any type. For ex-
multicast (CAST).
ample, the DHT implemented as part of the DHash inter-
face in CFS [4] stores and retrieves single disk blocks by
gle physical IP host). Participating nodes are assigned their content-hashed keys.
uniform random nodeIds from a large identifier space. The DOLR abstraction provides a decentralized direc-
Application-specific objects are assigned unique iden- tory service. Each object replica (or endpoint) has an
tifiers called keys, selected from the same id space. objectID and may be placed anywhere within the system.
Tapestry [11, 5], Pastry [8] and Chord [10] use a circu- Applications announce the presence of endpoints by pub-
lar identifier space of -bit integers modulo   (   lishing their locations. A client message addressed with
for Chord and Tapestry,    for Pastry). CAN [7] a particular objectID will be delivered to a nearby end-
uses a -dimensional cartesian identifier space, with 128- point with this name. Note that the underlying distributed
bit nodeIds that define a point in the space. directory can be implemented by annotating trees associ-
Each key is dynamically mapped by the overlay to a ated with each objectID; other implementations are pos-
unique live node, called the key’s root. To deliver mes- sible. One might ask why DOLR is not implemented on
sages efficiently to the root, each node maintains a rout- top of a DHT, with data pointers stored as values; this is
ing table consisting of the nodeIds and IP addresses of not possible because a DOLR routes messages to the near-
the nodes to which the local node maintains overlay links. est available endpoint—providing a locality property not
Messages are forwarded across overlay links to nodes supported by DHTs. An integral part of this process is the
whose nodeIds are progressively closer to the key in the maintenance of the distributed directory during changes
identifier space. to the underlying nodes or links.
Each system defines a function that maps keys to nodes. The CAST abstraction provides scalable group commu-
In Chord, keys are mapped to the live node with the clos- nication and coordination. Overlay nodes may join and
est nodeId clockwise from the key. In Pastry, keys are leave a group, multicast messages to the group, or any-
mapped to the live node with the closest nodeId. Tapestry cast a message to a member of the group. Because the
maps a key to the live node whose nodeId has the longest group is represented as a tree, membership management is
prefix match, where the node with the next higher nodeId decentralized. Thus, CAST can support large and highly
value is chosen for each digit that cannot be matched ex- dynamic groups. Moreover, if the overlay that provides
actly. In CAN, neighboring nodes in the identifier space the KBR service is proximity-aware, then multicast is effi-
agree on a partitioning of the space surrounding their cient and anycast messages are delivered to a group mem-
nodeIds; keys are mapped to the node responsible for the ber near the anycast originator.
space that contains the key. The DOLR and CAST abstractions are closely related.
Both maintain sets of endpoints in a decentralized manner
3 Abstractions and by their proximity in the network, using a tree con-
All existing systems provide higher level abstractions sisting of the routes from the endpoints to a common root
built upon the basic structured overlays. Examples are associated with the set. However, the DOLR abstraction is
Distributed Hash Tables (DHT), Decentralized Object Lo- more tailored towards object location, while the CAST ab-
cation and Routing (DOLR), and group anycast/multicast straction targets group communication. Thus, their imple-
(CAST). mentations combine different policies with the same ba-
Figure 1 illustrates how these abstractions are related. sic mechanism. The DHT abstraction, on the other hand,
Key-based routing is the common service provided by provides a largely orthogonal service, namely a scalable
all systems at tier 0. At tier 1, we have higher level ab- repository for key, value pairs.

2
DHT DOLR CAST
put (key, data) publish (objectId) join(groupId)
remove (key) unpublish (objectId) leave(groupId)
multicast(msg, groupId)
value = get (key) sendToObj (msg, objectId, [n])
anycast(msg, groupId)
Table 1: Tier 1 Interfaces

Defining APIs for the DHT, DOLR and CAST inter- may modify the M, K, or nextHopNode parameters or
faces is the subject of ongoing work. By defining an API terminate the message by setting nextHopNode to NULL.
for key-based routing and identifying the key tier 1 ab- By modifying the nextHopNode argument the applica-
stractions, we have taken a major first step. tion can effectively override the default routing behavior.
We will demonstrate examples of the use of this flexibility
4 Key-based routing API in Section 5.1.
In this section we describe the proposed key-based routing
API. We begin by defining notation and data types we will void deliver(key K, msg M) This function is in-
use to describe the API. Section 5.1 will show how we can voked on the the node that is the root for key K upon the
use these calls to implement the DHT, DOLR and CAST arrival of message M. The deliver upcall is provided as a
higher level abstractions. convenience for applications.

4.3 Routing state access


4.1 Data types
The API allows applications to access a node’s routing
A key is a 160-bit string. A nodehandle encapsulates the
state via the following calls. All of these operations are
transport address and nodeId of a node in the system. The
strictly local and involve no communication with other
nodeId is of type key; the transport address might be, for
nodes. Applications may query the routing state to, for
example, an IP address and port. Messages (type msg)
instance, obtain nodes that may be used by the forward
contain application data of arbitrary length.
upcall above as a next hop destination.
We adopt a language-neutral notation for describing the
Some of the operations return information about a key’s

API. A parameter p will be denoted as p if it is a read-
-root. The -root is a generalization of a key’s root. A
node is an -root for a key if that node becomes the root
only parameter and p if it is a read-write parameter. We
for the key if all of the -roots fail for   . The node may
denote an ordered set p of objects of type T as T[] p.

4.2 Routing messages be the -root for keys in one or more contiguous regions
of the ID space.
void route(key K, msg M, nodehandle hint) This
operation forwards a message, M, towards the root of key nodehandle[] local lookup(key K, int num,
K. The optional hint argument specifies a node that should boolean safe) This call produces a list of nodes that
be used as a first hop in routing the message. A good hint, can be used as next hops on a route towards key K, such
e.g. one that refers to the key’s current root, can result in that the resulting route satisfies the overlay protocol’s
the message being delivered in one hop; a bad hint adds at bounds on the number of hops taken.
most one extra hop to the route. Either K or hint may be If safe is true, the expected fraction of faulty nodes in
NULL, but not both. The operation provides a best-effort the list is guaranteed to be no higher than the fraction of
service: the message may be lost, duplicated, corrupted, faulty nodes in the overlay; if false, the set may be cho-
or delayed indefinitely. sen to optimize performance at the expense of a poten-
The route operation delivers a message to the key’s tially higher fraction of faulty nodes. This option allows
root. Applications process messages by executing code in applications to implement routing in overlays with byzan-
upcalls which are invoked by the KBR routing system at tine node failures. Implementations that assume fail-stop
nodes along a message’s path and at its root. To permit behavior may ignore the safe argument. The fraction of
event-driven implementations, upcall handlers must not faulty nodes in the returned list may be higher if the
block and should not perform long-running computations. safe parameter is not true because, for instance, malicious
  nodes have caused the local node to build a routing table

void forward(key K, msg M, nodehandle that is biased towards malicious nodes [1].
nextHopNode) This upcall is invoked at each node
that forwards message M, including the source node, and nodehandle [] neighborSet (int num) This operation
the key’s root node (before deliver is invoked). The upcall produces an unordered list of nodehandles that are neigh-
informs the application that message M with key K is bors of the local node in the ID space. Up to num node
about to be forwarded to nextHopNode. The application handles are returned.

3
nodehandle [] replicaSet (key k, int max rank) receiving the message, stores the (key, value) pair in its
This operation returns an ordered set of nodehandles on local storage. If the value is large in size, the insertion
which replicas of the object with key k can be stored. can be optimized by returning only the nodehandle R of
The call returns nodes with a rank up to and includ- the key’s root in response to the initial PUT message, and
ing max rank. If max rank exceeds the implementation’s then sending the value in a single hop using route(key,
maximum replica set size, then its maximum replica set [PUT,value], R)).
is returned. Some protocols ([11], [7]) only support a The get operation routes a GET message using
max rank value of one. With protocols that support a rank route(key, [GET,S], NULL). The key’s root returns the
value greater than one, the returned nodes may be used for value and its own nodehandle in a single hop using
replicating data since they are precisely the nodes which route(NULL, [value,R], S). If the local node remembers
become roots for the key k when the local node fails. R from a previous access to key, it can provide R as a hint.
update(nodehandle n, bool joined) This upcall is CAST. Group communication is a powerful building
invoked to inform the application that node  has either block in many distributed applications. We describe one
joined or left the neighbor set of the local node as that set approach to implementing the CAST abstraction de-
would be returned by the neighborSet call. scribed in Section 3. A key is associated with a group,
 and the key’s root becomes the root of the group’s multi-

boolean range (nodehandle N, rank r, key lkey,
cast tree. Nodes join the group by routing a SUBSCRIBE
key rkey) This operation provides information about
ranges of keys for which the node  is currently a -root.
message containing their nodehandle to the group’s key.
When the forward upcall is invoked at a node, the node
The operations returns false if the range could not be de-
checks if it is a member of the group. If so, it termi-
termined, true otherwise. It is an error to query the range
nates the SUBSCRIBE message; otherwise, it inserts its
of a node not present in the neighbor set as returned by
nodehandle into the message and forwards the message
the update upcall or the neighborSet call. Certain imple-
mentations may return an error if  is greater than zero.
towards the group key’s root, thus implicitly subscribing


denotes an inclusive range of key values.
to the group. In either case, it adds the nodehandle of the
joining node to its list of children in the group multicast
Some protocols may have multiple, disjoint ranges of
tree.
keys for which a given node is responsible. The parame-
Any overlay node may multicast a message to the
ter lkey allows the caller to specify which region should
be returned. If the node referenced by  is responsible
group, by routing a MCAST message using the group
key. The group key’s root, upon receiving this message,
for key lkey, then the resulting range includes lkey. Oth-
forwards the message to its children in the group’s tree,
erwise, the result is the nearest range clockwise from lkey
for which  is responsible.
and so on recursively. To send an anycast message, a node
routes an ACAST message using the group key. The first
5 Validating the API node on the path that is a member of the group forwards
To evaluate our proposed API, we show how it can be used the message to one of its children and does not forward
to implement the tier 1 abstractions, and give examples of it towards the root (returns NULL for nexthop). The mes-
other common usages. We believe that the API is expres- sage is forwarded down the tree until it reaches a leaf,
sive enough to implement all the applications known to where it is delivered to the application. If the underlying
the authors that have to date been built on top of CAN, KBR supports proximity, then the anycast receiver is a
Chord, Pastry and Tapestry. We also discuss how the API group member near the anycast originator.
can be supported on top of several representative struc- DOLR. A decentralized object location and routing
tured overlay protocols. (DOLR) layer allows applications to control the place-
5.1 Use of the API ment of objects in the overlay. The DOLR layer provides
three operations: publish(objectId), unpublish(ObjectID),
Here we briefly sketch how tier 1 abstractions (DHT,
and sendToObj(msg, objectId, [n]).
DOLR, CAST) can be implemented on top of the routing
API. We also show how to implement a tier 2 application, The publish operation announces the availability of an
Internet Indirection Infrastructure [9], and other mecha- object (at the physical node that issues this operation)
nisms and protocols such as caching and replication. under the name objectID. The simplest form of publish
calls route(objectId, [PUBLISH, objectId, S], NULL),
DHT. A distributed hash table (DHT) provides two op- where S is the name of the originating node. At each hop,
erations: (1) put(key, value), and (2) value = get(key). A an application upcall handler stores a local mapping from
simple implementation of put routes a PUT message con- objectId to S. More sophisticated versions of publish may
taining value and the local node’s nodehandle, , using deposit pointers along secondary paths to the root. The
route(key, [PUT,value,S], NULL). The key’s root, upon unpublish operation walks through the same path and re-

4
moves mappings. Data Maintenance. When a node’s identifier neighbor-
The sendToObj operation delivers a message to  hood changes, the node will be required to move keys to
nearby replicas of a named object. It begins by routing the preserve the mapping of keys to nodes, or to maintain a
message towards the object root using route(objectId, [n, desired replication level. When the update upcall indi-
msg], NULL). At each hop, the upcall handler searches cates that node () has joined the identifier neighborhood,
for local object references matching objectId and sends a the application calls range (n, i) with  =      and trans-
copy of the message directly to the closest  locations. fers any keys which fall in the returned range to . This
If fewer than  pointers are found, the handler decre- has the effect of both transferring data to a node which has
ments  by the number of pointers found and forwards taken over the local node’s key space (  ) and main-
the original message towards objectID by again calling taining replicas (  ). This description assumes that a
route(objectId, [n, msg], NULL). node is using  replicas as returned by replicaSet.
Internet Indirection Infrastructure (i3). 
is a commu- Caching. Applications like DHTs use dynamic caching
nication infrastructure that provides indirection, that is, it to create transient copies of frequently requested data in
decouples the act of sending a packet from the act of re- order to balance query load. It is desirable to cache data
ceiving it [9]. This allows 
to provide support for mobil- on nodes that appear on the route request messages take
ity, multicast, anycast and service composition. towards a key’s root because such nodes are likely to re-
There are two basic operations in 
: sources send pack- ceive future request messages. A simple scheme places a
ets to a logical identifier and receivers express interest in cached a copy of a data item on the last node of the route
packets by inserting a trigger into the network. In their prior to the node that provides the data. Caching can be
simplest form, packets are of the form    and trig- implemented as follows. A field is added to the request
gers are of the form   , where  is either an message to store the nodehandle of the previous node on
identifier or an IP address. 1 Given a packet   , 
the path. When the forward upcall is invoked, each node
will search for a trigger    and forward  to along the message’s path checks whether it stores the re-
 . 
IDs in packets are matched with those in triggers quested data. If not, it inserts its nodehandle into the mes-
using longest prefix matching. 
IDs are 256-bit long, and sage, and allows the lookup to proceed. If the node does
their prefix is at least 128-bit long. store the data, it sends the data to the requester and sends a
To insert a trigger  , the receiver calls copy of the data to the previous node on the request path.
route(    , [  , addr], NULL), where   The node then terminates the request message by setting
is a hash function that converts an 128-bit string into nextHopNode to NULL.
an unique 160-bit string (eventually by padding   
with zeros). This message is routed to the node responsi- 5.2 Implementation
ble for     , where the trigger is stored. Note that Here we sketch how existing structured overlay protocols
all triggers whose IDs have the same prefix are stored at can implement the proposed API. While the chosen exam-
the same node; thus the longest prefix matching is done lo- ple systems (CAN, Chord, Pastry, Tapestry) do not consti-
cally. Similarly, a host sending a packet     invokes tute an exhaustive list of structured overlays, they repre-
route(    , [  , data], NULL). When the sent a cross-section of existing systems and support our
packet arrives at the node responsible for     , claim the the API can be widely implemented easily.
the packet’s  is matched with the trigger’s  and for-
warded to the corresponding destination. To improve effi- 5.2.1 CAN
ciency, a host may cache the address of the server where The route operation is supported by existing operations,
a particular  is stored, and use as a hint when invoking and the hint functionality can be easily added. The range
the route primitive for that . call returns the range associated with the local node,
which in CAN can be represented by a binary prefix. lo-
Replication. Applications like DHTs use replication to cal lookup is a local routing table lookup and currently
ensure that stored data survives node failure. To replicate a ignores the value of safe. The update operation is trig-
newly received key ( )  times, the application calls repli- gered every time a node splits its namespace range, or
caSet (k,r) and sends a copy of the key to each returned joins its range with that of a neighbor.
node. If the implementation is not able to return  suit-
able neighbors, then the application itself is responsible 5.2.2 Chord
for determining replica locations. Route is implemented in an iterative fashion in Chord. At
1 To support service composition and scalable multicast, ¿ general-
each hop, the local node invokes an RPC at the next node
izes the packet and trigger formats by replacing the  of a packet and
in the lookup path; this RPC invokes the appropriate up-
the  field of a trigger with a stack of identifiers. However, since call (route or deliver) and returns a next hop node. If a hint
these generalizations do not affect our discussion, we ignore them here. is given, it is used as the first hop in the search instead of a

5
node taken from the local routing table. The local lookup binding of the API in a language, application developers
call returns the closest  successors of K in the node’s will not be able to trivially change which system they use.
location table. Calls to neighborSet and replicaSet return Instead, the API directs developers to structure their ap-
the node’s successor list; neighborSet calls additionally plications in such a way that they can be translated from
return the node’s predecessor. The range call can be im- one system to another with a minimum of effort. One pos-
plemented by querying the successor list; given the th sibility for true portability among structured P2P systems
node, it returns the range       . would be to implement the API as an RPC program.
The exception to this rule is the predecessor; the range of In the future, we will better articulate APIs for tier 1
the predecessor cannot be determined. services such as DHT, DOLR and CAST, including clear
definitions of functional and performance expectations.
5.2.3 Pastry We made a stab at this in Section 3, but more work must
The route operation can be trivially implemented on top be done. In particular, the similarities between DOLR
of Pastry’s route operation. The hint argument, if present, and CAST are striking and demand further exploration.
supersedes the routing table lookup. The range operation It is at level of tier 1 abstractions that structured peer-to-
is implemented based on nodeId comparisons among the peer overlays take on their greatest power and utility. We
members of Pastry’s leafset. local lookup translates into hope that the effort detailed in this paper is the beginning
a simple lookup of Pastry’s routing table; if safe is true, of convergence of functionality toward common services
the lookup is performed in Pastry’s constrained routing available for all peer-to-peer applications writers.
table [1]. The update operation is triggered by a change in
Pastry’s leafset, and the neighbor set (returned by neigh- References
[1] C ASTRO , M., D RUSCHEL , P., G ANESH , A., R OWSTRON , A.,
borSet) consists of the leafset. AND WALLACH , D. S. Secure routing for structured peer-to-peer
overlay networks. In Proceedings of OSDI (December 2002).
5.2.4 Tapestry
[2] C ASTRO , M., D RUSCHEL , P., K ERMARREC , A.-M., N ANDI , A.,
The route operation is identical to the Tapestry API R OWSTRON , A., AND S INGH , A. SplitStream: High-bandwidth
call TapestryRouteMsg forwarded to the hint argument, if content distribution in a cooperative environment. In Proceedings
present. Tapestry routing tables optimize performance and of (IPTPS’03) (February 2003).
maintain a small set (generally three) of nodes which are [3] C ASTRO , M., D RUSCHEL , P., K ERMARREC , A.-M., AND R OW-
STRON , A. SCRIBE: A large-scale and decentralized application-
the closest nodes maintaining the next hop prefix match-
level multicast infrastructure. IEEE JSAC 20, 8 (Oct. 2002).
ing property. The local lookup call retrieves the opti-
[4] D ABEK , F., K AASHOEK , M. F., K ARGER , D., M ORRIS , R., AND
mized next hop nodes. The safe routing mode is not used S TOICA , I. Wide-area cooperative storage with CFS. In Proceed-
by the current Tapestry implementation, but may be used ings of SOSP (Oct. 2001).
in future implementations. The range operation returns [5] H ILDRUM , K., K UBIATOWICZ , J. D., R AO , S., AND Z HAO ,
a set of ranges, one each for all combinations of levels B. Y. Distributed object location in a dynamic network. In Pro-
ceedings of SPAA (Winnipeg, Canada, August 2002), ACM.
where the node can be surrogate routed to. The update op-
eration is trigged when a node receives an acknowledged [6] M AYMOUNKOV, P., AND M AZIERES , D. Kademlia: A peer-to-
peer information system based on the xor metric. In Proceedings
multicast for a new inserting node, or when it receives an of (IPTPS) (2002).
object movement request during node deletion [5]. [7] R ATNASAMY, S., F RANCIS , P., H ANDLEY, M., K ARP, R., AND
S HENKER , S. A scalable content-addressable network. In Proc.
6 Discussion and future work ACM SIGCOMM (San Diego, 2001).
Settling on a particular key-based routing API were com- [8] R OWSTRON , A., AND D RUSCHEL , P. Pastry: Scalable, distributed
plicated by the tight coupling between applications and object location and routing for large-scale peer-to-peer systems. In
the lookup systems on which they were developed. Cur- Proceedings of IFIP/ACM Middleware (Nov. 2001).
rent block replication schemes, especially the neighbor set [9] S TOICA , I., A DKINS , D., Z HUANG , S., S HENKER , S., AND
S URANA , S. Internet indirection infrastructure. In Proceedings
replication used by Chord and Pastry, are closely tied to of SIGCOMM (August 2002), ACM.
the manner in keys are mapped to nodes. Supporting ef- [10] S TOICA , I., M ORRIS , R., K ARGER , D., K AASHOEK , M. F., AND
ficient data replication independent of the lookup system B ALAKRISHNAN , H. Chord: A scalable peer-to-peer lookup ser-
necessitates the range and replicaSet calls which allow a vice for internet applications. In Proc. ACM SIGCOMM (San
node to determine where to replicate keys. The common Diego, 2001).
practice of caching blocks along probable lookup paths [11] Z HAO , B., K UBIATOWICZ , J., AND J OSEPH , A. Tapestry: An
infrastructure for fault-tolerant wide-area location and routing.
also requires additional flexibility in the API, namely the Tech. Rep. UCB/CSD-01-1141, Computer Science Division, U. C.
upcall mechanism which allows application procedures to Berkeley, Apr. 2001.
execute during the lookup. [12] Z HUANG , S. Q., Z HAO , B. Y., J OSEPH , A. D., K ATZ , R. H.,
The KBR API described here is intended to be language AND K UBIATOWICZ , J. D. Bayeux: An architecture for scalable
neutral to allow the greatest possible flexibility for imple- and fault-tolerant wide-area data dissemination. In Proceedings of
NOSSDAV (June 2001).
mentors of lookup systems. Without specifying a precise

You might also like