Skip to content

al8n/memberlist

Repository files navigation

Memberlist

Batteries-included, runtime-agnostic, WASM/WASI-friendly SWIM gossip membership and failure detection for Rust — a Sans-I/O protocol core with pluggable async drivers.

Port and improve HashiCorp's memberlist to Rust.

github LoC Build codecov

docs.rs crates.io crates.io license

Discord

Introduction

memberlist manages cluster membership and member failure detection using a gossip-based protocol — the foundation any distributed system needs. It is eventually consistent but converges quickly (and the convergence rate is tunable through the protocol's knobs), and it tolerates network partitions by attempting to reach potentially-dead nodes through multiple routes.

Its protocol logic is a runtime-agnostic Sans-I/O state machine (memberlist-proto), modeled on quinn-proto; thin async drivers adapt it to tokio / smol, compio, and bare-metal no_std targets, so the same SWIM core runs on a server or a microcontroller. The memberlist crate is the high-level entry point: it wires that core to a ready-to-use async driver (memberlist-reactor) so you can join a cluster in a few lines, with tokio out of the box.

This is a Rust port of HashiCorp's memberlist, extended with a Sans-I/O architecture and no_std / bare-metal support. Every crate is WASM/WASI friendly and can compile to wasm32-unknown-unknown and wasm32-wasip1 with the appropriate features.

Highlights

  • Sans-I/O core. All protocol logic lives in memberlist-proto as pure state machines — no sockets, threads, or clocks — making it deterministic and exhaustively unit-tested. The drivers only shuttle bytes and time in and out.
  • Runtime-agnostic. Drive it from tokio, smol, or compio (thread-per-core) with no change to protocol behavior.
  • no_std and bare-metal. The core runs on alloc, and memberlist-smoltcp / memberlist-embassy bring full SWIM membership to embedded targets.
  • SWIM + Lifeguard. A faithful port of HashiCorp's memberlist: suspicion / refutation, indirect probes, push/pull anti-entropy, and the Lifeguard awareness extensions that keep detection robust under CPU starvation and network loss.
  • Pluggable transports. Plain TCP, TLS-over-TCP (rustls), or QUIC (quinn-proto) reliable planes — each with a UDP / datagram gossip plane carrying opt-in checksum, compression, and AEAD encryption.
  • Customizable. Bring your own Id, Address, AddressResolver, and delegates (alive / conflict / merge / event / node / ping).
  • Observable, à la carte. Opt into tracing — compiled out when unused.
  • Config-file & CLI friendly. Every *Options type optionally derives serde and clap, so configuration loads from a file or maps straight onto CLI flags + env (std-only).

The family

The crates split protocol logic from I/O, mirroring the quinn layering:

Crate Role
memberlist this crate — batteries-included facade (core + default tokio driver)
memberlist-proto Sans-I/O protocol state machines (no_std-capable)
memberlist-reactor runtime-agnostic async driver (tokio & smol)
memberlist-compio compio (thread-per-core, io_uring / IOCP) async driver
memberlist-embedded shared no_std driving core for the embedded drivers
memberlist-smoltcp executor-free no_std driver over smoltcp (caller-poll)
memberlist-embassy embassy-net async no_std driver, built on memberlist-embedded

Installation

Build requirement: memberlist-proto's build script invokes protoc to generate the wire codec, so the Protocol Buffers compiler must be on PATH when building (e.g. apt install protobuf-compiler, brew install protobuf). CI installs it via arduino/setup-protoc.

[dependencies]
memberlist = "0.9" # tokio runtime + tcp transport by default

For smol instead of tokio:

[dependencies]
memberlist = { version = "0.9", default-features = false, features = ["smol", "tcp"] }

For the compio (completion-based, thread-per-core) runtime:

[dependencies]
memberlist = { version = "0.9", default-features = false, features = ["compio", "tcp"] }

For bare-metal (no_std) targets, enable smoltcp (the executor-free engine) or embassy (the embassy-net async driver) — neither pulls in std:

[dependencies]
memberlist = { version = "0.9", default-features = false, features = ["embassy", "tcp"] }

The minimum supported Rust version (MSRV) is 1.96.0 (edition 2024).

Example

Common types (Options, MaybeResolved, delegates, …) are re-exported from the per-runtime module; the runtime-pinned constructors (tcp / tls / quic) live there too (memberlist::tokio, memberlist::smol, memberlist::compio).

use core::net::SocketAddr;
use memberlist::tokio::{tcp, MaybeResolved, Options, SocketAddrResolver, VoidDelegate};
use smol_str::SmolStr;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let advertise: SocketAddr = "127.0.0.1:7946".parse()?;

    // Start a node pinned to the tokio runtime (TCP reliable plane + UDP gossip).
    let node = tcp(
        &SocketAddrResolver,
        SmolStr::new("node-a"),
        MaybeResolved::Resolved(advertise),
        Options::new(),
        VoidDelegate::<SmolStr, SocketAddr>::new(),
    )
    .await?;

    // Join an existing cluster through one or more seed addresses.
    // join returns Ok(reached_set) on success, Err((reached_so_far, error)) on failure.
    let seed: SocketAddr = "127.0.0.1:7947".parse()?;
    let reached = node
        .join(&SocketAddrResolver, &[MaybeResolved::Resolved(seed)])
        .await
        .map_err(|(_, e)| e)?;

    println!("{} members online (reached {})", node.num_online_members(), reached.len());

    // Gracefully leave: broadcast a leave, then drain.
    node.leave().await?;
    Ok(())
}

Feature flags

Pick one runtime, one or more transports, and any transforms you need.

  • Runtimestokio (default), smol, compio (thread-per-core), reactor (generic over an agnostic runtime), smoltcp / embassy / embedded (no_std).
  • Transportstcp (default); tls + a backend (tls-rustls-ring, tls-rustls-aws-lc-rs); quic + a backend (quic-rustls-ring, quic-rustls-aws-lc-rs).
  • Compression (both planes) — lz4, snappy, zstd, brotli.
  • Encryption (gossip + plain-TCP reliable, AEAD) — aes-gcm, chacha20-poly1305.
  • Checksum (gossip plane) — crc32, xxhash64, xxhash32, xxhash3, murmur3.
  • Configserde (config-file round-trips) and clap (CLI flags + env) on the *Options types; std-only.
  • Othercidr (IP allow-list admission), dns (DNS address resolver), getifs (auto-detect the advertise address from local interfaces), tracing.

Compression applies on both planes; AEAD encryption applies on the gossip plane and on plain-TCP reliable streams (QUIC and TLS reliable streams are already secure, so it is skipped there); checksum applies on the gossip plane only.

Observability

Enable features = ["tracing"] and install a subscriber in main:

fn main() {
    tracing_subscriber::fmt().init();
    // … start your node …
}

The driver task and memberlist-proto emit tracing events during probing, suspicion, gossip dissemination, push/pull anti-entropy, and join / leave.

Protocol

memberlist is based on "SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol". HashiCorp's developers extended the protocol in a number of ways: several extensions increase propagation speed and convergence rate, and a further set — which they call Lifeguard — make memberlist more robust in the presence of slow message processing (due to factors such as CPU starvation and network delay or loss). For details, read HashiCorp's paper "Lifeguard: SWIM-ing with Situational Awareness" alongside the memberlist source.

Design

Unlike the original Go implementation, the Rust memberlist uses a highly generic, layered architecture: you can implement a component yourself and plug it in, and you can even bring your own Id and Address. The layers are:

  • Transport drivers

    The protocol logic is a runtime-agnostic Sans-I/O state machine (memberlist-proto, modeled on quinn-proto). Each driver pairs that core with one async runtime; protocol behavior is identical across all of them — select the one matching your runtime from the family above, or depend on a driver crate directly. Every driver carries three transports — plain TCP, TLS-over-TCP (rustls), and QUIC (quinn-proto) — each a reliable stream plane plus a UDP / datagram gossip plane. The runtime layer is provided by agnostic's Runtime (for the reactor driver), and each driver provides its own AddressResolver trait (with built-in resolvers such as SocketAddrResolver). You can bring your own Id, Address, and AddressResolver.

  • Delegate layer

    This layer is used as a reactor for different kinds of messages.

    • Delegate is the trait clients implement to hook into the gossip layer. All methods must be thread-safe, as they can and generally will be called concurrently. It is split into focused sub-traits:

      • AliveDelegate — involve a client in processing a node "alive" message. When a node joins (through packet gossip or promised push/pull), its state is updated via an alive message; this hook can filter a node out using application-specific logic.
      • ConflictDelegate — inform a client that a joining node would cause a name conflict (two clients configured with the same name but different addresses).
      • EventDelegate — a simpler delegate that only receives notifications about members joining and leaving. Its methods may be called by multiple threads, but never concurrently, so you can reason about ordering.
      • MergeDelegate — involve a client in a potential cluster merge: on every promised push/pull — both the initial join and ongoing anti-entropy — the delegate is consulted and may veto the exchange. (This deliberately tightens HashiCorp's memberlist, which consults the merge delegate only on join.)
      • NodeDelegate — manage node-related events, e.g. metadata.
      • PingDelegate — notify an observer how long a ping round trip took, and write arbitrary bytes into ack messages. To stay meaningful for RTT estimates, it does not apply to indirect pings or to fallback pings sent over a promised connection.
    • CompositeDelegate splits the Delegate into multiple small delegates, so you don't have to implement the full Delegate when you only want to customize a few methods.

Projects using memberlist

  • serf: a decentralized solution for service discovery and orchestration that is lightweight, highly available, and fault tolerant.
  • examples/toydb: a toy eventually-consistent distributed key-value database.

Q & A

  • Is the Rust memberlist implementation compatible with Go's memberlist?

    No, but yes! The Rust implementation uses a protobuf-like forward- and backward-compatible encoding, whereas Go's uses MessagePack. Interop is possible in theory — you would need to implement your own transport layer — but it is not recommended; you would face expensive overhead.

  • If Go's memberlist adds more functionality, will this project support it too?

    Yes! And this project may also add functionality that Go's memberlist does not have, e.g. WASM support and bindings to other languages.

Related Projects

  • agnostic: helps you develop runtime-agnostic crates.

License

memberlist is under the terms of the MPL-2.0 license. See LICENSE for details.

Copyright (c) 2025 Al Liu.

Copyright (c) 2013 HashiCorp, Inc.

About

A highly customable, adaptable, runtime agnostic and WASM/WASI friendly Gossip protocol (SWIM) which helps manage cluster membership and member failure detection.

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

  •  

Packages

 
 
 

Contributors

Languages