Oplog entries referencing unreachable blocks replicate forever — need entry expiry or compaction

## Problem

OrbitDB's append-only oplog has no mechanism to expire or compact entries whose referenced blocks are permanently unavailable. Once an orphaned entry enters the oplog (e.g. from a corrupt write, a blockstore wipe, or the helia v6 streaming blockstore bug), it replicates to every peer forever.

Every peer that receives the entry attempts to load the referenced identity block via bitswap, fails (because no peer has it), and retries on every sync cycle — indefinitely. This fills logs with `LoadBlockFailedError` / `Want was aborted` errors and wastes network resources.

## How orphaned entries get created

1. **Helia v6 streaming blockstore incompatibility** — helia v6 changed `blockstore.get()` to return `AsyncGenerator<Uint8Array>` instead of `Promise<Uint8Array>`. OrbitDB v3.0.2 expects the old API. When the streaming response is consumed incorrectly, identity blocks get written with garbled bytes. The CID is valid but the content doesn't match. These entries then replicate to all peers.

2. **Disk-full or I/O error during write** — if the blockstore write fails partway through (disk full, I/O error), the block may be partially written. A subsequent integrity check or restart detects the corruption and removes the block locally, but the oplog entry referencing it has already been replicated to peers.

3. **Blockstore wipe after integrity check** — the application detects corrupt blocks on startup and wipes the blockstore to recover. The oplog entries that referenced those blocks now reference CIDs that no longer exist anywhere on the network.

In all cases, the oplog entry is valid CBOR and has a valid structure — it just references a block (typically an identity block) that doesn't exist on any peer. Since the oplog is append-only with no expiry, these poison entries persist forever.

## Current impact

We run OrbitDB as part of a distributed environmental sensor network. We're currently in testing with 3-4 peers but expect this to grow to many thousands or more. The current testing databases are small (node registry, trust list — ~10 entries total across 3 databases). Despite this, we see dozens of `LoadBlockFailedError` messages on every peer after each restart, and they continue on a 15-minute retry cycle indefinitely.

We've implemented application-level workarounds:
- A permanent block blacklist (after N failed fetches, stop retrying that CID forever, persist to disk)
- A `canAppend` patch that accepts entries with unverifiable identities (since we use `write: ["*"]`)
- Write-ahead verification on `put()` to catch partial writes before they create new orphaned references

These suppress the symptoms but don't fix the root cause — the entries still replicate between peers, consuming bandwidth and triggering the fetch-fail-blacklist cycle on every new peer that joins. At scale, every new peer joining the network will have to discover and blacklist every orphaned entry independently.

## Proposed solutions

Any of these would help:

1. **Oplog entry TTL / expiry** — allow entries older than a configurable age to be dropped during sync. For many use cases (node registries, state tracking), only recent entries matter.

2. **Oplog compaction** — a mechanism to compact the oplog by removing entries whose referenced blocks are known to be unavailable (e.g. after N failed fetch attempts across all peers).

3. **Head-only sync mode** — for databases that only care about current state (key-value stores), sync only the current heads and their direct dependencies rather than the full oplog history.

4. **Entry validation during sync receive** — before accepting an entry from a peer, verify that its referenced blocks (identity, payload) are either available locally or fetchable. Reject entries that reference unreachable blocks rather than accepting them into the local oplog.

## Environment

- OrbitDB: v3.0.2
- Helia: v6
- Node.js: 22
- Databases: keyvalue (3 databases, ~10 entries total)
- Peers: 3-4 nodes (testing, expected to scale to thousands+)
- Access control: `write: ["*"]` (permissive)

## Related issues

- #1166 — "Want for bafyrei... aborted" (similar symptom, different root cause)
- #1244 — Helia v6 streaming blockstore incompatibility (the source of the corrupt blocks)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Oplog entries referencing unreachable blocks replicate forever — need entry expiry or compaction #1251

Problem

How orphaned entries get created

Current impact

Proposed solutions

Environment

Related issues

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Oplog entries referencing unreachable blocks replicate forever — need entry expiry or compaction #1251

Description

Problem

How orphaned entries get created

Current impact

Proposed solutions

Environment

Related issues

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions