#minecraft-world #minecraft-bedrock #leveldb #minecraft #world #mcbe

bedrock-leveldb

Pure Rust LevelDB-style storage backend for Minecraft Bedrock worlds

1 unstable release

Uses new Rust 2024

0.2.1 May 7, 2026

#70 in Database implementations


Used in bedrock-world

MIT/Apache

295KB
7K SLoC

bedrock-leveldb

English | 简体中文

bedrock-leveldb is a pure Rust raw key/value storage library for Minecraft Bedrock world databases. The performance target is benchmark-backed zero-copy where possible, lock-free read hot paths after a short state snapshot, and explicit owned allocation when callers request it. It focuses on the storage layer only: chunk, actor, player, and NBT semantics are intentionally out of scope and belong in application code or domain-specific layers.

The crate can read native Bedrock/LevelDB manifests, WAL records, and table files. v0.2 writes standard LevelDB WAL batches, flushes native .ldb tables, and persists manifest version edits. Older BWLDB... files remain readable for migration/backward compatibility only.

Maintainers and contributors should also read the development guide.

Quick Start

use bedrock_leveldb::{
    Db, OpenOptions, ReadOptions, ScanMode, ScanPipelineOptions, VisitorControl, WriteOptions,
};

fn main() -> bedrock_leveldb::Result<()> {
    let db = Db::open("path/to/world/db", OpenOptions::default())?;

    if let Some(value) = db.get(b"player_1")? {
        println!("player_1 has {} raw bytes", value.len());
    }

    let outcome = db.for_each_prefix(
        b"player_",
        ReadOptions {
            scan_mode: ScanMode::ParallelTables,
            pipeline: ScanPipelineOptions {
                queue_depth: 64,
                ..ScanPipelineOptions::default()
            },
            ..ReadOptions::default()
        },
        |key, value| {
            println!("{} -> {} bytes", String::from_utf8_lossy(key), value.len());
            Ok(VisitorControl::Continue)
        },
    )?;

    println!(
        "visited {} entries across {} tables on {} workers",
        outcome.visited, outcome.tables_scanned, outcome.worker_threads
    );

    db.put(b"tool_key".as_slice(), b"tool_value".as_slice(), WriteOptions::default())?;
    Ok(())
}

For read-only analysis of real Bedrock worlds, set OpenOptions::read_only = true and create_if_missing = false. Read-only handles never initialize, repair, flush, or write to the database directory.

Supported Surface

Area Status
Native LevelDB manifest replay Implemented for the metadata needed to find tables
Native LevelDB WAL replay Implemented for write batches
Native LevelDB table reads Footer, index block, data blocks, restart arrays, internal key trailer
Compression reads Snappy, zlib, and Bedrock raw deflate when features are enabled
Lazy point lookup Implemented with manifest range filtering and seeked block reads
Visitor scans Key, entry, prefix, sequential, and table-parallel modes
Native block cache Bounded decoded block cache
Bedrock chunk key helpers Parse and encode documented LevelDB chunk keys
Legacy LegacyTerrain values Validate and expose the 83,200-byte early LevelDB terrain layout, including [biome_id, red, green, blue] biome samples
Legacy subchunk values Classify paletted subchunks and expose pre-paletted block ID/metadata arrays
Batch exact reads Db::get_many_owned preserves input order for legacy and modern render keys
Native writes by this crate WAL batch append, native .ldb flush, manifest edit persistence
Production LevelDB compaction Correctness-first native range compaction
Arbitrary corrupt database repair Partial, writes native recovered output from readable data
Pre-LevelDB worlds Not supported; chunks.dat and entities.dat are outside this crate
mmap read path Feature-gated callback scans can borrow uncompressed custom/native table values

API Notes

  • Db::open(path, OpenOptions) loads CURRENT, manifest metadata, and the WAL overlay. It does not eagerly materialize every native table value.
  • Db::get(key) is the compatibility owned/shared read path. Db::get_ref and Db::get_with_ref return ValueRef, which can represent borrowed, shared, or explicitly owned values. Cross-function point lookups stay shared or owned so they cannot return dangling table slices.
  • Db::for_each_entry_ref and Db::for_each_prefix_ref are the true borrowed-first APIs. With ReadStrategy::Borrowed and sequential scan mode, uncompressed native LevelDB blocks return ValueRef::Borrowed inside the visitor callback. Compressed blocks, WAL/overlay values, and non-callback point reads return Shared/Owned. Enabling the mmap feature maps table files read-only so those borrowed slices are backed by the mapping for the duration of the callback.
  • Db::get_many_owned(keys, ReadOptions) is the preferred renderer path for exact chunk records such as LegacyTerrain (0x30), Data2D, subchunks, and block entities. It preserves input order and avoids prefix scans during tile rendering. It returns raw values byte-for-byte; X/Y/Z coordinate interpretation and legacy biome priority are intentionally tested and implemented in bedrock-world/bedrock-render.
  • With the default async feature, Arc<Db> now provides owned async read helpers: get_async, get_with_async, collect_keys_owned_async, collect_prefix_keys_owned_async, and collect_prefix_owned_async. They use Tokio spawn_blocking and are intended for GUI or server runtimes that must keep foreground tasks responsive.
  • Db::collect_keys_owned, Db::collect_prefix_keys_owned, and Db::collect_prefix_owned return owned data without forcing callers to write visitor glue for common indexing paths.
  • Db::write_batch_native, Db::flush_memtable, Db::compact_range_native, and Db::recover_native are the explicit v0.2 native write/recovery entry points. Db::write, Db::flush, Db::compact_range, and Db::repair delegate to the same native paths.
  • ReadOptions::cache_policy defaults to Bypass, so normal reads do not contend on the shared block cache. Set it to Use only when cross-request block reuse is worth the lock cost.
  • ReadOptions::pipeline configures local Rayon scan scheduling. queue_depth, table_batch_size, and progress_interval use automatic defaults when set to zero. ScanOutcome reports tables_scanned, worker_threads, queue_wait_ms, and cancel_checks so renderers can tune without fixed machine-specific timing thresholds.
  • Old LevelDB worlds are still LevelDB databases. This crate reads native zlib compression tag 2, Bedrock raw deflate tag 4, WAL + .ldb overlays, and exact LegacyTerrain keys; pre-LevelDB chunks.dat parsing intentionally lives in bedrock-world.
  • Db::for_each_key, Db::for_each_entry, and Db::for_each_prefix stream borrowed keys and Bytes values to visitors.
  • Db::for_each_prefix_key is the preferred render-index path when callers only need keys. It avoids value callbacks and lets native table scans seek directly into the requested prefix range.
  • Visitors return VisitorControl::Continue or VisitorControl::Stop; normal early termination is reported in ScanOutcome, not as an error.
  • stats_fast() is metadata/overlay-only. stats_full(), snapshots, materialized iterators, repair, and compaction are explicit expensive paths.

Migration: full prefix values to key-only scans

Old render index code often read every chunk value just to discover whether a chunk had renderable records:

let mut keys = Vec::new();
db.for_each_prefix(b"chunk-prefix", ReadOptions::default(), |key, _value| {
    keys.push(bytes::Bytes::copy_from_slice(key));
    Ok(bedrock_leveldb::VisitorControl::Continue)
})?;

Prefer the key-only API for viewport and region indexes:

let mut keys = Vec::new();
db.for_each_prefix_key(b"chunk-prefix", ReadOptions::default(), |key| {
    keys.push(bytes::Bytes::copy_from_slice(key));
    Ok(bedrock_leveldb::VisitorControl::Continue)
})?;

Async callers should share the database handle instead of reopening it for each request:

let db = std::sync::Arc::new(Db::open("path/to/world/db", OpenOptions::default())?);
let keys = db
    .clone()
    .collect_prefix_keys_owned_async(
        bytes::Bytes::from_static(b"chunk-prefix"),
        ReadOptions::default(),
    )
    .await?;

Bedrock Record Helpers

The database APIs stay raw key/value APIs. For old Bedrock LevelDB worlds, the crate also provides storage-level helpers for documented record families:

use bedrock_leveldb::{
    BedrockKey, ChunkRecordTag, Db, LegacyTerrain, OpenOptions,
};

# fn example() -> bedrock_leveldb::Result<()> {
let db = Db::open("path/to/world/db", OpenOptions::default())?;

db.for_each_entry(Default::default(), |key, value| {
    if let BedrockKey::Chunk(chunk_key) = BedrockKey::parse(key) {
        if chunk_key.tag == ChunkRecordTag::LegacyTerrain {
            let terrain = LegacyTerrain::parse(value)?;
            let _block_id = terrain.block_id(0, 64, 0);
        }
    }
    Ok(bedrock_leveldb::VisitorControl::Continue)
})?;
# Ok(())
# }

The helpers cover the LevelDB-era legacy layouts described by the Bedrock format history, including LegacyTerrain and old SubChunkPrefix payload families. They intentionally do not parse pre-LevelDB chunks.dat / entities.dat worlds, NBT payloads, actor records, or gameplay-level chunk semantics.

Logging

This is a library crate, so it only emits diagnostics through the standard log facade. It does not initialize a global logger and never calls println! or eprintln!. Applications can connect any compatible backend:

fn main() -> bedrock_leveldb::Result<()> {
    // Example only: choose env_logger, log4rs, tracing-log, or your own logger
    // at the application boundary.
    env_logger::init();

    let db = bedrock_leveldb::Db::open("path/to/world/db", Default::default())?;
    let _ = db.get(b"player_1")?;
    Ok(())
}

Log events are intentionally low-noise and avoid raw values. Useful events are emitted around database open, manifest/WAL replay, table scans, custom flushes, repair paths that discard unreadable files, parallel table workers, cancellation, and key-only prefix scans. Applications using tracing can bridge these events with tracing_log::LogTracer.

Errors

All fallible APIs return bedrock_leveldb::Result<T>, an alias for Result<T, LevelDbError>. LevelDbError is structured; prefer matching ErrorKind and using path() instead of parsing display strings:

use bedrock_leveldb::{Db, ErrorKind, OpenOptions};

let err = Db::open(
    "missing-db",
    OpenOptions {
        read_only: true,
        create_if_missing: false,
        ..OpenOptions::default()
    },
)
.expect_err("missing database should fail");

assert_eq!(err.kind(), ErrorKind::NotFound);
assert!(err.path().is_some());

Cooperative scan cancellation returns ErrorKind::Cancelled. Read-only handles return ErrorKind::ReadOnly for writes, flushes, repair, and compaction.

Features

Feature Default Meaning
zlib yes Enables zlib and Bedrock raw-deflate decompression/compression
snappy yes Enables Snappy table decompression/compression
async yes Adds Db::open_async through Tokio spawn_blocking
mmap no Reserved for a future mapped read path
repair-tools no Reserved for expanded repair tooling
bench no Reserved for benchmark-only code paths

docs.rs builds with all features enabled, so the hosted API reference includes async helpers, compression backends, mapped scan types, and repair-tool entry points. The crates.io package includes the English and Chinese READMEs, the guide documents under docs/, the changelog, licenses, source, tests, and benchmarks.

MSRV is Rust 1.87.

Testing And Benchmarks

Release checks used before the first public commit:

cargo fmt --check
cargo clippy --all-features --all-targets -- -D warnings
cargo rustdoc --all-features -- -D missing_docs
cargo test --all-features
cargo test --no-default-features
cargo test --no-default-features --features zlib
cargo test --no-default-features --features snappy
cargo test --no-default-features --features async
cargo test --no-default-features --features mmap
cargo doc --all-features --no-deps
cargo package --allow-dirty
cargo bench --all-features

The Criterion suite is synthetic. It separates overlay hot reads, flushed native table reads, native table point/prefix reads, WAL recovery, and sequential versus table-parallel scans. Large-world behavior should still be validated with real Bedrock fixtures in higher-level crates because this crate does not interpret world keys or NBT payloads. Latest local numbers are tracked in docs/BENCHMARKS.md.

License

Licensed under either of:

  • Apache License, Version 2.0
  • MIT license

Dependencies

~1.3–2.8MB
~51K SLoC