Skip to content

triblespace/jerky

 
 

Repository files navigation

Succinct on-disk data structures in Rust

Documentation Crates.io

Jerky provides some succinct data structures written in Rust.

Documentation

https://docs.rs/jerky/

Limitation

This library is designed to run on 64-bit machines.

Build docs

The document can be compiled with the following command:

RUSTDOCFLAGS="--html-in-header katex.html" cargo doc --no-deps

Zero-copy bit vectors

BitVectorBuilder can build a bit vector whose underlying BitVectorData is backed by anybytes::View. Metadata describing a stored sequence includes SectionHandles so the raw Bytes returned by ByteArea::freeze can be handed to BitVectorData::from_bytes with its BitVectorDataMeta for zero‑copy reconstruction.

Types following this pattern implement the Serializable trait, which exposes a metadata accessor and a from_bytes constructor.

DacsByte sequences expose a metadata method returning a descriptor with a handle to a slice of per-level handles. Each entry stores the flag bitvector handle (if any), its bit length, and the payload byte handle. from_bytes rebuilds the sequence using that metadata.

Bytes layout for a `DacsByte` sequence (current builders place sections
contiguously, though layout is fully described by the stored handles):

| flag[0] words | flag[1] words | ... | flag[n-2] words | level[0] data | level[1] data | ... | level[n-1] data |

The flag vectors come first and store native-endian `usize` words. The level
data immediately follows without any padding.

CompactVector and WaveletMatrix provide the same pattern: call metadata to obtain a descriptor with the required SectionHandles, then hand both the metadata and the full Bytes region to from_bytes.

For a wavelet matrix the metadata stores a handle to a slice of per-layer handles. Each handle in that slice points to the native-endian usize words forming a single layer. Layers may reside anywhere in the arena and no longer need to be contiguous.

use anybytes::ByteArea;
use jerky::int_vectors::{CompactVector, CompactVectorBuilder};

let mut area = ByteArea::new()?;
let mut sections = area.sections();
let mut builder = CompactVectorBuilder::with_capacity(3, 3, &mut sections)?;
builder.set_ints(0..3, [7, 2, 5])?;
let cv = builder.freeze();
let meta = cv.metadata();
let bytes = area.freeze()?;
let view = CompactVector::from_bytes(meta, bytes.clone())?;
assert_eq!(view.get_int(1), Some(2));

Examples

See the examples directory for runnable usage demos, including bit_vector, compact_vector, dacs_byte, and wavelet_matrix.

Licensing

Licensed under either of

at your option.

About

Collection of succinct data structures in Rust

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 98.6%
  • Other 1.4%