1 unstable release
Uses new Rust 2024
| new 0.1.0 | Apr 7, 2026 |
|---|
#227 in Biology
68KB
1.5K
SLoC
pbzarr
A Rust library for PBZ (Per-Base Zarr) — a Zarr v3 convention for storing per-base resolution genomic data such as read depths, methylation levels, and boolean masks.
PBZ is a modern alternative to D4 and bigWig. This crate handles store layout, metadata, region parsing, and chunk I/O, delegating array storage and compression to zarrs.
Installation
[dependencies]
pbzarr = "0.1"
Quick Start
use pbzarr::{PbzStore, TrackConfig};
use ndarray::Array2;
// Create a store
let contigs = vec!["chr1".into(), "chr2".into()];
let lengths = vec![248_956_422, 242_193_529];
let store = PbzStore::create("sample.pbz.zarr", &contigs, &lengths)?;
// Create a track
let config = TrackConfig::new("uint32")
.columns(vec!["sample_A".into(), "sample_B".into()]);
let track = store.create_track("depths", config)?;
// Write a chunk
let data = Array2::<u32>::zeros((1_000_000, 2));
track.write_chunk::<u32>("chr1", 0, data)?;
// Read it back
let chunk: Array2<u32> = track.read_chunk("chr1", 0)?;
Features
- Zarr v3 only with Blosc/Zstd compression and PackBits for booleans
- Chunk-level I/O with
ndarray::Array2<T> - Region parsing:
"chr1:1000-2000"with comma-stripped numbers - Escape hatches to raw
zarrs::Arrayobjects - All standard dtypes: u8, u16, u32, i8, i16, i32, f32, f64, bool
Links
Dependencies
~15–20MB
~356K SLoC