#bioinformatics #dna

seq-hash

A SIMD-accelerated library to compute hashes of DNA sequences

4 releases

Uses new Rust 2024

0.1.2 Jan 20, 2026
0.1.1 Oct 15, 2025
0.1.0 Oct 1, 2025
0.0.1 Sep 26, 2025

#640 in Biology

Download history 18/week @ 2025-12-28 29/week @ 2026-01-04 60/week @ 2026-01-11 16/week @ 2026-01-18 2/week @ 2026-01-25 7/week @ 2026-02-01 15/week @ 2026-02-08 41/week @ 2026-02-15 98/week @ 2026-02-22 82/week @ 2026-03-01 75/week @ 2026-03-08 154/week @ 2026-03-15 100/week @ 2026-03-22 35/week @ 2026-03-29 36/week @ 2026-04-05 53/week @ 2026-04-12

241 downloads per month
Used in 10 crates (3 directly)

MIT license

38KB
817 lines

seq-hash

crates.io docs

A SIMD-accelerated library for iterating over k-mer hashes of DNA sequences, building on packed_seq. Building block for simd-minimizers.

Paper: Please cite the simd-minimizers paper, for which this crate was developed:

Requirements

This library requires AVX2 or NEON instruction sets, which, on x64, requires either target-cpu=native or target-cpu=x86-64-v3. See this README for details and this blog for background. The same restrictions apply when using seq-hash in a larger project.

RUSTFLAGS="-C target-cpu=native" cargo run --release

Usage example

Full documentation can be found on docs.rs.

use packed_seq::{AsciiSeqVec, PackedSeqVec, SeqVec};
use seq_hash::{KmerHasher, NtHasher};

let seq = b"ACGGCAGCGCATATGTAGT";
let packed_seq = PackedSeqVec::from_ascii(seq);

let k = 3;
// Default `NtHasher` is canonical.
let hasher = <NtHasher>::new(k);

// Consider a 'context' of a single kmer.
let hashes: Vec<_> = hasher.hash_kmers_simd(packed_seq.as_slice(), 1).collect();
assert_eq!(hashes.len(), seq.len() - (k-1)

Dependencies

~2MB
~45K SLoC