15 releases
Uses new Rust 2024
| new 0.0.22 | Apr 16, 2026 |
|---|---|
| 0.0.21 | Apr 14, 2026 |
| 0.0.3 | Nov 12, 2025 |
#209 in Biology
145KB
3K
SLoC
nj.rs
Neighbor-Joining phylogenetic tree inference in Rust, with Python and WASM bindings.
Takes a mutiple sequence alignment and returns a Newick string. Alphabet (DNA/protein) is auto-detected. Supports optional bootstrap support values on internal nodes. Optionally, only the distance matrix or average distance can be computed. Wrappers use a plugin system for implementing progress/error logging.
Substitution models: PDiff (DNA + protein), JukesCantor (DNA), Kimura2P (DNA), Poisson (protein)
CLI
cargo install nj --features cli
nj sequences.fasta
nj --substitution-model kimura2-p --n-bootstrap-samples 100 sequences.fasta > tree.nwk
A progress bar is shown on stderr when bootstrapping.
Rust
[dependencies]
nj = "0.0.18"
use nj::{NJConfig, NJEvent, SequenceObject, nj, parse_fasta};
use nj::models::SubstitutionModel;
// Parse from a FASTA string
let sequences = parse_fasta(">A\nACGTACGT\n>B\nACCTACGT\n>C\nTCGTACGT\n")?;
// Run Neighbor-Joining
let newick = nj(
NJConfig {
msa: sequences,
n_bootstrap_samples: 100,
substitution_model: SubstitutionModel::JukesCantor,
alphabet: None,
num_threads: None,
},
Some(Box::new(|event| {
if let NJEvent::BootstrapProgress { completed, total } = event {
eprintln!("{completed}/{total}");
}
})),
)?;
Distance-only computation (no tree, no bootstrap):
use nj::{DistConfig, distance_matrix};
use nj::models::SubstitutionModel;
let result = distance_matrix(DistConfig {
msa: sequences,
substitution_model: SubstitutionModel::JukesCantor,
alphabet: None,
num_threads: None,
})?;
// result.names — Vec<String> of taxon names
// result.matrix — n×n Vec<Vec<f64>>, symmetric, diagonal zero
average_distance takes the same DistConfig and returns the mean of all pairwise distances as an f64.
Python
pip install nj_py
from nj_py import nj, distance_matrix
msa = [
{"identifier": "A", "sequence": "ACGTACGT"},
{"identifier": "B", "sequence": "ACCTACGT"},
{"identifier": "C", "sequence": "TCGTACGT"},
]
def on_event(event):
if event["type"] == "BootstrapProgress":
print(f"{event['completed']}/{event['total']}")
newick = nj(msa, substitution_model="JukesCantor", n_bootstrap_samples=100, on_event=on_event)
Distance-only computation:
result = distance_matrix(msa, substitution_model="JukesCantor")
# result["names"] — list of taxon names
# result["matrix"] — n×n list of lists, symmetric, diagonal zero
JavaScript / WASM
npm install @holmrenser/nj
import { nj, distance_matrix } from '@holmrenser/nj';
const msa = [
{ identifier: 'A', sequence: 'ACGTACGT' },
{ identifier: 'B', sequence: 'ACCTACGT' },
{ identifier: 'C', sequence: 'TCGTACGT' },
];
const newick = nj(
{ msa, n_bootstrap_samples: 100, substitution_model: 'JukesCantor' },
(event) => {
if (event.type === 'BootstrapProgress') {
progressBar.value = event.completed / event.total * 100;
}
}
);
const { names, matrix } = distance_matrix({ msa, substitution_model: 'JukesCantor' });
Dependencies
~1–13MB
~114K SLoC