#bioinformatics #newick #neighbor-joining

bin+lib nj

Neighbor-Joining phylogenetic tree inference. Auto-detects DNA/protein, supports multiple substitution models and bootstrap.

15 releases

Uses new Rust 2024

new 0.0.22 Apr 16, 2026
0.0.21 Apr 14, 2026
0.0.3 Nov 12, 2025

#209 in Biology

MIT license

145KB
3K SLoC

nj.rs

Neighbor-Joining phylogenetic tree inference in Rust, with Python and WASM bindings.

Release Tests

Crates.io Version PyPI - Version NPM Version

Takes a mutiple sequence alignment and returns a Newick string. Alphabet (DNA/protein) is auto-detected. Supports optional bootstrap support values on internal nodes. Optionally, only the distance matrix or average distance can be computed. Wrappers use a plugin system for implementing progress/error logging.

Substitution models: PDiff (DNA + protein), JukesCantor (DNA), Kimura2P (DNA), Poisson (protein)

CLI

cargo install nj --features cli

nj sequences.fasta
nj --substitution-model kimura2-p --n-bootstrap-samples 100 sequences.fasta > tree.nwk

A progress bar is shown on stderr when bootstrapping.

Rust

[dependencies]
nj = "0.0.18"
use nj::{NJConfig, NJEvent, SequenceObject, nj, parse_fasta};
use nj::models::SubstitutionModel;

// Parse from a FASTA string
let sequences = parse_fasta(">A\nACGTACGT\n>B\nACCTACGT\n>C\nTCGTACGT\n")?;

// Run Neighbor-Joining
let newick = nj(
    NJConfig {
        msa: sequences,
        n_bootstrap_samples: 100,
        substitution_model: SubstitutionModel::JukesCantor,
        alphabet: None,
        num_threads: None,
    },
    Some(Box::new(|event| {
        if let NJEvent::BootstrapProgress { completed, total } = event {
            eprintln!("{completed}/{total}");
        }
    })),
)?;

Distance-only computation (no tree, no bootstrap):

use nj::{DistConfig, distance_matrix};
use nj::models::SubstitutionModel;

let result = distance_matrix(DistConfig {
    msa: sequences,
    substitution_model: SubstitutionModel::JukesCantor,
    alphabet: None,
    num_threads: None,
})?;
// result.names — Vec<String> of taxon names
// result.matrix — n×n Vec<Vec<f64>>, symmetric, diagonal zero

average_distance takes the same DistConfig and returns the mean of all pairwise distances as an f64.

Python

pip install nj_py
from nj_py import nj, distance_matrix

msa = [
    {"identifier": "A", "sequence": "ACGTACGT"},
    {"identifier": "B", "sequence": "ACCTACGT"},
    {"identifier": "C", "sequence": "TCGTACGT"},
]

def on_event(event):
    if event["type"] == "BootstrapProgress":
        print(f"{event['completed']}/{event['total']}")

newick = nj(msa, substitution_model="JukesCantor", n_bootstrap_samples=100, on_event=on_event)

Distance-only computation:

result = distance_matrix(msa, substitution_model="JukesCantor")
# result["names"] — list of taxon names
# result["matrix"] — n×n list of lists, symmetric, diagonal zero

JavaScript / WASM

npm install @holmrenser/nj
import { nj, distance_matrix } from '@holmrenser/nj';

const msa = [
    { identifier: 'A', sequence: 'ACGTACGT' },
    { identifier: 'B', sequence: 'ACCTACGT' },
    { identifier: 'C', sequence: 'TCGTACGT' },
];

const newick = nj(
    { msa, n_bootstrap_samples: 100, substitution_model: 'JukesCantor' },
    (event) => {
        if (event.type === 'BootstrapProgress') {
            progressBar.value = event.completed / event.total * 100;
        }
    }
);

const { names, matrix } = distance_matrix({ msa, substitution_model: 'JukesCantor' });

Dependencies

~1–13MB
~114K SLoC