#name #gender #sex #prediction

sexmachine

A library for predicting sex based on a given name

1 unstable release

Uses new Rust 2024

0.1.0 Feb 27, 2026

#1514 in Algorithms

MIT license

300KB
177 lines

sexmachine

A fast, lightweight Bayesian classifier for predicting sex from a given name.

Usage

Using the pre-trained model

use sexmachine::{Classifier, Sex};

let classifier = Classifier::from_model("english").unwrap();

match classifier.classify("jessica") {
    Sex::Female => println!("Female"),
    Sex::Male => println!("Male"),
    Sex::Ambiguous => println!("Ambiguous"),
}

// Get a confidence score (0.0 = ambiguous, 1.0 = certain)
let confidence = classifier.confidence("jessica");
println!("Confidence: {:.1}%", confidence * 100.0);

Available models

Model Description
"english" Trained on ssa.gov baby name data

Training your own model

You can train a classifier from scratch and serialize it for later use. Training data should be in the format name,sex,count where sex is M or F.

use sexmachine::{Classifier, Sex};

let mut classifier = Classifier::new();

classifier.train("james", Sex::Male);
classifier.train("jessica", Sex::Female);

classifier.serialize_to_file("my_model.bin").unwrap();

// .. deserialize it back later
let classifier = Classifier::from_file("my_model.bin").unwrap();

Training data

The included English model was trained on U.S. Social Security Administration baby name data, which can be downloaded here:

https://www.ssa.gov/oact/babynames/limits.html

Each file in the archive contains rows in the format:

Mary,F,7065
John,M,5614

To retrain using the provided train example:

cargo run --example train -- /path/to/names/folder output_model.bin

Dependencies

~0.4–1MB
~23K SLoC