1 unstable release
Uses new Rust 2024
| 0.1.0 | Feb 27, 2026 |
|---|
#1514 in Algorithms
300KB
177 lines
sexmachine
A fast, lightweight Bayesian classifier for predicting sex from a given name.
Usage
Using the pre-trained model
use sexmachine::{Classifier, Sex};
let classifier = Classifier::from_model("english").unwrap();
match classifier.classify("jessica") {
Sex::Female => println!("Female"),
Sex::Male => println!("Male"),
Sex::Ambiguous => println!("Ambiguous"),
}
// Get a confidence score (0.0 = ambiguous, 1.0 = certain)
let confidence = classifier.confidence("jessica");
println!("Confidence: {:.1}%", confidence * 100.0);
Available models
| Model | Description |
|---|---|
"english" |
Trained on ssa.gov baby name data |
Training your own model
You can train a classifier from scratch and serialize it for later use. Training data should be in the format name,sex,count where sex is M or F.
use sexmachine::{Classifier, Sex};
let mut classifier = Classifier::new();
classifier.train("james", Sex::Male);
classifier.train("jessica", Sex::Female);
classifier.serialize_to_file("my_model.bin").unwrap();
// .. deserialize it back later
let classifier = Classifier::from_file("my_model.bin").unwrap();
Training data
The included English model was trained on U.S. Social Security Administration baby name data, which can be downloaded here:
Each file in the archive contains rows in the format:
Mary,F,7065
John,M,5614
To retrain using the provided train example:
cargo run --example train -- /path/to/names/folder output_model.bin
Dependencies
~0.4–1MB
~23K SLoC