Sandokan

Train with reasoning, optimize with precision.
A CPU-first C++ neural network training engine built for on-device learning — no Python, no PyTorch, no GPU required.

Explore the docs » · Report Bug · Request Feature

Table of Contents

About

Most neural network training assumes a GPU and a Python runtime. That assumption breaks in the places where learning matters most: a microcontroller updating a sensor model in the field, a robot adapting its controller between tasks, an embedded vision system that must improve on the device it runs on. Sandokan is built for those environments.

Sandokan is a CPU-only, on-device training engine. There is no CUDA dependency, no Python interpreter, no ~1 GB LibTorch runtime to drag in. The target is small-scale models — fully connected networks in the tens-of-thousands to low-millions of parameter range — trained directly on the hardware where they will be used.

Drop a single header into any C++ project and get a complete training pipeline: classification, regression, custom datasets, optimizers, learning rate schedules, and model persistence. The engine is backed by a custom slab allocator (PMAD) and Apple AMX acceleration via Eigen, so CPU training is as fast as the hardware allows.

Why CPU-only and small-scale?

The trend toward giant GPU-trained models obscures a different class of problem: systems that must keep learning after deployment, with local data, on hardware that has no network connection or power budget for a GPU. Sandokan's design constraints are deliberate — tight memory control via PMAD, mmap-backed datasets with bounded RSS, and a header-only footprint make it practical to embed a full training loop into firmware, a game engine, or a latency-critical trading system.

Built for:

Embedded systems and edge devices that need on-device model adaptation
Robotics and control systems that update parameters between episodes
Game engines doing real-time AI personalization without a cloud round-trip
Trading systems and other latency-sensitive C++ codebases with on-device inference and retraining
Any environment where GPU access is unavailable and Python is not an option

Built With

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.vscode		.vscode
Formula		Formula
data		data
examples		examples
include		include
pmad		pmad
test_project		test_project
.DS_Store		.DS_Store
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
Config.cmake.in		Config.cmake.in
HOW_TO_USE_SANDOKAN.txt		HOW_TO_USE_SANDOKAN.txt
LICENSE		LICENSE
README.md		README.md
emnist_bench.csv		emnist_bench.csv
emnist_bench.png		emnist_bench.png
fashion_mnist_bench.csv		fashion_mnist_bench.csv
fashion_mnist_bench.png		fashion_mnist_bench.png
letternet.sand		letternet.sand
logo.png		logo.png

Optimizer	Notes
`SGD`	Stochastic gradient descent with fixed learning rate
`Adam`	Adaptive moments with bias correction

Loss	Output activation	Use case
`CrossEntropyLoss`	`Softmax`	Multi-class classification
`BCELoss`	`Sigmoid`	Binary / multi-label classification
`MSELoss`	Linear (none)	Regression

Layer	Description
`Linear`	Fully connected — He-initialised weights, PMAD-backed gradient buffers
`ReLU`	Element-wise rectifier, stores pre-activation for backward pass
`Softmax`	Numerically stable column-wise softmax, passthrough backward (CE loss folds in Jacobian)
`Sigmoid`	Element-wise sigmoid

Backend	Total (ms)	ms / epoch	ms / sample	samples / sec
Sandokan single-sample	7 540	1 508	0.0121	82 757
Eigen single-sample	9 257	1 851	0.0148	67 408
Sandokan batched + parallel	386	77	0.0006	1 615 666
Eigen batched	614	123	0.0010	1 015 951

Backend	ms / epoch	samples / sec
Sandokan batched + parallel	34.4	1 742 000
Eigen batched	40.9	1 464 000

Dataset	Architecture	Optimizer	Result
EMNIST Letters	784 → 64 → ResBlock(64) → 26	Adam + LinearLR	88.25% test accuracy
Fashion MNIST	784 → 64 → 64 → 10	SGD	converges to ~85%

Example	Task	Dataset
`examples/emnist_letters`	26-class letter recognition	EMNIST Letters
`examples/tabular_demo`	Generic CSV classification	any numeric CSV
`examples/benchmark`	Full timing sweep (single / batched / Module)	EMNIST Letters
`examples/emnist_bench`	Sandokan vs Eigen — per-epoch timing	EMNIST Letters
`examples/fashion_mnist_bench`	Sandokan vs Eigen — per-epoch timing	Fashion MNIST

Folders and files

Latest commit

History

Repository files navigation

Sandokan

About

Built With

Core Features

Module System

PMAD Slab Allocator

Dataset Abstractions

ImageDataset — mmap-backed IDX loader

TabularDataset — in-memory column-major store

Optimizers and Learning Rate Schedulers

Loss Functions

Training Loops

Model Persistence

Inference

Layers

Performance

EMNIST Letters — 124 800 training samples

Fashion MNIST — 60 000 training samples

Accuracy

Build

Examples

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`ImageDataset` — mmap-backed IDX loader

`TabularDataset` — in-memory column-major store

Packages