Train your own ChatGPT on Apple Silicon — MLX port of nanochat
-
Updated
Apr 29, 2026 - Python
Train your own ChatGPT on Apple Silicon — MLX port of nanochat
Train Llama 3 models from scratch. Any scale, any personality. By Arianna Method.
A minimal, hackable Vision-Language Model built on Karpathy’s nanochat — add image understanding and multimodal chat for under $200 in compute.
Ascend NPU fork of nanochat for LLM training with torch_npu/HCCL (experimental)
nanochat's inference engine re-vibed in C++ with GGML.
Production-honest small language model training factory: data import, pretraining, SFT, eval gates, contamination checks, and GPU runbooks.
The best ChatGPT-style model that $100 of TPU time can buy.
The best GPT that $100-$125 worth of pre-training and finetuning can buy
Run nanochat training efficiently on Huawei Ascend NPUs with minimal code changes, supporting tokenizer, pretraining, and evaluation workflows.
The best ChatGPT that $100 can buy ported to an Nvidia RTX 5090 which is NOT $100
I built this repo to prove to my granny that I can implement GPT.
Controlled benchmark of memory mechanisms for transformers, built on nanochat
The official implementation of Ringmaster LMO, an asynchronous distributed optimizer for neural network training under heterogeneous compute environments.
Add a description, image, and links to the nanochat topic page so that developers can more easily learn about it.
To associate your repository with the nanochat topic, visit your repo's landing page and select "manage topics."