Fast & efficient BPE tokenizer written in C & python for LLM tranining
-
Updated
Oct 25, 2025 - C++
Fast & efficient BPE tokenizer written in C & python for LLM tranining
Carbon is a pure C++ Transformer framework inspired by GPT, featuring SIMD-optimized tensor math, multi-head attention, feedforward layers, and BPE tokenization. It’s a fully self-contained system for training and running language models without external modules or libraries.
A c++ framework on efficient training & fine-tuning LLMs
MobileFineTuner: Native C++ framework for fine-tuning LLMs directly on mobile devices. Features: LoRA/Full-FT, ZeRO-inspired parameter sharding, energy-aware throttling, custom autograd engine. Keep your data on-device.
Real-time, low-overhead visualization of LLM internals during training.
Quantized LLM training in pure CUDA/C++.
Add a description, image, and links to the llm-training topic page so that developers can more easily learn about it.
To associate your repository with the llm-training topic, visit your repo's landing page and select "manage topics."