- Overview
- Features
- Quick Start
- Installation
- Usage
- Development
- Troubleshooting
- Contributing
- License
- Roadmap
- Changelog
- Contact
- Acknowledgements
llm is a modular and extensible PyTorch framework for training and experimenting with Large Language Models (LLMs). It provides a robust infrastructure for building, training, and evaluating LLMs, focusing on modularity, scalability, and ease of experimentation.
- Flexible & Pluggable Architecture:
- Component Registry: Switch between MHA/MoE or future implementations (FlashAttn) via config.
- Grouped Query Attention (GQA) & SwiGLU support.
- Unified QKV projection & Flexible Norms (RMSNorm, LayerNorm).
- Robust Training Framework:
- Distributed Data Parallel (DDP) & Automatic Mixed Precision (AMP).
- Type-safe Configuration via Pydantic & Typer CLI.
torch.compileoptimization integration.
- Data & Tokenization Abstraction:
- HuggingFace Integration: Direct support for HF Tokenizers (GPT-2, Llama, etc.).
- Modular
DataModuledesign for text datasets. - Legacy Character-level tokenizer for simple experiments.
- Example Training Script: The
src/llm/training/train.pyscript demonstrates end-to-end training of decoder-only models, showcasing the framework's capabilities.
To quickly get started with training a model using the framework, follow these steps:
-
Initialize Project: Ensure your environment is set up by running the initialization command (see Installation for details).
make init
-
Run a Training Example: Execute the main training script with a language modeling task.
python src/llm/training/train.py --task lm --epochs 1 --batch-size 32
This command will train a language model for 1 epoch with a batch size of 32. You will see training progress logs in your console.
- Python 3.13+
- uv: A fast Python package installer and resolver, written in Rust.
- make: A build automation tool (typically pre-installed on Linux/macOS, available via Chocolatey/Scoop on Windows).
This project uses uv for dependency management and Makefile for common development tasks.
-
Install
uv: If you don't haveuvinstalled, follow the official instructions here. -
Initialize Project: Navigate to the project root directory and run the
make initcommand. This will set up the virtual environment, install all necessary dependencies, and install pre-commit hooks.make init
-
Synchronize Dependencies (if needed): If
pyproject.tomloruv.lockchanges, you can re-synchronize dependencies:make sync
If this project were to be distributed as a package, users would typically install it using pip:
pip install llm # Assuming 'llm' is the package name on PyPIHowever, for development, make init is the recommended way to set up the environment.
For comprehensive documentation, including detailed usage examples, development guides, and troubleshooting, please refer to our dedicated documentation section:
- Architecture Guide
- Development Guide
- CPU LLM Tutorial
- Project Troubleshooting
- Training Framework Documentation
For detailed information on setting up the development environment, running tests, maintaining code quality, and other development workflows, please refer to our comprehensive Development Guide.
If you encounter any issues while using llm, please check our Troubleshooting Guide for common problems and their solutions. If you can't find a solution to your problem, please open an issue on our GitHub repository.
We welcome contributions! Please see our Contributing Guide for details on how to submit pull requests, report issues, or suggest improvements.
This project is licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
For our detailed development roadmap and future plans, including upcoming features like inference API, Flash Attention integration, and RLHF support, please see ROADMAP.md.
For a detailed history of changes to this project, please refer to our CHANGELOG.md.
For questions, suggestions, or support, please open an issue on our GitHub repository.
We acknowledge all contributors, open-source projects, and resources that have inspired and supported the development of this project.