Skip to content

pplmx/llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

llm

Generated by CI Coverage Status PyPI version Python Versions

Table of Contents

Overview

llm is a modular and extensible PyTorch framework for training and experimenting with Large Language Models (LLMs). It provides a robust infrastructure for building, training, and evaluating LLMs, focusing on modularity, scalability, and ease of experimentation.

Features

  • Modern Transformer Architecture:
    • Grouped Query Attention (GQA) for balanced performance and memory efficiency
    • SwiGLU activation function for enhanced MLP performance
    • Unified QKV projection for optimized memory layout and throughput
    • Mixture of Experts (MoE) support for scaling model capacity
    • Flexible normalization techniques (Pre-LN/Post-LN, RMSNorm)
  • Robust Training Framework:
    • Distributed Data Parallel (DDP) training for scalability.
    • Automatic Mixed Precision (AMP) for memory efficiency and faster training.
    • torch.compile integration for performance optimization.
    • Flexible configuration system using YAML and Python dataclasses.
    • Extensible callback system for custom training logic (logging, checkpointing, early stopping, LR scheduling).
    • Comprehensive checkpoint management and performance monitoring.
  • Efficient Development Workflow:
    • Utilizes uv for fast and reliable dependency management.
    • Enforces code quality with ruff (linting and formatting) and mypy (static type checking).
    • Comprehensive testing with pytest, including coverage reports and Allure test results.
  • Data and Tokenization Abstraction:
    • Modular DataModule design for flexible data loading and preprocessing.
    • Character-level tokenizer for basic experimentation, with clear extensibility for advanced tokenizers.
  • Example Training Script: The src/llm/training/train.py script demonstrates end-to-end training of decoder-only models, showcasing the framework's capabilities.

Quick Start

To quickly get started with training a model using the framework, follow these steps:

  1. Initialize Project: Ensure your environment is set up by running the initialization command (see Installation for details).
    make init
  2. Run a Training Example: Execute the main training script with a language modeling task.
    python src/llm/training/train.py --task lm --epochs 1 --batch-size 32
    This command will train a language model for 1 epoch with a batch size of 32. You will see training progress logs in your console.

Installation

Requirements

  • Python 3.13+
  • uv: A fast Python package installer and resolver, written in Rust.
  • make: A build automation tool (typically pre-installed on Linux/macOS, available via Chocolatey/Scoop on Windows).

Setting up the Environment

This project uses uv for dependency management and Makefile for common development tasks.

  1. Install uv: If you don't have uv installed, follow the official instructions here.
  2. Initialize Project: Navigate to the project root directory and run the make init command. This will set up the virtual environment, install all necessary dependencies, and install pre-commit hooks.
    make init
  3. Synchronize Dependencies (if needed): If pyproject.toml or uv.lock changes, you can re-synchronize dependencies:
    make sync

User Installation (Distribution)

If this project were to be distributed as a package, users would typically install it using pip:

pip install llm # Assuming 'llm' is the package name on PyPI

However, for development, make init is the recommended way to set up the environment.

Usage

For comprehensive documentation, including detailed usage examples, development guides, and troubleshooting, please refer to our dedicated documentation section:

Development

For detailed information on setting up the development environment, running tests, maintaining code quality, and other development workflows, please refer to our comprehensive Development Guide.

Troubleshooting

If you encounter any issues while using llm, please check our Troubleshooting Guide for common problems and their solutions. If you can't find a solution to your problem, please open an issue on our GitHub repository.

Contributing

We welcome contributions! Please see our Contributing Guide for details on how to submit pull requests, report issues, or suggest improvements.

License

This project is licensed under either of:

at your option.

Roadmap

For our detailed development roadmap and future plans, including upcoming features like inference API, Flash Attention integration, and RLHF support, please see ROADMAP.md.

Changelog

For a detailed history of changes to this project, please refer to our CHANGELOG.md.

Contact

For questions, suggestions, or support, please open an issue on our GitHub repository.

Acknowledgements

We acknowledge all contributors, open-source projects, and resources that have inspired and supported the development of this project.

About

No description, website, or topics provided.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages