Utilities for efficient fine-tuning, inference and evaluation of code generation models
-
Updated
Oct 3, 2023 - Python
Utilities for efficient fine-tuning, inference and evaluation of code generation models
Python package for rematerialization-aware gradient checkpointing
Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.
Triton implementation of FlashAttention2 that adds Custom Masks.
Toy Flash Attention implementation in torch
Long term project about a custom AI architecture. Consist of cutting-edge technique in machine learning such as Flash-Attention, Group-Query-Attention, ZeRO-Infinity, BitNet, etc.
Fast and memory efficient PyTorch implementation of the Perceiver with FlashAttention.
Decoder-only LLM trained on the Harry Potter books.
Training GPT-2 on FineWeb-Edu in JAX/Flax
[CVPR 2025 Highlight] The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss". A super memory-efficiency CLIP training scheme.
Building Native Sparse Attention
A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/Pallas/JAX).
MoBA: Mixture of Block Attention for Long-Context LLMs
Grouped-Tied Attention by Zadouri, Strauss, Dao (2025).
A from-scratch implementation of a T5 model modified with Rotary Position Embeddings (RoPE). This project includes the code for pre-training on the C4 dataset in streaming mode with Flash Attention 2.
Ring sliding window attention implementation with flash attention
Implementing modern DL systems from scratch — Transformers, Diffusion, Multimodal LLMs, FlashAttention, RLHF.
InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
Add a description, image, and links to the flash-attention topic page so that developers can more easily learn about it.
To associate your repository with the flash-attention topic, visit your repo's landing page and select "manage topics."