Starred repositories
CaptionQA: Is Your Caption as Useful as the Image Itself?
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using Agentic Retrieval 🔄.
Official implementation of "Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs".
[NeurIPS 2025 Spotlight] A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone.
Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.
A Next-Generation Training Engine Built for Ultra-Large MoE Models
ScalarLM - a unified training and inference stack
PyTorch building blocks for the OLMo ecosystem
Repository containing code and data for the paper "ArgCMV: An Argument Summarization Benchmark for the LLM-era", accepted at EMNLP 2025 Main Conference.
Lumina-DiMOO - An Open-Sourced Multi-Modal Large Diffusion Language Model
Awesome LLM pre-training resources, including data, frameworks, and methods.
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
Latency and Memory Analysis of Transformer Models for Training and Inference
llm theoretical performance analysis tools and support params, flops, memory and latency analysis.
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
A high-performance inference engine for LLMs, optimized for diverse AI accelerators.
Transformer related optimization, including BERT, GPT
[ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
[TMLR 2024] Efficient Large Language Models: A Survey
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models