Lists (1)
Sort Name ascending (A-Z)
Stars
A unified inference and post-training framework for accelerated video generation.
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
A curated list of Diffusion Model in RL resources (continually updated)
RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI
OpenClaw-RL: Train any agent simply by talking
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.
RLLaVA is a user-friendly framework for multi-modal RL research and optimized for resource-constrained teams.
HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Curated systems, benchmarks, and papers etc. on memory for LLMs/MLLMs --- long-term context, retrieval, and reasoning.
Light Image Video Generation Inference Framework
A framework for efficient model inference with omni-modality models
collection of diffusion model papers categorized by their subareas
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
verl: Volcano Engine Reinforcement Learning for LLMs
Paper reading and discussion notes, covering AI frameworks, distributed systems, cluster management, etc.
Injecting Adrenaline into LLM Serving: Boosting Resource Utilization and Throughput via Attention Disaggregation
[ICLR 2025] The official pytorch implement of "Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification".
PyTorch implementations of `BatchSampler` that under/over sample according to a chosen parameter alpha, in order to create a balanced training distribution.
AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。
High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
A tensor-aware point-to-point communication primitive for machine learning
Build multimodal data processing pipelines with Azure AI Services + LLMs
A lightweight design for computation-communication overlap.
A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.