Stars
cuDNN Frontend is NVIDIA's modern, open-source entry point to the cuDNN library and a growing collection of high-performance open-source kernels.
Evaluate and improve models and agents using environments
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
A framework for efficient model inference with omni-modality models
NumPy and SciPy on Multi-Node Multi-GPU systems
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI
Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]
NexRL is an ultra-loosely-coupled LLM post-training framework.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Perplexity open source garden for inference technology
ZSL98 / verl
Forked from verl-project/verlveRL: Volcano Engine Reinforcement Learning for LLM
The official implementation of OSDI'25 paper BlitzScale
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
Tile primitives for speedy kernels
SkyRL: A Modular Full-stack RL Library for LLMs
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
DelinQu / SimplerEnv-OpenVLA
Forked from simpler-env/SimplerEnvEvaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo, and OpenVLA) in simulation under common setups (e.g., Google Robot, WidowX+Bridge)
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
C++ examples for the Vulkan graphics API
GPU Sharing Scheduler for Kubernetes Cluster
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights i…
This repository is established to store personal notes and annotated papers during daily research.
Implementation of "PaLM-E: An Embodied Multimodal Language Model"