Stars
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
An open-source AI agent that brings the power of Gemini directly into your terminal.
[CVPR 2025] Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents
DeepResearchAgent is a hierarchical multi-agent system designed not only for deep research tasks but also for general-purpose task solving. The framework leverages a top-level planning agent to coo…
Question and Answer based on Anything.
整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX | Organize the currently open-source optimal table recognition models, improve pre-processing and post-processing, and convert the models to ONNX.
A TTS model capable of generating ultra-realistic dialogue in one pass.
A live stream development of RL tunning for LLM agents
Run MCP stdio servers over SSE and SSE over stdio. AI gateway.
Model Context Protocol Servers
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
A Datacenter Scale Distributed Inference Serving Framework
Wan: Open and Advanced Large-Scale Video Generative Models
A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
FlashMLA: Efficient Multi-head Latent Attention Kernels
A curated list of Diffusion Model in RL resources (continually updated)
wonderisland / cutlass
Forked from NVIDIA/cutlassCUDA Templates for Linear Algebra Subroutines
NVIDIA Linux open GPU kernel module source
cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,算法链路全流程,算力租赁平台,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU虚拟化,边缘计算,标注平台,自动化标注,deepseek等大模型sft微调/奖励模型/强化学习训练,vllm/ollama/mindie大模型多机推理,私有知识库,AI模型市场,支持国…
Efficient Triton Kernels for LLM Training
Whisper realtime streaming for long speech-to-text transcription and translation
✨✨Latest Advances on Multimodal Large Language Models
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding