Stars
Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
TokenSpeed is a speed-of-light LLM inference engine.
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
练习下用pytorch来复现下经典的推荐系统模型, 如MF, FM, DeepConn, MMOE, PLE, DeepFM, NFM, DCN, AFM, AutoInt, ONN, FiBiNET, DCN-v2, AFN, DCAP等
whutbd / cuda-learn-note
Forked from xlite-dev/LeetCUDA🎉CUDA 笔记 / 高频面试题汇总 / C++笔记,个人笔记,更新随缘: sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
Use Garry Tan's exact Claude Code setup: 23 opinionated tools that serve as CEO, Designer, Eng Manager, Release Manager, Doc Engineer, and QA
Lumina Robotics Talent Call | Lumina社区具身智能招贤榜 | A list for Embodied AI / Robotics Jobs (PhD, RA, intern, etc
Twinkle✨: Training workbench to make your model glow.
A lightweight alternative to OpenClaw that runs in containers for security. Connects to WhatsApp, Telegram, Slack, Discord, Gmail and other messaging apps,, has memory, scheduled jobs, and runs dir…
Production-grade client-side tracing, profiling, and analysis for complex software systems.
"AI-Trader: 100% Fully-Automated Agent-Native Trading"
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Composable and Embeddable Communication Runtime for Distributed AI Services
Open-source LLM knowledge platform: turn raw documents into a queryable RAG, an autonomous reasoning agent, and a self-maintaining Wiki.
Whisper command line client compatible with original OpenAI client based on CTranslate2.
A PyTorch native platform for training generative AI models
A Next-Generation Training Engine Built for Ultra-Large MoE Models
slime is an LLM post-training framework for RL Scaling.
Simultaneous speech-to-text models
这是一个基于参数服务器(Parameter Server)PS-Lite的分布式深度学习训练和预测框架。This is a model training and prediction framework.1) It includes a complete set of processes such as sample generation, feature extraction, model…
LLMs-from-scratch项目中文翻译