Lists (1)
Sort Name ascending (A-Z)
Stars
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
A high-throughput and memory-efficient inference and serving engine for LLMs
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Fully open reproduction of DeepSeek-R1
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
verl: Volcano Engine Reinforcement Learning for LLMs
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Agent harness built with LangChain and LangGraph. Equipped with a planning tool, a filesystem backend, and the ability to spawn subagents - well-equipped to handle complex agentic tasks.
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, Slurm, 20+ clouds, on-prem).
Object detection, 3D detection, and pose estimation using center point detection:
Google AI 2018 BERT pytorch implementation
Demo of a customer service use case implemented with the OpenAI Agents SDK
slime is an LLM post-training framework for RL Scaling.
The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search
A live stream development of RL tunning for LLM agents
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.
Official Repository of Absolute Zero Reasoner
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
Code for the paper "Planning with Diffusion for Flexible Behavior Synthesis"
The official implementation of Self-Play Fine-Tuning (SPIN)
AIDE: AI-Driven Exploration in the Space of Code. The machine Learning engineering agent that automates AI R&D.