Lists (4)
Sort Name ascending (A-Z)
Stars
A project implementing various agentic RL based on the Slime post-training framework
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
Deep dive into Claude Code internals — architecture, agent loop, context engineering, and more. / 深入解析 Claude Code 源码:架构、Agent 循环、上下文工程、工具系统等
The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.
Lightweight coding agent that runs in your terminal
A construction kit for reinforcement learning environment management.
Official repository for DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research
Bash is all you need - A nano claude code–like 「agent harness」, built from 0 to 1
MCore-Bridge: Providing Megatron-Core model definitions for state-of-the-art large models and making Megatron training as simple as Transformers.
Harbor is a framework for running agent evaluations and creating and using RL environments.
😼 优雅地使用基于 clash/mihomo 的代理环境
2026年最新ChatGPT充值订阅教程(117元/月):本文会重点介绍五种开通ChatGPT Plus会员的方法,包括购买ChatGPT Plus独立账号、为你的ChatGPT代充值、拼车合租ChatGPT Plus账号、使用苹果Apple礼品卡充值ChatGPT会员、使用国外的虚拟信用卡订阅ChatGPT Plus会员。
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis
[KernelGYM & Dr. Kernel] A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generations
OpenClaw-RL: Train any agent simply by talking
Measuring how well CLI agents like Claude Code or Codex CLI can post-train base LLMs on a single H100 GPU in 10 hours
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
Your Personal AI Assistant; easy to install, deploy on your own machine or on the cloud; supports multiple chat apps with easily extensible capabilities.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…
A curated skill collection for academic writing and research
科研写作助手 (Research Writing Assistant)
RedSearcher's framework for deep search agent trajectory synthesis, QA filtering, and model evaluation, supporting ReACT and DeepSeek-style agent loops.
Official repository for the paper "Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation"
分享AI Infra知识&代码练习:PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等
Qwen3.5 is the large language model series developed by Qwen team, Alibaba Cloud.
REDSearch: A scalable, cost-efficient framework for long-horizon search agents. Features complex task synthesis, optimized mid-training, post-training (SFT and Agentic RL)