Stars
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
A resource repository for machine unlearning in large language models
The official implementation of “ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought”
Alpha Screening with LLM Reasoning via Reinforcement Learning
Holmes is an interactive, text-based crime investigation game powered by a large language model (LLM). With each replay, the game offers a fresh narrative, ensuring a unique experience for players …
基于多智能体LLM的中文金融交易框架 - TradingAgents中文增强版
SGLang is a high-performance serving framework for large language models and multimodal models.
Official implementation of the NeurIPS 2025 paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
yuanzhoulvpi2017 / nano_rl
Forked from verl-project/verl在verl上做reward的定制开发
A high-throughput and memory-efficient inference and serving engine for LLMs
Minimal reproduction of DeepSeek R1-Zero
verl: Volcano Engine Reinforcement Learning for LLMs
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)
[ICCV 2019] Monocular depth estimation from a single image
The official code for ICRA 2021 Paper: "Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation"
Self-supervised monocular depth estimation with a vision transformer