Stars
high-performance linear attention kernel library built on TileLang
Secure, Fast, and Extensible Sandbox runtime for AI agents.
An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale
Minimal and readable coding agent harness implementation in Python to explain the core components of coding agents.
A project implementing various agentic RL based on the Slime post-training framework
CL-bench: A Benchmark for Context Learning
Claw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.
PinchBench is a benchmarking system for evaluating LLM models as OpenClaw coding agents. Made with 🦀 by the humans at https://kilo.ai
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
Harbor is a framework for running agent evaluations and creating and using RL environments.
A benchmark for LLMs on complicated tasks in the terminal
OpenClaw-RL: Train any agent simply by talking
The awesome collection of OpenClaw skills. 5,400+ skills filtered and categorized from the official OpenClaw Skills Registry.🦞
从零开始玩转OpenClaw:最全面的中文教程,涵盖安装、配置、实战案例和避坑指南(github版)
Qwen3.6 is the large language model series developed by Qwen team, Alibaba Group.
Lightweight and portable LLM sandbox runtime (code interpreter) Python library.
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
NexRL is an ultra-loosely-coupled LLM post-training framework.
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
A PyTorch-native inference engine with cache, parallelism, quantization for Diffusion Transformers.
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
SGLang model provider for Strands Agents for on-policy agentic RL training.
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.