Stars
CORAL is a robust, lightweight infrastructure for multi-agent autonomous self-evolution, built for autoresearch.
AI handles execution, humans own the direction, and every run becomes an inspectable research artifact on disk.
🔥[SIGMOD'26] Official repository for the paper "DeepEye-SQL: A Software-Engineering-Inspired Text-to-SQL Framework"
A light-weight and powerful meta-prompting, context engineering and spec-driven development system for Claude Code by TÂCHES.
"ClawTeam: Agent Swarm Intelligence" (One Command → Full Automation)
Toolathlon-Gym for testing AI agents real-world tool-use capabilities across diverse MCP servers.
Google Workspace CLI — one command-line tool for Drive, Gmail, Calendar, Sheets, Docs, Chat, Admin, and more. Dynamically built from Google Discovery Service. Includes AI agent skills.
Ralph is an autonomous AI agent loop that runs repeatedly until all PRD items are complete.
🔥[ICLR'26] Official repository for the paper "Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs"
"🐈 nanobot: The Ultra-Lightweight Personal AI Agent"
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Automating Sub-Agent Creation for Agentic Orchestration
slime is an LLM post-training framework for RL Scaling.
RLAnything & DemyAgent: General and scalable agentic RL algorithms across terminal, GUI, SWE, and tool-call settings
This is an official github repo for CSGOTrading project.
My learning notes for ML SYS.
Eigent: The Open Source Cowork Desktop to Unlock Your Exceptional Productivity. Local and Free Alternative to Claude Cowork.
General plug-and-play inference library for Recursive Language Models (RLMs), supporting various sandboxes.
Public repository for the Remote Labor Index (RLI)
This repository allows reproduction of Poetiq's record-breaking submission to the ARC-AGI-1 and ARC-AGI-2 benchmarks.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
ToolOrchestra is an end-to-end RL training framework for orchestrating tools and agentic workflows.
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…
τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains