-
East China Normal University
- Shanghai, China
-
20:50
(UTC -12:00)
Highlights
- Pro
Stars
AssetOpsBench - Industry 4.0: A unified benchmark and framework for building, orchestrating, and evaluating domain-specific AI agents for Industry 4.0 asset operations and maintenance, with 460+ sc…
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
Aligning pretrained language models with instruction data generated by themselves.
Agentic RL on Any Harness at Scale
Optimize prompts, code, and more with AI-powered Reflective Text Evolution
Official repository of the paper: Continual Harness: Online Adaptation for Self-Improving Foundation Agents and PokeAgent Speedrun Track 2
The First Unified Agent Data Synthesis Framework for Custom Agentic Task with all-in-one envrionment
AgentTuning: Enabling Generalized Agent Abilities for LLMs
DSPy: The framework for programming—not prompting—language models
[ICLR 2026] A library for generating difficulty-scalable, multi-tool, and verifiable agentic tasks with execution trajectories.
Official AHE code — Agentic Harness Engineering: observability-driven automatic evolution of coding-agent harnesses (concurrent w/ meta-harness). NexAU-AHE reaches 84.7% ± 2.1 pass@1 on Terminal-Be…
AgentTrace is an open-source, local-first step debugger for AI agents. It provides a Python SDK for tracing your agent runs and a web UI to inspect spans, tool calls, prompts, and responses as an i…
AgentTrace is a lightweight observability library to trace and evaluate agentic systems.
Reference code for the Meta-Harness paper.
AI agents running research on single-GPU nanochat training automatically
Automatic Environment Generation with Evolving Coding Agent for Embodied Agent Learning
"OpenSpace: Make Your Agents: Smarter, Low-Cost, Self-Evolving" -- Community: https://open-space.cloud/
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.
Heuristic Learning Blog Post
Codes for papers on Large Language Models Personalization (LaMP)
[ICLR 2026] The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution
0 - 1 learn OpenClaw: sections to build an claw-AI agent from scratch
Bash is all you need - A nano claude code–like 「agent harness」, built from 0 to 1
The Language Virtual Machine for Agent Skills
The agent that grows with you