-
Fudan University
- Shanghai
- https://abbey4799.github.io/
Stars
The first Chinese metaphor corpus serving for identification and generation. 中文比喻数据集. Presented at COLING 2022.
🦞 ClawMark: A Living-World Benchmark for Multi-Day, Multimodal Coworker Agents
A persistent, unified memory layer for all your AI agents (e.g. Claude Code, Codex), backed by Markdown and Milvus.
An in-the-wild benchmark for AI agents in the OpenClaw Environment.
Claude Code 源码深度研究,包括 Foundations/Execution/Infrastructure 三大章节和 23 个子系统的架构分析拆解。
Persistent Context Across Sessions for Every Agent – Captures everything your agent does during sessions, compresses it with AI, and injects relevant context back into future sessions. Works with C…
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.
Claw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.
mini cli search engine for your docs, knowledge bases, meeting notes, whatever. Tracking current sota approaches while being all local
Evaluate your agent memory on real-world dialogues, not LLM-simulated dialogues.
[ICML 26] An evaluation framework assessing long-context retention and long-horizon memory performance for agentic applications (AMA-bench).
Open source code for ICLR 2026 Paper: Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions
Benchmarking Chat Assistants on Long-Term Interactive Memory (ICLR 2025)
Reverse-engineered TypeScript client for QClaw's WeChat Access API.
CL-bench: A Benchmark for Context Learning
Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi et al.
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Official Implementation of Knowledge Flow Prompting
A Framework for LLM-based Multi-Agent Reinforced Training and Inference
Official Code for "Coser: Coordinating LLM-Based Persona Simulation of Established Roles"
Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
Sotopia-RL: Reward Design for Social Intelligence
The absolute trainer to light up AI agents.