-
PolyU
- Hong Kong, China
- https://hemingkx.github.io/
- @hemingkx
Highlights
- Pro
Stars
A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.
BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.
Ralph is an autonomous AI agent loop that runs repeatedly until all PRD items are complete.
Awesome list for AI agent harness engineering: tools, patterns, evals, memory, MCP, permissions, observability, and orchestration.
The agent that grows with you
🛠️ Awesome tools & guides for harness engineering.
SkillsBench evaluates how well skills work and how effective agents are at using them
Memento-Skills: Let Agents Design Agents
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning
A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.
Specification and documentation for Agent Skills
AutoSkill: Experience-Driven Lifelong Learning via Skill Self-Evolution
contains the list of papers of agent skills
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
General technology for enabling AI capabilities w/ LLMs and MLLMs
Your private AI assistant on your phone: simple, safe, and ready anytime. 你手机里的私人 AI 助手:简单、安全,随时可用。
"Parallel Test-Time Scaling for Latent Reasoning Models"
AI agents running research on single-GPU nanochat training automatically
DFlash: Block Diffusion for Flash Speculative Decoding
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
OpenClaw-RL: Train any agent simply by talking
"🐈 nanobot: The Ultra-Lightweight Personal AI Agent"
The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"
Reinforcement Learning via Self-Distillation (SDPO)
CL-bench: A Benchmark for Context Learning
[SIGIR 2026] "One Adapts to Any: Meta Reward Modeling for Personalized LLM Alignment"
"Reasoning in the Dark: Interleaved Vision-Text Reasoning in Latent Space"