-
University of Washington
- Seattle
-
18:53
(UTC -08:00) - https://zhangchenxu.com
- @zhangchen_xu
Highlights
- Pro
Stars
Daytona is a Secure and Elastic Infrastructure for Running AI-Generated Code
Label Studio is a multi-type data labeling and annotation tool with standardized output format
A minimal yet professional single agent demo project that showcases the core execution pipeline and production-grade features of agents.
All parts of Claude Code's system prompt, 20 builtin tool descriptions, sub agent prompts (Plan/Explore/Task), utility prompts (CLAUDE.md, compact, statusline, magic docs, WebFetch, Bash cmd, secur…
https://astro-multiplepage-portfolio.edgeone.app/
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning
GenAI Agent Framework, the Pydantic way
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, or on-prem).
Add your HDD, SSD and NVMe drives to your Synology's compatible drive database and a lot more
Building a Foundational Guardrail for General Agentic Systems via Synthetic Data
Klavis AI (YC X25): MCP integration platforms that let AI agents use tools reliably at any scale
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…
Official repo of Toucan: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments
A Python tool that automatically cleans, completes, and standardizes BibTeX entries using LLMs and web search.
Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.
MCPToolBench++ MCP Model Context Protocol Tool Use Benchmark on AI Agent and Model Tool Use Ability
Synthetic data curation for post-training and structured data extraction
MCPMark is a comprehensive, stress-testing MCP benchmark designed to evaluate model and agent capabilities in real-world MCP use.
MCP-Universe is a comprehensive framework designed for developing, testing, and benchmarking AI agents
Public Evaluation Result Archieve for BFCL
MCP-based Agent Deep Evaluation System
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!