Test your MCP servers by having AI agents complete real tasks.
You've built an MCP server with tools. It works. But can an AI agent actually discover and use your tools correctly? Are your descriptions clear enough? Does your server handle edge cases?
mcpchecker answers these questions automatically. It runs real AI agents against your MCP server, records every tool call, and verifies that tasks complete successfully. Think of it as integration testing for AI tool use.
brew tap mcpchecker/mcpchecker
brew install mcpcheckerFor other platforms (Linux, manual download), see Getting Started.
mcpchecker check eval.yamlThis runs an evaluation that:
- Starts your MCP server and sets up an MCP proxy to record tool calls
- Gives an AI agent a task prompt
- Verifies the task succeeded (via scripts or LLM judge)
- Checks assertions against the recorded behavior
Results are saved to mcpchecker-<name>-out.json with a pass/fail summary printed to the terminal.
For hands-on tutorials, see Quickstarts.
mcpchecker places a recording proxy between the agent and your MCP server:
AI Agent --> MCP Proxy (recording) --> Your MCP Server
If agents can discover and use your tools to complete tasks, your server is well-designed. If they can't, the recorded call history helps you figure out why.
Read more in How It Works.
Getting started:
How-to guides:
- Configure agents -- Claude Code, LLM agents, custom agents, ACP mode
- Write tasks -- task structure, labels, filtering, extensions
- Use assertions -- validate tool usage, call order, resource access
- LLM judge verification -- semantic evaluation of agent responses
- Parallel execution and multi-run -- speed up evals and test consistency
Reference:
Understanding:
- How it works -- architecture and evaluation flow
go build -o mcpchecker ./cmd/mcpcheckerSee LICENSE.