Skip to content

Tags: lastmile-ai/mcp-eval

Tags

v0.1.10

Toggle v0.1.10's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Bump version from 0.1.9 to 0.1.10

v0.1.9

Toggle v0.1.9's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Bump pyproject.toml version

v0.1.8

Toggle v0.1.8's commit message
Bump pyproject ver

v0.1.7

Toggle v0.1.7's commit message
Lint, format and bump pyproject

v0.1.6

Toggle v0.1.6's commit message
bump pypi

v0.1.5

Toggle v0.1.5's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Make docs clearer, ship with usage_example for `mcp-eval init` (#11)

## Summary

init now doesn't start with an empty state.
docs are improved

## Checklist

- [ ] Tests added/updated
- [ ] Docs updated (README, docs/*.mdx)
- [x] Lint passes locally
- [ ] Linked issue (if any)

## Screenshots / Logs

N/A

## Breaking Changes

- [ ] Yes
- [x] No

v0.1.3

Toggle v0.1.3's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
mcp-eval final body join (#8)

## CLI (working):

MCP-Eval CLI Commands - Complete Reference

  Overview

MCP-Eval provides a comprehensive CLI for testing MCP (Model Context
Protocol) servers with AI agents. The commands follow a logical workflow
from setup → validation →
  generation → execution → debugging.

  Command Summary

  🚀 Setup & Configuration

  mcp-eval init

  Purpose: Initialize a new MCP-Eval projectWhat it does:
  - Creates mcpeval.yaml and mcpeval.secrets.yaml
  - Prompts for LLM provider (Anthropic/OpenAI) and API key
- Auto-detects and prompts to import servers from .cursor/mcp.json or
.vscode/mcp.json
  - Configures a default agent with instructions
  - Sets up judge configuration for test evaluation

  Example:
  mcp-eval init

  mcp-eval add server

  Purpose: Add MCP servers to your configurationOptions:
- Interactive mode: Prompts for transport, command/URL, args, env,
headers
  - Import from file: --from-mcp-json or --from-dxt
  - Validates and saves to mcpeval.yaml

  Example:
  mcp-eval add server
  mcp-eval add server --from-mcp-json .cursor/mcp.json

  mcp-eval add agent

  Purpose: Add test agents that use MCP serversWhat it does:
  - Prompts for agent name, instruction, and server assignments
  - Validates that referenced servers exist
  - Offers to add missing servers inline
  - Optionally sets as default agent

  Example:
  mcp-eval add agent

  🔍 Inspection & Validation

  mcp-eval list

  Purpose: View configured resourcesSubcommands:
- servers: List all configured servers (with -v for full details
including env/headers)
- agents: List all agents (with -v for full instructions, --name X for
specific agent)
  - all: List everything

  Example:
  mcp-eval list servers -v
  mcp-eval list agents --name test_agent
  mcp-eval list all

  mcp-eval validate

  Purpose: Validate your configuration is correctWhat it checks:
  - API keys are configured
  - Judge model is set
  - Servers can be connected to (with --no-quick)
  - Agents reference valid servers
  - LLM connections work

  Options:
  - --quick: Skip connection tests
  - --no-servers: Skip server validation
  - --no-agents: Skip agent validation

  Example:
  mcp-eval validate
  mcp-eval validate --quick

  mcp-eval doctor

  Purpose: Comprehensive diagnostics for troubleshootingWhat it checks:
  - Python version and package installations
  - Configuration files existence
  - Environment variables
  - System information
  - Recent test errors
  - Runs validation checks
  - Provides actionable fix suggestions

  Options:
  - --full: Include connection tests

  Example:
  mcp-eval doctor
  mcp-eval doctor --full

  🧪 Test Generation

  mcp-eval generate

  Purpose: Generate test files for MCP serversWhat it does:
  - Connects to server and discovers available tools
  - Uses LLM to generate comprehensive test scenarios
  - Creates test files with assertions (pytest/decorators/dataset style)
  - Includes edge cases, error handling, performance checks

  Options:
  - --style: Test format (pytest, decorators, dataset)
  - --n-examples: Number of test scenarios to generate
  - --provider: LLM provider
  - --model: Specific model to use

  Example:
  mcp-eval generate
  mcp-eval generate --style pytest --n-examples 10

  mcp-eval update

  Purpose: Append new tests to existing test filesWhat it does:
  - Generates additional test scenarios
  - Appends to existing test file without overwriting
  - Maintains consistent style with existing tests

  Options:
  - --target-file: Path to existing test file
  - --server-name: Server to generate tests for
  - --style: Test style for new tests
  - --n-examples: Number of new tests

  Example:
  mcp-eval update --target-file tests/test_fetch.py --n-examples 5

  🏃 Test Execution

  mcp-eval run

  Purpose: Execute test filesWhat it does:
  - Runs pytest-style or decorator-style tests
  - Executes tests against configured MCP servers
  - Generates reports in JSON/Markdown format
  - Saves traces for debugging

  Options:
  - Test selection flags (inherited from pytest)
  - --verbose: Detailed output
  - --pattern: File pattern matching

  Example:
  mcp-eval run
  mcp-eval run tests/test_fetch.py
  mcp-eval run --pattern "test_*.py"

  🐛 Debugging & Support

  mcp-eval issue

  Purpose: Create GitHub issues with diagnostic informationWhat it does:
  - Gathers system and environment information
  - Includes recent test results and errors
  - Captures configuration status
  - Pre-fills GitHub issue with all details
  - Can open browser directly to create issue

  Options:
  - --title: Issue title
  - --no-include-outputs: Skip test outputs
  - --no-open-browser: Just show content without opening browser

  Example:
  mcp-eval issue
  mcp-eval issue --title "Server connection timeout"

  mcp-eval version

  Purpose: Show MCP-Eval version information

  Example:
  mcp-eval version

  Typical Workflow

  1. Initial Setup

  # Initialize project
  mcp-eval init

  # Add additional servers if needed
  mcp-eval add server

  # Configure agents
  mcp-eval add agent

  2. Validation

  # Check everything is configured correctly
  mcp-eval validate

  # List what's configured
  mcp-eval list all

  # Run diagnostics if issues
  mcp-eval doctor

  3. Test Generation

  # Generate initial tests
  mcp-eval generate --style pytest

  # Add more tests later
  mcp-eval update --target-file tests/test_server.py

  4. Test Execution

  # Run all tests
  mcp-eval run

  # Run specific tests
  mcp-eval run tests/test_fetch.py

  5. Debugging

  # If tests fail, diagnose
  mcp-eval doctor --full

  # Create issue if needed
  mcp-eval issue

## GitHub Actions

Added a run github action to be able to run tests and generate/upload
reports

### Test badge

Test coverage and test success badges, also integrated into the GH
action

## Docs

Detailed docs for the entire framework