A lightweight, general-purpose framework for evaluating GPU kernel correctness and performance.
- Three Evaluation Modes: Analyze, Compare, Benchmark
- Heterogeneous Hardware: AMD (HIP) and NVIDIA (CUDA) GPUs
- Execution Environments: Local, Sandbox Container, and Remote Ray Cluster
- Hardware Control: Hardware-aware kernel evaluation under controlled execution settings
- Trace Analysis: TraceLens integration for performance profiling analysis
- MCP Server: Model Context Protocol integration for AI agents
- Structured Reports: JSON output for pipeline integration
- Python 3.10+
- AMD ROCm (HIP) or NVIDIA CUDA toolchain (for kernel compilation/profiling)
rocprof-compute(AMD) orncu(NVIDIA) if you enable performance profiling- Docker (required for Benchmark mode)
# Basic installation
pip install git+https://github.com/AMD-AGI/Magpie.git
git clone https://github.com/AMD-AGI/Magpie.git
cd Magpie
# Editable install (recommended for development)
pip install -e .
# Or use make
make install# Analyze a kernel using a config file
magpie analyze --kernel-config Magpie/kernel_config.yaml.example
# Compare kernels directly
magpie compare kernel_v1.hip kernel_v2.hip
# Benchmark vLLM with torch profiling
magpie benchmark --benchmark-config examples/benchmark_vllm.yaml
# Run MCP server
python -m Magpie.mcpNote: You can also use
python -m Magpieinstead ofmagpiecommand.
| Mode | Description | Status |
|---|---|---|
| Analyze | Single kernel evaluation with testcase | β |
| Compare | Multi-kernel comparison and ranking | β |
| Benchmark | Framework-level benchmarking (vLLM/SGLang) with trace analysis | β |
π See Benchmark Mode Documentation for detailed usage.
Key categories:
gpu: force device selection and hardware control (power/frequency).scheduler: local/container/remote execution and scheduling behavior.performance: profiling and profiler configuration.logging: log levels and output formatting.
See Magpie/kernel_config.yaml.example for full examples.
Example configs live in examples/:
| Mode | Config File | Description |
|---|---|---|
| Analyze | ck_gemm_add.yaml |
Single kernel evaluation |
| Compare | ck_grouped_gemm_compare.yaml |
Multi-kernel comparison |
| Benchmark | benchmark_vllm.yaml |
vLLM benchmark with profiling |
| Benchmark | benchmark_vllm_tracelens.yaml |
vLLM + TraceLens analysis |
| Benchmark | benchmark_sglang.yaml |
SGLang benchmark |
MCP configuration example: Magpie/mcp/config.json
Available tools:
analyze- Analyze kernel correctness and performancecompare- Compare multiple kernel implementationshardware_spec- Query GPU hardware specificationsconfigure_gpu- Configure GPU power and frequencydiscover_kernels- Scan a project and suggest analyzable kernels/configssuggest_optimizations- Suggest performance optimizations from analyze outputcreate_kernel_config- Generate a kernel config YAML for analyze
make install-dev
make lint
make formatβββ README.md
βββ LICENSE
βββ .gitignore
βββ pyproject.toml # Package configuration (pip install)
βββ requirements.txt
βββ Makefile
βββ examples/ # Example configurations
βββ docs/ # Documentation
β βββ benchmark.md # Benchmark mode documentation
βββ Magpie/
βββ __init__.py # Package initialization
βββ __main__.py # Entry point for python -m Magpie
βββ main.py # CLI implementation
βββ config.yaml # Framework configuration
βββ kernel_config.yaml.example
βββ config/ # Configuration classes
βββ core/ # Core engine components
βββ eval/ # Evaluation pipeline
βββ modes/ # Evaluation modes
β βββ analyze_eval/ # Single kernel analysis
β βββ compare_eval/ # Multi-kernel comparison
β βββ benchmark/ # Framework-level benchmarking
β βββ benchmarker.py # Benchmark orchestration
β βββ config.py # Benchmark configuration
β βββ tracelens.py # TraceLens integration
β βββ result.py # Result data structures
βββ mcp/ # MCP Server
β βββ __init__.py
β βββ __main__.py # Entry point for python -m Magpie.mcp
β βββ server.py # MCP server implementation
β βββ config.json # MCP client configuration
βββ utils/ # Utility functions
MIT License. See LICENSE.