Autonomous A/B testing agent for small teams and indie builders.
Suggest tests. Implement variants. Simulate attention. Ship winners.
Traditional A/B testing tools are built for enterprises with dedicated growth teams and high traffic. Mutatr brings that capability to everyone else.
Mutatr is an Electron desktop app that uses the Claude Agent SDK to autonomously suggest, implement, and evaluate website experiments — and Playwright to render variants and produce attention heatmaps. When real traffic is low, it simulates visitors using synthetic customer personas so you can get directional signal before going live.
This repository currently packages and distributes macOS desktop builds only.
1. Point mutatr at your project folder
2. It discovers your pages and generates synthetic personas
3. Pick a page → mutatr suggests high-impact tests
4. Approve tests → mutatr implements the code changes autonomously
5. Select personas → mutatr simulates multi-visitor attention per variant
6. Compare control vs. variant heatmaps with an interactive slider
Claude analyzes your page source and personas to propose conversion-focused A/B tests with hypotheses, expected impact, and risk levels.
Approved tests are implemented by Claude in an isolated temp copy of your project. Variants are rendered from that copy, and only sanitized repo-relative files can be written back when you explicitly push a PR.
AI-generated customer personas (demographics, motivations, pain points, tone) that drive realistic attention simulations when real traffic isn't available.
Configurable visitor count per variant/persona pair. Multiple LLM calls run in parallel, and results are merged into Clarity-style heatmaps (blue → green → yellow → red) overlaid on full-page screenshots.
Interactive slider UI: control heatmap on the left, variant on the right. Switch between variants and personas to compare attention patterns.
Choose different Claude models (Sonnet, Opus, Haiku) for each stage — persona generation, test suggestions, implementation, and attention analysis.
| Workflow | Heatmap Comparison |
|---|---|
| Choose page → Treatments → Renders → Personas → Results | Control vs. variant slider with persona sidebar |
- Node.js 18+
- One of:
- A valid Claude Agent SDK local auth context
- An Anthropic API key (entered in Settings)
git clone https://github.com/novynlabs/mutatr.git
cd mutatr
npm installImported projects should already have their own dependencies installed. Mutatr will not run npm install for them automatically.
npm run devThis starts Vite (renderer) + Electron (main process) concurrently. The app window opens automatically.
npm run buildThis builds the renderer only.
npm run packageCreate distributable macOS artifacts:
npm run distnpm run package and npm run dist are macOS-only release commands.
-
Add a project — Click "New project" and select your web project folder. Mutatr discovers pages, starts a local dev server if it can, and renders thumbnails.
-
Create an experiment — Give it a name, then choose a target page.
-
Generate tests — Click "Suggest tests" to get AI-generated A/B test ideas with hypotheses and risk levels.
-
Implement variants — Select tests and click "Implement selected". Claude writes the code changes in an isolated temp copy; Playwright screenshots that rendered variant.
-
Run attention analysis — Select renders and personas, set visitor count, and click "Run test". Mutatr produces heatmaps for every (variant, persona) pair plus controls.
-
Compare results — Use the slider to compare control vs. variant attention, then review per-persona scorecards, issue detection, diff explanations, and aggregate variant scores.
mutatr/
├── electron/ # Main process
│ ├── main.mjs # IPC handlers, app lifecycle
│ ├── preload.cjs # Context bridge
│ └── services/
│ ├── claudeService.mjs # Claude Agent SDK calls
│ ├── playwrightService.mjs # Page rendering, heatmap generation
│ ├── projectService.mjs # Project discovery, dev server detection
│ └── store.mjs # JSON persistence
├── src/ # Renderer (React)
│ ├── App.tsx # Main UI — workflow, comparison slider, settings
│ ├── main.tsx # Entry point with browser-dev mock API
│ ├── components/ui/ # Radix UI primitives (dialog, tabs, button, etc.)
│ ├── types/contracts.ts # Shared TypeScript contracts
│ └── styles/app.css # Custom styles (Tailwind + OKLCH theme)
├── e2e/ # End-to-end tests
│ ├── run-e2e.mjs # Mock Claude E2E
│ └── run-live-e2e.mjs # Real Claude API E2E
└── public/ # Static assets
- Isolated variant rendering — Each test variant is implemented and rendered from its own temp copy. The original repo is not mutated during screenshot capture.
- Strict file-path boundary — Model-reported changed files are normalized against the project root, and unsafe paths are rejected before rendering or PR creation.
- Parallel visitor queries — All visitor LLM calls within an attention run fire concurrently via
Promise.all. - No automatic dependency installs — Imported projects must already be runnable; if a dev server cannot boot, mutatr renders a clear fallback tile instead of modifying the project.
- macOS distribution path — Electron Builder is configured for signed DMG + ZIP artifacts with hardened runtime entitlements and notarization-ready environment variables.
- Canvas-based heatmaps — Intensity map with additive blending → colormap, producing smooth full-coverage Clarity-style heatmaps.
npm run test:e2eRuns a full Electron E2E test with mocked Claude responses. Validates:
- Project add/open/remove
- Full experiment workflow (choose page → goal → treatments → renders → personas → results)
- Aggregate scorecard, issue detector, and diff explainer UI
- Persona generation and custom persona creation
- Settings save/clear and key-storage mode handling
E2E_CLAUDE_API_KEY=sk-ant-... npm run test:e2e:liveUses a real Claude API key to validate non-mock outputs against a multi-route fixture app.
Open Settings from the sidebar to configure:
| Setting | Description |
|---|---|
| Claude API Key | Anthropic API key (or leave empty for SDK local auth) |
| Personas model | Model for synthetic persona generation |
| Suggestions model | Model for test idea generation |
| Implementation model | Model for autonomous code changes |
| Attention model | Model for attention simulation |
Each can be set to Default (inherit), Sonnet, Opus, or Haiku.
npm run package and npm run dist now choose between two signing modes automatically:
- If a
Developer ID Applicationidentity orCSC_*signing environment is available, Electron Builder uses it. - Otherwise the build falls back to ad-hoc signing so the generated
.appstill launches correctly on the local machine instead of picking an arbitrary non-Apple certificate.
For public distribution, configure notarization and Developer ID signing.
Create an electron-builder.env file from electron-builder.env.example and set:
APPLE_API_KEY_IDAPPLE_API_ISSUERAPPLE_API_KEYAPPLE_TEAM_ID(recommended)
For CI or machines without the signing cert in Keychain, also set:
CSC_LINKCSC_KEY_PASSWORD
Then run:
npm run distThe release output lands in release/ and includes a SHA256SUMS.txt manifest.
The repo includes a mac-only workflow at .github/workflows/release-mac.yml. On v* tags it:
- validates the required signing/notarization secrets
- builds signed DMG + ZIP artifacts
- uploads them as workflow artifacts
- attaches them to the GitHub release
| Script | Description |
|---|---|
npm run dev |
Start Vite + Electron in development mode |
npm run dev:renderer |
Start Vite dev server only (for browser-only dev) |
npm run build |
Build the renderer for production |
npm run package |
Build the renderer and create an unpacked macOS app bundle with automatic Developer ID vs ad-hoc signing selection |
npm run dist |
Build macOS release artifacts and checksums with automatic Developer ID vs ad-hoc signing selection |
npm run typecheck |
Run TypeScript type checking |
npm run test:e2e |
Run E2E tests with mock Claude |
npm run test:e2e:live |
Run E2E tests with real Claude API |
| Layer | Technology |
|---|---|
| Desktop shell | Electron 37 (macOS distribution) |
| Frontend | React 19, TypeScript 5.9, Tailwind CSS 3 |
| Build | Vite 7 |
| AI | Claude Agent SDK (@anthropic-ai/claude-agent-sdk) |
| Browser automation | Playwright 1.53 |
| UI primitives | Radix UI (Dialog, Tabs, Checkbox) |
| Icons | Lucide React |
All state is stored locally in Electron's user data directory under mutatr-app/:
state.json— projects, experiments, personas, settings metadataclaudeApiKeyEncryptedinsidestate.jsonwhen secure OS storage is available; otherwise the key stays in memory for the current session onlyimages/— rendered screenshots and heatmap PNGs
Contributions are welcome! Please open an issue first to discuss what you'd like to change.
- Fork the repo
- Create your feature branch (
git checkout -b feat/my-feature) - Commit your changes
- Push to the branch
- Open a pull request
Built by Novyn Labs