#ollama #coding-assistant #pair-programming #tui #ai-coding

bin+lib mermaid-cli

Open-source AI pair programmer with agentic capabilities. Local-first with Ollama, native tool calling, and beautiful TUI.

19 releases (5 breaking)

Uses new Rust 2024

0.7.1 Apr 26, 2026
0.5.1 Apr 12, 2026
0.5.0 Mar 30, 2026
0.2.2 Dec 19, 2025
0.2.0 Nov 17, 2025

#69 in Artificial intelligence

MIT/Apache

1.5MB
26K SLoC

Mermaid

An open-source AI coding assistant with computer use for the terminal. Multi-provider — Ollama (local), Anthropic, Gemini, OpenAI, Groq, OpenRouter, and any OpenAI-compatible endpoint — with native tool calling, subagents, desktop control, and a clean TUI.

Features

  • Multi-Provider — Ollama (local/cloud), Anthropic Claude, Google Gemini, OpenAI, Groq, OpenRouter, Cerebras, DeepInfra, Together, plus fully-custom OpenAI-compatible endpoints
  • Native Tool Calling — read, write, edit, execute commands, search the web, manage MCP servers
  • Computer Use — screenshot, click, type, scroll — full desktop control via vision models
  • Subagents — spawn parallel autonomous agents for independent tasks
  • Agent Loop — model calls tools autonomously, sees results, and continues until done
  • Image Paste — Ctrl+V to attach images for vision models (X11/Wayland/macOS/Windows)
  • Reasoning Levels — seven tiers (none/minimal/low/medium/high/xhigh/max); cycle with Alt+T or set via /reasoning; persisted per-model
  • MERMAID.md — auto-loaded project-level instructions; edits take effect on the next turn
  • MCP Servers — stdio JSON-RPC client with a built-in registry of 16 popular servers (mermaid add <name>)
  • Session Persistence — conversations auto-save and resume with --continue
  • Message Queuing — type while the model generates, messages send in order
  • Non-Interactive Mode — script with mermaid run "prompt" for CI/automation

Architecture

Mermaid's runtime is an Elm/MVU pattern: one pure reducer (fn update(State, Msg) -> (State, Vec<Cmd>)), effects as data, structured concurrency per turn. Whole classes of bug the old architecture let slip — duplicate error display, 20-press Ctrl+C during tool execution, stale stream events corrupting a new turn — are statically impossible against the new types.

Read docs/architecture.md for the full tour. The adding a tool and adding a provider recipes are one file each; docs/replay_debugging.md covers record/replay for reproducing bugs.

Quick Start

# Install from crates.io
cargo install mermaid-cli

# Or from source
git clone https://github.com/noahsabaj/mermaid-cli.git
cd mermaid-cli
cargo install --path .

Local inference requires Ollama (models auto-pull if not found locally). Cloud providers are optional — see Remote Providers below.

Computer Use Dependencies (optional)

For desktop control via screenshot/click/type tools:

# Linux / X11
sudo apt install scrot xdotool

# Linux / Wayland
sudo apt install grim ydotool wtype

# Screenshot downscaling (optional, for high-res displays)
sudo apt install imagemagick

macOS and Windows are supported through screencapture/pngpaste and PowerShell respectively. See src/providers/tool/computer_use/ for the full platform matrix.

Usage

mermaid                                         # Start fresh session
mermaid --continue                              # Resume last session
mermaid --sessions                              # Pick a previous session to resume
mermaid --model ollama/qwen3-coder:30b          # Ollama local
mermaid --model anthropic/claude-opus-4-7       # Anthropic (requires ANTHROPIC_API_KEY)
mermaid --model gemini/gemini-3.1-pro-preview   # Gemini (requires GOOGLE_API_KEY)
mermaid --model openai/gpt-5                    # OpenAI (requires OPENAI_API_KEY)
mermaid --model groq/qwen-qwq-32b               # Groq (requires GROQ_API_KEY)
mermaid --reasoning high                        # Override default reasoning depth
mermaid list                                    # List available models across providers
mermaid status                                  # Check Ollama, MCP, and provider config
mermaid init                                    # Create default config file
mermaid run "fix the tests"                     # Non-interactive mode
mermaid run "explain main.rs" -f json           # JSON output
mermaid add <name>                              # Add an MCP server (e.g., context7, git)
mermaid remove <name>                           # Remove a configured MCP server
mermaid mcp                                     # List configured MCP servers

mermaid add <name> resolves the name through a built-in registry of 16 popular MCP servers (context7, playwright, memory, git, fetch, time, filesystem, notion, slack, postgres, brave-search, supabase, perplexity, docker, sequential-thinking, everything), prompts for any required env vars, validates by spawning the server, and saves it to ~/.config/mermaid/config.toml.

Keyboard Shortcuts

Key Action
Enter Send message (or queue while the model is generating)
Esc Stop generation / clear input / dismiss command palette
Ctrl+C Quit (auto-saves the session)
Alt+T Cycle reasoning level: None → Low → Medium → High → Max → None
Ctrl+V Paste image or text from clipboard
Ctrl+Click Open image from chat history
/ Open slash-command palette (filter-as-you-type)
Tab In palette: complete highlighted command name
Up/Down Navigate input history; palette navigation; scroll chat
Page Up/Down Scroll chat
Mouse Wheel Scroll chat

Slash Commands

Type / to open the command palette (shows all commands with live filter); type /<name> to invoke directly.

Command Description
/model <name> Switch model; auto-pulls Ollama models if needed
/reasoning <level> Set reasoning: none, minimal, low, medium, high, xhigh, max
/clear Clear chat history and model context for this session
/save [name] Save the current conversation
/load [id] Load a saved conversation by id
/list List saved conversations
/cloud-setup Show Ollama Cloud API-key setup instructions
/help (/h) Show all commands
/quit (/q) Exit

Reasoning choices persist per-model: setting /reasoning high on Claude Opus 4.7 and /reasoning low on Ollama is remembered across sessions.

Tools

The model uses these autonomously via native tool calling:

Tool Description
read_file Read files (text, PDF, images)
write_file Create or overwrite files (timestamped backup if file exists)
edit_file Targeted text replacement with diff
delete_file Delete files (timestamped backup)
create_directory Create directories
execute_command Run shell commands; background mode registers PID/log/URL metadata for GUI apps and dev servers
web_search Search the web (Ollama Cloud)
web_fetch Fetch URL content as markdown (Ollama Cloud)
agent Spawn autonomous sub-agent for parallel tasks
screenshot Capture the screen (fullscreen, focused window, monitor, region, or window by title)
list_windows List visible window titles (discovery for window-mode screenshots)
click Click at screen coordinates (auto-screenshot after)
type_text Type text at cursor position (auto-screenshot after)
press_key Press key combos (ctrl+s, alt+tab, etc.)
scroll Scroll up or down
mouse_move Move mouse cursor without clicking

MCP servers contribute additional tools under the mcp__<server>__<tool> prefix when configured.

Project Instructions (MERMAID.md)

Create a MERMAID.md at your project root with conventions, tool versions, naming patterns, and run commands — Mermaid loads it automatically at session start and auto-reloads when the file changes (one stat per turn, no filesystem watcher). The walk stops at the .git root or $HOME.

# Project: foo-service

## Conventions
- snake_case for functions, PascalCase for types
- No `unwrap()` outside of tests
- Run `cargo nextest run` for tests (not `cargo test`)

## Build
- `just dev` — dev server on :8080

File size is capped at ~10k tokens; oversized content is truncated with a marker so the model knows context was elided.

Configuration

Config file: ~/.config/mermaid/config.toml (Linux) or platform equivalent via directories crate.

Run mermaid init to create a default config. Full surface:

# Last model picked via `--model` — used by bare `mermaid` on next start
last_used_model = "ollama/qwen3-coder:30b"

[default_model]
provider = "ollama"
name = "qwen3-coder:30b"
temperature = 0.7
max_tokens = 4096
reasoning = "medium"  # none | minimal | low | medium | high | xhigh | max

[ollama]
host = "localhost"
port = 11434
# cloud_api_key = "your-key"  # for :cloud models + web_search/web_fetch
# num_gpu = 10
# num_ctx = 8192

[non_interactive]
output_format = "text"
max_tokens = 4096
no_execute = false

# Per-model reasoning preferences (remembered across sessions)
[reasoning_per_model]
"anthropic/claude-opus-4-7" = "high"
"ollama/qwen3-coder:30b" = "low"

# Remote providers — override env-var name, base URL, or extra headers
[providers.anthropic]
# api_key_env = "MY_ANTHROPIC_KEY"  # default: ANTHROPIC_API_KEY

[providers.gemini]
# api_key_env = "MY_GOOGLE_KEY"  # default: GOOGLE_API_KEY; GEMINI_API_KEY is accepted as a legacy fallback

[providers.groq]
# api_key_env = "MY_GROQ_KEY"    # default: GROQ_API_KEY

# Custom OpenAI-compatible provider (e.g., self-hosted vLLM)
[providers.my-vllm]
base_url = "http://192.168.1.42:8000/v1"
api_key_env = "VLLM_KEY"
compat = "openai-effort"   # openai | openai-effort | openrouter
# default_model = "Qwen/Qwen2.5-Coder-32B-Instruct"

# MCP servers — usually managed via `mermaid add <name>`
[mcp_servers.context7]
command = "npx"
args = ["-y", "@upstash/context7-mcp"]

Remote Providers

Set the appropriate environment variable (or override via [providers.<name>].api_key_env in config):

Provider Env var Example model
Anthropic ANTHROPIC_API_KEY anthropic/claude-opus-4-7
Google Gemini GOOGLE_API_KEY (GEMINI_API_KEY legacy fallback) gemini/gemini-3.1-pro-preview
OpenAI OPENAI_API_KEY openai/gpt-5
Groq GROQ_API_KEY groq/qwen-qwq-32b
OpenRouter OPENROUTER_API_KEY openrouter/anthropic/claude-3.7-sonnet
Cerebras CEREBRAS_API_KEY cerebras/gpt-oss-120b
DeepInfra DEEPINFRA_API_KEY deepinfra/deepseek-ai/DeepSeek-R1
Together TOGETHER_API_KEY together/deepseek-ai/DeepSeek-R1
Ollama Cloud OLLAMA_API_KEY ollama/kimi-k2-thinking:cloud

Web search and web fetch tools require an Ollama Cloud API key — set OLLAMA_API_KEY or cloud_api_key under [ollama]. Use /cloud-setup in the TUI for the full instructions.

License

MIT OR Apache-2.0

Built with Ratatui and Ollama. Inspired by Aider and Claude Code.

Dependencies

~50–74MB
~1.5M SLoC