Multi-LLM orchestration with agentic execution, memory, and training data generation.
Install • Quick Start • Features • Config
Beyond CLI wrappers. PuzldAI is a complete AI orchestration framework — route tasks, explore codebases, execute file edits, build memory, and generate training data.
PuzldAI is a terminal-native framework for orchestrating multiple AI agents. Route tasks to the best agent, compare responses, chain agents in pipelines, or let them collaborate. Agentic Mode gives LLMs tools to explore your codebase (view, glob, grep, bash) and propose file edits with permission prompts — like Claude Code, but for any LLM. Memory/RAG stores decisions and code for future context. Observation Layer logs everything for DPO fine-tuning. One framework that grows with your AI workflow.
| Problem | Solution |
|---|---|
| Claude is great at code, Gemini at research | Auto-routing picks the best agent |
| Need specific model versions | Model selection — pick sonnet, opus, haiku, etc. |
| Want multiple opinions | Compare mode runs all agents in parallel |
| Complex tasks need multiple steps | Pipelines chain agents together |
| Repetitive workflows | Workflows save and reuse pipelines |
| Need agents to review each other | Collaboration — correct, debate, consensus |
| Want LLM to explore & edit files safely | Agentic mode — tools, permission prompts, apply |
| Context gets lost between sessions | Memory/RAG — semantic retrieval of past decisions |
| Need data to fine-tune models | Observations — export DPO training pairs |
| Need AI to understand your codebase | Indexing — AST parsing, semantic search, AGENTS.md |
- Auto-routing — Ask anything. The right agent answers.
- Model Selection — Pick specific models per agent (sonnet, opus, haiku, etc.)
- Compare — Same question, multiple agents, side-by-side.
- Pipelines — Chain agents on-the-fly:
gemini:analyze → claude:code(CLI) - Workflows — Save pipelines as templates, run anywhere (TUI & CLI)
- Autopilot — Describe the goal. AI builds the plan.
- Multi-Agent Collaboration — Correct, debate, and build consensus across agents.
- Agentic Mode — LLMs explore your codebase, propose edits, you approve with permission prompts.
- Codebase Indexing — AST parsing, semantic search, AGENTS.md support.
- Memory/RAG — Semantic retrieval injects relevant context into prompts.
- Observation Layer — Logs all interactions for training data generation.
- Sessions — Persist chat history, resume conversations.
- TUI — Full terminal UI with autocomplete, history, keyboard nav.
| Agent | Source | Requirement | Agentic Mode |
|---|---|---|---|
| Claude | Anthropic | Claude CLI | ✅ Full support |
| Gemini | Gemini CLI | ||
| Codex | OpenAI | Codex CLI | |
| Ollama | Local | Ollama running | ✅ Full support |
| Mistral | Mistral AI | Vibe CLI |
Note: Some CLIs (Gemini, Codex) have built-in file reading that bypasses permission prompts. Claude and Ollama respect the permission system fully.
npm install -g puzldaiOr try without installing:
npx puzldaiUpdate:
npm update -g puzldai# Interactive TUI
puzldai
# Single task
puzldai run "explain recursion"
# Compare agents
puzldai compare claude,gemini "best error handling practices"
# Pipeline: analyze → code → review
puzldai run "build a logger" -P "gemini:analyze,claude:code,gemini:review"
# Multi-agent collaboration
puzldai correct "write a sort function" --producer claude --reviewer gemini
puzldai debate "microservices vs monolith" -a claude,gemini
puzldai consensus "best database choice" -a claude,gemini,ollama
# Check what's available
puzldai check
puzldalso works as a shorter alias.
| Mode | Pattern | Use Case | Category |
|---|---|---|---|
| Single | One agent processes task | Quick questions, simple tasks | Basic |
| Compare | Same task → multiple agents in parallel | See different perspectives | Parallel |
| Pipeline | Agent A → Agent B → Agent C | Multi-step processing | Sequencing |
| Workflow | Saved pipeline, reusable | Repeatable workflows | Sequencing |
| Autopilot | LLM generates plan → executes | Complex tasks, unknown steps | AI Planning |
| Correct | Producer → Reviewer → Fix | Quality assurance, code review | Collaboration |
| Debate | Agents argue in rounds, optional moderator | Find flaws in reasoning | Collaboration |
| Consensus | Propose → Vote → Synthesize | High-confidence answers | Collaboration |
| Agentic | LLM explores → Tools → Permission prompts → Apply | Codebase exploration + file edits | Execution |
| Plan | LLM analyzes task → Describes approach | Planning before implementation | Execution |
| Build | LLM explores + edits with full tool access | Direct implementation with tools | Execution |
| Mode | Option | Type | Default | Description |
|---|---|---|---|---|
| Single | agent |
AgentName | auto |
Which agent to use |
model |
string | — | Override model (e.g., sonnet, opus) | |
| Compare | agents |
AgentName[] | — | Agents to compare (min 2) |
sequential |
boolean | false |
Run one-at-a-time vs parallel | |
pick |
boolean | false |
LLM selects best response | |
| Pipeline | steps |
PipelineStep[] | — | Sequence of agent:action |
interactive |
boolean | false |
Confirm between steps | |
| Workflow | name |
string | — | Workflow to load |
interactive |
boolean | false |
Confirm between steps | |
| Autopilot | planner |
AgentName | ollama |
Agent that generates plan |
execute |
boolean | false |
Auto-run generated plan | |
| Correct | producer |
AgentName | auto |
Agent that creates output |
reviewer |
AgentName | auto |
Agent that critiques | |
fixAfterReview |
boolean | false |
Producer fixes based on review | |
| Debate | agents |
AgentName[] | — | Debating agents (min 2) |
rounds |
number | 2 |
Number of debate rounds | |
moderator |
AgentName | none |
Synthesizes final conclusion | |
| Consensus | agents |
AgentName[] | — | Participating agents (min 2) |
maxRounds |
number | 2 |
Voting rounds | |
synthesizer |
AgentName | auto |
Creates final output | |
| Agentic | agent |
AgentName | claude |
Agent to use for exploration |
tools |
string[] | all | Available tools (view, glob, grep, bash, write, edit) | |
| Plan | agent |
AgentName | claude |
Agent to analyze task |
| Build | agent |
AgentName | claude |
Agent to implement |
Pick specific models for each agent. Aliases like sonnet, opus, haiku always point to the latest version. Specific versions like claude-sonnet-4-20250514 are pinned.
# TUI
/model # Open model selection panel
# CLI
puzldai model show # Show current models for all agents
puzldai model list # List all available models
puzldai model list claude # List models for specific agent
puzldai model set claude opus # Set model for an agent
puzldai model clear claude # Reset to CLI default
# Per-task override
puzldai run "task" -m opus # Override model for this run
puzldai agent -a claude -m haiku # Interactive mode with specific modelRun the same prompt on multiple agents and compare results side-by-side.
Three views: side-by-side, expanded, or stacked.
# TUI
/compare claude,gemini "explain async/await"
/sequential # Toggle: run one-at-a-time
/pick # Toggle: select best response
# CLI
puzldai compare "task" # Default: claude,gemini
puzldai compare "task" -a claude,gemini,codex # Specify agents
puzldai compare "task" -s # Sequential mode
puzldai compare "task" -p # Pick best responseChain multiple agents together for complex tasks. Each agent handles a specific step.
puzldai run "build a REST API" -P "gemini:analyze,claude:code,gemini:review"
puzldai run "task" -P "claude:plan,codex:code" -i # Interactive: pause between stepsSave pipelines as reusable templates. Run them anywhere with a single command.
Three views: side-by-side, expanded, or stacked.
# TUI
/workflow code-review "my code here"
/workflows # Manage templates (interactive)
/interactive # Toggle: pause between steps
# CLI
puzldai run "task" -T code-review
puzldai run "task" -T code-review -i # Interactive mode
puzldai template list # List all templates
puzldai template show my-flow # Show template details
puzldai template create my-flow -P "claude:plan,codex:code"
puzldai template delete my-flow # Delete templateDescribe the goal. AI analyzes the task, builds a multi-step plan, and executes it automatically using the best agents for each step.
With /execute enabled, results display in 3 view modes: side-by-side, expanded, or stacked.
# TUI
/autopilot "build a todo app with authentication"
/planner claude # Set planner agent
/execute # Toggle auto-execution on/off
# CLI
puzldai autopilot "task" # Generate plan only
puzldai autopilot "task" -x # Generate and execute
puzldai autopilot "task" -p claude # Use specific agent as plannerGet multiple agents to work together through correction, debate, or consensus.
One agent produces, another reviews. Optionally fix based on feedback.
# TUI
/correct claude gemini "write a sorting algorithm"
# CLI
puzldai correct "task" --producer claude --reviewer gemini
puzldai correct "task" --producer claude --reviewer gemini --fixAgents debate a topic across multiple rounds. Optional moderator summarizes.
# TUI
/debate claude,gemini "Is functional programming better than OOP?"
# CLI
puzldai debate "topic" -a claude,gemini
puzldai debate "topic" -a claude,gemini -r 3 -m ollama # 3 rounds + moderatorAgents propose solutions, vote on them, and synthesize a final answer.
# TUI
/consensus claude,gemini,ollama "best database for this use case"
# CLI
puzldai consensus "task" -a claude,gemini,ollama
puzldai consensus "task" -a claude,gemini -r 3 -s claude # 3 rounds + synthesizerAll collaboration modes support 3 view modes: side-by-side, expanded, and stacked.
Configure rounds, moderator, and synthesizer in /settings.
LLMs explore your codebase using tools, then propose file edits with permission prompts (like Claude Code). PuzldAI acts as the execution layer — the LLM explores and proposes, you approve what gets executed.
# TUI - Use @agent syntax to trigger agentic mode
@claude fix the bug in src/utils.ts
@gemini add error handling to api/routes.ts
@ollama create a hello world script
# Or use commands
/plan @claude refactor the auth system # Plan only (no execution)
/build @claude implement the login form # Full implementation with toolsTools available to LLM:
| Tool | Description |
|---|---|
view |
Read file contents with line numbers |
glob |
Find files by pattern (e.g., **/*.ts) |
grep |
Search file contents with regex |
bash |
Execute shell commands |
write |
Create or overwrite files |
edit |
Search and replace in files |
Permission prompts (like Claude Code):
Allow— Execute this tool callAllow from directory— Auto-approve reads from this directoryAllow all reads— Auto-approve all file readsDeny— Skip this tool callEsc— Cancel entire operation
Live tool activity:
- Colored status dots: ● green (done), yellow (running), red (error), gray (pending)
- Tree-style result display with truncation
Ctrl+Sto expand/collapse full output
How it works:
- You describe the task with
@agent - LLM explores codebase using tools (view, glob, grep)
- Each tool call shows a permission prompt
- LLM proposes file edits (write, edit)
- You approve or deny each change
- PuzldAI applies approved changes
Consensus → Agentic workflow: Run consensus first, then continue with an agent. The consensus result is automatically injected as context:
/consensus claude,gemini "best approach for auth"
# Choose "Continue"
@claude implement this # Has consensus contextPuzldAI includes a memory system that stores conversations, decisions, and code patterns for future retrieval.
Memory types:
conversation— Q&A pairs from sessionsdecision— Accepted file edits and explanationscode— Code snippets and patternspattern— Reusable solutions
How it works:
- Observations from
/agenticare automatically saved to memory - When you accept file edits, the decision is stored for future context
- Semantic search retrieves relevant memories for new prompts
- Uses SQLite FTS5 (zero dependencies) or Ollama embeddings when available
Embedding models (auto-detected):
nomic-embed-text(recommended)mxbai-embed-largeall-minilm
All /agentic interactions are logged for training data generation:
- Inputs: Prompts, injected context, agent/model used
- Outputs: LLM responses, proposed files, explanations
- Decisions: Which files were accepted/rejected
- Edits: User modifications to proposed content
Export for fine-tuning:
import { exportObservations, exportPreferencePairs } from 'puzldai/observation';
// Export all observations as JSONL
exportObservations({ outputPath: 'observations.jsonl', format: 'jsonl' });
// Export DPO training pairs (chosen vs rejected)
exportPreferencePairs({ outputPath: 'preferences.jsonl', format: 'jsonl' });DPO pair types:
accept_reject— User accepted some files, rejected othersuser_edit— User modified the LLM's proposed contentfull_reject— User rejected all proposed files
Index your codebase for semantic search and automatic context injection.
# TUI
/index # Open indexing panel
/index search "auth" # Search indexed code
# CLI
puzld index # Index current directory
puzld index --quick # Skip embeddings (faster)
puzld index --search "handleLogin"
puzld index --context "fix auth bug"
puzld index --config # Show detected config files
puzld index --graph # Show dependency graphWhat gets indexed:
- Functions, classes, interfaces, types
- Import/export relationships
- File dependencies with tsconfig path alias support
Project instructions (auto-injected into prompts):
AGENTS.md— Project-wide instructionsCLAUDE.md,CODEX.md— Agent-specific instructions.cursorrules,copilot-instructions.md— IDE rules.puzldai/agents/*.md— Per-agent instructions
When you run /agentic, project instructions are automatically injected into the prompt.
/compare claude,gemini "task" Compare agents side-by-side
/autopilot "task" AI-planned workflow
/workflow code-review "code" Run saved workflow
/workflows Manage templates
/correct claude gemini "task" Cross-agent correction
/debate claude,gemini "topic" Multi-agent debate
/consensus claude,gemini "task" Build consensus
@claude "task" Agentic mode with Claude
@gemini "task" Agentic mode with Gemini
/plan @claude "task" Plan mode (analyze, no execution)
/build @claude "task" Build mode (full tool access)
/index Codebase indexing options
/index search "query" Search indexed code
/session Start new session
/resume Resume previous session
/settings Open settings panel
/changelog Show version history
/agent claude Switch agent
/model Model selection panel
/router ollama Set routing agent
/planner claude Set autopilot planner
/sequential Toggle: compare one-at-a-time
/pick Toggle: select best from compare
/execute Toggle: auto-run autopilot plans
/interactive Toggle: pause between steps
/help All commands
puzldai # Launch TUI
puzldai run "task" # Single task
puzldai run "task" -a claude # Force agent
puzldai run "task" -m opus # Override model
puzldai run "task" -P "..." # Pipeline
puzldai run "task" -T template # Use template
puzldai run "task" -i # Interactive: pause between steps
puzldai compare "task" # Compare (default: claude,gemini)
puzldai compare "task" -a a,b,c # Specify agents
puzldai compare "task" -s # Sequential mode
puzldai compare "task" -p # Pick best response
puzldai autopilot "task" # Generate plan
puzldai autopilot "task" -x # Plan + execute
puzldai autopilot "task" -p claude # Use specific planner
puzldai correct "task" --producer claude --reviewer gemini
puzldai correct "task" --producer claude --reviewer gemini --fix
puzldai debate "topic" -a claude,gemini -r 3 -m ollama
puzldai consensus "task" -a claude,gemini -r 3 -s claude
puzldai session list # List sessions
puzldai session new # Create new session
puzldai check # Agent status
puzldai agent # Interactive agent mode
puzldai agent -a claude # Force specific agent
puzldai agent -m sonnet # With specific model
puzldai model show # Show current models
puzldai model list # List available models
puzldai model set claude opus # Set model for agent
puzldai model clear claude # Reset to CLI default
puzldai serve # API server
puzldai serve -p 8080 # Custom port
puzldai serve -w # With web terminal
puzldai template list # List templates
puzldai template show <name> # Show template details
puzldai template create <name> -P "..." -d "desc"
puzldai template edit <name> # Edit template
puzldai template delete <name> # Delete template
puzldai index # Index codebase
puzldai index --quick # Skip embeddings
puzldai index --search "query" # Search indexed code
puzldai index --context "task" # Get relevant context
puzldai index --config # Show project config~/.puzldai/config.json
{
"defaultAgent": "auto",
"fallbackAgent": "claude",
"routerModel": "llama3.2",
"adapters": {
"claude": { "enabled": true, "path": "claude", "model": "sonnet" },
"gemini": { "enabled": true, "path": "gemini", "model": "gemini-2.5-pro" },
"codex": { "enabled": false, "path": "codex", "model": "gpt-5.1-codex" },
"ollama": { "enabled": true, "model": "llama3.2" },
"mistral": { "enabled": true, "path": "vibe" }
}
}User Input (@claude "fix bug")
│
▼
┌─────────┐ ┌────────────┐ ┌──────────┐
│ CLI/TUI │────▶│ Orchestrator│────▶│ Adapters │
└─────────┘ └────────────┘ └──────────┘
│ │
┌─────────────┼─────────────┐ │
▼ ▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌──────────────┐
│ Router │ │ Memory │ │ Agents │
│ (Ollama) │ │ (RAG) │ │ Claude │
└───────────┘ └───────────┘ │ Gemini │
│ │ │ Codex │
▼ ▼ │ Ollama │
┌───────────┐ ┌───────────┐ │ Mistral │
│ Indexing │ │Observation│ └──────────────┘
│ (AST/FTS) │ │ Logger │
└───────────┘ └───────────┘
│ │
▼ ▼
┌───────────────────────────────┐
│ Agent Loop │
│ LLM ──▶ Tool Call ──▶ Result │
│ ▲ │ │ │
│ │ ┌─────▼─────┐ │ │
│ │ │ Permission│ │ │
│ │ │ Prompts │ │ │
│ │ └───────────┘ │ │
│ └──────────────────────┘ │
└───────────────────────────────┘
│ │
▼ ▼
┌───────────┐ ┌───────────┐
│ Diff │ │ Export │
│ Review │ │ (DPO) │
└───────────┘ └───────────┘
PuzldAI doesn't handle your AI credentials directly. Instead, it orchestrates the official CLI tools you already have installed:
| What PuzldAI Does | What PuzldAI Doesn't Do |
|---|---|
Calls claude, gemini, codex binaries |
Store your API keys |
| Passes prompts, receives responses | Handle OAuth flows |
| Respects each CLI's auth state | Piggyback on private OAuth clients |
Why this matters:
- No credential exposure — Your tokens stay with the official CLIs
- No piggybacking — We don't borrow OAuth client IDs or reverse-engineer auth endpoints
- No terms violations — We use CLIs exactly as their creators intended
- Always up-to-date — When CLIs update their auth, you get it automatically
- Your auth, your control — Log in once per CLI, PuzldAI just orchestrates
Some tools bypass official CLIs to call APIs directly using piggybacked credentials or unofficial OAuth flows. PuzldAI takes a different approach: we wrap the tools you trust, nothing more.
git clone https://github.com/MedChaouch/Puzld.ai.git
cd Puzld.ai
bun install
bun run build
npm link
puzldaiPull requests welcome! Please ensure your changes pass the build before submitting.
AGPL-3.0-only — See LICENSE
Built by Med Chaouch