Of course.
Below are the four documents you requested for the LLM Coding Platform
(LLCP), extracted and synthesized from our conversations.
1. architecture.md: The complete system design, from edge to data.
2. roadmap.md: A phased plan from POC to production-ready scale.
-=tech-stack.md: A list of all open-source projects and their licenses.
3. task-template.yaml: An example input spec for a new micro-service.
This plan is designed to be lean, iterative, and avoids the major upfront costs and
operational headaches we discussed, while still providing a clear path to owning
the full stack when the time is right.
________________
1. architecture.md
LLM Coding Platform (LLCP) - System Architecture
This document outlines the architecture for the LLCP, a multi-agent system designed
to generate, test, refine, and deploy production-ready micro-services with minimal
Human-in-the-Loop (HITL).
1. Architectural Principles
* Domain-Driven Design (DDD): Each micro-service is a bounded context with clear
ownership and a well-defined API.
* Event-Driven & Asynchronous: Services communicate via an event bus (Redpanda) and
a job queue (Oban), keeping them decoupled and resilient.
* Infrastructure as Code (IaC): All infrastructure (VMs, firewall, databases) is
defined in Terraform and deployed with Helm, ensuring reproducibility and easy
teardowns.
* Security by Design: UIs and APIs are protected by default using Identity-Aware
Proxy (IAP), and services run with least-privilege IAM roles.
* Progressive Enhancement: The stack starts with hosted SaaS and CPU-based models,
migrating to self-hosted and GPU-backed infrastructure only when unit economics
justify it.
2. High-Level Component Diagram
flowchart TD
subgraph Planning & Orchestration
A(Developer/PM) -->|writes spec.md| B(Claude-Task-Master)
B -- RAG --> C{RAGFlow + MCPs}
B -- YAML --> D[tasks/inbox/]
D --> E[Oban Job Queue]
end
subgraph Code Generation & Testing (LangGraph)
E -- Triggers --> F(LangGraph Pipeline)
F -- RAG --> C
F -- Code --> G[Git Repo]
G -- PR --> H[CI/CD Pipeline]
end
subgraph Dev & Admin UI
I(Focalboard UI) <--> E
J(Flowise / OpenWebUI) <--> C
G <--> J
end
3. Component Breakdown (Bounded Contexts)
Service
Technology
Description
Planning Engine
Claude-Task-Master (CTM)
Decomposes high-level specs (epic_features.md) into atomic, machine-
readable .task.yaml files. Queries RAG to ground its plans.
Job Queue
Oban (Elixir)
Watches tasks/inbox/, enqueues a LangGraph run for each task, and handles
retries, scheduling, and concurrency.
Code Pipeline
LangGraph (Python)
A multi-agent state machine (Draft → Test → Refine → Static → Polish) that
consumes a .task.yaml and outputs PRs with production-ready code.
Retrieval Layer
RAGFlow + MCP Servers
Provides grounded context. RAGFlow holds static docs (books, guides); each
MCP server exposes live examples and mocks for a specific service (Telnyx,
ClickHouse, etc.).
Model Serving
Ollama (on ragbox VM)
Serves open-source models (Mistral, Qwen, DeepSeek) for core pipeline stages.
Premium models (Claude, GPT-4o) are called via API for escalation.
Task UI
Focalboard (self-hosted)
Kanban board that provides a visual "single pane of glass" for the entire
pipeline, from Backlog to Deployed.
Dev Playground
Flowise + OpenWebUI
UI for ad-hoc RAG queries, prompt tuning, and model comparison. For internal
use by prompt engineers.
4. Data & Event Flow
1. A developer or PM writes or updates architecture.md, tech-stack.md, and a high-
level epic_features.md.
2. ctm plan is executed (manually or via CI). It reads the source documents,
queries RAGFlow and Context7 for grounding, and outputs a series of .task.yaml
files.
3. A script moves these files to tasks/inbox/ and creates a corresponding card in
the Focalboard Backlog via REST API.
4. A developer drags a card to "In Progress". This triggers a webhook that enqueues
an Oban job with the task YAML path.
5. The Oban worker calls the LangGraph pipeline.
* DraftAgent reads the task YAML, pulls context from the MergedRetriever (RAG +
all MCPs), and generates initial code.
* TestWriterAgent generates ExUnit tests based on acceptance criteria.
* UnitTest Node runs mix test inside a disposable Docker container, hitting MCP
mocks.
* If tests fail, RefineAgent gets the error logs and loops up to 3 times.
* If tests pass, StaticAnalysis Node runs Credo, Dialyzer, and Sobelow. Failures
loop to a StaticRefine Agent.
* If all checks are green, a PR is opened in the target Git repository.
6. The Oban worker patches the Focalboard card with status updates (UnitTest:Pass,
Static:Fail, etc.).
5. Infrastructure & Security
* VMs: One ragbox (e2-standard-8, 32GB RAM) for RAG/Ollama/UIs and one jobsbox (e2-
micro) for Oban. Both are provisioned with Terraform.
* Auto Start/Stop: ragbox runs an idle-check.sh service to shut down after 15
minutes of no LLM activity. A Cloud Scheduler job starts it on weekdays.
* Security: All UIs (Focalboard, Flowise, OpenWebUI) are placed behind Google Cloud
IAP, requiring Google account authentication. Internal VM-to-VM traffic is on a
private VPC network.
________________
2. roadmap.md
LLM Coding Platform (LLCP) - Product Roadmap
This roadmap outlines the phased development of the LLCP, from a minimal proof-of-
concept to a production-grade, multi-service generation pipeline.
Phase 0: POC - The Core Loop
Goal: Prove that the Draft → Test → Refine loop can generate a single, simple, CI-
passing Elixir micro-service.
* Features:
* Deploy ragbox VM with Docker-Compose stack: RAGFlow, Ollama, Flowise,
OpenWebUI.
* Set up LangGraph Cloud account and basic elixir_service_flow graph.
* Write and test the three core prompt templates: draft.j2, test_writer.j2,
refine.j2.
* Ingest Elixir/OTP books into RAGFlow and verify retrieval in Flowise.d
* Manually write one template-registry.task.yaml spec.
* POC Success Criteria: Execute the pipeline on the template-registry task; it
must produce a PR with code that compiles and passes all generated unit tests.
Phase 1: MVP - The Automated Pipeline
Goal: Automate the entire pipeline from spec to PR, managed through a visual UI,
with static analysis included.
* Features:
* Integrate Claude-Task-Master (CTM) to auto-generate task YAML from high-level
specs.
* Deploy jobsbox VM with Oban to create a persistent job queue.
* Deploy Focalboard and write the ctm_to_focal.exs script to auto-populate the
backlog.
* Add the StaticAnalysis and StaticRefine nodes to the LangGraph pipeline.
* Implement MCP servers for Telnyx, ClickHouse, and Postgres, and add them to
the MergedRetriever.
* MVP Success Criteria: The pipeline can generate the first 3-4 micro-services
from the architecture map with ≥80% of runs requiring no HITL.
Phase 2: Productionizing & Scaling
Goal: Harden the pipeline for performance, security, and observability, making it
reliable for generating production services.
* Features:
* Add premium model escalation (Claude 3.5 / GPT-4o) for high-difficulty tasks
or persistent failures.
* Implement Contract Testing (Pact) and Telnyx Replay nodes in the LangGraph
pipeline.
* Integrate Static & Dynamic Security Scanning (Semgrep, OWASP Zap).
* Add Performance Regression node (k6/Gatling) to the pipeline.
* Deploy PromEx and wire Telemetry hooks into all generated services and
pipeline components.
* Build Grafana dashboards for pipeline latency, cost per service, and success
rate.
* Success Criteria: The pipeline can batch-generate 10+ micro-services overnight
with a >70% success rate and full observability.
Phase 3: Enterprise & Optimization (Year 2)
Goal: Expand the pipeline to handle multi-team collaboration, different languages,
and auto-documentation.
* Features:
* Add new roles/prompt templates for Python services, React/LiveView frontend
components, and GraphQL connectors.
* Integrate Swimm and Redocly to auto-generate internal and external
documentation from the pipeline outputs.
* Implement a multi-tenant model for teams, with separate Focalboard boards and
RAG collections.
* Add a human review queue and feedback loop that automatically uses approved
fixes to fine-tune RAG or LoRA adapters.
* Success Criteria: A non-expert developer can specify, generate, and deploy a
new, fully-documented micro-service in under a day.
________________
3. tech-stack.md
LLM Coding Platform (LLCP) - Technology Stack
This document lists the core open-source projects used in the LLCP.
Project
GitHub Repo / Link
License
RAGFlow
infiniflow/ragflow
Apache-2.0
Ollama
ollama/ollama
MIT
Claude-Task-Master
eyaltoledano/claude-task-master
MIT
LangChain/LangGraph
langchain-ai/langchain
MIT
Oban
sorentwo/oban
AGPL-3.0
Focalboard
mattermost-community/focalboard
MIT
Flowise AI
FlowiseAI/Flowise
Apache-2.0
Open WebUI
open-webui/open-webui
MIT
PostgreSQL
postgres/postgres
PostgreSQL Licence
Elixir
elixir-lang/elixir
Apache-2.0
Phoenix Framework
phoenixframework/phoenix
MIT
PromEx
akoutmos/prom_ex
MIT
Credo
rrrene/credo
MIT
Dialyxir
jeremyjh/dialyxir
MIT
Sobelow
nccgroup/sobelow
Apache-2.0
Terraform
hashicorp/terraform
BSL-1.1
________________
4. task-template.yaml
This file serves as the contract between the CTM planning stage and the LangGraph
execution stage.
# tasks/template-registry.task.yaml
id: POC-001
title: "template-registry: CRUD for conversation templates"
difficulty: 2
tags: [backend, elixir, phoenix, postgres]
# CTM will auto-populate this from architecture.md and its own planning
files:
- path: "services/template_registry/lib/template_registry.ex"
purpose: "The main context module for CRUD operations."
- path: "services/template_registry/lib/template_registry/template.ex"
purpose: "The Ecto schema for the 'templates' table."
- path: "services/template_registry/priv/repo/migrations/..."
purpose: "The Ecto migration file to create the 'templates' table."
- path: "services/template_registry/lib/template_registry_web/schema.ex"
purpose: "The Absinthe GraphQL schema with queries and mutations."
- path: "services/template_registry/test/template_registry_test.exs"
purpose: "ExUnit tests for the context module."
- path: "services/template_registry/test/template_registry_web/schema_test.exs"
purpose: "ExUnit tests for the GraphQL endpoints."
# Acceptance criteria drive the TestWriterAgent and validation nodes
acceptance:
unit_test_coverage: 0.90
credo_preset: "strict"
dialyzer_plt_apps: ["ecto", "phoenix"]
graphql_queries:
- "query listTemplates(agencyId: UUID!)"
- "mutation createTemplate(name: String!, manifest: JSON!)"
# This informs the LangGraph runner which prompt templates to use
# CTM v0.4 can auto-suggest this based on file paths and dependencies
templateVariants:
- "draft"
- "db_arch"
- "graphql_api"
# Tells the pipeline which external systems this service interacts with
requiredResources:
- postgres
Labeled Code Snippets for Future Reference
* [SCRIPT-01: Docker Compose Stack]
* [SCRIPT-02: ragbox_startup.sh]
* [SCRIPT-03: idle_check.sh]
* [SCRIPT-04: model_switch.sh]
* [SCRIPT-05: ctm_to_focal.exs]
* [SCRIPT-06: Oban TaskRunner Worker]
* [TERRAFORM-01: Full GCP Stack]
* [PROMPT-01: DraftAgent Template]
* [PROMPT-02: RefineAgent Template]
* [PROMPT-03: TestWriterAgent Template]
* [PROMPT-04: StaticRefineAgent Template]
________________
How the Snippets Connect in the Pipeline
Snippet Label
Its Role in the Pipeline
How It Connects to Others
[TERRAFORM-01: Full GCP Stack]
Provisioning. Creates the ragbox and jobsbox VMs, the firewall, and the auto-
start scheduler.
It calls [SCRIPT-02: ragbox_startup.sh] to configure the VM after it boots.
It creates the hardware that the other scripts run on.
[SCRIPT-02: ragbox_startup.sh]
Configuration Management. Installs Docker, pulls all necessary containers,
and sets up the auto-stop service.
It contains the definition for [SCRIPT-01: Docker Compose Stack] and [SCRIPT-
03: idle_check.sh]. It's the "master installer" for the ragbox VM.
[SCRIPT-01: Docker Compose Stack]
Service Definition. Defines the containers (RAGFlow, Ollama, MCPs, Flowise,
OpenWebUI) that run on ragbox.
It is embedded within and executed by [SCRIPT-02: ragbox_startup.sh]. It
doesn't run on its own.
[SCRIPT-03: idle_check.sh]
Cost Control. A simple utility that checks for active LLM containers and
shuts down the VM if idle.
It is created and enabled by [SCRIPT-02: ragbox_startup.sh] as a systemd
service.
[SCRIPT-04: model_switch.sh]
On-Demand Utility. A helper script placed on ragbox to allow a user (or UI
button) to switch the active Ollama model.
It interacts with the ollama container defined in the Docker Compose stack.
It's for post-deployment use.
[SCRIPT-05: ctm_to_focal.exs]
Pipeline Glue (Part 1). Takes the YAML files generated by Claude-Task-Master
and creates corresponding cards in Focalboard.
It runs after CTM completes its planning phase and before any coding work
begins. It populates the UI backlog.
[SCRIPT-06: Oban TaskRunner Worker]
Pipeline Glue (Part 2). This is the job queue worker that is triggered when a
card is moved in Focalboard.
It consumes the task YAML that CTM created and initiates the LangGraph
pipeline, calling the prompt templates.
[PROMPT-01: DraftAgent Template]
Agent Brain (Role 1). The initial instruction set for generating the first
pass of code.
It is called by the LangGraph pipeline, which is started by the Oban
TaskRunner. It consumes the CTM task YAML.
[PROMPT-03: TestWriterAgent Template]
Agent Brain (Role 2). The instruction set for generating unit tests for the
code created by the DraftAgent.
It is called by LangGraph after the DraftAgent succeeds.
[PROMPT-02: RefineAgent Template]
Agent Brain (Role 3). The instruction set for fixing code when unit tests
fail.
It is called by LangGraph only if the unit tests from the TestWriterAgent
fail.
[PROMPT-04: StaticRefineAgent Template]
Agent Brain (Role 4). An additional refinement step for fixing static
analysis errors (Credo, Dialyzer).
It is called by LangGraph after unit tests have passed, as a final quality
gate.
________________
Visual Flow of Snippets
flowchart TD
subgraph DevOps Setup
T1([TERRAFORM-01]) -->|provisions| VM
subgraph VM [ragbox VM]
S2([SCRIPT-02]) -->|creates| S1([SCRIPT-01])
S2 -->|creates & runs| S3([SCRIPT-03])
S2 -->|installs| S4([SCRIPT-04])
end
end
subgraph Pipeline Execution
CTM[CTM Plan] --> S5([SCRIPT-05])
S5 --> Focalboard[Focalboard UI]
Focalboard --webhook--> S6([SCRIPT-06])
S6 --> LangGraph[LangGraph]
subgraph LangGraph Roles
P1([PROMPT-01])
P3([PROMPT-03])
P2([PROMPT-02])
P4([PROMPT-04])
end
LangGraph --> P1 --> P3 --> P2 --> P4
end
Bottom Line
The snippets are designed to be a complete, non-conflicting set. You can
confidently hand them over to your DevOps engineer. The architecture moves
logically from:
1. Infrastructure Provisioning (Terraform)
2. Software Configuration (Startup Script + Docker Compose)
3. Pipeline Triggering & Management (CTM → Focalboard → Oban)
4. Intelligent Execution (LangGraph using the various prompt templates)
Each script has a distinct purpose and is called at the appropriate stage of the
setup or execution process.
________________
[SCRIPT-01: Docker Compose Stack]
* Purpose: Defines and orchestrates all the necessary services that run on the
ragbox VM. This is the heart of the development and testing environment.
* Filename: docker-compose.yml
# docker-compose.yml
#
# This file defines the multi-container stack for the LLCP.
# It includes the RAG engine, local LLM server, UI playgrounds,
# and databases needed for the POC.
#
# To run: `docker compose up -d`
# To stop: `docker compose down`
version: "3.9"
services:
# RAGFlow: The core Retrieval-Augmented Generation engine.
# Handles document uploading, chunking, embedding, and vector search.
ragflow:
image: infiniflow/ragflow:latest
ports: ["8080:8080"]
volumes: ["./data/ragflow:/data"] # Persists uploaded documents and indexes
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
# Ollama: Serves open-source LLMs (Mistral, Qwen, DeepSeek) locally.
# The LangGraph pipeline will make API calls to this container for code
generation.
ollama:
image: ollama/ollama:latest
ports: ["11434:11434"]
volumes: ["./data/ollama:/root/.ollama"] # Persists downloaded model weights
environment:
- OLLAMA_MODELS=/models # Optional: specify a directory for models
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
interval: 30s
timeout: 10s
retries: 5
# Context7 MCP: Serves as a retriever for the existing codebase.
# This helps the LLMs stay consistent with internal patterns.
mcp_context7:
image: context7/mcp:latest
ports: ["9090:9090"]
# Telnyx MCP: Provides live docs and mock endpoints for Telnyx APIs.
# Critical for generating and testing any service that interacts with Telnyx.
mcp_telnyx:
image: ghcr.io/team-telnyx/telnyx-mcp-server:latest
environment:
# This key should be stored securely, e.g., in a .env file and passed in.
- TELNYX_API_KEY=${TELNYX_API_KEY}
ports: ["7111:7111"]
# Flowise AI: A no-code UI for building and testing RAG + LLM chains.
# Use this for prompt engineering and ad-hoc RAG queries.
flowise:
image: flowiseai/flowise:latest
ports: ["3001:3000"]
environment:
- PORT=3000
- DATABASE_TYPE=sqlite
- SECRETKEY_OVERWRITE=${FLOWISE_SECRET_KEY} # Use a stable secret for
persistence
# Open WebUI: A ChatGPT-style UI for all models running in Ollama.
# Perfect for side-by-side model comparison and quick queries.
openwebui:
image: ghcr.io/open-webui/open-webui:latest
ports: ["3002:3000"]
volumes: ["./data/webui:/app/backend/data", "./data/ollama:/root/.ollama"]
environment:
- "OLLAMA_BASE_URL=http://ollama:11434"
# PostgreSQL: The database for the first micro-service POC (template-registry)
# and will be used by Oban and Focalboard.
postgres:
image: postgres:16
environment:
- POSTGRES_USER=llcp_user
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD} # Load from .env
- POSTGRES_DB=llcp_dev
ports: ["5432:5432"]
volumes: ["./data/postgres:/var/lib/postgresql/data"] # Persists database
________________
[SCRIPT-02: ragbox_startup.sh]
* Purpose: A master script to provision the ragbox VM on its first boot. It
installs dependencies, creates the Docker Compose file, and sets up the auto-
shutdown service.
* Filename: ragbox_startup.sh
#!/bin/bash
#
# ragbox_startup.sh
# This script is executed once by GCP metadata-from-file on the first boot
# of the 'ragbox' instance.
#
# It performs the following actions:
# 1. Installs Docker and Docker Compose.
# 2. Creates the directory structure for the AI stack.
# 3. Writes the docker-compose.yml file.
# 4. Pulls and starts all service containers.
# 5. Preloads the initial set of LLM models into Ollama.
# 6. Sets up and enables a systemd service for auto-shutdown on idle.
# --- 1. Install Dependencies ---
echo "⚙️ Installing Docker and dependencies..."
apt-get update -y
apt-get install -y docker.io docker-compose-plugin curl git
# Add the default user to the docker group to run docker commands without sudo
usermod -aG docker $(logname)
# --- 2. Create Directory Structure ---
echo "📁 Creating directory structure at /opt/ai-stack..."
mkdir -p /opt/ai-stack/data/{ragflow,ollama,webui,postgres}
cd /opt/ai-stack
# --- 3. Create Docker Compose File ---
echo "📄 Writing docker-compose.yml..."
# This uses a HEREDOC to write the multi-line YAML file.
# The content is the same as [SCRIPT-01].
cat > docker-compose.yml <<'YAML'
version: "3.9"
services:
ragflow:
image: infiniflow/ragflow:latest
ports: ["8080:8080"]
volumes: ["./data/ragflow:/data"]
ollama:
image: ollama/ollama:latest
ports: ["11434:11434"]
volumes: ["./data/ollama:/root/.ollama"]
mcp_context7:
image: context7/mcp:latest
ports: ["9090:9090"]
mcp_telnyx:
image: ghcr.io/team-telnyx/telnyx-mcp-server:latest
environment: ["TELNYX_API_KEY=${TELNYX_API_KEY}"]
ports: ["7111:7111"]
flowise:
image: flowiseai/flowise:latest
ports: ["3001:3000"]
environment: ["PORT=3000", "DATABASE_TYPE=sqlite", "SECRETKEY_OVERWRITE=$
{FLOWISE_SECRET_KEY}"]
openwebui:
image: ghcr.io/open-webui/open-webui:latest
ports: ["3002:3000"]
volumes: ["./data/webui:/app/backend/data", "./data/ollama:/root/.ollama"]
environment: ["OLLAMA_BASE_URL=http://ollama:11434"]
postgres:
image: postgres:16
environment: ["POSTGRES_USER=llcp_user", "POSTGRES_PASSWORD=$
{POSTGRES_PASSWORD}", "POSTGRES_DB=llcp_dev"]
ports: ["5432:5432"]
volumes: ["./data/postgres:/var/lib/postgresql/data"]
YAML
# --- 4. Start Services ---
echo "🚀 Starting all services via Docker Compose..."
# We pass --env-file to load secrets securely from a .env file.
# This .env file should be created manually or via a secure mechanism.
touch .env # Create if it doesn't exist
docker compose --env-file .env up -d
# --- 5. Preload Models into Ollama ---
echo "🧠 Preloading initial LLM models into Ollama (this may take a few minutes)..."
docker exec ollama ollama pull mistral:7b-instruct-q4_K_M
docker exec ollama ollama pull qwen3-embedding:0.6b-q4_K_M
docker exec ollama ollama pull deepseek-r1:6.7b-q4_0
echo "✅ Models preloaded."
# --- 6. Setup Auto-Stop Service ---
echo "⏳ Setting up idle-check service for cost savings..."
# This service will run the idle-check script every 5 minutes.
cat > /etc/systemd/system/idle-check.service <<'SERVICE'
[Unit]
Description=Stop VM if Ollama has been idle for over 15 minutes
[Service]
Type=simple
ExecStart=/usr/local/bin/idle-check.sh
Restart=always
RestartSec=300
[Install]
WantedBy=multi-user.target
SERVICE
# The script itself, which contains the logic for checking idleness.
# This script is the same as [SCRIPT-03].
cat > /usr/local/bin/idle-check.sh <<'SCRIPT'
#!/bin/bash
# If any container with 'ollama' in its name is NOT running, and the VM
# has been up for more than 900 seconds (15 minutes), shut it down.
# This is a simple proxy for "no LLM tasks are active".
if ! docker ps --format '{{.Names}}' | grep -q 'ollama'; then
UPTIME_SECONDS=$(awk '{print int($1)}' /proc/uptime)
if [ "$UPTIME_SECONDS" -gt 900 ]; then
echo "VM has been idle for over 15 minutes. Shutting down."
# The service account attached to the VM needs 'compute.instances.stop'
permission.
gcloud compute instances stop $(hostname) --zone=$(gcloud compute instances
list --filter="name=$(hostname)" --format="value(zone)") --quiet
fi
fi
SCRIPT
# Make the script executable and enable the service to run on boot.
chmod +x /usr/local/bin/idle-check.sh
systemctl daemon-reload
systemctl enable --now idle-check.service
echo "✅ 'ragbox' setup complete."
________________
[SCRIPT-03: idle_check.sh]
* Purpose: A simple utility to automatically shut down the ragbox VM to save costs
when it's not being used.
* Filename: idle-check.sh (to be placed at /usr/local/bin/)
#!/bin/bash
#
# idle_check.sh
#
# This script is run by a systemd timer every 5 minutes.
# Its purpose is to stop the GCP VM if it has been idle for a while.
# "Idle" is defined as: no Ollama container is actively running a model.
#
# As a simple proxy for the POC, we check if the main 'ollama' container
# is running. A more advanced version could check `ollama ps` for active models.
# Check if any container with 'ollama' in its name is running.
# The '|| true' prevents the script from exiting if grep finds nothing.
OLLAMA_IS_RUNNING=$(docker ps --format '{{.Names}}' | grep -q 'ollama' && echo
"true" || echo "false")
if [ "$OLLAMA_IS_RUNNING" = "false" ]; then
# Get the system uptime in seconds.
UPTIME_SECONDS=$(awk '{print int($1)}' /proc/uptime)
# If uptime is greater than 900 seconds (15 minutes), it's safe to shut down.
# This prevents shutdown during the initial boot and setup process.
if [ "$UPTIME_SECONDS" -gt 900 ]; then
echo "Ollama container not found and VM has been up for over 15 minutes.
Shutting down to save costs."
# The VM needs the 'compute.instances.stop' permission on its service account.
# It dynamically finds its own name and zone to issue the stop command.
gcloud compute instances stop $(hostname) --zone=$(gcloud compute instances
describe $(hostname) --format='get(zone)' | awk -F/ '{print $NF}') --quiet
fi
fi
________________
[SCRIPT-04: model_switch.sh]
* Purpose: An on-demand utility to switch the default model used by UIs like
OpenWebUI. It pulls a new model and tags it as current.
* Filename: model_switch.sh (to be placed at /usr/local/bin/)
#!/bin/bash
#
# model_switch.sh
#
# A utility script to easily switch the default model for testing.
# It pulls the specified model from Ollama's library if it doesn't exist,
# and then tags it as 'current' for any tool configured to use that tag.
#
# Usage:
# ./model_switch.sh mistral:7b-instruct-q4_K_M
# ./model_switch.sh deepseek-r1:6.7b-q4_0
set -e # Exit immediately if a command exits with a non-zero status.
# --- Input Validation ---
MODEL_TO_SET=$1
if [[ -z "$MODEL_TO_SET" ]]; then
echo "❌ Error: No model specified."
echo "Usage: $0 <model-name-in-ollama>"
echo "Example: $0 mistral:7b-instruct"
exit 1
fi
# --- Main Logic ---
echo "🔄 Pulling model '$MODEL_TO_SET' (if not already present)..."
# This command is idempotent. If the model exists, it does nothing.
docker exec ollama ollama pull "$MODEL_TO_SET" >/dev/null
echo "🔄 Tagging '$MODEL_TO_SET' as 'current'..."
# This creates an alias. `ollama run current` will now run this model.
docker exec ollama ollama tag "$MODEL_TO_SET" current
echo "✅ Success! Model '$MODEL_TO_SET' is now tagged as 'current'."
________________
[SCRIPT-05: ctm_to_focal.exs]
* Purpose: A bridge script to translate the .task.yaml files from CTM into visual
cards on a Focalboard Kanban board.
* Filename: scripts/ctm_to_focal.exs
# scripts/ctm_to_focal.exs
#
# An Elixir script to parse task YAML files generated by Claude-Task-Master
# and create corresponding cards on a Focalboard via its REST API.
#
# To run:
# mix run scripts/ctm_to_focal.exs <focalboard_url> <focal_token> <board_id>
<column_id>
# Use Mix.install to fetch dependencies on the fly without a full Mix project.
Mix.install([
{:httpoison, "~> 2.2"},
{:yaml_elixir, "~> 2.9"},
{:jason, "~> 1.4"}
])
defmodule CtmToFocal do
def main(args) do
# --- 1. Parse CLI Arguments ---
[focal_url, token, board_id, column_id] = args
IO.puts("🚀 Populating Focalboard backlog...")
# --- 2. Find All Task Files ---
task_files = Path.wildcard("tasks/**/*.task.yaml")
# --- 3. Iterate and Create Cards ---
for path <- task_files do
# Read the YAML content from the task file.
spec = YamlElixir.read_from_file!(path)
# Construct the JSON payload for the Focalboard API.
# The card's description will contain the full YAML for reference.
card_payload = %{
"boardId" => board_id,
"title" => get_in(spec, ["title"]),
"content" => [
%{
"type" => "text",
"data" => %{"text" => "```yaml\n#{File.read!(path)}\n```"}
}
],
"fields" => %{
"properties" => %{
"status" => column_id # This places the card in the 'Backlog' column
}
}
}
# Define the HTTP headers, including the auth token.
headers = [
{"Authorization", "Bearer #{token}"},
{"Content-Type", "application/json"}
]
# --- 4. Make the API Call ---
case HTTPoison.post("#{focal_url}/api/v2/cards", Jason.encode!(card_payload),
headers) do
{:ok, %{status_code: 200}} ->
IO.puts("✅ Created card for: #{get_in(spec, ["title"])}")
{:error, reason} ->
IO.inspect(reason, label: "❌ Failed to create card for #{path}")
end
end
IO.puts("✅ Done.")
end
end
# Entry point for the script
CtmToFocal.main(System.argv())
________________
[SCRIPT-06: Oban TaskRunner Worker]
* Purpose: The Elixir Oban worker that consumes a task from the queue, determines
the right model, and triggers the LangGraph pipeline.
* Filename: lib/my_app/workers/task_runner.ex
# lib/my_app/workers/task_runner.ex
#
# This Oban worker is the central "glue" between the task queue and the
# LangGraph code generation pipeline. It's triggered when a task is ready
# to be worked on (e.g., moved to 'In Progress' in Focalboard).
defmodule MyApp.Workers.TaskRunner do
use Oban.Worker, queue: :langgraph_tasks, max_attempts: 3
# A mock/stub for the LangGraph service. In reality, this would be
# a proper HTTP client (e.g., using Tesla or Req).
alias MyApp.Services.LangGraph
@impl true
def perform(%Oban.Job{args: %{"task_path" => path, "card_id" => card_id}}) do
# --- 1. Read and Parse the Task Spec ---
spec = YamlElixir.read_from_file!(path)
# --- 2. Route to the Correct Model Based on Difficulty ---
# This logic allows us to use cheap, open-source models for simple tasks
# and escalate to more powerful (and expensive) models like GPT-4o or Claude
# only when the task is flagged as difficult.
model =
case get_in(spec, ["difficulty"]) do
3 -> "gpt-4o"
2 -> "deepseek-r1:6.7b-q4_0"
_ -> "mistral:7b-instruct-q4_K_M" # Default for difficulty 1 or unset
end
# --- 3. Trigger the LangGraph Pipeline ---
# The LangGraph service is responsible for the multi-agent Draft->Test->Refine
loop.
# We pass the full task spec and the chosen model.
case LangGraph.run(%{
template: get_in(spec, ["promptTemplate"], "draft.j2"),
spec_yaml: spec,
model: model
}) do
# --- 4. Update Focalboard with the Result ---
{:ok, result} ->
# If the pipeline succeeds, update the card status to 'Done'.
patch_focalboard_card(card_id, "Done", result.pr_url)
:ok
{:error, reason} ->
# If it fails after all retries, mark it for human review.
patch_focalboard_card(card_id, "HITL_Required", reason)
{:error, reason}
end
end
# Helper function to update the card in Focalboard.
defp patch_focalboard_card(card_id, status, details) do
# This would contain the HTTPoison call to the Focalboard API,
# similar to the logic in [SCRIPT-05].
IO.puts("PATCHING card #{card_id} with status: #{status}, details: #{details}")
end
end
Of course. Based on our extensive discussions, I have a clear understanding of the
architecture and the interplay between Claude-Task-Master (CTM), RAGFlow, and the
LangGraph pipeline. The new CTM features you referenced further clarify how it can
serve as an intelligent "front door" to the entire system.
Below are the five artifacts you requested, written to be production-ready for your
POC.
________________
[TERRAFORM-01: Full GCP Stack]
This stack provisions the two necessary VMs, a firewall rule for Google's Identity-
Aware Proxy (IAP), and a Cloud Scheduler job to auto-start the ragbox on weekdays.
variables.tf
# variables.tf
#
# Defines the input variables for the Terraform stack, making it reusable
# across different environments (dev, staging, prod).
variable "gcp_project_id" {
type = string
description = "The GCP project ID where resources will be deployed."
}
variable "gcp_region" {
type = string
description = "The GCP region for deployment."
default = "us-central1"
}
variable "gcp_zone" {
type = string
description = "The GCP zone for deployment."
default = "us-central1-a"
}
variable "service_account_email" {
type = string
description = "The email of the service account for VM and scheduler
permissions."
}
variable "ssh_public_key_file" {
type = string
description = "Path to the SSH public key file for VM access."
default = "~/.ssh/id_rsa.pub"
}
main.tf
# main.tf
#
# Provisions the core infrastructure for the LLCP Proof-of-Concept.
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 5.21"
}
}
}
provider "google" {
project = var.gcp_project_id
region = var.gcp_region
zone = var.gcp_zone
}
# --- VM 1: The main worker box for RAG, Ollama, and UIs ---
resource "google_compute_instance" "ragbox" {
name = "ragbox"
machine_type = "e2-standard-8" # 8 vCPU, 32 GB RAM for RAG + one LLM
zone = var.gcp_zone
tags = ["iap-allowed", "ragbox"]
boot_disk {
initialize_params {
image = "ubuntu-os-cloud/ubuntu-2204-lts"
size = 100 # GB, for Docker images and model weights
}
}
# The startup script provisions the entire Docker stack.
metadata_startup_script = file("ragbox_startup.sh")
service_account {
email = var.service_account_email
scopes = ["cloud-platform"] # Full access for self-management (e.g., stop)
}
network_interface {
network = "default"
access_config {} # Assigns an ephemeral public IP
}
metadata = {
ssh-keys = "devops:${file(var.ssh_public_key_file)}"
}
}
# --- VM 2: A lightweight, always-on box for the Oban job queue ---
resource "google_compute_instance" "jobsbox" {
name = "jobsbox"
machine_type = "e2-micro" # Always-free tier
zone = var.gcp_zone
tags = ["internal"]
boot_disk {
initialize_params {
image = "ubuntu-os-cloud/ubuntu-2204-lts"
}
}
service_account {
email = var.service_account_email
scopes = ["cloud-platform"]
}
network_interface {
network = "default"
}
}
# --- Firewall Rule for Google's Identity-Aware Proxy (IAP) ---
# This locks down access to the UIs to only authenticated Google users.
resource "google_compute_firewall" "allow_iap" {
name = "allow-iap-to-ragbox-uis"
network = "default"
direction = "INGRESS"
allow {
protocol = "tcp"
# Ports for RAGFlow, MCPs, UIs, and Postgres
ports = ["8080", "11434", "9090", "7111", "3001", "3002", "5432"]
}
# This is the official, fixed IP range for all IAP traffic.
source_ranges = ["35.235.240.0/20"]
target_tags = ["iap-allowed"]
}
# --- Cloud Scheduler: Auto-start ragbox on weekdays ---
# This job ensures the VM is running at the start of the workday.
# The idle-check script on the VM itself handles shutdown.
resource "google_cloud_scheduler_job" "start_ragbox_weekdays" {
name = "start-ragbox-weekdays"
schedule = "50 7 * * 1-5" # 7:50 AM every Monday-Friday
time_zone = "America/Chicago"
http_target {
http_method = "POST"
uri = "https://compute.googleapis.com/compute/v1/projects/$
{var.gcp_project_id}/zones/${var.gcp_zone}/instances/ragbox/start"
# Use OIDC token for secure, keyless authentication.
oidc_token {
service_account_email = var.service_account_email
}
}
}
________________
[PROMPT-01: DraftAgent Template]
* Filename: prompts/draft.j2
#################### SYSTEM ####################
You are a world-class, 15-year senior Elixir engineer specializing in fault-
tolerant, scalable systems using OTP {{ spec.otp_version }}.
Your task is to generate the complete, production-ready implementation for the
micro-service described below.
# Coding Contract:
- Target Elixir version: {{ spec.elixir_version }}
- Style: Follow the official Elixir style guide and the Credo "strict" preset. Use
a pipe-first, functional approach.
- Types: Every public function MUST include a `@spec` definition for static
analysis.
- Documentation: Every public function MUST include a `@doc` block with an example.
- Concurrency: Use OTP primitives (GenServer, Supervisor, Task) where appropriate
for the required features.
# Output Contract:
Return **ONLY** a single, valid JSON object with the following schema. Do not
include any introductory text, closing remarks, or narration.
{
"files": [
{
"path": "lib/my_app/my_module.ex",
"content": "UTF-8 encoded file content as a single string"
}
],
"notes": "A brief, <120-word rationale for your design choices."
}
#################### SERVICE SPECIFICATION (from CTM) ####################
{{ spec_yaml }}
#################### CONTEXT (from RAGFlow & MCPs) ####################
# The following snippets provide relevant examples from documentation, cookbooks,
# internal codebases, and API definitions. Prefer snippets with higher
authoritativeness.
{{ context }}
#################### YOUR TASK ####################
Draft the complete and robust implementation for the service defined in the
specification. Ensure all required files, including application, supervisor,
contexts, schemas, migrations, and basic configuration, are generated.
________________
[PROMPT-02: RefineAgent Template]
* Filename: prompts/refine.j2 (For Unit Test Failures)
#################### SYSTEM ####################
You are a senior Elixir developer acting as the RefineAgent. Your sole purpose is
to fix failing unit tests by patching the provided code. You must be precise and
change only what is necessary to make the tests pass.
# Output Contract:
Return **ONLY** a single, valid JSON object with the "files" key, containing the
patched file(s). Do not include files that are unchanged.
{
"files": [
{
"path": "lib/my_app/my_module.ex",
"content": "The full, corrected content of the file"
}
]
}
#################### CONTEXT & ERRORS ####################
# The following unit tests are failing. Your goal is to make them green.
====== FAILING TEST OUTPUT ======
{{ test_output }}
====== RELEVANT CODE (from previous step) ======
{{ previous_files }}
#################### YOUR TASK ####################
Analyze the test failures and the provided code. Identify the root cause of the
error(s) and provide the complete, corrected content for the affected file(s).
________________
[PROMPT-03: TestWriterAgent Template]
* Filename: prompts/test_writer.j2
#################### SYSTEM ####################
You are a QA engineer specializing in writing comprehensive and idiomatic tests for
Elixir/Phoenix applications using ExUnit.
# Testing Contract:
- Achieve a minimum test coverage of {{ spec.acceptance.unit_test_coverage |
default(0.90) * 100 }}%.
- Test all public functions and GraphQL/API endpoints defined in the spec.
- Use `Mox` to mock external dependencies where necessary.
- For services that interact with external APIs (like Telnyx or ClickHouse), write
tests that hit the corresponding MCP mock endpoints.
- Use the test template provided in the spec if available.
# Output Contract:
Return **ONLY** a single, valid JSON object containing the test files.
{
"files": [
{
"path": "test/my_app/my_module_test.exs",
"content": "The ExUnit test code"
}
]
}
#################### SERVICE SPECIFICATION ####################
# Write tests that validate the behavior and contracts defined in this spec.
{{ spec_yaml }}
#################### CONTEXT (from RAGFlow & MCPs) ####################
# These snippets show examples of well-written tests and mock usage.
{{ context }}
#################### YOUR TASK ####################
Generate the complete ExUnit test suite for the service.
________________
[PROMPT-04: StaticRefineAgent Template]
* Filename: prompts/refine_static.j2 (For Static Analysis Failures)
#################### SYSTEM ####################
You are a senior Elixir developer acting as the StaticRefineAgent. Your task is to
refactor the provided code to resolve all static analysis warnings from Credo,
Dialyzer, and Sobelow.
# Output Contract:
Return **ONLY** a single, valid JSON object with the "files" key, containing the
patched file(s).
{
"files": [
{
"path": "lib/my_app/my_module.ex",
"content": "The full, refactored content of the file"
}
]
}
#################### CONTEXT & ERRORS ####################
# The following static analysis tools reported errors. Your goal is to make them
pass.
====== STATIC ANALYSIS ERRORS ======
{{ static_errors }}
====== RELEVANT CODE (from previous step) ======
{{ previous_files }}
#################### YOUR TASK ####################
Analyze the static analysis report and the provided code. Refactor the code to
resolve all reported issues while preserving the original functionality.