kirocc

A local proxy server that relays Anthropic Messages API-compatible requests to the Kiro (Amazon Q) backend using Kiro CLI credentials.

Just set ANTHROPIC_BASE_URL from any Anthropic API client (e.g., Claude Code) to use Claude models via Kiro.

Features

Anthropic Messages API compatible — Supports /v1/messages (streaming / non-streaming), /v1/messages/count_tokens, and /v1/models
Request conversion — Automatically converts Anthropic API requests to Kiro API (AWS Event Stream) format
Response conversion — Converts Kiro event streams back to Anthropic SSE format
Automatic auth management — Reads credentials from Kiro CLI's SQLite DB with automatic token refresh (Social / OIDC)
Model mapping — Maps Anthropic model names (e.g., claude-sonnet-4-6) to Kiro model names. Customizable via environment variable
Extended Thinking — Enable via the [1m] suffix, the thinking field, or output_config.effort. Reasoning depth travels natively as additionalModelRequestFields.output_config.effort (validated against each model's enum; defaults to medium for effort-capable models when thinking is on without an explicit effort)
Tool Search — Proxy-side implementation of Anthropic's Tool Search Tool. Supports tool_search_tool_regex_20251119 and tool_search_tool_bm25_20251119 with defer_loading for on-demand tool discovery
Prompt Caching — Converts Anthropic tool-level cache_control to Kiro cachePoint
Truncation detection — Automatically injects a notice into the next request when a response is truncated
Retry — Exponential backoff retry for 403 (token expiry), 429, and 5xx errors. Also retries thinking-only (empty visible) responses
API key auth — Optional access restriction for the proxy itself
CORS — Allows requests from localhost origins
File logging — Write structured logs (OTel JSON Lines) to a rotating file via lumberjack. Defaults optimized for coding agent consumption (10 MB, uncompressed)
OpenTelemetry tracing — Opt-in distributed tracing via --otel with OTLP HTTP exporter. Captures request/response headers and body as span events across the full proxy chain

Prerequisites

Go 1.26+
Kiro CLI installed and logged in

Installation

Homebrew

brew install d-kuro/tap/kirocc

go install

go install github.com/d-kuro/kirocc/cmd/kirocc@latest

Usage

Start the server

kirocc

Listens on http://127.0.0.1:3456 by default.

Use with Claude Code

export ANTHROPIC_BASE_URL=http://127.0.0.1:3456
export ANTHROPIC_AUTH_TOKEN=dummy
claude

ANTHROPIC_AUTH_TOKEN is required by Claude Code but not used for authentication by kirocc (credentials are read from Kiro CLI's DB). Any non-empty value works unless -api-key is set.

Command-line options

Flag	Default	Description
`-port`	`3456`	Listen port
`-host`	`127.0.0.1`	Bind host
`-db`	(OS-dependent, see below)	Kiro CLI SQLite DB path
`-api-key`	(none)	API key required to access the proxy
`-debug`	`false`	Enable debug logging
`-log-file`	(none)	Write logs to file with rotation (file-only by default)
`-log-max-size`	`10`	Max log file size in MB before rotation
`-log-max-backups`	`5`	Max number of old log files to retain
`-log-max-age`	`7`	Max days to retain old log files
`-log-compress`	`false`	Compress rotated log files with gzip
`-log-console`	`false`	Also write logs to console when `-log-file` is set
`-otel`	`false`	Enable OpenTelemetry tracing (OTLP HTTP exporter)
`-otel-body-limit`	`32768`	Max bytes of request body to capture in OTel spans (0 = unlimited)

Default DB path

OS	Path
macOS	`~/Library/Application Support/kiro-cli/data.sqlite3`
Linux	`~/.local/share/kiro-cli/data.sqlite3`

Environment variables

Command-line options can be overridden with environment variables.

Variable	Corresponding option
`KIROCC_PORT`	`-port`
`KIROCC_HOST`	`-host`
`KIROCC_DB_PATH`	`-db`
`KIROCC_API_KEY`	`-api-key`
`KIROCC_DEBUG`	`-debug`
`KIROCC_LOG_FILE`	`-log-file`
`KIROCC_LOG_MAX_SIZE`	`-log-max-size`
`KIROCC_LOG_MAX_BACKUPS`	`-log-max-backups`
`KIROCC_LOG_MAX_AGE`	`-log-max-age`
`KIROCC_LOG_COMPRESS`	`-log-compress`
`KIROCC_LOG_CONSOLE`	`-log-console`
`KIROCC_OTEL`	`-otel`
`KIROCC_OTEL_BODY_LIMIT`	`-otel-body-limit`

OpenTelemetry tracing

Enable distributed tracing to visualize the full request chain in Jaeger, Grafana Tempo, or any OTLP-compatible backend.

# Start a local collector (e.g., Grafana LGTM stack)
docker run -d --name lgtm -p 3000:3000 -p 4317:4317 -p 4318:4318 grafana/otel-lgtm

# Start kirocc with tracing enabled
kirocc -otel

The OTLP endpoint defaults to http://localhost:4318 and can be configured via the standard OTEL_EXPORTER_OTLP_ENDPOINT environment variable.

Custom model mappings

Use the KIROCC_MODEL_MAPPINGS environment variable to override model name mappings.

export KIROCC_MODEL_MAPPINGS='[{"anthropic":"my-model","kiro":"claude-sonnet-4.5","context_window_size":200000}]'

Endpoints

Path	Description
`GET /health`	Health check
`GET /v1/models`	List available models
`POST /v1/messages`	Messages API (streaming / non-streaming)
`POST /v1/messages/count_tokens`	Token count (approximate *)

* count_tokens uses the cl100k_base encoding from tiktoken-go, which differs from Claude's actual tokenizer. The returned value is an approximation.

Architecture

flowchart TB
    subgraph Client
        CC["Claude Code / Anthropic API Client"]
    end

    subgraph kirocc ["kirocc (localhost:3456)"]
        direction TB
        MW["Middleware<br/>(OTel Tracing, Trace ID, CORS, API Key Auth)"]
        Handler["Messages Handler"]
        Auth["Auth<br/>(SQLite + Token Refresh)"]

        subgraph reqconv ["Request Conversion"]
            direction LR
            ModelResolve["Model Resolution<br/>claude-sonnet-4-6 → claude-sonnet-4.6"]
            MsgNorm["Message Normalization"]
            ToolConv["Tool & Schema Conversion"]
            ToolSearch["Tool Search<br/>(regex / BM25)"]
            EffortResolve["Effort Resolution<br/>(native output_config.effort)"]
            EnvState["Env State<br/>(&lt;env&gt; block → operatingSystem/cwd, current message only)"]
            CacheConv["Cache Point Conversion<br/>(tool-level only)"]
        end

        subgraph respconv ["Response Conversion"]
            direction LR
            EventParse["AWS Event Stream Parser"]
            ThinkingParse["Thinking Tag Parser"]
            SSEWrite["SSE Writer"]
            TruncDetect["Truncation Detection"]
            GateWrite["Gate Writer<br/>(buffered retry)"]
        end
    end

    subgraph Kiro ["Kiro API"]
        KiroAPI["runtime.{region}.kiro.dev"]
    end

    CC -- "Anthropic Messages API<br/>(JSON / SSE)" --> MW
    MW --> Handler
    Handler --> Auth
    Handler --> reqconv
    reqconv -- "Kiro Payload<br/>(JSON)" --> KiroAPI
    KiroAPI -- "AWS Event Stream<br/>(binary frames)" --> respconv
    respconv -- "Anthropic SSE / JSON" --> CC

Request flow

Client sends an Anthropic Messages API request to kirocc
Middleware assigns a trace ID, handles CORS, and validates the API key
Auth reads/refreshes credentials from Kiro CLI's SQLite DB
Handler resolves the model name and determines thinking mode
Request conversion pipeline:
- Normalizes messages (merges consecutive same-role messages, extracts text/images/tool_use/tool_result from multi-block content)
- Converts tools and sanitizes JSON Schema (removes unsupported keywords, flattens anyOf/oneOf/allOf)
- If tool search tools are present, partitions tools into active/deferred and injects a proxy-side ToolSearch tool
- Extracts system prompt and places it as a history entry pair
- Parses the <env> block from the system prompt into envState (operatingSystem, currentWorkingDirectory) and attaches it to the current message only
- Reorders tool results to match the preceding assistant's tool_use order
- Forwards reasoning effort natively as additionalModelRequestFields.output_config.effort at the request root (sibling of conversationState); the resolved effort is validated/clamped per model
- Converts Anthropic tool-level cache_control to Kiro cachePoint
Kiro API returns an AWS Event Stream (binary frames)
Response conversion pipeline:
- Parses binary event stream frames
- Converts cumulative text to incremental deltas
- Intercepts ToolSearch tool_use calls, executes search, emits server_tool_use/tool_search_tool_result SSE events, and re-requests Kiro with discovered tools (up to 3 rounds)
- Parses <thinking> tags from assistantResponseEvent or uses reasoningContentEvent (with deduplication)
- Enforces stop_sequences and max_tokens adapter-side
- Detects truncated responses and stores them; a notice is injected into the next request
- Gate Writer buffers output until visible content arrives, enabling transparent retry of thinking-only responses

Extended Thinking

kiro-cli 2.5.1 expresses reasoning depth natively through output_config.effort. kirocc forwards it as additionalModelRequestFields.output_config.effort at the request root (sibling of conversationState):

{
  "conversationState": { "...": "..." },
  "additionalModelRequestFields": {
    "output_config": { "effort": "medium" }
  }
}

Thinking is enabled by any of:

Model name with [1m] suffix (e.g., claude-sonnet-4-6[1m])
Anthropic-Beta header containing context-1m (e.g., context-1m-2025-01-01)
thinking.type set to "enabled" or "adaptive" in the request

The reasoning effort sent to the backend is resolved as follows:

An explicit, recognized output_config.effort wins, validated/clamped to the model's allowed enum (xhigh on a 4-value model clamps to max; unrecognized strings are dropped).
Otherwise, if reasoning is enabled (via thinking.type, the [1m] suffix, or the context-1m header) without an explicit effort, a default effort of medium is sent so the intent reaches the backend.
Otherwise the field is omitted.

Per-model allowed effort levels:

claude-opus-4.8, claude-opus-4.7: low, medium, high, xhigh, max
claude-opus-4.6, claude-sonnet-4.6 (and their -1m variants): low, medium, high, max (no xhigh; clamps to max)
All other models omit additionalModelRequestFields entirely

thinking.budget_tokens is accepted in the request but no longer affects behavior; reasoning depth is conveyed entirely through effort.

Tool Search

The Kiro backend does not support Anthropic's Tool Search Tool. kirocc implements it proxy-side with an inner loop:

Client sends tool_search_tool_regex_20251119 (or bm25) + tools with defer_loading: true
Proxy partitions tools into active (sent to Kiro) and deferred (held for search)
Proxy injects a ToolSearch tool definition that Kiro can understand
When the model calls ToolSearch, the proxy intercepts the tool_use:
- Executes regex or BM25 search against deferred tools
- Emits server_tool_use + tool_search_tool_result SSE events to the client
- Promotes discovered tools to active and rebuilds the Kiro request
- Calls Kiro again with the updated tool list (up to 3 rounds)
When the model calls a regular tool or produces text, the response is forwarded to the client

Supported query forms:

select:Read,Edit,Grep — exact tool selection by name
read file — keyword search (regex with word-level OR fallback, or BM25 scoring)

Model mappings

Input model	Kiro model	Context window
`claude-sonnet-4-6`	`claude-sonnet-4.6`	200k
`claude-sonnet-4-6[1m]`	`claude-sonnet-4.6-1m`	1M
`claude-sonnet-4.5`	`claude-sonnet-4.5`	200k
`claude-sonnet-4.5[1m]`	`claude-sonnet-4.5-1m`	1M
`claude-opus-4-8`	`claude-opus-4.8`	1M
`claude-opus-4-8[1m]`	`claude-opus-4.8`	1M
`claude-opus-4-7`	`claude-opus-4.7`	1M
`claude-opus-4-7[1m]`	`claude-opus-4.7`	1M
`claude-opus-4-6`	`claude-opus-4.6`	1M
`claude-opus-4-6[1m]`	`claude-opus-4.6`	1M
`claude-opus-4.5`	`claude-opus-4.5`	200k
`claude-haiku-4.5`	`claude-haiku-4.5`	200k

Opus 4.6, 4.7, and 4.8 always use 1M context (no 200k SKU exists upstream). The explicit [1m]-suffixed aliases (claude-opus-4-8[1m] / claude-opus-4-7[1m] / claude-opus-4-6[1m]) are first-class entries that preserve the suffix verbatim in the response model field — this matches Claude Code's default Max-plan state (lG() emits claude-opus-4-8[1m]) and keeps its mR() 1M-context check happy without spuriously enabling extended thinking. Thinking is still opt-in via Sonnet [1m] suffix, Anthropic-Beta: context-1m header, or thinking field.

Unmatched claude-* models are passed through as-is. Non-claude models fall back to claude-sonnet-4.6.

Response model ID

The model field in /v1/messages responses (streaming message_start, non-streaming body, and tool-search path) is returned as the Anthropic-form ID (e.g. claude-opus-4-7), not the Kiro SKU (claude-opus-4.7).

When the proxy routes to a 1M context window (always-1M SKU such as claude-opus-4.8 / claude-opus-4.7 / claude-opus-4.6, or a model invoked with the [1m] suffix or Anthropic-Beta: context-1m header), a trailing [1m] is appended to the response model ID (e.g. claude-opus-4-8[1m]). Claude Code's client-side context-window logic matches /\[1m\]/i on the response model to pick the 1M window — without the suffix it defaults to 200k and auto-compacts at ~160k even when upstream actually has 1M of context.

Note: [1m] has different meanings on request vs. response. On the request model it is a client-supplied thinking-opt-in signal (and is stripped before upstream routing). On the response model it is purely a context-window advertisement for Claude Code and does not imply that extended thinking was enabled.

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.claude/skills		.claude/skills
.github/workflows		.github/workflows
cmd/kirocc		cmd/kirocc
docs		docs
internal		internal
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
.mise.toml		.mise.toml
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
lefthook.yml		lefthook.yml
renovate.json		renovate.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kirocc

Features

Prerequisites

Installation

Homebrew

go install

Usage

Start the server

Use with Claude Code

Command-line options

Default DB path

Environment variables

OpenTelemetry tracing

Custom model mappings

Endpoints

Architecture

Request flow

Extended Thinking

Tool Search

Model mappings

Response model ID

License

About

Uh oh!

Releases 16

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

kirocc

Features

Prerequisites

Installation

Homebrew

go install

Usage

Start the server

Use with Claude Code

Command-line options

Default DB path

Environment variables

OpenTelemetry tracing

Custom model mappings

Endpoints

Architecture

Request flow

Extended Thinking

Tool Search

Model mappings

Response model ID

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages