Skip to content

server: Anthropic-compatible Messages API (/v1/messages) (#325)#326

Open
jamesburton wants to merge 1 commit into
kkokosa:mainfrom
jamesburton:anthropic-messages-api
Open

server: Anthropic-compatible Messages API (/v1/messages) (#325)#326
jamesburton wants to merge 1 commit into
kkokosa:mainfrom
jamesburton:anthropic-messages-api

Conversation

@jamesburton

Copy link
Copy Markdown

Summary

Adds an Anthropic-compatible Messages API layer to DotLLM.Server, served alongside the existing OpenAI-compatible endpoints (Step 34). Clients and SDKs written for the Anthropic Messages API (the anthropic Python/TS SDKs, anything targeting POST /v1/messages) can now point at a dotLLM server unchanged.

This is purely additive — the engine, tokenizer, chat-template, sampler and tool-calling pipeline are reused verbatim. Only a new HTTP DTO + translation surface is introduced; no existing endpoint, engine, or sampling behaviour changes.

Closes #325.

Endpoints

  • POST /v1/messages — non-streaming (JSON) and streaming (event-based SSE: message_start, content_block_start/_delta/_stop, message_delta, message_stop, ping).
  • POST /v1/messages/count_tokens — returns { "input_tokens": N }.

Translation (AnthropicConverter)

  • Top-level system (string or text-block array) → leading system message.
  • Message content as a string or a content-block array; text / tool_use / tool_result blocks mapped to engine ChatMessage / ToolCall and tool-role messages keyed by tool_use_id.
  • tools[].input_schemaToolDefinition; tool_choice auto/any/none/tool → engine ToolChoice.
  • FinishReasonstop_reason (end_turn / max_tokens / stop_sequence / tool_use).
  • Anthropic error envelope ({"type":"error","error":{...}}); all DTOs registered in the source-gen ServerJsonContext (AOT-clean, no reflection).

Tests & Docs

  • 26 unit tests (AnthropicConverterTests): message flattening (string/blocks/system), tool_choice, stop_reason mapping, tool_use block emission, request validation, and response/error serialization shape. Full server unit suite green.
  • New docs/ANTHROPIC_API.md (full mapping + streaming sequence) and docs/SERVER.md / ROADMAP.md (Step 34b) / README.md / CLAUDE.md updates.

Notes / Limitations

  • Streaming tool calls are detected post-generation (matching the existing OpenAI streaming endpoint's post-hoc detection), so tool_use blocks are emitted after the text block closes rather than incrementally.
  • image / multimodal content blocks are not yet supported (no multimodal pipeline).
  • Same single-request serialization, validation, and prompt-caching semantics as the OpenAI endpoints apply.

🤖 Generated with Claude Code

Add an Anthropic Messages API layer to DotLLM.Server alongside the existing
OpenAI surface, so clients written for the `anthropic` SDKs can talk to dotLLM
unchanged. Purely additive — the engine, chat template, sampler and
tool-calling pipeline are reused verbatim; only the wire format differs.

Endpoints:
- POST /v1/messages — non-streaming (JSON) and streaming (named SSE events:
  message_start, content_block_start/_delta/_stop, message_delta, message_stop).
- POST /v1/messages/count_tokens — { "input_tokens": N }.

Translation (AnthropicConverter):
- Top-level system (string or text-block array) -> leading system message.
- String-or-block message content; text/tool_use/tool_result blocks mapped to
  ChatMessage/ToolCall and tool-role messages keyed by tool_use_id.
- tools/input_schema -> ToolDefinition; tool_choice auto/any/none/tool.
- FinishReason -> stop_reason (end_turn/max_tokens/stop_sequence/tool_use).
- Anthropic error envelope; AOT-clean source-gen DTOs in ServerJsonContext.

Tests: 26 unit tests (message flattening, tool_choice, stop_reason, tool_use
blocks, validation, response/error serialization shape).
Docs: docs/ANTHROPIC_API.md + SERVER.md/ROADMAP.md/README.md/CLAUDE.md sync.

Closes kkokosa#325

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 15, 2026 13:17

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds an Anthropic-compatible Messages API surface to the server (request/response DTOs, converter, and endpoints), plus documentation and unit tests to validate the translation layer and JSON shapes.

Changes:

  • Introduces /v1/messages and /v1/messages/count_tokens endpoints with non-streaming JSON and streaming SSE event support.
  • Adds Anthropic Messages API DTOs and a converter to map Anthropic wire format to dotLLM engine types.
  • Updates JSON source-generation context and documentation; adds unit tests for converter/serialization/validation.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/DotLLM.Tests.Unit/Server/AnthropicConverterTests.cs Adds unit tests for Anthropic conversion, validation, and serialization shapes.
src/DotLLM.Server/ServerJsonContext.cs Registers Anthropic DTOs/events for System.Text.Json source generation.
src/DotLLM.Server/Models/AnthropicMessagesModels.cs Introduces DTOs for Anthropic requests/responses/errors/streaming events.
src/DotLLM.Server/Endpoints/MessagesEndpoint.cs Implements Anthropic-compatible endpoints, streaming SSE, and request validation.
src/DotLLM.Server/EndpointExtensions.cs Wires the new endpoint mapping into the server.
src/DotLLM.Server/AnthropicConverter.cs Adds conversion logic between Anthropic DTOs and engine types.
docs/SERVER.md Documents the new endpoints at a high level and links to full Anthropic API docs.
docs/ROADMAP.md Marks Anthropic Messages API support as done and references docs.
docs/ANTHROPIC_API.md Adds detailed API documentation for request/response and streaming event sequence.
README.md Adds feature bullet + news entry about the new Anthropic-compatible API.
CLAUDE.md Adds a documentation index entry for the new Anthropic API doc.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +259 to +267
public static JsonElement ParseInput(string? arguments)
{
if (string.IsNullOrWhiteSpace(arguments))
return EmptyObject();
try
{
using var doc = JsonDocument.Parse(arguments);
return doc.RootElement.Clone();
}
Comment on lines +331 to +348
internal static string? ValidateRequest(AnthropicMessagesRequest request, bool requireMaxTokens)
{
if (request.Messages is null || request.Messages.Length == 0)
return "messages: at least one message is required";

if (request.Messages.Length > RequestValidator.MaxMessages)
return $"messages: array exceeds maximum of {RequestValidator.MaxMessages}";

if (requireMaxTokens)
{
if (!request.MaxTokens.HasValue)
return "max_tokens: field required";
if (request.MaxTokens.Value <= 0)
return "max_tokens: must be a positive integer";
}

return null;
}
Comment on lines +168 to +170
httpContext.Response.ContentType = "text/event-stream";
httpContext.Response.Headers.CacheControl = "no-cache";
httpContext.Response.Headers.Connection = "keep-alive";
Comment thread docs/ANTHROPIC_API.md
Comment on lines +41 to +44
"tool_choice": {"type": "auto"},
"stream": false,
"lora_adapter": "customer-support"
}
Comment on lines +274 to +278
private static JsonElement EmptyObject()
{
using var doc = JsonDocument.Parse("{}");
return doc.RootElement.Clone();
}
Comment on lines +156 to +166
private static async Task HandleStreamingAsync(
AnthropicMessagesRequest request,
TextGenerator generator,
ServerState state,
HttpContext httpContext,
string prompt,
DotLLM.Core.Configuration.InferenceOptions options,
string messageId, string modelId,
ToolDefinition[]? tools,
int promptTokenCount,
CancellationToken ct)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

server: Anthropic-compatible Messages API (/v1/messages) layer

2 participants