AgentKit.swift

A provider-agnostic agent orchestration framework for Apple platforms.

Build multi-agent systems with typed inter-agent messaging, declarative pipelines, tool calling, memory, persistence, and first-class observability — all designed around Swift 6 strict concurrency.

let pipeline = Pipeline("content-generation") {
    Parallel {
        MusicResearcherAgent()
        NewsResearcherAgent()
    }
    Step(ProducerAgent())
    Step(DJAgent())
}

let session = try await pipeline.run(context: context)

Requirements

Swift 6.0+
macOS 15+ / iOS 18+
Xcode 16+

Installation

Add AgentKit.swift to your Package.swift:

dependencies: [
    .package(url: "https://github.com/your-org/AgentKit.swift.git", from: "0.1.0"),
]

Import the modules you need:

// Core framework (zero external dependencies)
.product(name: "AgentKit", package: "AgentKit.swift"),

// Provider bridges (pick one or both)
.product(name: "AgentKitAnthropic", package: "AgentKit.swift"),
.product(name: "AgentKitOpenAI", package: "AgentKit.swift"),

// Persistence (GRDB-backed)
.product(name: "AgentKitGRDB", package: "AgentKit.swift"),

// Test helpers (mock providers, fixtures, assertions)
.product(name: "AgentKitTesting", package: "AgentKit.swift"),

Architecture

┌──────────────────────────────────────────────────────────┐
│                      Your Application                     │
├──────────────────────────────────────────────────────────┤
│  Pipeline DSL  │  Orchestrators  │  AgentRunner          │
├──────────────────────────────────────────────────────────┤
│                    AgentKit (Core)                        │
│  Agent · Tool · Message · Provider · Memory · Metrics    │
├──────────┬───────────┬────────────┬──────────────────────┤
│ Anthropic│  OpenAI   │    GRDB    │      Testing         │
│ Provider │  Provider │ Persistence│   Mock Providers     │
└──────────┴───────────┴────────────┴──────────────────────┘

AgentKit is the core module with zero external dependencies. It defines all protocols, value types, the runner, orchestration patterns, and observability. Provider bridges, persistence, and testing are separate modules you opt into.

Quick Start

1. Define a Tool

Tools give agents the ability to interact with the outside world. Define typed inputs and outputs — the framework handles JSON Schema generation and serialization.

import AgentKit

struct WeatherTool: Tool {
    static let name = "get_weather"
    static let description = "Get current weather for a city"

    struct Input: ToolInput {
        let city: String
    }

    struct Output: ToolOutput {
        let temperature: Double
        let condition: String
    }

    func call(_ input: Input) async throws -> Output {
        // Your implementation here
        Output(temperature: 72.0, condition: "sunny")
    }
}

2. Define an Agent

An agent declares its identity, model preference, tools, system prompt, and prompt-building logic. The AgentRunner handles the agentic loop (LLM call, tool execution, iteration).

struct WeatherAgent: Agent {
    static let id = "weather_agent"
    static let displayName = "Weather Agent"

    let model: ModelPreference = .balanced
    let systemPrompt = "You are a helpful weather assistant."

    let tools = ToolSet {
        AnyTool(WeatherTool(), inputSchema: .object(
            properties: ["city": .string("City name")],
            required: ["city"]
        ))
    }

    func buildPrompt(context: AgentContext) async throws -> String {
        "What's the weather in San Francisco?"
    }
}

3. Run the Agent

import AgentKitAnthropic

let provider = AnthropicProvider(configuration: AnthropicConfiguration(
    apiKey: "sk-ant-..."
))

let context = AgentContext(provider: provider)
let runner = AgentRunner(maxIterations: 10)
let result = try await runner.run(agent: WeatherAgent(), context: context)

switch result {
case .finalOutput(let message, let metrics):
    print(message.text)
    print("Tokens used: \(metrics.totalTokenUsage.totalTokens)")
case .handoff(let request):
    print("Handed off to: \(request.targetAgentID)")
case .requiresInput(let prompt):
    print("Needs input: \(prompt)")
}

Core Concepts

Agents

The Agent protocol is the central abstraction. Every agent declares:

Property	Required	Default	Description
`id`	Yes	—	Unique identifier (static)
`displayName`	Yes	—	Human-readable name (static)
`model`	Yes	—	Model preference (.cheapest, .balanced, .creative, .best, .specific)
`systemPrompt`	Yes	—	System prompt defining the agent's role
`buildPrompt(context:)`	Yes	—	Builds the user prompt for each execution
`tools`	No	`.empty`	Tools available to the agent
`executionPolicy`	No	`.immediate`	How requests are dispatched
`cachePolicy`	No	`.none`	Prompt caching configuration
`memory`	No	`nil`	Memory retrieval configuration
`runCondition`	No	`.always`	Condition that must be met to run

Typed Agents

For agents that produce structured output, conform to TypedAgent or BatchTypedAgent:

struct RecommendationAgent: TypedAgent {
    static let id = "recommender"
    static let displayName = "Recommender"

    typealias Output = [Recommendation]

    let model: ModelPreference = .balanced
    let systemPrompt = "You return JSON arrays of recommendations."

    func buildPrompt(context: AgentContext) async throws -> String {
        "Recommend 5 jazz albums."
    }

    func process(output: [Recommendation], context: AgentContext) async throws {
        for rec in output {
            try await context.publish(MusicRecommendation(title: rec.title))
        }
    }
}

Tools

Tools conform to the Tool protocol with typed Input and Output. Wrap them in AnyTool for type erasure, and compose them with the ToolSet result builder:

let tools = ToolSet {
    AnyTool(SearchTool(), inputSchema: .object(
        properties: ["query": .string("Search query")],
        required: ["query"]
    ))
    AnyTool(FetchTool(), inputSchema: .object(
        properties: [
            "url": .string("URL to fetch"),
            "format": .enumeration(["json", "text"], "Response format"),
        ],
        required: ["url"]
    ))
}

Inter-Agent Messaging

Agents communicate through typed messages. Define a message type conforming to AgentMessage, publish it from one agent, and query it from another:

struct MusicRecommendation: AgentMessage {
    static let channel = "music.recommendations"
    var id = UUID()
    var priority: MessagePriority = .normal
    var expiresAt: Date? = nil

    let title: String
    let artist: String
}

// Publishing (in one agent's buildPrompt or process):
try await context.publish(MusicRecommendation(title: "Kind of Blue", artist: "Miles Davis"))

// Querying (in another agent's buildPrompt):
let recs = try await context.messages(
    MusicRecommendation.self,
    where: .status(.pending),
    limit: 10
)

Messages support priority ordering (.low, .normal, .high, .critical), status lifecycle (.pending → .consumed / .expired / .failed), and expiration.

Providers

The framework ships with two provider bridges. Both conform to LLMProvider and support streaming, tool calling, vision, and structured output.

Anthropic (Claude)

import AgentKitAnthropic

let provider = AnthropicProvider(configuration: AnthropicConfiguration(
    apiKey: "sk-ant-...",
    defaultModel: "claude-sonnet-4-20250514",
    enablePromptCaching: true
))

Capabilities: streaming, tool calling, structured output, vision, prompt caching, extended thinking, batch API.

OpenAI (GPT)

import AgentKitOpenAI

let provider = OpenAIProvider(configuration: OpenAIConfiguration(
    apiKey: "sk-...",
    defaultModel: "gpt-4o"
))

Capabilities: streaming, tool calling, structured output, vision.

Provider Routing

Route requests to different providers based on model preference, with automatic fallback:

let router = ProviderRouter(
    providers: [
        NamedProvider(name: "anthropic", provider: anthropicProvider),
        NamedProvider(name: "openai", provider: openaiProvider),
    ],
    routing: { preference in
        switch preference {
        case .cheapest:
            return [ProviderRoute(providerName: "anthropic", model: .claude4Haiku)]
        case .best:
            return [
                ProviderRoute(providerName: "anthropic", model: .claude4Opus),
                ProviderRoute(providerName: "openai", model: .gpt4o), // fallback
            ]
        default:
            return [ProviderRoute(providerName: "anthropic", model: .claude4Sonnet)]
        }
    }
)

// Use the router as the provider — it conforms to LLMProvider
let context = AgentContext(provider: router)

Model Preferences

Agents declare a ModelPreference rather than a specific model, allowing the runtime to resolve the best available option:

Preference	Intent
`.cheapest`	Minimize cost (e.g., Haiku, GPT-4o Mini)
`.balanced`	Good quality at reasonable cost (e.g., Sonnet)
`.creative`	Optimized for creative tasks
`.best`	Maximum quality (e.g., Opus, GPT-4o)
`.specific(ModelID)`	Exact model by ID

Built-in model IDs: .claude4Opus, .claude4Sonnet, .claude4Haiku, .gpt4o, .gpt4oMini, .o1, .o3Mini, .foundationModels.

Pipelines

The Pipeline DSL provides a declarative way to compose agents into multi-step workflows:

let pipeline = Pipeline("content-pipeline") {
    // Phase 1: Research (parallel)
    Parallel {
        MusicResearcherAgent()
        NewsResearcherAgent()
    }

    // Phase 2: Produce script (sequential)
    Step(ProducerAgent())

    // Phase 3: Generate final output
    Step(DJAgent())
}
.checkpoint(.afterEachStep)
.costBudget(.perRun(cents: 50))

let session = try await pipeline.run(context: context)
print("Total cost: \(session.metrics.estimatedCost(calculator: .default))")

Pipeline Components

Component	Description
`Step(agent)`	Run a single agent sequentially
`Parallel { ... }`	Run multiple agents concurrently via TaskGroup
`Progressive(agent, lookahead: n)`	Run an agent `n` times (for iterative processing)

Pipeline Modifiers

Modifier	Description
`.checkpoint(.afterEachStep)`	Save execution state after each step
`.trace(.enabled(level: .info))`	Enable execution tracing
`.costBudget(.perRun(cents: 50))`	Set a cost budget for the pipeline

Orchestration Patterns

For more control than the Pipeline DSL, use orchestrators directly:

Chain

Sequential execution where output flows from one agent to the next:

let chain = ChainOrchestrator(agents: [researchAgent, writerAgent, editorAgent])
let result = try await chain.run(input: "Write about Swift concurrency", context: context)

Parallel

Concurrent execution with optional concurrency limits:

let parallel = ParallelOrchestrator(
    agents: [searchAgent, fetchAgent, analyzeAgent],
    maxConcurrency: 2
)
let results = try await parallel.run(input: "Analyze market trends", context: context)

Handoff

Agent-to-agent delegation — an agent can hand off to another agent, which can hand off to another, and so on:

let handoff = HandoffOrchestrator(
    agents: [triageAgent, billingAgent, supportAgent],
    entryAgentID: "triage",
    maxHandoffs: 5
)
let result = try await handoff.run(input: "I need help with my bill", context: context)

Refinement

A generate-evaluate-refine loop between two agents:

let refinement = RefinementOrchestrator(
    generator: writerAgent,
    evaluator: criticAgent,
    maxRefinements: 3
)
let result = try await refinement.run(input: "Draft a product announcement", context: context)

Persistence

AgentKitGRDB provides SQLite-backed persistence for all stores, using GRDB with WAL mode for concurrent reads.

import AgentKitGRDB

// Create a database (or use .inMemory() for testing)
let db = try DatabaseManager(path: "path/to/agents.sqlite")

// Create stores
let messageStore = GRDBMessageStore(databaseManager: db)
let checkpointStore = GRDBCheckpointStore(databaseManager: db)
let memoryStore = GRDBMemoryStore(databaseManager: db)
let metricsStore = GRDBMetricsStore(databaseManager: db)

// Wire into context
let context = AgentContext(
    provider: provider,
    messageStore: messageStore,
    checkpointStore: checkpointStore,
    memoryStore: memoryStore
)

Stores

Store	Protocol	Description
`GRDBMessageStore`	`MessageStore`	Typed inter-agent messages with priority, status, and expiration
`GRDBCheckpointStore`	`CheckpointStore`	Agent execution snapshots for checkpoint-resume
`GRDBMemoryStore`	`MemoryStore`	Episodic and core memory with tag-based retrieval
`GRDBMetricsStore`	—	Step-level execution metrics with aggregation queries

All stores also have in-memory implementations (InMemoryMessageStore, InMemoryCheckpointStore, InMemoryMemoryStore) that are used by default when no persistence is configured.

Memory

Agents can be configured with memory for context-aware behavior across sessions:

struct PersonalAssistant: Agent {
    static let id = "assistant"
    static let displayName = "Personal Assistant"

    let model: ModelPreference = .balanced
    let systemPrompt = "You are a personal assistant with memory of past interactions."

    let memory = MemoryConfig(
        coreMemoryTokenBudget: 500,
        retrievalTokenBudget: 800,
        retrievalStrategy: .tagAndRecency(
            recencyWeight: 0.3,
            relevanceWeight: 0.5,
            importanceWeight: 0.2
        ),
        consolidation: .afterSession(summarizeWith: .cheapest)
    )

    func buildPrompt(context: AgentContext) async throws -> String {
        let memories = try await context.retrieveMemories(tags: ["preference"], limit: 5)
        let memoryText = memories.map(\.content).joined(separator: "\n")
        return "Past context:\n\(memoryText)\n\nUser request: ..."
    }
}

Record memories during processing:

try await context.recordMemory(EpisodicMemory(
    content: "User prefers jazz over classical music",
    importance: 0.8,
    tags: ["music", "preference"]
))

Metrics and Cost Tracking

Every agent run accumulates step-level metrics:

let result = try await runner.run(agent: myAgent, context: context)

if case .finalOutput(_, let metrics) = result {
    print("Steps: \(metrics.stepCount)")
    print("Total tokens: \(metrics.totalTokenUsage.totalTokens)")
    print("Cached tokens: \(metrics.totalTokenUsage.cachedInputTokens)")
    print("Latency: \(metrics.totalLatency)s")
    print("Estimated cost: \(metrics.estimatedCost(calculator: .default))")
}

The CostCalculator includes default pricing for Claude and GPT models. Customize it for your own pricing:

let calculator = CostCalculator(pricing: [
    .claude4Sonnet: ModelPricing(inputPer1M: 3.0, outputPer1M: 15.0),
    .gpt4o: ModelPricing(inputPer1M: 2.5, outputPer1M: 10.0),
])

Set cost budgets on pipelines to prevent runaway costs:

let pipeline = Pipeline("expensive-pipeline") { ... }
    .costBudget(.perRun(cents: 100))    // Max $1.00 per run
    .costBudget(.daily(cents: 1000))    // Max $10.00 per day

Observability

Logging

Structured logging via OSLog with the com.swift-agents subsystem:

let logger = AgentLogger(agentID: "music-researcher")
logger.info("Starting research phase")
logger.debug("Found \(count) results")
logger.error("Search API returned 429")

View logs in Console.app by filtering on subsystem com.swift-agents.

Instruments

AgentSignposts provides os_signpost intervals for profiling in Instruments:

AgentRun — Full agent execution lifecycle
LLMCall — Individual LLM completion requests
ToolExecution — Individual tool invocations

Tracing

The AgentTracer protocol enables custom trace collection:

let tracer = InMemoryTracer()

// After runs, query traces:
let traces = await tracer.traces(since: Date().addingTimeInterval(-3600))
print("Cache hit rate: \(traces.cacheHitRate)")
print("Total cost: \(traces.totalCost)")

Testing

AgentKitTesting provides a complete mock infrastructure:

import AgentKitTesting

// Create a mock provider with canned responses
let provider = MockLLMProvider(responses: [
    .text("Here are my recommendations..."),
    .toolCall(name: "search", arguments: "{\"query\": \"jazz\"}"),
    .text("Based on the search results..."),
])

// Build a test context with all dependencies mocked
let context = TestContext.make(
    provider: provider,
    messageStore: InMemoryMessageStore(),
    checkpointStore: InMemoryCheckpointStore()
)

// Run your agent
let runner = AgentRunner()
let result = try await runner.run(agent: MyAgent(), context: context)

// Assert results
assertFinalOutput(result, contains: "recommendations")

if let (message, metrics) = extractFinalOutput(result) {
    #expect(message.text.contains("jazz"))
    assertMetrics(metrics, stepCount: 3)
}

// Inspect recorded requests
let requests = await provider.recordedRequests
#expect(requests.count == 3)

Mock Responses

Mock	Description
`.text("...")`	Simple text response
`.toolCall(name:arguments:)`	Simulates a tool call from the LLM
`.json(codable)`	JSON-encoded response
`.raw(CompletionResponse)`	Full custom response

Seed Messages

Pre-populate the message store for testing agents that depend on inter-agent messages:

let context = try await TestContext.make(
    provider: provider,
    seedMessages: [
        MusicRecommendation(title: "Kind of Blue", artist: "Miles Davis"),
        MusicRecommendation(title: "A Love Supreme", artist: "John Coltrane"),
    ]
)

Streaming

All providers support streaming via AsyncThrowingStream:

let stream = provider.stream(request)

for try await chunk in stream {
    switch chunk {
    case .delta(let textDelta):
        print(textDelta.text, terminator: "")
    case .toolCallDelta(let delta):
        print("Tool call: \(delta.name ?? "building arguments...")")
    case .usage(let usage):
        print("Tokens: \(usage.totalTokens)")
    case .finished(let reason):
        print("Done: \(reason)")
    }
}

Error Handling

All errors are represented by AgentError:

Error	Description
`.providerError(underlying:requestID:)`	LLM API error
`.toolExecutionFailed(toolName:underlying:)`	Tool threw an error
`.maxIterationsExceeded(Int)`	Agent loop hit the iteration limit
`.costBudgetExceeded(budgetCents:actualCents:)`	Cost budget exceeded
`.toolNotFound(String)`	LLM called a tool that doesn't exist
`.checkpointFailed(underlying:)`	Checkpoint save/load error
`.invalidConfiguration(String)`	Configuration error
`.cancelled`	Task was cancelled
`.handoffFailed(targetAgentID:reason:)`	Agent handoff failed

Module Reference

Module	Dependencies	Description
`AgentKit`	None	Core protocols, types, runner, pipeline, orchestration, observability
`AgentKitAnthropic`	SwiftAnthropic	Anthropic Claude provider bridge
`AgentKitOpenAI`	SwiftOpenAI	OpenAI GPT provider bridge
`AgentKitGRDB`	GRDB	SQLite persistence for messages, checkpoints, memory, metrics
`AgentKitTesting`	None	Mock providers, test contexts, assertion helpers

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Examples/AIRadio		Examples/AIRadio
Sources		Sources
Tests		Tests
.gitignore		.gitignore
LICENSE		LICENSE
Package.swift		Package.swift
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

AgentKit.swift

Requirements

Installation

Architecture

Quick Start

1. Define a Tool

2. Define an Agent

3. Run the Agent

Core Concepts

Agents

Typed Agents

Tools

Inter-Agent Messaging

Providers

Anthropic (Claude)

OpenAI (GPT)

Provider Routing

Model Preferences

Pipelines

Pipeline Components

Pipeline Modifiers

Orchestration Patterns

Chain

Parallel

Handoff

Refinement

Persistence

Stores

Memory

Metrics and Cost Tracking

Observability

Logging

Instruments

Tracing

Testing

Mock Responses

Seed Messages

Streaming

Error Handling

Module Reference

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages