A provider-agnostic agent orchestration framework for Apple platforms.
Build multi-agent systems with typed inter-agent messaging, declarative pipelines, tool calling, memory, persistence, and first-class observability — all designed around Swift 6 strict concurrency.
let pipeline = Pipeline("content-generation") {
Parallel {
MusicResearcherAgent()
NewsResearcherAgent()
}
Step(ProducerAgent())
Step(DJAgent())
}
let session = try await pipeline.run(context: context)- Swift 6.0+
- macOS 15+ / iOS 18+
- Xcode 16+
Add AgentKit.swift to your Package.swift:
dependencies: [
.package(url: "https://github.com/your-org/AgentKit.swift.git", from: "0.1.0"),
]Import the modules you need:
// Core framework (zero external dependencies)
.product(name: "AgentKit", package: "AgentKit.swift"),
// Provider bridges (pick one or both)
.product(name: "AgentKitAnthropic", package: "AgentKit.swift"),
.product(name: "AgentKitOpenAI", package: "AgentKit.swift"),
// Persistence (GRDB-backed)
.product(name: "AgentKitGRDB", package: "AgentKit.swift"),
// Test helpers (mock providers, fixtures, assertions)
.product(name: "AgentKitTesting", package: "AgentKit.swift"),┌──────────────────────────────────────────────────────────┐
│ Your Application │
├──────────────────────────────────────────────────────────┤
│ Pipeline DSL │ Orchestrators │ AgentRunner │
├──────────────────────────────────────────────────────────┤
│ AgentKit (Core) │
│ Agent · Tool · Message · Provider · Memory · Metrics │
├──────────┬───────────┬────────────┬──────────────────────┤
│ Anthropic│ OpenAI │ GRDB │ Testing │
│ Provider │ Provider │ Persistence│ Mock Providers │
└──────────┴───────────┴────────────┴──────────────────────┘
AgentKit is the core module with zero external dependencies. It defines all protocols, value types, the runner, orchestration patterns, and observability. Provider bridges, persistence, and testing are separate modules you opt into.
Tools give agents the ability to interact with the outside world. Define typed inputs and outputs — the framework handles JSON Schema generation and serialization.
import AgentKit
struct WeatherTool: Tool {
static let name = "get_weather"
static let description = "Get current weather for a city"
struct Input: ToolInput {
let city: String
}
struct Output: ToolOutput {
let temperature: Double
let condition: String
}
func call(_ input: Input) async throws -> Output {
// Your implementation here
Output(temperature: 72.0, condition: "sunny")
}
}An agent declares its identity, model preference, tools, system prompt, and prompt-building logic. The AgentRunner handles the agentic loop (LLM call, tool execution, iteration).
struct WeatherAgent: Agent {
static let id = "weather_agent"
static let displayName = "Weather Agent"
let model: ModelPreference = .balanced
let systemPrompt = "You are a helpful weather assistant."
let tools = ToolSet {
AnyTool(WeatherTool(), inputSchema: .object(
properties: ["city": .string("City name")],
required: ["city"]
))
}
func buildPrompt(context: AgentContext) async throws -> String {
"What's the weather in San Francisco?"
}
}import AgentKitAnthropic
let provider = AnthropicProvider(configuration: AnthropicConfiguration(
apiKey: "sk-ant-..."
))
let context = AgentContext(provider: provider)
let runner = AgentRunner(maxIterations: 10)
let result = try await runner.run(agent: WeatherAgent(), context: context)
switch result {
case .finalOutput(let message, let metrics):
print(message.text)
print("Tokens used: \(metrics.totalTokenUsage.totalTokens)")
case .handoff(let request):
print("Handed off to: \(request.targetAgentID)")
case .requiresInput(let prompt):
print("Needs input: \(prompt)")
}The Agent protocol is the central abstraction. Every agent declares:
| Property | Required | Default | Description |
|---|---|---|---|
id |
Yes | — | Unique identifier (static) |
displayName |
Yes | — | Human-readable name (static) |
model |
Yes | — | Model preference (.cheapest, .balanced, .creative, .best, .specific) |
systemPrompt |
Yes | — | System prompt defining the agent's role |
buildPrompt(context:) |
Yes | — | Builds the user prompt for each execution |
tools |
No | .empty |
Tools available to the agent |
executionPolicy |
No | .immediate |
How requests are dispatched |
cachePolicy |
No | .none |
Prompt caching configuration |
memory |
No | nil |
Memory retrieval configuration |
runCondition |
No | .always |
Condition that must be met to run |
For agents that produce structured output, conform to TypedAgent or BatchTypedAgent:
struct RecommendationAgent: TypedAgent {
static let id = "recommender"
static let displayName = "Recommender"
typealias Output = [Recommendation]
let model: ModelPreference = .balanced
let systemPrompt = "You return JSON arrays of recommendations."
func buildPrompt(context: AgentContext) async throws -> String {
"Recommend 5 jazz albums."
}
func process(output: [Recommendation], context: AgentContext) async throws {
for rec in output {
try await context.publish(MusicRecommendation(title: rec.title))
}
}
}Tools conform to the Tool protocol with typed Input and Output. Wrap them in AnyTool for type erasure, and compose them with the ToolSet result builder:
let tools = ToolSet {
AnyTool(SearchTool(), inputSchema: .object(
properties: ["query": .string("Search query")],
required: ["query"]
))
AnyTool(FetchTool(), inputSchema: .object(
properties: [
"url": .string("URL to fetch"),
"format": .enumeration(["json", "text"], "Response format"),
],
required: ["url"]
))
}Agents communicate through typed messages. Define a message type conforming to AgentMessage, publish it from one agent, and query it from another:
struct MusicRecommendation: AgentMessage {
static let channel = "music.recommendations"
var id = UUID()
var priority: MessagePriority = .normal
var expiresAt: Date? = nil
let title: String
let artist: String
}
// Publishing (in one agent's buildPrompt or process):
try await context.publish(MusicRecommendation(title: "Kind of Blue", artist: "Miles Davis"))
// Querying (in another agent's buildPrompt):
let recs = try await context.messages(
MusicRecommendation.self,
where: .status(.pending),
limit: 10
)Messages support priority ordering (.low, .normal, .high, .critical), status lifecycle (.pending → .consumed / .expired / .failed), and expiration.
The framework ships with two provider bridges. Both conform to LLMProvider and support streaming, tool calling, vision, and structured output.
import AgentKitAnthropic
let provider = AnthropicProvider(configuration: AnthropicConfiguration(
apiKey: "sk-ant-...",
defaultModel: "claude-sonnet-4-20250514",
enablePromptCaching: true
))Capabilities: streaming, tool calling, structured output, vision, prompt caching, extended thinking, batch API.
import AgentKitOpenAI
let provider = OpenAIProvider(configuration: OpenAIConfiguration(
apiKey: "sk-...",
defaultModel: "gpt-4o"
))Capabilities: streaming, tool calling, structured output, vision.
Route requests to different providers based on model preference, with automatic fallback:
let router = ProviderRouter(
providers: [
NamedProvider(name: "anthropic", provider: anthropicProvider),
NamedProvider(name: "openai", provider: openaiProvider),
],
routing: { preference in
switch preference {
case .cheapest:
return [ProviderRoute(providerName: "anthropic", model: .claude4Haiku)]
case .best:
return [
ProviderRoute(providerName: "anthropic", model: .claude4Opus),
ProviderRoute(providerName: "openai", model: .gpt4o), // fallback
]
default:
return [ProviderRoute(providerName: "anthropic", model: .claude4Sonnet)]
}
}
)
// Use the router as the provider — it conforms to LLMProvider
let context = AgentContext(provider: router)Agents declare a ModelPreference rather than a specific model, allowing the runtime to resolve the best available option:
| Preference | Intent |
|---|---|
.cheapest |
Minimize cost (e.g., Haiku, GPT-4o Mini) |
.balanced |
Good quality at reasonable cost (e.g., Sonnet) |
.creative |
Optimized for creative tasks |
.best |
Maximum quality (e.g., Opus, GPT-4o) |
.specific(ModelID) |
Exact model by ID |
Built-in model IDs: .claude4Opus, .claude4Sonnet, .claude4Haiku, .gpt4o, .gpt4oMini, .o1, .o3Mini, .foundationModels.
The Pipeline DSL provides a declarative way to compose agents into multi-step workflows:
let pipeline = Pipeline("content-pipeline") {
// Phase 1: Research (parallel)
Parallel {
MusicResearcherAgent()
NewsResearcherAgent()
}
// Phase 2: Produce script (sequential)
Step(ProducerAgent())
// Phase 3: Generate final output
Step(DJAgent())
}
.checkpoint(.afterEachStep)
.costBudget(.perRun(cents: 50))
let session = try await pipeline.run(context: context)
print("Total cost: \(session.metrics.estimatedCost(calculator: .default))")| Component | Description |
|---|---|
Step(agent) |
Run a single agent sequentially |
Parallel { ... } |
Run multiple agents concurrently via TaskGroup |
Progressive(agent, lookahead: n) |
Run an agent n times (for iterative processing) |
| Modifier | Description |
|---|---|
.checkpoint(.afterEachStep) |
Save execution state after each step |
.trace(.enabled(level: .info)) |
Enable execution tracing |
.costBudget(.perRun(cents: 50)) |
Set a cost budget for the pipeline |
For more control than the Pipeline DSL, use orchestrators directly:
Sequential execution where output flows from one agent to the next:
let chain = ChainOrchestrator(agents: [researchAgent, writerAgent, editorAgent])
let result = try await chain.run(input: "Write about Swift concurrency", context: context)Concurrent execution with optional concurrency limits:
let parallel = ParallelOrchestrator(
agents: [searchAgent, fetchAgent, analyzeAgent],
maxConcurrency: 2
)
let results = try await parallel.run(input: "Analyze market trends", context: context)Agent-to-agent delegation — an agent can hand off to another agent, which can hand off to another, and so on:
let handoff = HandoffOrchestrator(
agents: [triageAgent, billingAgent, supportAgent],
entryAgentID: "triage",
maxHandoffs: 5
)
let result = try await handoff.run(input: "I need help with my bill", context: context)A generate-evaluate-refine loop between two agents:
let refinement = RefinementOrchestrator(
generator: writerAgent,
evaluator: criticAgent,
maxRefinements: 3
)
let result = try await refinement.run(input: "Draft a product announcement", context: context)AgentKitGRDB provides SQLite-backed persistence for all stores, using GRDB with WAL mode for concurrent reads.
import AgentKitGRDB
// Create a database (or use .inMemory() for testing)
let db = try DatabaseManager(path: "path/to/agents.sqlite")
// Create stores
let messageStore = GRDBMessageStore(databaseManager: db)
let checkpointStore = GRDBCheckpointStore(databaseManager: db)
let memoryStore = GRDBMemoryStore(databaseManager: db)
let metricsStore = GRDBMetricsStore(databaseManager: db)
// Wire into context
let context = AgentContext(
provider: provider,
messageStore: messageStore,
checkpointStore: checkpointStore,
memoryStore: memoryStore
)| Store | Protocol | Description |
|---|---|---|
GRDBMessageStore |
MessageStore |
Typed inter-agent messages with priority, status, and expiration |
GRDBCheckpointStore |
CheckpointStore |
Agent execution snapshots for checkpoint-resume |
GRDBMemoryStore |
MemoryStore |
Episodic and core memory with tag-based retrieval |
GRDBMetricsStore |
— | Step-level execution metrics with aggregation queries |
All stores also have in-memory implementations (InMemoryMessageStore, InMemoryCheckpointStore, InMemoryMemoryStore) that are used by default when no persistence is configured.
Agents can be configured with memory for context-aware behavior across sessions:
struct PersonalAssistant: Agent {
static let id = "assistant"
static let displayName = "Personal Assistant"
let model: ModelPreference = .balanced
let systemPrompt = "You are a personal assistant with memory of past interactions."
let memory = MemoryConfig(
coreMemoryTokenBudget: 500,
retrievalTokenBudget: 800,
retrievalStrategy: .tagAndRecency(
recencyWeight: 0.3,
relevanceWeight: 0.5,
importanceWeight: 0.2
),
consolidation: .afterSession(summarizeWith: .cheapest)
)
func buildPrompt(context: AgentContext) async throws -> String {
let memories = try await context.retrieveMemories(tags: ["preference"], limit: 5)
let memoryText = memories.map(\.content).joined(separator: "\n")
return "Past context:\n\(memoryText)\n\nUser request: ..."
}
}Record memories during processing:
try await context.recordMemory(EpisodicMemory(
content: "User prefers jazz over classical music",
importance: 0.8,
tags: ["music", "preference"]
))Every agent run accumulates step-level metrics:
let result = try await runner.run(agent: myAgent, context: context)
if case .finalOutput(_, let metrics) = result {
print("Steps: \(metrics.stepCount)")
print("Total tokens: \(metrics.totalTokenUsage.totalTokens)")
print("Cached tokens: \(metrics.totalTokenUsage.cachedInputTokens)")
print("Latency: \(metrics.totalLatency)s")
print("Estimated cost: \(metrics.estimatedCost(calculator: .default))")
}The CostCalculator includes default pricing for Claude and GPT models. Customize it for your own pricing:
let calculator = CostCalculator(pricing: [
.claude4Sonnet: ModelPricing(inputPer1M: 3.0, outputPer1M: 15.0),
.gpt4o: ModelPricing(inputPer1M: 2.5, outputPer1M: 10.0),
])Set cost budgets on pipelines to prevent runaway costs:
let pipeline = Pipeline("expensive-pipeline") { ... }
.costBudget(.perRun(cents: 100)) // Max $1.00 per run
.costBudget(.daily(cents: 1000)) // Max $10.00 per dayStructured logging via OSLog with the com.swift-agents subsystem:
let logger = AgentLogger(agentID: "music-researcher")
logger.info("Starting research phase")
logger.debug("Found \(count) results")
logger.error("Search API returned 429")View logs in Console.app by filtering on subsystem com.swift-agents.
AgentSignposts provides os_signpost intervals for profiling in Instruments:
- AgentRun — Full agent execution lifecycle
- LLMCall — Individual LLM completion requests
- ToolExecution — Individual tool invocations
The AgentTracer protocol enables custom trace collection:
let tracer = InMemoryTracer()
// After runs, query traces:
let traces = await tracer.traces(since: Date().addingTimeInterval(-3600))
print("Cache hit rate: \(traces.cacheHitRate)")
print("Total cost: \(traces.totalCost)")AgentKitTesting provides a complete mock infrastructure:
import AgentKitTesting
// Create a mock provider with canned responses
let provider = MockLLMProvider(responses: [
.text("Here are my recommendations..."),
.toolCall(name: "search", arguments: "{\"query\": \"jazz\"}"),
.text("Based on the search results..."),
])
// Build a test context with all dependencies mocked
let context = TestContext.make(
provider: provider,
messageStore: InMemoryMessageStore(),
checkpointStore: InMemoryCheckpointStore()
)
// Run your agent
let runner = AgentRunner()
let result = try await runner.run(agent: MyAgent(), context: context)
// Assert results
assertFinalOutput(result, contains: "recommendations")
if let (message, metrics) = extractFinalOutput(result) {
#expect(message.text.contains("jazz"))
assertMetrics(metrics, stepCount: 3)
}
// Inspect recorded requests
let requests = await provider.recordedRequests
#expect(requests.count == 3)| Mock | Description |
|---|---|
.text("...") |
Simple text response |
.toolCall(name:arguments:) |
Simulates a tool call from the LLM |
.json(codable) |
JSON-encoded response |
.raw(CompletionResponse) |
Full custom response |
Pre-populate the message store for testing agents that depend on inter-agent messages:
let context = try await TestContext.make(
provider: provider,
seedMessages: [
MusicRecommendation(title: "Kind of Blue", artist: "Miles Davis"),
MusicRecommendation(title: "A Love Supreme", artist: "John Coltrane"),
]
)All providers support streaming via AsyncThrowingStream:
let stream = provider.stream(request)
for try await chunk in stream {
switch chunk {
case .delta(let textDelta):
print(textDelta.text, terminator: "")
case .toolCallDelta(let delta):
print("Tool call: \(delta.name ?? "building arguments...")")
case .usage(let usage):
print("Tokens: \(usage.totalTokens)")
case .finished(let reason):
print("Done: \(reason)")
}
}All errors are represented by AgentError:
| Error | Description |
|---|---|
.providerError(underlying:requestID:) |
LLM API error |
.toolExecutionFailed(toolName:underlying:) |
Tool threw an error |
.maxIterationsExceeded(Int) |
Agent loop hit the iteration limit |
.costBudgetExceeded(budgetCents:actualCents:) |
Cost budget exceeded |
.toolNotFound(String) |
LLM called a tool that doesn't exist |
.checkpointFailed(underlying:) |
Checkpoint save/load error |
.invalidConfiguration(String) |
Configuration error |
.cancelled |
Task was cancelled |
.handoffFailed(targetAgentID:reason:) |
Agent handoff failed |
| Module | Dependencies | Description |
|---|---|---|
AgentKit |
None | Core protocols, types, runner, pipeline, orchestration, observability |
AgentKitAnthropic |
SwiftAnthropic | Anthropic Claude provider bridge |
AgentKitOpenAI |
SwiftOpenAI | OpenAI GPT provider bridge |
AgentKitGRDB |
GRDB | SQLite persistence for messages, checkpoints, memory, metrics |
AgentKitTesting |
None | Mock providers, test contexts, assertion helpers |