Native Swift SDK for building autonomous AI agents with Apple's FoundationModels design philosophy
SwiftAgent simplifies AI agent development by providing a clean, intuitive API that handles the complexity of agent loops, tool execution, and direct provider communication. Inspired by Apple's FoundationModels framework, it brings the same elegant, declarative approach to cross-platform AI agent development.
import SwiftAgent
@SessionSchema
struct CityExplorerSchema {
@Tool var cityFacts = CityFactsTool()
@Tool var reservation = ReservationTool()
@Grounding(Date.self) var travelDate
@Grounding([String].self) var mustVisitIdeas
@StructuredOutput(ItinerarySummary.self) var itinerary
}
@MainActor
func planCopenhagenWeekend() async throws {
let schema = CityExplorerSchema()
let model = OpenResponsesLanguageModel(apiKey: "sk-...", model: "openai/gpt-5")
let session = AgentSession(
model: model,
schema: schema,
instructions: "Design cinematic weekends. Call tools for local intel and reservations.",
)
let response = try await session.run(
to: Prompt {
PromptTag("context") {
"Travel date: \(Date(timeIntervalSinceNow: 86_400))"
"Must-visit ideas:"
"Coffee Collective, Nørrebro",
"Designmuseum Denmark",
"Kødbyens Fiskebar"
}
PromptTag("user-query") {
"Coffee, design, and dinner plans for two days in Copenhagen."
}
},
generating: ItinerarySummary.self,
)
print(response.content.headline)
print(response.content.mustTry.joined(separator: " → "))
for entry in try schema.resolve(session.transcript) {
if case let .toolRun(.cityFacts(run)) = entry, let output = run.output {
print("Local picks:", output)
}
if case let .prompt(prompt) = entry {
print("Groundings:", prompt.sources)
}
}
}- Features
- Quick Start
- Session Schema
- Streaming Responses
- Streaming State Helpers
- Proxy Servers
- Simulated Session
- Logging
- Recording HTTP Fixtures
- Development Status
- Provider Feature Parity
- Example App
- License
- Acknowledgments
- Zero-Setup Agent Loops — Use
AgentSessionwhen you want SwiftAgent to execute tools and continue until the agent has a final answer - Native Tool Integration — Use SwiftAgent
@Generablestructs as local tools; inspect calls directly withLanguageModelSessionor execute them automatically withAgentSession - Provider Agnostic — The public session APIs support multiple AI providers (OpenAI + Anthropic included, more coming)
- Apple-Native Design — API inspired by FoundationModels for familiar, intuitive development
- Modern Swift — Built with Swift 6, async/await, and latest concurrency features
- Rich Logging — Comprehensive, human-readable logging for debugging and monitoring
- Flexible Configuration — Fine-tune generation options, tools, and provider-specific settings
SwiftAgent exposes three public layers. Use the lowest layer that matches how much control you want.
LanguageModel is the lowest-level model inference backend. It accepts a provider-neutral ModelRequest and returns one ModelResponse, or streams one turn as ModelStreamEvent values.
Use this layer when you are building your own session, transcript, tool loop, or provider adapter:
let model = OpenResponsesLanguageModel(apiKey: "sk-...", model: "openai/gpt-5")
let response = try await model.respond(to: ModelRequest(
messages: [
ModelMessage(
role: .user,
segments: [.text(.init(content: "Write one sentence about Swift concurrency."))]
)
]
))When using LanguageModel directly across multiple turns, keep the full ModelMessage, ModelResponse, transcript part, and tool-call values instead of flattening them to plain strings. Providers store API bookkeeping in providerMetadata, such as response item IDs, encrypted reasoning payloads, citations, hosted tool references, container IDs, and native tool-call IDs. Basic text loops may still work without that metadata, but provider-specific continuity can degrade.
LanguageModelSession is a stateful, low-level chat/session API around a LanguageModel. It owns transcript state, instructions, tools, schema tools, token usage, response metadata, and streaming snapshots.
It sends tools to the model and records model-emitted tool calls, but it does not execute tools and it does not run an agent loop. After a response contains tool calls, app code can execute whichever tools it wants and explicitly continue with respond(with: toolOutputs) or streamResponse(with: toolOutputs).
let session = LanguageModelSession(
model: model,
tools: [WeatherTool()],
instructions: "Use tools when current weather is needed."
)
let firstTurn = try await session.respond(to: "Weather in Nashville?")
let toolOutputs = firstTurn.transcriptEntries.compactMap { entry -> Transcript.ToolOutput? in
guard case let .toolCalls(toolCalls) = entry,
let call = toolCalls.calls.first
else { return nil }
return Transcript.ToolOutput(
id: call.id,
callId: call.callId,
toolName: call.toolName,
segment: .text(.init(content: "72 F and clear")),
status: .completed
)
}
let finalTurn = try await session.respond(with: toolOutputs)
print(finalTurn.content)For streaming, use the same one-turn shape with streamResponse(to:) and continue through streamResponse(with: toolOutputs).
AgentSession is the high-level agent runtime. It uses the same model/session primitives, but owns the loop: send a turn, inspect tool calls, execute registered local tools, append tool outputs, and repeat until the model produces a final answer or the configured limits are reached.
Use AgentSession when you want SwiftAgent to execute tools and manage the full agent lifecycle:
let session = AgentSession(
model: model,
tools: [WeatherTool()],
instructions: "Answer with current weather when asked."
)
let response = try await session.run(to: "Weather in Nashville?")
print(response.content)Add SwiftAgent to your Swift project:
// Package.swift
dependencies: [
.package(url: "https://github.com/SwiftedMind/SwiftAgent.git", branch: "main")
]
.product(name: "SwiftAgent", package: "SwiftAgent")Then import SwiftAgent:
import SwiftAgentCreate a model and a LanguageModelSession with your default instructions, then call respond whenever you need a single-turn answer. The session tracks conversation state for you, so you can start simple and layer on additional features later.
import SwiftAgent
let model = OpenResponsesLanguageModel(apiKey: "sk-...", model: "openai/gpt-5")
let session = LanguageModelSession(
model: model,
instructions: "You are a helpful assistant.",
)
// Create a response
let response = try await session.respond(to: "What's the weather like in San Francisco?")
// Process response
print(response.content)Or use Anthropic:
import SwiftAgent
let model = AnthropicLanguageModel(apiKey: "sk-ant-...", model: "claude-sonnet-4-5")
let session = LanguageModelSession(
model: model,
instructions: "You are a helpful assistant.",
)
let response = try await session.respond(to: "What's the weather like in San Francisco?")
print(response.content)Note
Using an API key directly is great for prototyping, but do not ship it in production apps. For shipping apps, use a secure proxy with per‑turn tokens. See Proxy Servers for more information.
Create tools using SwiftAgent's @Generable macro for type-safe tool definitions. Tools expose argument and output types that SwiftAgent validates for you, so AgentSession can execute model-requested tools and return strongly typed results without manual JSON parsing.
import SwiftAgent
struct WeatherTool: Tool {
let name = "get_weather"
let description = "Get current weather for a location"
@Generable
struct Arguments {
@Guide(description: "City name")
let city: String
@Guide(description: "Temperature unit")
let unit: String
}
@Generable
struct Output {
let temperature: Double
let condition: String
let humidity: Int
}
func call(arguments: Arguments) async throws -> Output {
return Output(
temperature: 22.5,
condition: "sunny",
humidity: 65
)
}
}
let model = OpenResponsesLanguageModel(apiKey: "sk-...", model: "openai/gpt-5")
let session = AgentSession(
model: model,
tools: [WeatherTool()],
instructions: "You are a helpful assistant.",
)
let response = try await session.run(to: "What's the weather like in San Francisco?")
print(response.content)Note
LanguageModelSession and AgentSession both take tools as an array, for example tools: [WeatherTool(), OtherTool()]. LanguageModelSession exposes tool calls for app-managed loops; AgentSession executes registered local tools automatically.
Provider-hosted tools use the separate providerTools: parameter. SwiftAgent forwards those definitions to the provider and preserves provider-owned calls in the transcript, but AgentSession does not execute them as local Swift tools.
If a tool call fails in a way the agent can correct (such as an unknown identifier or other validation issue), throw a ToolRunRejection. SwiftAgent forwards the structured content you provide to the model without aborting the loop so the agent can adjust its next action.
SwiftAgent always wraps your payload in a standardized envelope that includes error: true and the reason string so the agent can reliably detect recoverable rejections.
For quick cases, attach string-keyed details with the convenience initializer:
struct CustomerLookupTool: Tool {
func call(arguments: Arguments) async throws -> Output {
guard let customer = try await directory.loadCustomer(id: arguments.customerId) else {
throw ToolRunRejection(
reason: "Customer not found",
details: [
"issue": "customerNotFound",
"customerId": arguments.customerId
]
)
}
return Output(summary: customer.summary)
}
}For richer payloads, pass any @Generable type via the content: initializer to return structured data:
@Generable
struct CustomerLookupRejectionDetails {
var issue: String
var customerId: String
var suggestions: [String]
}
throw ToolRunRejection(
reason: "Customer not found",
content: CustomerLookupRejectionDetails(
issue: "customerNotFound",
customerId: arguments.customerId,
suggestions: ["Ask the user to confirm the identifier"]
)
)You can force the response to be structured by defining a @Generable type and passing it to the session.respond method:
import SwiftAgent
@Generable
struct WeatherReport {
let temperature: Double
let condition: String
let humidity: Int
}
let model = OpenResponsesLanguageModel(apiKey: "sk-...", model: "openai/gpt-5")
let session = LanguageModelSession(
model: model,
instructions: "You are a helpful assistant.",
)
let response = try await session.respond(
to: Prompt("Create a concise weather report for San Francisco."),
generating: WeatherReport.self,
)
// Fully typed response content
print(response.content.temperature)
print(response.content.condition)
print(response.content.humidity)The response body is now a fully typed WeatherReport. SwiftAgent validates the payload against your schema, so you can use the data immediately in UI or unit tests without defensive decoding.
Every LanguageModelSession maintains a running transcript that records instructions, prompts, reasoning steps, tool calls, explicit tool outputs, and responses. AgentSession exposes the same transcript shape after it executes tools and continues the loop. Iterate over either transcript to drive custom analytics, persistence, or UI updates:
import SwiftAgent
let model = OpenResponsesLanguageModel(apiKey: "sk-...", model: "openai/gpt-5")
let session = LanguageModelSession(
model: model,
instructions: "You are a helpful assistant.",
)
for entry in session.transcript {
switch entry {
case let .instructions(instructions):
print("Instructions: ", instructions)
case let .prompt(prompt):
print("Prompt: ", prompt)
case let .reasoning(reasoning):
print("Reasoning: ", reasoning)
case let .toolCalls(toolCalls):
print("Tool Calls: ", toolCalls)
case let .toolOutput(toolOutput):
print("Tool Output: ", toolOutput)
case let .response(response):
print("Response: ", response)
}
}Note
LanguageModelSession and AgentSession are @Observable, so SwiftUI and other Observation-based UI can track response state, transcripts, token usage, and provider metadata directly. Both keep mutable runtime state behind internal synchronization instead of exposing writable transcript storage.
Track each session's cumulative token consumption to budget response costs or surface usage in settings screens:
import SwiftAgent
let model = OpenResponsesLanguageModel(apiKey: "sk-...", model: "openai/gpt-5")
let session = LanguageModelSession(
model: model,
instructions: "You are a helpful assistant.",
)
let response = try await session.respond(to: "What's the weather like in San Francisco?")
print(response.tokenUsage?.inputTokens ?? 0)
print(response.tokenUsage?.outputTokens ?? 0)
print(response.tokenUsage?.reasoningTokens ?? 0)
print(response.tokenUsage?.totalTokens ?? 0)
print(session.tokenUsage?.totalTokens ?? 0)Note: Each individual response also includes token usage information on
LanguageModelSession.Response.
AgentSession exposes aggregated usage for the full run, including every model turn needed to execute tools:
let result = try await agent.run(to: "Plan with current weather.")
print(result.tokenUsage?.totalTokens ?? 0)
print(agent.tokenUsage?.totalTokens ?? 0)Build rich prompts inline with the Prompt builder DSL. Tags group related context, keep instructions readable, and mirror the structure providers expect when you want to mix prose with metadata.
let response = try await session.respond(to: Prompt {
"You are a friendly assistant who double-checks calculations."
PromptTag("user-question") {
"Explain how Swift's structured concurrency works."
}
PromptTag("formatting") {
"Answer in three concise bullet points."
}
})
print(response.content)Under the hood SwiftAgent converts the builder result into the exact wire format required by the provider, so you can focus on intent instead of string concatenation.
LanguageModelSession also supports multimodal prompts when the provider accepts images. Attach one image or several images to a direct response request; the session records the image segments in the transcript so follow-up turns can preserve provider metadata.
let image = Transcript.ImageSegment(
source: .url(URL(string: "https://example.com/receipt.jpg")!)
)
let response = try await session.respond(
to: "Read the total and merchant name from this receipt.",
image: image,
)
print(response.content)Use the same model with AgentSession when image analysis should be part of a larger tool-using workflow.
You can specify generation options for your responses:
import SwiftAgent
let model = OpenAILanguageModel(apiKey: "sk-...", model: "gpt-5", apiVariant: .responses)
let session = LanguageModelSession(
model: model,
instructions: "You are a helpful assistant.",
)
var options = GenerationOptions(
temperature: 0.7,
maximumResponseTokens: 1_000,
)
options[custom: OpenAILanguageModel.self] = .init(
reasoning: .init(effort: .low, summary: "auto"),
serviceTier: .auto,
)
let response = try await session.respond(
to: "What's the weather like in San Francisco?",
options: options,
)
print(response.content)These overrides apply only to the current turn, so you can increase creativity or token limits for specific prompts without mutating the session-wide configuration.
Anthropic uses its own generation options:
import SwiftAgent
let model = AnthropicLanguageModel(apiKey: "sk-ant-...", model: "claude-sonnet-4-5")
let session = LanguageModelSession(
model: model,
instructions: "You are a helpful assistant.",
)
var options = GenerationOptions(maximumResponseTokens: 1_000)
options[custom: AnthropicLanguageModel.self] = .init(thinking: .init(budgetTokens: 1_024))
let response = try await session.respond(
to: "What's the weather like in San Francisco?",
options: options,
)
print(response.content)SwiftAgent does not send store to OpenAI Responses unless you set it in provider options. OpenAI's Responses API stores responses by default, so omitting store keeps OpenAI's default behavior. Set store: false when you need an explicit no-store request:
options[custom: OpenAILanguageModel.self] = .init(
reasoning: .init(effort: .low, summary: "auto"),
store: false,
)For reasoning models, no-store continuity can require encrypted reasoning metadata. SwiftAgent preserves provider metadata it receives, including encrypted reasoning payloads, but it does not yet automatically request every provider-specific include needed for full stateless reasoning continuity. See the OpenAI feature matrices linked below for current gaps.
Raw transcripts expose every event as GeneratedContent, which is flexible but awkward when you want to build UI or assertions.
Create a schema using @SessionSchema to describe the tools, groundings, and structured outputs you expect. The schema is runtime-neutral: use it to resolve transcript entries from either LanguageModelSession or AgentSession into strongly typed cases that mirror your declarations.
struct WeatherReportOutput: StructuredOutput {
static let name = "weatherReport"
@Generable
struct Schema {
let temperature: Double
let condition: String
let humidity: Int
}
}
@SessionSchema
struct SessionSchema {
@Tool var weatherTool = WeatherTool()
@Tool var calculatorTool = CalculatorTool()
@Grounding(Date.self) var currentDate
@Grounding(VectorSearchResult.self) var searchResults
@StructuredOutput(WeatherReportOutput.self) var weatherReport
@StructuredOutput(CalculatorOutput.self) var calculatorOutput
}
let sessionSchema = SessionSchema()
let model = OpenResponsesLanguageModel(apiKey: "sk-...", model: "openai/gpt-5")
let session = AgentSession(
model: model,
schema: sessionSchema,
instructions: "You are a helpful assistant.",
)Each macro refines a portion of the transcript:
@Toollinks a tool implementation to its decoded entries, giving you typed arguments, outputs, and errors for every invocation.@Groundingregisters values you inject into prompts (like dates or search results) so they can be replayed alongside the prompt text.@StructuredOutputbinds a guided generation schema to its decoded result, including partial streaming updates and final values.
Decoded tool runs combine the model's argument payload and your tool's output in one place. That makes it easy to render progress UIs and surface recoverable errors without manually joining separate transcript entries.
import SwiftAgent
@SessionSchema
struct SessionSchema {
@Tool var weatherTool = WeatherTool()
}
let sessionSchema = SessionSchema()
let model = OpenResponsesLanguageModel(apiKey: "sk-...", model: "openai/gpt-5")
let session = AgentSession(
model: model,
schema: sessionSchema,
instructions: "You are a helpful assistant.",
)
// let response = try await session.run(to: "What's the weather like in San Francisco?")
// ...
for entry in try sessionSchema.resolve(session.transcript) {
switch entry {
case let .toolRun(toolRun):
switch toolRun {
case let .weatherTool(weatherToolRun):
if let arguments = weatherToolRun.finalArguments {
print(arguments.city, arguments.unit)
}
if let output = weatherToolRun.output {
print(output.condition, output.humidity, output.temperature)
}
default:
break
}
default: break
}
}When you request structured data, decoded responses slot those values directly into the schema you registered on the session. You can pull the result out of the live response or from the transcript later, depending on your workflow.
import SwiftAgent
@SessionSchema
struct SessionSchema {
@StructuredOutput(WeatherReportOutput.self) var weatherReport
}
let sessionSchema = SessionSchema()
let model = OpenResponsesLanguageModel(apiKey: "sk-...", model: "openai/gpt-5")
let session = LanguageModelSession(
model: model,
schema: sessionSchema,
instructions: "You are a helpful assistant.",
)
let response = try await session.respond(
to: Prompt("Create a concise weather report for San Francisco."),
generating: WeatherReportOutput.Schema.self,
)
print(response.content) // WeatherReportOutput.Schema object
// Access the structured output in the resolved transcript
for entry in try sessionSchema.resolve(session.transcript) {
switch entry {
case let .response(response):
switch response.structuredSegments[0].content {
case let .weatherReport(weatherReport):
if let weatherReport = weatherReport.finalContent {
print(weatherReport.condition, weatherReport.humidity, weatherReport.temperature)
}
case .unknown:
print("Unknown output")
}
default: break
}
}Groundings capture extra context you feed the model—like the current time or search snippets—and keep it synchronized with the prompt text. That makes it straightforward to inspect what the model saw and to recreate prompts later for debugging.
import SwiftAgent
@SessionSchema
struct SessionSchema {
@Grounding(Date.self) var currentDate
@StructuredOutput(WeatherReportOutput.self) var weatherReport
}
let sessionSchema = SessionSchema()
let model = OpenResponsesLanguageModel(apiKey: "sk-...", model: "openai/gpt-5")
let session = LanguageModelSession(
model: model,
schema: sessionSchema,
instructions: "You are a helpful assistant.",
)
let response = try await session.respond(
to: Prompt {
PromptTag("context") {
"The current date is \(Date())."
}
PromptTag("user-query") {
"What's the weather like in San Francisco?"
}
},
)
print(response.content)
// Access the input prompt and its groundings separately in the transcript
for entry in try sessionSchema.resolve(session.transcript) {
switch entry {
case let .prompt(prompt):
print(prompt.input) // User input
// Grounding sources stored alongside the input prompt
// If this prompt was produced by a schema-aware grounding API, sources decode here.
for source in prompt.sources {
switch source {
case let .currentDate(date):
print("Current date: \(date)")
}
}
print(prompt.prompt) // Final prompt sent to the model
default: break
}
}LanguageModelSession.streamResponse(...) streams one direct model/session response. It emits snapshots while the provider returns reasoning, tool-call arguments, text, or structured content for that turn. It does not execute local tools; if the model emits tool calls, inspect them and explicitly continue with streamResponse(with: toolOutputs).
Use AgentSession.stream(...) when you want a full agent run. Agent streaming includes model events plus tool lifecycle events across loop iterations.
SwiftAgent generates PartiallyGenerated companions for every @Generable type, turning each property into an optional so tokens can land as soon as they are decoded. SwiftAgent surfaces those partial values directly, then swaps in the fully realized type once the model finalizes the turn.
import SwiftAgent
@SessionSchema
struct SessionSchema {
@StructuredOutput(WeatherReportOutput.self) var weatherReport
}
let sessionSchema = SessionSchema()
let model = OpenResponsesLanguageModel(apiKey: "sk-...", model: "openai/gpt-5")
let session = LanguageModelSession(
model: model,
schema: sessionSchema,
instructions: "You are a helpful assistant.",
)
let stream = session.streamResponse(to: "Write a short note about Swift concurrency.")
for try await snapshot in stream {
// Once the model is sending the response, the snapshot's content will start to populate
if let content = snapshot.content {
print(content)
}
// You can also access the generated transcript as it is streamed in
print(snapshot.transcript)
}Each snapshot contains the latest response fragment—if the model has started speaking—and the full transcript up to that point, giving you enough context to animate UI or log intermediate steps.
For automatic tool execution, stream an AgentSession and handle agent events:
let agent = AgentSession(
model: model,
tools: [WeatherTool()],
instructions: "Use tools for current weather questions.",
)
for try await event in agent.stream(to: "What's the weather in Nashville?") {
switch event {
case let .modelEvent(modelEvent):
print("Model:", modelEvent)
case let .toolExecutionStarted(call):
print("Running:", call.toolName)
case let .toolOutput(output):
print("Tool output:", output)
case let .completed(result):
print(result.content)
default:
break
}
}Structured streaming works the same way: SwiftAgent first yields partially generated objects whose properties fill in as tokens arrive, then delivers the final schema once generation completes.
import SwiftAgent
@SessionSchema
struct SessionSchema {
@StructuredOutput(WeatherReportOutput.self) var weatherReport
}
let sessionSchema = SessionSchema()
let model = OpenResponsesLanguageModel(apiKey: "sk-...", model: "openai/gpt-5")
let session = LanguageModelSession(
model: model,
schema: sessionSchema,
instructions: "You are a helpful assistant.",
)
let stream = session.streamResponse(
to: Prompt("Create a concise weather report for San Francisco."),
generating: WeatherReportOutput.Schema.self,
)
for try await snapshot in stream {
// Once the model is sending the final response, the snapshot's content will start to populate
if let weatherReport = snapshot.content {
print(weatherReport.condition ?? "Not received yet")
print(weatherReport.humidity ?? "Not received yet")
print(weatherReport.temperature ?? "Not received yet")
}
// You can also access the generated transcript as it is streamed in
let transcript = snapshot.transcript
let resolvedTranscript = try sessionSchema.resolve(transcript)
print(transcript, resolvedTranscript)
}
// You can also observe the transcript during streaming
for entry in try sessionSchema.resolve(session.transcript) {
switch entry {
case let .response(response):
switch response.structuredSegments[0].content {
case let .weatherReport(weatherReport):
switch weatherReport.content {
case let .partial(partialWeatherReport):
print(partialWeatherReport) // Partially populated object
case let .final(finalWeatherReport):
print(finalWeatherReport) // Fully populated object
default:
break // Not yet available
}
case .unknown:
print("Unknown output")
}
default: break
}
}SwiftAgent keeps SwiftUI views stable by exposing current projections of in-flight data. For agent-owned tool runs, stream AgentSession events and resolve agent.transcript with your schema. currentArguments always returns the partially generated variant of your argument type alongside an isFinal flag, so the view does not need to branch on enum states. When you need the fully validated payload reach for finalArguments, and if you want to respond to streaming transitions you can switch over argumentsPhase.
struct WeatherToolRunView: View {
let run: ToolRun<WeatherTool>
var body: some View {
// 1. UI-friendly projection that stays stable while streaming
if let currentArguments = run.currentArguments {
VStack(alignment: .leading, spacing: 4) {
Text("City: \(currentArguments.city ?? "-")")
Text("Unit: \(currentArguments.unit ?? "-")")
if currentArguments.isFinal {
Text("Arguments locked in").font(.caption).foregroundStyle(.secondary)
}
}
.monospacedDigit()
}
// 2. Fully validated payload, available once the model finalizes arguments
if let finalArguments = run.finalArguments {
Text("Resolved location: \(finalArguments.city)")
.font(.footnote)
}
// 3. Underlying phase enum if you need to branch on streaming progress
switch run.argumentsPhase {
case let .partial(partialArguments):
Text("Awaiting completion… \(partialArguments.city ?? "-")")
.font(.caption)
.foregroundStyle(.secondary)
case let .final(finalArguments):
Text("Final: \(finalArguments.city)")
.font(.caption)
.foregroundStyle(.green)
case .none:
EmptyView()
}
}
}Structured outputs follow the same pattern with snapshot.currentContent: you always receive a partially generated projection that updates in place, while finalContent and contentPhase give you access to the completed schema and the streaming status respectively. The Example App’s playground views lean on these helpers to render incremental suggestions without triggering SwiftUI identity churn.
Sending your OpenAI API key from the device is fine while sketching ideas, but it is not acceptable once you ship. Point the SDK at a proxy you control so the app never sees the provider credential:
let proxyToken = try await backend.issueTurnToken(for: userId)
let httpClient = URLSessionHTTPClient(configuration: .init(
baseURL: URL(string: "https://api.your-backend.com/proxy/v1/")!,
jsonEncoder: JSONEncoder(),
jsonDecoder: JSONDecoder(),
interceptors: HTTPClientInterceptors(
prepareRequest: { request in
request.setValue("Bearer \(proxyToken)", forHTTPHeaderField: "Authorization")
}
)
))
let model = OpenResponsesLanguageModel(
apiKey: proxyToken,
model: "openai/gpt-5",
httpClient: httpClient
)
let session = LanguageModelSession(
model: model,
instructions: "You are a helpful assistant.",
)SwiftAgent reuses the base URL you provide and appends the normal Responses API route, for example https://api.your-backend.com/proxy/v1/responses. Your backend should forward that path to OpenAI, attach its secret API key, and return the upstream response. In practice a robust proxy will:
- Validate the catch-all path so only the expected
/v1/responsesendpoint is reachable. - Decode and inspect the body before relaying it (for example, enforce a
safety_identifier, limit models, or reject obviously abusive payloads). - Stream the request to OpenAI and pass the response straight through, optionally recording token usage for billing.
Every request emitted by the SDK already matches the Responses API schema, so the proxy does not need to reshape payloads.
The same proxied model can be passed to AgentSession when you want automatic tool execution. Per-turn authorization applies to each model turn the agent performs.
Protect the proxy with short-lived tokens instead of static API keys. Before each call to respond or streamResponse, ask your backend for a token that identifies the signed-in user and expires after one turn:
let turnToken = try await backend.issueTurnToken(for: userId)
let model = OpenResponsesLanguageModel(
apiKey: turnToken,
model: "openai/gpt-5",
httpClient: proxyHTTPClient(for: turnToken)
)
let session = LanguageModelSession(model: model)
let response = try await session.respond(to: "Summarize yesterday's sales numbers.")Install the turn token in the request headers used by your proxy HTTP client so every internal request for the turn inherits the same authorization.
For quick prototypes you can still pass an API key directly to OpenResponsesLanguageModel or OpenAILanguageModel, but remove direct provider credentials before release.
You can test direct sessions and agents without making API calls using the built-in simulation system. This is useful for prototyping, testing, and developing UIs before integrating with live APIs.
import SwiftAgent
import SimulatedSession
// Create mockable tool wrappers
struct WeatherToolMock: MockableTool {
var tool: WeatherTool
func mockArguments() -> WeatherTool.Arguments {
.init(city: "San Fransico", unit: "Celsius")
}
func mockOutput() async throws -> WeatherTool.Output {
.init(
temperature: 22.5,
condition: "sunny",
humidity: 65
)
}
}
@SessionSchema
struct SessionSchema {
@Tool var weatherTool = WeatherTool()
}
let sessionSchema = SessionSchema()
let configuration = SimulationConfiguration(defaultGenerations: [
.reasoning(summary: "Simulated Reasoning"),
.toolRun(tool: WeatherToolMock(tool: WeatherTool())),
.response(text: "It's a beautiful sunny day in San Francisco with 22.5°C!"),
])
let model = SimulationLanguageModel(configuration: configuration)
let session = LanguageModelSession(
model: model,
schema: sessionSchema,
instructions: "You are a helpful assistant."
)
let response = try await session.respond(to: "What's the weather like in San Francisco?")
print(response.content) // "It's a beautiful sunny day in San Francisco with 22.5°C!"Use the same SimulationLanguageModel with AgentSession when you want to exercise the high-level agent runtime in tests or previews.
Logging covers both direct provider/session calls and high-level agent runs. Direct LanguageModelSession calls log model requests and responses. AgentSession adds agent lifecycle, tool call, tool output, and completion events.
// Enable comprehensive logging
SwiftAgentConfiguration.setLoggingEnabled(true)
// Enable full request/response network logging (very verbose but helpful for debugging)
SwiftAgentConfiguration.setNetworkLoggingEnabled(true)
// Logs show:
// 🟢 Agent start — model=gpt-5 | tools=weather, calculator
// 🛠️ Tool call — weather [abc123]
// 📤 Tool output — weather [abc123]
// ✅ FinishedWhen writing unit tests, it’s often useful to capture real provider payloads and replay them locally.
SwiftAgent includes an opt-in recorder (HTTPReplayRecorder) that attaches to the SDK’s networking layer
via HTTPClientInterceptors and prints paste-ready Swift fixtures.
This is especially useful for streaming responses where you want to replay the full text/event-stream
payload in tests (like the ones using ReplayHTTPClient in this repository).
This repository includes a small macOS command-line tool (AgentRecorder) that runs recording scenarios and prints
paste-ready Swift fixtures to stdout.
- Set API keys (either in your shell or in Xcode scheme env vars):
OPENAI_API_KEYANTHROPIC_API_KEY
Alternatively, if you already have a local Secrets.plist (not committed) you can let the CLI read keys from it:
- Set
AGENT_RECORDER_SECRETS_PLIST(or pass--secrets-plist <path>) - Provide
OpenAI_API_Key_Debugand/orAnthropic_API_Key_Debugkeys inside that plist- Tip: if you place
Secrets.plistin the repo root and runAgentRecorderfrom the repo root, it will be picked up automatically.
- Tip: if you place
- Run from Xcode:
- Open
SwiftAgent.xcworkspace - Select the
AgentRecorderscheme - Run (stdout/stderr show in Xcode’s Debug console)
- Run from Terminal:
xcodebuild -workspace SwiftAgent.xcworkspace -scheme AgentRecorder -destination "platform=macOS" -derivedDataPath .tmp/DerivedData build
OPENAI_API_KEY=sk-... ./.tmp/DerivedData/Build/Products/Debug/AgentRecorder --list-scenarios
OPENAI_API_KEY=sk-... ./.tmp/DerivedData/Build/Products/Debug/AgentRecorder --provider openai --scenario openai/streaming-tool-calls/weatherimport SwiftAgent
let recorder = HTTPReplayRecorder(
options: .init(
includeRequests: false,
includeHeaders: true,
prettyPrintJSON: true
)
)
var interceptors = HTTPClientInterceptors(
prepareRequest: { request in
request.setValue("Bearer \(apiKey)", forHTTPHeaderField: "Authorization")
}
)
interceptors = interceptors.recording(to: recorder)
let configuration = HTTPClientConfiguration(
baseURL: URL(string: "https://api.openai.com/v1/")!,
jsonEncoder: JSONEncoder(),
jsonDecoder: JSONDecoder(),
interceptors: interceptors
)
let httpClient = URLSessionHTTPClient(configuration: configuration)
let model = OpenResponsesLanguageModel(
apiKey: apiKey,
model: "openai/gpt-5",
httpClient: httpClient
)
let session = LanguageModelSession(
model: model,
instructions: "You are a helpful assistant.",
)
_ = try await session.respond(to: "Hello!")
await recorder.printSwiftFixtureSnippet()Notes:
- Streaming responses are recorded as raw
text/event-streampayloads (may be partial if the consumer stops iterating early). - Enabling stream capture buffers stream bytes in memory; keep it enabled only for debugging/fixture recording.
- Request headers may contain secrets; the recorder redacts common auth header fields before printing.
- Recorder scenarios can exercise direct
LanguageModelSessioncalls or fullAgentSessionruns; use agent scenarios when fixture coverage needs local tool execution and continuation loops.
Work in Progress: SwiftAgent is under active development. APIs may change, and breaking updates are expected. Use in production with caution.
SwiftAgent supports OpenAI, Anthropic, OpenAI-compatible Responses providers, and simulation through the shared LanguageModel contract. Provider APIs do not have identical feature surfaces, so the source tree keeps parity notes beside each implementation:
Those matrices are the best place to check current gaps such as hosted/server-side tools, stored response continuation, encrypted reasoning metadata, and provider-specific request options.
SwiftAgent ships with a SwiftUI demo that showcases the SDK in action. Open the project at Examples/Example App/ExampleApp to explore an agent playground that:
- Configures OpenAI and Anthropic language models with the bundled
SessionSchema, calculator tool, weather tool, and a structured weather report output. - Lets you switch between direct
LanguageModelSessionstreaming and automaticAgentSessionstreaming. - Renders prompts, reasoning summaries, tool runs, and final replies in a chat-style transcript UI.
- Demonstrates tool-specific views (calculator and weather) with live argument updates, results, and SwiftUI previews backed by
SimulationLanguageModelscenarios.
Use the app to experiment with SwiftAgent locally or as a starting point for integrating the SDK into your own SwiftUI experience.
SwiftAgent is available under the MIT license. See LICENSE for more information.
- Inspired by Apple's FoundationModels framework
- Built with the amazing Swift ecosystem and community
Made with ❤️ for the Swift community