1 unstable release

0.4.0	Mar 30, 2026

#1946 in HTTP server

MIT license

180KB
4.5K SLoC

swiftllm

A blazing-fast universal LLM gateway written in Rust. Route requests to OpenAI, Anthropic, Google Gemini, Mistral, Ollama, and more through a single OpenAI-compatible API.

┌──────────────┐       ┌───────────┐       ┌──────────┐
│  Your App    │──────▶│ swiftllm  │──────▶│  OpenAI  │
│  (any SDK)   │       │  :8080    │──────▶│ Anthropic│
└──────────────┘       └───────────┘──────▶│  Gemini  │
                                    ──────▶│  Mistral │
                                    ──────▶│  Ollama  │
                                           └──────────┘

Why?

Most teams use multiple LLM providers. That means juggling different SDKs, API formats, and billing dashboards. swiftllm gives you:

One endpoint — drop-in replacement for the OpenAI API. Use any SDK or tool that speaks OpenAI format.
Automatic routing — requests route to the right provider based on model name (gpt-4.1 → OpenAI, claude-sonnet-4-6 → Anthropic, gemini-2.0-flash → Google, mistral-large-latest → Mistral, llama3:latest → Ollama).
Streaming support — full SSE streaming with format translation across all providers.
Single binary — no runtime dependencies, no Docker required. Just download and run.
~1ms overhead — built in Rust with async I/O. Adds negligible latency.

Quick Start

Download pre-built binary

Grab the latest release for your platform from the Releases page:

# Linux
curl -L https://github.com/Elyeden0/swiftllm/releases/latest/download/swiftllm-linux-amd64.tar.gz | tar xz
chmod +x swiftllm

# macOS (Apple Silicon)
curl -L https://github.com/Elyeden0/swiftllm/releases/latest/download/swiftllm-macos-arm64.tar.gz | tar xz
chmod +x swiftllm

# Windows
# Download swiftllm-windows-amd64.zip from the releases page and extract it

Then configure and run:

# Copy the example .env and add your API keys
cp .env.example .env
# Edit .env with your API keys...

./swiftllm

The .env file must be placed in the same directory as the executable. swiftllm will refuse to start without it.

From source

git clone https://github.com/Elyeden0/swiftllm
cd swiftllm
cargo build --release

cp .env.example .env
# Edit .env with your API keys...

./target/release/swiftllm

Usage

Once running, point any OpenAI-compatible client at http://localhost:8080:

# Non-streaming
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-proxy-api-key" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# Streaming
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-proxy-api-key" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Tell me a joke"}],
    "stream": true
  }'

Works with the OpenAI Python SDK:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="your-proxy-api-key",
)

# Use any model from any provider
response = client.chat.completions.create(
    model="claude-sonnet-4-6",  # Routes to Anthropic
    messages=[{"role": "user", "content": "Hello!"}],
)

Configuration

All configuration is done through a .env file. See .env.example for all options.

PORT=8080
AUTH_API_KEYS=your-proxy-api-key
DEFAULT_PROVIDER=openai

OPENAI_API_KEY=sk-...
OPENAI_MODELS=gpt-4o,gpt-4.1,o3,o4-mini
OPENAI_PRIORITY=1

ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODELS=claude-sonnet-4-6,claude-opus-4-6
ANTHROPIC_PRIORITY=2

GEMINI_API_KEY=your-gemini-key
GEMINI_MODELS=gemini-2.0-flash,gemini-2.0-pro
GEMINI_PRIORITY=3

MISTRAL_API_KEY=your-mistral-key
MISTRAL_MODELS=mistral-large-latest,codestral-latest
MISTRAL_PRIORITY=4

OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODELS=llama3:latest,mistral:latest
OLLAMA_PRIORITY=10

You can also pass the path explicitly: swiftllm --env /path/to/.env

Model routing

Models are routed to providers in this order:

Exact match — if a model name appears in a provider's MODELS list
Prefix match — gpt-* → OpenAI, claude-* → Anthropic, gemini-* → Google, mistral-* → Mistral, model:tag → Ollama
Default provider — the DEFAULT_PROVIDER fallback

API Endpoints

Endpoint	Description
`POST /v1/chat/completions`	Chat completions (streaming & non-streaming)
`GET /v1/models`	List all configured models
`GET /health`	Health check
`GET /api/stats`	Usage stats, cost tracking, cache metrics
`GET /dashboard`	Live web dashboard

Roadmap

Response caching (LRU with configurable TTL)
Cost tracking & token counting dashboard
Automatic failover with priority chains
Embedded web dashboard
Rate limiting per provider
Google Gemini provider
Tool/function call translation
Request logging & analytics

License

MIT

Dependencies

~102MB
~2.5M SLoC