Nexus

One API endpoint. Any backend. Zero configuration.

Nexus is a distributed LLM orchestrator that unifies heterogeneous inference backends behind a single, intelligent API gateway. Local first, cloud when needed.

Features

🔍 Auto-Discovery — Finds LLM backends on your network via mDNS
🎯 Intelligent Routing — Routes by model capabilities, load, and latency
🔄 Transparent Failover — Retries with fallback backends automatically
🔌 OpenAI-Compatible — Works with any OpenAI API client
⚡ Zero Config — Just run it — works out of the box with Ollama
🔒 Privacy Zones — Structural enforcement prevents data from reaching cloud backends
💰 Budget Management — Token-aware cost tracking with automatic spend limits
📊 Real-time Dashboard — Monitor backends, models, and requests in your browser
🧠 Quality Tracking — Profiles backend response quality to inform routing decisions
📐 Embeddings API — OpenAI-compatible /v1/embeddings with capability-aware routing
📋 Request Queuing — Holds requests when backends are busy, with priority support
🔧 Model Lifecycle — Load, unload, and migrate models across backends via API
🔮 Fleet Intelligence — Pattern analysis with pre-warming recommendations

Supported Backends

Backend	Status	Discovery
Ollama	✅ Supported	mDNS (auto)
LM Studio	✅ Supported	Static config
vLLM	✅ Supported	Static config
llama.cpp	✅ Supported	Static config
exo	✅ Supported	mDNS (auto)
OpenAI	✅ Supported	Static config

Quick Start

# Install from source
cargo install --path .

# Start with auto-discovery (zero config)
nexus serve

# Or with Docker
docker run -d -p 8000:8000 leocamello/nexus

Once running, send your first request:

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "llama3:70b", "messages": [{"role": "user", "content": "Hello!"}]}'

Point any OpenAI-compatible client to http://localhost:8000/v1 — Claude Code, Continue.dev, OpenAI SDK, or plain curl.

→ Full setup guide — installation, configuration, CLI reference, and more.

Architecture

┌──────────────────────────────────────────────────┐
│              Nexus Orchestrator                   │
│  - Discovers backends via mDNS                   │
│  - Tracks model capabilities & quality           │
│  - Routes to best available backend              │
│  - Queues requests when backends are busy        │
│  - OpenAI-compatible API + Embeddings            │
└──────────────────────────────────────────────────┘
        │           │           │           │
        ▼           ▼           ▼           ▼
   ┌────────┐  ┌────────┐  ┌────────┐  ┌────────┐
   │ Ollama │  │  vLLM  │  │  exo   │  │ OpenAI │
   │  7B    │  │  70B   │  │  32B   │  │ cloud  │
   └────────┘  └────────┘  └────────┘  └────────┘

Documentation

	Document	What you'll find
🚀	Getting Started	Installation, configuration, CLI, environment variables
📖	REST API	HTTP endpoints, X-Nexus-* headers, error responses
🔌	WebSocket API	Real-time dashboard protocol
🏗️	Architecture	System design, module structure, data flows
🗺️	Roadmap	Feature index (F01–F23), version history, future plans
🔧	Troubleshooting	Common errors, debugging tips
❓	FAQ	What Nexus is (and isn't), common questions
🤝	Contributing	Dev workflow, coding standards, PR guidelines
📋	Changelog	Release history
🔒	Security	Vulnerability reporting

License

Apache License 2.0 — see LICENSE for details.

Related Projects

exo — Distributed AI inference
LM Studio — Desktop app for local LLMs
Ollama — Easy local LLM serving
vLLM — High-throughput LLM serving
LiteLLM — Cloud LLM API router

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
.config		.config
.github		.github
.specify		.specify
.vscode		.vscode
benches		benches
dashboard		dashboard
docs		docs
scripts		scripts
specs		specs
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
nexus.example.toml		nexus.example.toml
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nexus

Features

Supported Backends

Quick Start

Architecture

Documentation

License

Related Projects

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Nexus

Features

Supported Backends

Quick Start

Architecture

Documentation

License

Related Projects

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages