A high-performance AI Gateway built with Hono and AI SDK for edge computing environments. This gateway provides unified access to multiple AI providers with intelligent fallback, streaming support, advanced tools integration, and multimedia generation capabilities.
- Unified Text API: Images and videos models accessible through standard OpenAI chat/responses and Anthropic messages endpoints
- Admin Models: Special administrative models for system management (admin/magic-vision)
- Streaming Support: Real-time responses with progress indicators
- Tool Integration: Python execution, web search, content extraction
- Response Storage: Persistent conversation management with Netlify Blobs
# Install dependencies
bun install
# Start development server
bun dev
# Build for production
bun buildPOST /v1/chat/completions
POST /v1/messages
POST /v1/responses
GET /v1/models
GET /v1/responses/:response_id # Get a specific response
GET /v1/responses # List all responses
DELETE /v1/responses/:response_id # Delete a specific response
DELETE /v1/responses/all # Delete all responses
POST /v1/chat/completions (model: admin/magic-vision)
POST /v1/responses (model: admin/magic-vision)
GET /v1/files/:file # Serve a file from Netlify Blobs
- Gateway: Vercel AI Gateway
- Direct Providers: Vercel AI Gateway (Gateway), OpenAI (ChatGPT), Google Generative AI (Gemini), Groq, Cerebras, OpenRouter, Poe, Volcengine (Doubao), ModelScope, Infini, Nvidia, Mistral, Poixe, Cohere, Morph, GitHub Models (GitHub), GitHub Copilot (Copilot), Cloudflare Gateway (Cloudflare), Meituan (LongCat), and any custom OpenAI chat/completions compatible providers.
- Gemini Image (Nano Banana): Gemini native image generation: t2i and i2i
- GPT Image (
image_generationtool): GPT-5 series native image generation: t2i and i2i - Black Forest Labs: FLUX models via Vercel AI Gateway: t2i and i2i
- Doubao (ByteDance): t2i/i2i (Seedream) and t2v/i2v (Seedance)
- ModelScope: Community models for t2i and i2i (i2i requires Netlify Blobs)
- Hugging Face: Community models for t2i, i2i, t2v, and i2v
If required environment variables are set, the following tools are enabled by adding tools in request body (except for when Anthropic format client tools are provided), even an empty array (in Cherry Studio, this is triggered by enabling model build-in search):
- Code Execution: Python Executor API
python_executoror model build-in (Gateway and Custom Geminicode_execution, Gateway Anthropiccode_execution, Gateway OpenAIcode_interpreter) - Web Search: Tavily Search API
web_searchor model build-in (Gateway and Custom Geminigoogle_search, Gateway OpenAIweb_search_preview, Gateway Anthropicweb_search, Gateway Grokmode = 'on', Gateway Perplexityalways on regardless of tools) - Content Extraction: Jina Reader API
fetchor model build-in (Gateway and Customurl_context)
In OpenAI endpoints, research mode is triggered by detecting keywards research and paper in conversation. Default search depth and reasoning effort will increase, all tools above (except python_executor) and research APIs below will be enabled:
- Research APIs: Ensembl API
ensembl_api, Semantic Scholar APIsscholar_searchandpaper_recommendations
# Required
PASSWORD=your-gateway-password
GATEWAY_API_KEY=your-vercel-ai-gateway-key
# Optional tools
TAVILY_API_KEY=tvly-dev-...
PYTHON_API_KEY=your-python-key
PYTHON_URL=https://your-python-executor.com
# Use Netlify Blobs in non-Netlify platforms
NETLIFY_SITE_ID=your-netlify-site-id
NETLIFY_TOKEN=nfp_...
URL=http://localhost:8888 # Optional site URL to upload files
# Optional provider-specific keys
CHATGPT_API_KEY=sk-proj-...,sk-proj-...,sk-proj-...
GROQ_API_KEY=gsk_...
CEREBRAS_API_KEY=csk-...
GEMINI_API_KEY=AIzaSy...
CHATGPT_API_KEY=sk-proj-...
DOUBAO_API_KEY=your-volcengine-key
MODELSCOPE_API_KEY=ms-...
GITHUB_API_KEY=github_pat_...
OPENROUTER_API_KEY=sk-or-v1-...
NVIDIA_API_KEY=nvapi-...
MISTRAL_API_KEY=your-mistral-key
COHERE_API_KEY=your-cohere-api-key
MORPH_API_KEY=sk-...
INFINI_API_KEY=sk-...
POIXE_API_KEY=sk-...
COPILOT_API_KEY=ghu_...
POE_API_KEY=your-poe-api-key
HUGGINGFACE_API_KEY=hf_...
LONGCAT_API_KEY=ak_...
# Cloudflare Gateway (GPT-OSS series currently not supported)
CLOUDFLARE_API_KEY=your-cloudflare-api-key
CLOUDFLARE_ACCOUNT_ID=your-cloudflare-account-id
CLOUDFLARE_GATEWAY=your-cloudflare-gateway-name
#Custom OpenAI chat/completions format providers
CUSTOM_API_ENDPOINTS={"internai":{"baseURL":"https://chat.intern-ai.org.cn/api/v1"},"lmstudio":{"baseURL":"http://localhost:1234/v1"}}
INTERNAI_API_KEY=each-custom-provider-must-have-at-least-a-key
LMSTUDIO_API_KEY=each-custom-provider-must-have-at-least-a-key- LLM Providers:
provider/modelfor custom providers,modelfor Vercel AI Gateway, e.g.: - For Gemini native image generation:
google/gemini-3-pro-image(Vercel AI Gateway),gemini/gemini-2.5-flash-image(Google Generative AI) - For ChatGPT native image generation: add
-imagesuffix to model ID, e.g.openai/gpt-5.1-image(Vercel AI Gateway)
- Black Forest Labs:
image/bfl/flux-2-pro,image/bfl/flux-kontext-proetc. (image/bfl/+ BFL model ID via Vercel AI Gateway) - Doubao (ByteDance):
image/doubao- i2i and t2i. - Hugging Face:
image/huggingface/black-forest-labs/FLUX.1-Kontext-devetc. (image/huggingface/+ any Hugging Face internal model ID) - ModelScope:
image/modelscope/Qwen/Qwen-Imageetc. (image/modelscope/+ any ModelScope internal model ID; i2i requires Netlify Blobs) - Flags:
--size WxH,--ratio A:B,--guidance N,--steps N,--seed Netc. (Send/helpfor help)
- Doubao Seedance:
video/doubao-seedance,video/doubao-seedance-pro(t2v and i2v) - Hugging Face:
video/Wan-AI/Qwen-Wan2.2-I2V-A14B-visionetc. (Any Hugging Face Inference model, t2v and i2v) - Flags:
--ratio 16:9,--duration 3-12,--resolution 720petc. (Send/helpfor help)
- System Management:
admin/magic-vision(Send/helpfor help)
# Basic chat completion with password authentication
curl -X POST "$HOSTNAME/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $PASSWORD" \
-d '{
"model": "cerebras/gpt-oss-120b",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
],
"temperature": 1.0,
"max_tokens": 1000,
"stream": false
}'# Basic message completion with Anthropic format
curl -X POST "$HOSTNAME/v1/messages" \
-H "Content-Type: application/json" \
-H "x-api-key: $PASSWORD" \
-d '{
"model": "anthropic/claude-sonnet-4",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
],
"max_tokens": 1000,
"stream": false
}'# Message with tools and web search capabilities
curl -X POST "$HOSTNAME/v1/messages" \
-H "Content-Type: application/json" \
-H "x-api-key: $PASSWORD" \
-d '{
"model": "anthropic/claude-opus-4.5",
"messages": [
{"role": "user", "content": "Search for the latest developments in AI research"}
],
"tools": [
{
"name": "paper_search",
"description": "Search the web for papers",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"}
},
"required": ["query"]
}
}
],
"max_tokens": 2000,
"stream": true
}'# Image editing using Doubao (ByteDance) models
curl -X POST "$HOSTNAME/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $PASSWORD" \
-d '{
"model": "image/doubao-vision",
"messages": [
{"role": "user", "content": "A beautiful sunset over mountains --size 1280x720 --guidance 7.5"}
],
"stream": true
}'
# Generate images using ModelScope models
curl -X POST "$HOSTNAME/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $PASSWORD" \
-d '{
"model": "image/Qwen/Qwen-Image", # image/ + ModelScope model ID
"messages": [
{"role": "user", "content": "Cyberpunk cityscape at night --steps 30 --guidance 3.5"}
]
}'# Generate videos using Doubao Seedance models
curl -X POST "$HOSTNAME/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $PASSWORD" \
-d '{
"model": "video/doubao-seedance",
"messages": [
{"role": "user", "content": "A cat playing in a garden --ratio 16:9 --duration 5"}
],
"stream": true
}'# Chat with search (enabled by adding tools, even an empty array) and custom provider key
curl -X POST "$HOSTNAME/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer gsk_...,gsk_..." \
-d '{
"model": "groq/moonshotai/kimi-k2-instruct",
"messages": [
{"role": "user", "content": "Search for information about CRISPR gene editing"}
],
"tools": []
}'curl -X POST "$HOSTNAME/v1/responses" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $PASSWORD" \
-d '{
"model": "gemini/gemini-2.5-flash",
"input": [
{"role": "user", "content": [{"type": "input_text", "text": "What are the latest developments in AI?"}]}
],
"tools": []
}'# Generate image with reasoning steps shown
curl -X POST "$HOSTNAME/v1/responses" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $PASSWORD" \
-d '{
"model": "image/doubao-vision",
"input": [
{"role": "user", "content": [
{"type": "input_text", "text": "Create a logo for a tech startup"},
{"type": "input_image", "image_url": "data:image/jpeg;base64,..."}
]}
],
"stream": true
}'# Use admin model for system management
curl -X POST "$HOSTNAME/v1/responses" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $PASSWORD" \
-d '{
"model": "admin/magic-vision",
"input": "deleteall" # Delete all stored responses
}'# Create response with conversation continuation
curl -X POST "$HOSTNAME/v1/responses" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $PASSWORD" \
-d '{
"model": "gemini/gemini-2.5-flash",
"input": "Tell me more about that topic",
"previous_response_id": "resp_abc123456789"
}'# Get all available models including text, image, video, and admin models
curl -X GET "$HOSTNAME/v1/models" \
-H "Authorization: Bearer $PASSWORD" \# POST method and password auth also works for models
curl -X POST "$HOSTNAME/v1/models" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $PASSWORD"# Get response by ID
curl -X GET "$HOSTNAME/v1/responses/resp_abc123" \
-H "Authorization: Bearer $PASSWORD"# Get response with streaming
curl -X GET "$HOSTNAME/v1/responses/resp_abc123?stream=true" \
-H "Authorization: Bearer $PASSWORD"# List all stored responses
curl -X GET "$HOSTNAME/v1/responses" \
-H "Authorization: Bearer $PASSWORD"# List with filters
curl -X GET "$HOSTNAME/v1/responses?prefix=resp_&limit=10" \
-H "Authorization: Bearer $PASSWORD"# Delete response by ID
curl -X DELETE "$HOSTNAME/v1/responses/resp_abc123" \
-H "Authorization: Bearer $PASSWORD"# Delete all stored responses
curl -X DELETE "$HOSTNAME/v1/responses/all" \
-H "Authorization: Bearer $PASSWORD"βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Client App βββββΆβ Hono Gateway βββββΆβ AI Providers β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β
βΌ
ββββββββββββββββββββ
β Modular System β
β β’ Text Models β
β β’ Image Models β
β β’ Video Models β
β β’ Admin Models β
β β’ Tools Layer β
ββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββ
β Data Storage β
β β’ Netlify Blobs β
β β’ Response Mgmt β
β β’ Conversation β
ββββββββββββββββββββ