Track Every Request. Go Low-Level. Never Hit Rate Limits Again.
The ultimate Claude API proxy with intelligent load balancing across multiple accounts. Full visibility into every request, response, and rate limit.
video_compressed.mp4
- 🚀 Zero Rate Limit Errors - Automatically distribute requests across multiple accounts
- 🤖 Multi-Provider Support - Claude OAuth, Claude API console, Vertex AI, AWS Bedrock, NanoGPT, z.ai, Minimax, OpenRouter, Kilo, Codex (OpenAI OAuth), Anthropic-compatible, and OpenAI-compatible providers
- 🔒 OAuth Token Health - Real-time monitoring of OAuth token status with automatic refresh and health indicators
- 🔗 Custom API Endpoints - Configure custom endpoints for Anthropic accounts for enterprise deployments
- ☁️ OpenAI-Compatible Support - Use OpenAI-compatible providers like OpenRouter, Together AI, and more with Claude API format
- 🧩 Codex / Responses API Compatibility -
POST /v1/responsesandPOST /v1/responses/compactare translated to Anthropic/v1/messages - 🔄 Smart Auto-Fallback - Automatically switch back to preferred accounts when their rate limits reset
- ⚡ Auto-Refresh - Automatically start new usage windows when rate limits reset with 30-minute buffer
- 📊 Request-Level Analytics - Track latency, token usage, and costs in real-time with optimized batch processing
- 🔍 Deep Debugging - Full request/response logging and error traces
- 🔐 API Authentication - Optional API key authentication with secure key management
- ⚡ <10ms Overhead - Minimal performance impact with lazy loading and request deduplication
- 🛡️ Security Hardened - Critical security fixes for authentication bypass, command injection, and credential leakage
- 💸 Free & Open Source - Run it yourself, modify it, own your infrastructure
This project builds upon the excellent foundation of snipeship/ccflare with significant enhancements:
🎯 Core Improvements (v3.0.0):
- Enhanced Security - Critical fixes for authentication bypass, command injection, and PKCE implementation
- OAuth Token Health Monitoring - Real-time status indicators and automatic token refresh with 30-minute buffer
- Extended Provider Support - AWS Bedrock, NanoGPT (with dynamic pricing), Minimax, OpenRouter, Kilo, Codex (OpenAI OAuth), Anthropic-compatible, and OpenAI-compatible providers
- Simplified Load Balancing - Removed tier system for O(1) priority-based selection
- Real-time Analytics Dashboard - Beautiful web UI with fixed request history (no disappearing requests)
- Package Distribution - Available via npm and bun for easy installation
🛠️ Developer Experience:
- Powerful CLI - Complete command-line interface for account management and configuration
- REST API - Complete API for automation and integration
- Cross-Platform Binary - Pre-compiled binary works with Node.js or Bun
- Comprehensive Logging - Request/response tracking with searchable history
- Database Integration - SQLite (default) or PostgreSQL for persistent storage and analytics, supporting Kubernetes multi-pod deployments
📦 Distribution & Updates:
- npm/bun Registry - Install with
npm install -g better-ccflareorbun install -g better-ccflare - npx/bunx Support - Run without installation:
npx better-ccflareorbunx better-ccflare - Smart Update Detection - Web UI detects package manager and shows appropriate update commands
- Version Management - Semantic versioning with automatic update notifications
🏢 Production Ready:
- Enterprise Features - Custom API endpoints, session management, advanced analytics
- Performance Optimized - <10ms overhead with request deduplication and caching
- Reliability - Automatic error recovery, circuit breakers, and health monitoring
- Scalability - Built for high-throughput production environments
- PostgreSQL Support - Set
DATABASE_URL=postgresql://...to use PostgreSQL for Kubernetes multi-pod deployments where SQLite file-sharing is not feasible
npm install -g better-ccflare
# Start better-ccflare (Server + Dashboard)
better-ccflareContinue to Configure Claude SDK.
"C:\\Program Files\\nodejs\\\\node_modules\\better-ccflare\\dist\\better-ccflare" is either misspelled or could not be found, this is a known npm bug on Windows affecting how npm generates wrapper scripts. See Windows Troubleshooting for workarounds.
bun install -g better-ccflare
# Start better-ccflare (Server + Dashboard)
better-ccflareContinue to Configure Claude SDK.
Download the appropriate binary for your platform from GitHub Releases:
wget https://github.com/tombii/better-ccflare/releases/latest/download/better-ccflare-linux-amd64
chmod +x better-ccflare-linux-amd64
./better-ccflare-linux-amd64Continue to Configure Claude SDK.
wget https://github.com/tombii/better-ccflare/releases/latest/download/better-ccflare-linux-arm64
chmod +x better-ccflare-linux-arm64
./better-ccflare-linux-arm64Continue to Configure Claude SDK.
curl -L -o better-ccflare-macos-x86_64 https://github.com/tombii/better-ccflare/releases/latest/download/better-ccflare-macos-x86_64
chmod +x better-ccflare-macos-x86_64
# Remove quarantine attribute (required on macOS to run unsigned binaries)
xattr -d com.apple.quarantine better-ccflare-macos-x86_64
./better-ccflare-macos-x86_64Continue to Configure Claude SDK.
curl -L -o better-ccflare-macos-arm64 https://github.com/tombii/better-ccflare/releases/latest/download/better-ccflare-macos-arm64
chmod +x better-ccflare-macos-arm64
# Remove quarantine attribute (required on macOS to run unsigned binaries)
xattr -d com.apple.quarantine better-ccflare-macos-arm64
./better-ccflare-macos-arm64Continue to Configure Claude SDK.
macOS Gatekeeper Notice: Our macOS binaries are not notarized by Apple as this requires a paid Apple Developer subscription. After downloading, you must remove the quarantine attribute using the xattr command shown above to run the binary. If you prefer not to run unsigned binaries, you can install from source instead.
Download better-ccflare-windows-x64.exe and run it.
Continue to Configure Claude SDK.
# Run with npx (downloads and executes latest version)
npx better-ccflare@latest
# Run with bunx (faster for bun users)
bunx better-ccflare@latestContinue to Configure Claude SDK.
# Clone and install
git clone https://github.com/tombii/better-ccflare
cd better-ccflare
bun install
# Build dashboard (required before first run)
bun run build
# Start better-ccflare (TUI + Server)
bun run better-ccflareContinue to Configure Claude SDK.
Note: You must run bun run build at least once to build the dashboard files before starting the server. This can also be done by running bun run better-ccflare which includes the build step.
better-ccflare supports several environment variables for configuration:
# Server Configuration
PORT=8080 # Server port (default: 8080)
BETTER_CCFLARE_HOST=0.0.0.0 # Server binding host (default: 0.0.0.0, use 127.0.0.1 for localhost-only)
CLIENT_ID=your-client-id # OAuth client ID
BETTER_CCFLARE_CONFIG_PATH=/path/to/config.json # Custom config location
BETTER_CCFLARE_DB_PATH=/path/to/database.db # Custom database path (default: ~/.config/better-ccflare/better-ccflare.db)
# Use this for development/testing with a separate database
# Logging and Debugging
LOG_LEVEL=INFO # Log level (ERROR, WARN, INFO, DEBUG)
LOG_FORMAT=json # Log format (json or text)
better-ccflare_DEBUG=0 # Enable debug mode (1 for enabled)
# SSL/TLS Configuration
SSL_KEY_PATH=/path/to/key.pem # SSL private key path (for HTTPS)
SSL_CERT_PATH=/path/to/cert.pem # SSL certificate path (for HTTPS)
# Load Balancing
LB_STRATEGY=session # Load balancing strategy (default: session)
SESSION_DURATION_MS=18000000 # Session duration in milliseconds (5 hours)
# Retry Configuration
RETRY_ATTEMPTS=3 # Number of retry attempts
RETRY_DELAY_MS=1000 # Initial retry delay in milliseconds
RETRY_BACKOFF=2 # Retry backoff multiplier
# Health endpoint
HEALTH_DETAIL_ENABLED=false # Enable ?detail=1 on /health to expose per-account status (default: off, set true for internal monitoring)
# Agent Discovery
BETTER_CCFLARE_DISCOVER_PLUGIN_AGENTS=false # Set to true to discover agents distributed by Claude Code plugins
# (reads ~/.claude/plugins/installed_plugins.json)
# Storage
STORE_PAYLOADS=false # Disable storing request/response bodies (reduces DB size and memory usage)
# Token counts, costs, model, status and timing are still recorded
# Payload encryption at rest (optional)
# When set, request/response payloads are encrypted with AES-256-GCM before
# being written to `request_payloads`. Existing plaintext rows remain readable.
# Generate with: openssl rand -hex 32
PAYLOAD_ENCRYPTION_KEY= # 64-character hex (32 bytes / AES-256). Unset = plaintext storage.Encryption notes:
- Without a key, payloads are stored as plaintext (no behavior change from prior versions).
- Losing the key makes encrypted rows unreadable — payload reads throw rather than silently returning garbage. Back the key up alongside the database.
- The key is read once at process start (and once per Bun worker). Rotating it requires a re-encrypt migration; not yet built.
Security Notes:
- Use
BETTER_CCFLARE_HOST=127.0.0.1to bind only to localhost for better security - Never commit
.envfiles containing sensitive values to version control - Use environment-specific configuration for production deployments
better-ccflare automatically supports .env files for easy configuration management. You can create a .env file in your project directory:
# Copy the example .env file
cp .env.example .env
# Edit with your configuration
nano .envSupported across all deployment methods:
- CLI Binary: Automatically loads
.envfrom current working directory - Docker Compose: Automatically loads
.envfrom the same directory asdocker-compose.yml - Docker: Mount your
.envfile or pass variables directly
Example .env file:
# Server Configuration
PORT=8080
# SSL/TLS Configuration (optional)
SSL_KEY_PATH=/path/to/ssl/key.pem
SSL_CERT_PATH=/path/to/ssl/cert.pem
# Load Balancing
LB_STRATEGY=session
# Logging and Debugging
LOG_LEVEL=INFO
LOG_FORMAT=pretty
# Database configuration
DATA_RETENTION_DAYS=3
REQUEST_RETENTION_DAYS=90
# Storage (set to false to skip storing request/response bodies, reducing DB size and memory pressure)
STORE_PAYLOADS=trueUsage with different deployment methods:
# CLI (binary or local development)
better-ccflare --serve
# Docker Compose (place .env alongside docker-compose.yml)
docker-compose up
# Docker (mount .env file)
docker run -v $(pwd)/.env:/app/.env:ro -p 8080:8080 ghcr.io/tombii/better-ccflare:latest# Quick start with docker-compose
curl -O https://raw.githubusercontent.com/tombii/better-ccflare/main/docker-compose.yml
# Optional: Create and configure .env file
cp .env.example .env
# Edit .env with your settings (SSL, port, etc.)
nano .env
# Start with docker-compose (automatically loads .env file)
docker-compose up -d
# Or use docker run with environment variables
docker run -d \
--name better-ccflare \
-p 8080:8080 \
-v better-ccflare-data:/data \
-e SSL_KEY_PATH=/path/to/ssl/key.pem \
-e SSL_CERT_PATH=/path/to/ssl/cert.pem \
ghcr.io/tombii/better-ccflare:latest
# View logs
docker logs -f better-ccflareOnce the container is running, open http://localhost:8080 in your browser to add and manage accounts through the Web UI. This is the recommended way — using docker exec to run CLI commands inside the container won't work for OAuth-based account modes since the container has no browser.
🆕 Environment Variable Support: Docker Compose now automatically loads .env files from the same directory as docker-compose.yml. Simply create a .env file alongside your docker-compose.yml file and the container will use those settings.
Available Docker tags:
latest- Latest stable releasemain- Latest build from main branch1.2.28,1.2,1- Specific version tagssha-abc123- Commit-specific tags
See DOCKER.md for detailed Docker documentation.
For running better-ccflare as a native systemd service on Linux (without Docker), see the Systemd Deployment Guide. It covers unit file configuration, memory management with --smol, restart policies, and a preflight script that prevents BUN_JSC_* environment variable crashes.
If you have a Claude Pro or Team subscription and are logged into Claude CLI:
# Set only the base URL - no API key needed!
export ANTHROPIC_BASE_URL=http://localhost:8080
# Make sure to configure your accounts in the better-ccflare dashboard
# Start Claude CLI (uses your existing login)
claudeImportant: When using Claude CLI with an active OAuth login, do NOT set ANTHROPIC_AUTH_TOKEN. Setting both will trigger a warning from Claude CLI about conflicting authentication methods.
If you're NOT using Claude CLI's OAuth login, or prefer API key authentication:
# First, logout from Claude CLI if you're currently logged in
claude /logout
# Then set both the base URL and API key
export ANTHROPIC_BASE_URL=http://localhost:8080
# If better-ccflare has NO API keys configured (open access):
export ANTHROPIC_AUTH_TOKEN=dummy-key
# If better-ccflare HAS API keys configured (protected):
# Generate a key first: better-ccflare --generate-api-key "My VPS"
export ANTHROPIC_AUTH_TOKEN=btr-abcdef1234567890... # Use your real better-ccflare API key
# Make sure to configure your accounts in the better-ccflare dashboard
# Start Claude CLI
claudeUse better-ccflare on a trusted server to avoid storing OAuth credentials on untrusted/temporary machines:
On your trusted server (running better-ccflare):
# Add your Claude account with OAuth
better-ccflare --add-account myaccount --mode claude-oauth --priority 0
# Generate an API key for remote access
better-ccflare --generate-api-key "Remote VPS"
# Save the generated key: btr-abcdef1234567890...
# Start the server (ensure it's accessible remotely)
better-ccflare --serveOn your untrusted/temporary VPS:
# Set the remote better-ccflare URL and API key
export ANTHROPIC_BASE_URL=https://your-server.com:8080
export ANTHROPIC_AUTH_TOKEN=btr-abcdef1234567890... # Your better-ccflare API key
# Start Claude CLI (no need to login - better-ccflare handles auth)
claudeHow it works:
- Claude Code CLI sends requests with your better-ccflare API key
- better-ccflare validates the API key and proxies requests using its stored OAuth credentials
- Your OAuth credentials stay secure on your trusted server
- You can use Claude Code on any machine without storing sensitive credentials
- Have Claude Pro/Team and working locally? Use Option 1 (OAuth only) - simpler and no API key needed
- Working on untrusted/temporary machines? Use Option 3 (Remote VPS setup) - keeps credentials secure
- Using only API keys in better-ccflare? Use Option 2 (logout + API key)
- Getting auth conflict warnings? You have both methods active - choose one and follow its steps above
better-ccflare supports Codex CLI as a client. Codex speaks the OpenAI Responses API; better-ccflare intercepts requests to /v1/responses and /v1/responses/compact and translates them to Anthropic POST /v1/messages internally, routing through your configured account pool.
Configure Codex CLI to point at better-ccflare in ~/.codex/config.toml:
openai_base_url = "http://127.0.0.1:8080/v1"Note: use 127.0.0.1 instead of localhost — Codex CLI has a known issue where localhost resolves to IPv6 first and causes connection failures. The /v1 suffix is required; Codex appends /responses to the base URL.
Or via environment variables:
export OPENAI_BASE_URL=http://127.0.0.1:8080/v1
export OPENAI_API_KEY=dummy-keyCodex CLI requires an API key to start — use dummy-key if better-ccflare API key authentication is not enabled, or your real better-ccflare API key if it is.
Known limitations:
previous_response_idis accepted but ignored — Codex uses this only over WebSocket; for regular HTTP requests it always sends the full conversation history ininput- Built-in tool types (
web_search_preview,code_interpreter,file_search) are silently skipped; onlytype: "function"tools are forwarded to Anthropic - Claude OAuth accounts (Claude Pro/Team,
provider=anthropicwith OAuth tokens) are automatically excluded from Codex CLI traffic — Anthropic bans these when used outside Claude CLI. Anthropic API key accounts are fine and will be used normally.
To enable HTTPS with better-ccflare, you'll need SSL certificates. Here are your options:
# Generate a self-signed certificate on the better-ccflare host
openssl req -x509 -newkey rsa:2048 -keyout key.pem -out cert.pem -days 365 -nodes \
-subj "/C=US/ST=State/L=City/O=Organization/CN=yourhostname"
# Start better-ccflare with SSL
export SSL_KEY_PATH=/path/to/key.pem
export SSL_CERT_PATH=/path/to/cert.pem
better-ccflare
# Or use command line flags
better-ccflare --ssl-key /path/to/key.pem --ssl-cert /path/to/cert.pemTrust the self-signed certificate on client machines:
For self-signed certificates, you need to add the certificate to your system's trusted certificates:
-
Linux (Ubuntu/Debian):
# Copy cert.pem from the better-ccflare host to your client machine sudo cp cert.pem /usr/local/share/ca-certificates/better-ccflare.crt sudo update-ca-certificates -
Linux (Arch/Manjaro):
# Copy cert.pem from the better-ccflare host to your client machine sudo cp cert.pem /etc/ca-certificates/trust-source/anchors/better-ccflare.crt sudo trust extract-compat -
macOS:
# Copy cert.pem from the better-ccflare host to your client machine sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain cert.pem -
Windows (PowerShell as Administrator):
# Copy cert.pem from the better-ccflare host to your client machine Import-Certificate -FilePath cert.pem -CertStoreLocation Cert:\LocalMachine\Root
Configure Claude Code to use the trusted certificate:
After adding the certificate to your system's trusted store, configure your environment:
# Add to your ~/.bashrc or ~/.zshrc
export NODE_OPTIONS="--use-system-ca"
export ANTHROPIC_BASE_URL=https://yourhostname:8080The NODE_OPTIONS="--use-system-ca" is required for Claude Code and other Node.js-based clients to use the system certificate store. Without this, Node.js will not trust your self-signed certificate even if it's in the system store.
If you're running better-ccflare on a server with a domain name, use Let's Encrypt or your certificate provider:
# Using Let's Encrypt certificates
export SSL_KEY_PATH=/etc/letsencrypt/live/yourdomain.com/privkey.pem
export SSL_CERT_PATH=/etc/letsencrypt/live/yourdomain.com/fullchain.pem
better-ccflare
# Set the base URL to use HTTPS
export ANTHROPIC_BASE_URL=https://yourdomain.com:8080With production certificates from trusted CAs, you don't need NODE_OPTIONS="--use-system-ca" as they are already trusted.
For Docker deployments, we recommend using Traefik as a reverse proxy to handle TLS automatically with Let's Encrypt:
# docker-compose.yml
version: '3.8'
services:
traefik:
image: traefik:v3.0
command:
- "--api.insecure=true"
- "--providers.docker=true"
- "--entrypoints.web.address=:80"
- "--entrypoints.websecure.address=:443"
- "--certificatesresolvers.myresolver.acme.tlschallenge=true"
- "--certificatesresolvers.myresolver.acme.email=your-email@example.com"
- "--certificatesresolvers.myresolver.acme.storage=/letsencrypt/acme.json"
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./letsencrypt:/letsencrypt
restart: unless-stopped
better-ccflare:
image: ghcr.io/tombii/better-ccflare:latest
labels:
- "traefik.enable=true"
- "traefik.http.routers.ccflare.rule=Host(`your-domain.com`)"
- "traefik.http.routers.ccflare.entrypoints=websecure"
- "traefik.http.routers.ccflare.tls.certresolver=myresolver"
- "traefik.http.services.ccflare.loadbalancer.server.port=8080"
volumes:
- ~/.config/better-ccflare:/root/.config/better-ccflare
restart: unless-stoppedBenefits:
- Automatic TLS certificate generation and renewal via Let's Encrypt
- No need to manually manage SSL certificates
- Built-in HTTP to HTTPS redirection
- Dashboard for monitoring (port 8080 on Traefik)
Client Configuration:
export ANTHROPIC_BASE_URL=https://your-domain.comNo NODE_OPTIONS needed - Traefik provides trusted certificates automatically!
Problem: "Unable to connect to API due to poor internet connection" error even with ANTHROPIC_BASE_URL set
Solutions:
-
Verify the environment variable is set in the same shell/session:
echo $ANTHROPIC_BASE_URL echo $NODE_OPTIONS
-
Test the SSL connection manually:
# Should succeed without errors curl https://yourhostname:8080/health # If you see certificate errors, the cert isn't trusted yet curl -k https://yourhostname:8080/health # -k bypasses cert check for testing
-
Verify the certificate is in the system store:
# Linux ls -la /etc/ssl/certs/ | grep better-ccflare # macOS security find-certificate -a -c yourhostname -p /Library/Keychains/System.keychain
-
Ensure the hostname resolves correctly:
ping yourhostname
-
Check that the server is actually running:
curl -k https://yourhostname:8080/health
If you installed better-ccflare via npm on Windows and encounter an error like:
The command "C:\Program Files\nodejs\\node_modules\better-ccflare\dist\better-ccflare" is either
misspelled or could not be found.
This is a known npm bug on Windows (see npm/cli#969 and nodejs/node#39010) affecting how npm generates wrapper scripts with double backslashes in paths.
Option 1: Use npx (Recommended)
npx better-ccflareThis bypasses the npm wrapper script entirely and runs better-ccflare directly.
Option 2: Use the Pre-compiled Binary
Download the standalone Windows executable from GitHub Releases:
# Download better-ccflare-windows-x64.exe and run it directly
.\better-ccflare-windows-x64.exeOption 3: Update npm
Sometimes updating to the latest npm version fixes the issue:
npm install -g npm@latest
npm install -g better-ccflareOption 4: Direct Execution
If you need to use the npm-installed version, you can execute the binary directly:
node "%APPDATA%\npm\node_modules\better-ccflare\dist\better-ccflare"Option 5: Use Bun Package Manager
Bun doesn't have this bug and works correctly on Windows:
# Install bun from https://bun.sh
bun install -g better-ccflare
better-ccflareThis issue is caused by a bug in npm's wrapper script generation on Windows, where it incorrectly constructs paths with double backslashes (\\nodejs\\\\node_modules). This is a longstanding npm bug that affects many CLI packages, not just better-ccflare.
The issue is being tracked in:
- npm/cli#969 - Generated .cmd script bugs
- nodejs/node#39010 - Double slashes in Windows paths
We recommend using one of the workarounds above until the npm bug is fixed.
- Session-based - Maintain conversation context for Claude OAuth accounts (5hr usage windows), pay-as-you-go for other providers
- Auto-fallback - Automatically switch back to higher priority Claude OAuth accounts when their usage windows reset
- Auto-refresh - Automatically start new usage windows when they reset
- Usage Window Alignment - Sessions automatically align with Claude OAuth usage window resets for optimal resource utilization
- Usage Throttling - Configurable monthly token/cost limits per account with peak-hours auto-pause for Zai accounts
- 503 on Pool Exhaustion - Returns HTTP 503 when all accounts are rate-limited or paused, enabling client-side retry logic
- Rate Limit Audit Trail - Tracks when and why each account became rate-limited (
rate_limited_reason,rate_limited_at)
- Named Combos - Create named fallback chains with ordered (account, model) slots
- Family Activation - Assign one combo per model family (Opus, Sonnet, Haiku) — independent activation toggles
- Auto Waterfall - Requests automatically fall through slots top-to-bottom, skipping unavailable accounts (rate-limited, paused)
- Per-Slot Model Override - Each slot can use a different model, enabling cross-model fallback (e.g., try Opus on provider A, then Sonnet on provider B)
- SessionStrategy Fallback - If all combo slots fail, automatically falls back to normal session-based routing
- Dashboard Management - Drag-and-drop slot builder with account provider badges, enable/disable per combo, and family assignment UI
- Token usage tracking per request with optimized batch processing
- Response time monitoring with intelligent caching
- Rate limit detection and warnings
- Cost estimation and budgeting
- Request deduplication for improved performance
- Lazy-loaded analytics components for faster initial load
- Advanced filtering by accounts, models, API keys, and request status
- API key performance tracking and detailed analytics
- Powerful CLI (
better-ccflare) - Web dashboard (
http://localhost:8080/dashboard) - CLI for account management
- REST API for automation
--doctorcommand for database integrity checks and telemetry- Reasoning effort compatibility layer for OpenAI/Codex routes (downgrade mapping,
count_tokenssupport) /healthendpoint with three-state pool status (healthy/degraded/unhealthy), 503 on degraded/unhealthy, optional?detail=1behindHEALTH_DETAIL_ENABLED
- Automatic failover between accounts
- OAuth token refresh handling
- SQLite database for persistence
- Configurable retry logic
- Custom endpoint support for enterprise deployments
- Enhanced performance with request batching and caching
- Claude OAuth - Anthropic OAuth accounts with 5-hour usage windows and session tracking (rate limit window based)
- Claude Console API - Anthropic API key accounts with pay-as-you-go model (no session tracking)
- AWS Bedrock - Native AWS Bedrock integration with SigV4 authentication, inference profile support (geographic/global/regional), and automatic credential chain resolution via AWS CLI profiles
- Vertex AI - Google Cloud Vertex AI integration with service account authentication
- z.ai, Minimax - API key based providers with pay-as-you-go model
- OpenRouter - OpenRouter integration with native API support and model mapping
- Kilo - Kilo API integration with usage tracking
- Anthropic-Compatible - Custom Anthropic-compatible providers with pay-as-you-go model
- Ollama - Local Ollama instance (v0.14.0+) via native Anthropic-compatible API at
/v1/messages, no API key required - OpenAI-Compatible - OpenAI-compatible providers (Together AI, etc.) with Claude API format
- Universal API Format - Use OpenAI-compatible providers with Claude API format
- Automatic Format Conversion - Seamless conversion between Anthropic and OpenAI request/response formats
- Model Mapping - Map Claude models (Opus, Sonnet, Haiku) to equivalent OpenAI models
- Model Fallbacks - Automatically retry with a fallback model when the requested model is unavailable (e.g., fallback from Opus to Sonnet on Pro subscriptions)
- Streaming Support - Full support for streaming responses from OpenAI-compatible providers
- API Key Authentication - Secure API key management for OpenAI-compatible providers
- Cost Tracking - Automatic cost calculation for usage monitoring and budgeting
If you encounter "All accounts failed" errors, the database runs integrity checks automatically on startup and will guide you to repair if needed. You can also manually run:
bun run cli --repair-dbThis will check integrity, fix NULL values, validate constraints, and optimize the database. See the Troubleshooting Guide for more details.
Full documentation available in docs/:
- Getting Started
- CLI Commands
- Troubleshooting
- Architecture
- API Reference
- Configuration
- Load Balancing Strategies
- Auto-Fallback Guide
- Auto-Refresh Guide
- OpenAI-Compatible Providers
- Combos — Fallback Chains
| Real-time Dashboard | Request Logs |
| Analytics & Usage Tracking | |
For installation:
- npm or bun package manager (for npm/bun installation)
- Node.js >= 18.0.0 (when installed via npm)
- Bun >= 1.2.8 (when installed via bun or running from source)
- Or download pre-compiled binary - No runtime dependencies required!
For usage:
- Claude API accounts (Free, Pro, or Team), z.ai code plan accounts, or Minimax accounts
| Platform | Architecture | Status |
|---|---|---|
| Linux | x86_64 | ✅ Supported (npm + binary) |
| Linux | ARM64 (aarch64) | ✅ Supported (binary only) |
| macOS | Intel (x64) | ✅ Supported (npm + binary) |
| macOS | Apple Silicon (ARM64) | ✅ Supported (binary only) |
| Windows | x86_64 | ✅ Supported (binary only) |
Works on:
- Oracle Cloud ARM instances (Ampere Altra)
- AWS Graviton instances
- Raspberry Pi 3/4/5 (with 64-bit OS)
- Any x86_64 or ARM64 Linux/macOS/Windows system
Not supported:
- ARM32 devices (Raspberry Pi Zero, Pi 1, Pi 2, or 32-bit OS)
Inspired by snipeship/ccflare - thanks for the original idea and implementation!
Special thanks to our contributors:
- @bitcoin4cashqc - SSL/HTTPS support implementation with comprehensive documentation
- @anonym-uz - Critical auto-pause bug fix, analytics performance optimizations, request body truncation, and incremental vacuum implementation
- @makhweeb - Enhanced request handling and analytics improvements
- @jw409 - Fixed OAuth account addition in WSL2 and compiled binaries by replacing unreliable prompt() with readline; systemd deployment guide, BUN_JSC_* crash loop analysis, and preflight environment validator (PR #106)
- @materemias - Testing and validation of Vertex AI provider implementation, thorough debugging of OAuth API key authentication (issue #54), requesting and validating AWS Bedrock support (issue #49), extensive testing of new releases and features; fix request details modal hydration race where payload row committed after parent request row, adding lazy re-fetch via
/api/requests/payload/:idwith 404 fallback (PR #186) - @tqtensor - Comprehensive memory leak fix preventing OOM kills with smart chunk capping, memory monitoring, and optimized cleanup (PR #67)
- @lunetics - Force-reset rate limit feature allowing manual clearing of stale rate-limit locks via API, CLI, and dashboard with immediate usage polling (PR #68), OOM kill prevention with periodic data retention cleanup, 3-day default retention, and time-scoped stats queries (PR #70), model registry sync removing retired models and adding sonnet-4.6 CLI shortcut (PR #71)
- @troykelly - Comprehensive PostgreSQL compatibility fixes including boolean type handling, identifier case preservation, BIGINT string coercion, UNION ALL type alignment, HAVING clause compatibility, parameter ordering corrections, worker initialization, and connection pooling (issue #81); detailed bug report and root cause analysis for
/api/accountsInvalid Date error affecting PostgreSQL BIGINT columns (issue #88) - @cowwoc - Compact reliability fixes, space breakdown after cleanup, requests tab without payloads (PR #149); clear stale rate_limited_until when usage API shows capacity restored (PR #150); prevent manually-paused accounts from being selected as auto-fallback candidates (PR #151); balance new sessions by utilization within same-priority accounts using water-filling algorithm (PR #152); fix ModuleNotFound crash in compiled binary when using Compact Database by embedding vacuum-worker at build time (PR #155); expected usage position indicator on rate limit bars showing projected pacing vs. reset window (PR #156); show explicit rate-limited state when usage data unavailable on startup (PR #161); deduplicate concurrent fetchAndCache calls per account to prevent redundant Anthropic requests (PR #159); make GET /api/accounts cache-only and await usage fetch in refresh endpoint, eliminating blind 5s timeout (PR #162); mark account rate-limited when all models exhausted to prevent stale-state retry loops (PR #163); use retry-after header for dynamic model-exhaustion cooldown instead of hardcoded 1 hour (PR #164); remove implicit sonnet catch-all in getModelList preventing silent unexpected model remaps (PR #165); reduce log noise by aggregating auto-unpause skip messages and suppressing identity model mapping logs (PR #167); reasoning effort compatibility layer for OpenAI/Codex routes with deterministic downgrade mapping and count_tokens path support (PR #172, implemented in PR #188); unified Codex model mapping path (PR #203)
- @wonkooklee - Fixed ZlibError on startup caused by Bun auto-decompressing upstream 404 responses while leaving stale
content-encoding: gzipheaders intact — appliedwithSanitizedProxyHeadersto the model-not-found early-return path, matching every other forward path (PR #243) - @Cotch22 - Fixed garbled non-ASCII text in request details modal by decoding base64 bodies as UTF-8 via
TextDecoderinstead of rawatob()(PR #246) - @zenprocess - LeastUsed load-balancing strategy selecting accounts by lowest utilization to prevent burst pool-exhaustion (PR #193); fix keepalive 429s incorrectly cooling accounts by skipping cooldown application on synthetic keepalive replays, and shorten the no-reset fallback cooldown from 1 hour to 60 seconds (overridable via env) to prevent mass account lockout on burst contention (PR #196)
- @issmirnov - Persisted analytics view controls (view mode, time range, metric, model breakdown, filters) in the URL query string and localStorage so selections survive reloads and are shareable via link, with defensive decoding and cumulative-view coupling enforced centrally (PR #252)
- @CorentinLumineau - Plugin agent discovery scanning Claude Code plugin directories for agent definitions alongside the built-in agents path, with manifest parsing, name validation, and dashboard display (PR #197)
- @d4rken - fixed dashboard "Active" badge showing the most-recently-used account instead of the one the load balancer would actually pick next — introduced a side-effect-free
peek()toLoadBalancingStrategy(sharedpeek-availability.tsmirrors auto-unpause logic without DB writes), dashboard accounts endpoint stampsisPrimary: trueon the predicted pick, badge renamed from "Active" to "Primary"; also inlined the account-row controls previously hidden behind aMoreHorizontaldropdown into flat action buttons (PR #218); Dashboard account row reorganization into wrap-friendly multi-row layout with overflow kebab menu, restored tooltip context for moved actions, and deduped re-auth condition (PR #204); surfaced API key mutation errors inline in create/delete/toggle dialogs, prevented first-key lockout by disabling admin checkbox during loading with fail-safe isFirstKey guard, and added dismissible error banner for role/toggle failures (PR #205); fixed OAuth account creation silently discarding the priority value by adding a priority column to oauth_sessions and threading it through init → session → callback → account creation, plus replaced the constraint-droppingCREATE TABLE … AS SELECTtier-drop rebuild with an explicit schema-preserving migration (PR #206); surfaced update-check failure reason in dashboard tile by including the underlying error message in the backend 500 response body, switching to structured Logger, and displaying the error message under "Check Failed" in the frontend sidebar tile (PR #207); exempted /api/version/check from authentication so the dashboard update tile works correctly when API keys are configured (PR #208); replaced bare error-code list with grouped error UI featuring per-(errorCode, account) dismissal, time-window selector (1h/24h/7d/all) persisted in localStorage, clickable rows opening a details modal with human-readable descriptions and recovery info, and a CTE+window-function backend query joining account rate-limit state (PR #209); added combined-quota "5h Pool" and "7d Pool" metric tiles to the Overview dashboard aggregating per-account utilization across all eligible accounts, with worst-account subline, popover breakdown by contributing/exhausted/excluded/fallback accounts, at-risk projection, and next-reset time (PR #210); fixed SIGTERM race where interval-manager's synchronous process.exit(0) short-circuited async shutdown, dropping in-flight requests and pending DB writes — removed the early exit, added HTTP drain via serverInstance.stop(), reordered shutdown sequence (schedulers → HTTP drain → usage worker → DB flush), and added a 30s watchdog to bound total shutdown time (PR #214); fixed API key copy button silently failing on plain-HTTP hosts by adding acopyTexthelper that prefers the async Clipboard API and falls back to an off-screen textarea +execCommandfor non-secure contexts, and routedCopyButtonand the inline copy buttons inApiKeysTabthrough it so the checkmark animation works in both cases (PR #215); fixed post-processor worker timing out on large databases (multi-GB) by passingfastMode=trueto skip the redundant PRAGMA integrity_check — the main thread already ran it, and re-running it in the worker blew past the 10s startup deadline causing silent request loss (PR #220); replaced the bare recent-errors string list with a grouped error card featuring per-(errorCode, account) dismissal, time-window selector (1h/24h/7d/all), clickable rows opening a details modal with human-readable descriptions, recovery info, and provider-aware copy for model_fallback_429 that escalates to error only when no other accounts are available (PR #212); redesigned request history row layout into a two-row card that never wraps the top-line summary, switched the list view to the summary-only endpoint to eliminate the ~12MB per-page-load body fetch, and fixed the API key and account filter dropdowns to source options from dedicated endpoints so deleted-but-historical entries remain selectable (PR #221); replaced fs.copyFileSync migration backup with VACUUM INTO + atomic rename via a .partial temp file, producing a defragmented consistent snapshot and ensuring the named backup only appears complete (PR #222); added bounded backup retention — after each migration backup, prunes.backup.<ts>files to keep the newest N (default 3, configurable viaBETTER_CCFLARE_MIGRATION_BACKUP_KEEP, 0 to disable), sorting by name-embedded timestamp to survive rsync/clock-drift, skipping operator-renamed files with non-integer suffixes (PR #223); bumped the post-processor worker startup timeout from 10s to 60s and made it overridable viaCF_WORKER_STARTUP_TIMEOUT_MS, preventing silent analytics loss on large (multi-GB) databases where the per-handle PRAGMA work blew past the old deadline (PR #224); narrowed the migration backup gate from a 41-condition OR to only the three genuinely irreversible operations (refresh_token NOT NULL rebuild, account_tier drop, oauth_sessions.tier drop), eliminating multi-GB backup copies on every restart when only additive ALTER TABLE ADD COLUMN migrations are pending (PR #225); resolved all 42 error-level Biome diagnostics across dashboard components, hooks, and openai-formats tests — replacednoNonNullAssertedOptionalChainpatterns in tests with explicit non-null assertions aftertoBeDefined(), convertedforEachcallbacks tofor…ofto fixuseIterableCallbackReturn, added scopedbiome-ignorecomments with explanations fornoArrayIndexKeycases where stable ids aren't available, and fixed two a11y errors in ErrorBoundary (SVG role/aria-label/title, button type attribute) (PR #226); replaced the blocking startupPRAGMA integrity_check(which froze the event loop for 94s on a 7.6 GiB DB) with a dual-timer background scheduler runningquick_checkevery 6h and a fullintegrity_check+foreign_key_checkevery 24h in a dedicatedbun:sqliteworker, added a sticky-corrupt rule so a quickokcannot mask a full corruption finding, exposed the status via an expanded/api/storageresponse and a newPOST /api/storage/integrity/checkon-demand endpoint, and surfaced it in the Overview dashboard via aStorageIntegrityCardand stickyStorageIntegrityBanneron corruption — also addsforeign_key_checkto the full probe sinceintegrity_checkper SQLite docs does not verify foreign keys (PR #227); fixed Logger.{error,warn,info,debug} silently emitting{}for Error instances by normalizing Errors to plain objects with name/message/stack/cause before serialization — affects 159 call sites including the "Failed to intercept/modify request" line in agent-interceptor — and demoted all per-request INFO logs in agent-interceptor and extractSystemPrompt to DEBUG, plus skipped the spurious~/.claude/.claude/agentspath construction for global CLAUDE.md entries since global agents are already loaded unconditionally by AgentRegistry (PR #228); fixedLOG_LEVEL=DEBUGbeing silently ignored —LogLevel.DEBUG === 0is falsy sogetLogLevelFromEnv() || levelfell back to the constructor default (INFO), suppressing all debug output even when explicitly configured; switched to??so the env override applies for DEBUG too, and added regression tests covering DEBUG/WARN/ERROR/unset/unknown values plus thesilentConsoleside-effect (PR #229); eliminated the blocking full VACUUM trap — hourly retention cleanup and "Compact now" calledincrementalVacuumwhich silently fell back to a live full VACUUM on existing DBs (auto_vacuum≠INCREMENTAL), freezing the proxy for minutes every hour — replaced with:configureSqliteandensureSchemasetPRAGMA auto_vacuum = INCREMENTALon fresh DBs, newbootstrapAutoVacuum()migrates existing mode-0 DBs once at startup before HTTP binds (one-time VACUUM cost, never blocks live traffic),incrementalVacuum()now dispatchesPRAGMA incremental_vacuum(N)to a worker thread on mode-2 DBs or logs and returns otherwise (no destructive fallback), drops hourly N from 200000 to 8000 (~32 MiB) to keep writer-slot hold sub-100ms, adds consecutive-skip escalation after 3 missed ticks, and gates the auto_vacuum PRAGMA on mode=0 only to preserve operator-set FULL mode; also removed the dashboard "Compact now" button/endpoint (same hang trap from a different surface) while retainingbun run cli --compactwith a writer-lock probe that refuses if a running service holds it; and dropped the stalefastModeconstructor arg from DatabaseOperations after PR #227's integrity-scheduler rework made it redundant (PR #230); fixedmmapSize=0being silently ignored inconfigureSqlite— the> 0guard meant the defaultmmapSize: 0fell through to bun:sqlite's built-in mmap default (~15 GiB observed on a 15 GiB DB), causing the OOM-killing VACUUM scenario from #230 to recur; also applied memory-bounded PRAGMAs (mmap_size=0,cache_size=-2000,temp_store=FILE) to the full-VACUUM worker thread opened bycompact()/--compact, which had the same gap (PR #231); bounded AsyncDbWriter retention to stop a post-processor worker memory leak — split the single job queue into separate metadata (count-capped at 2000) and payload (count-capped at 1000, byte-capped at 100 MB) queues, added a MAX_JOBS_PER_TICK=50 / MAX_DRAIN_MS_PER_TICK=250ms dual budget so processQueue yields to the event loop instead of monopolizing it, added round-robin scheduling to prevent starvation between queue types, added canAcceptPayload() preflight to skip serialization under backpressure, and surfaced per-queue lengths, byte budget, oldest-job age, and per-kind drop counters in getHealth() (PR #234); enabled the refresh-usage button for Codex OAuth accounts — sends one minimal/responsesrequest (bounded tomax_output_tokens: 1,reasoning.effort: "minimal") to capturex-codex-*rate-limit headers, cancels the response body after the header snapshot, deduplicates concurrent clicks per-account with a shared in-flight promise, and wires the result into the existing usageCache + accounts.rate_limit_reset flow; polling intentionally excluded to avoid burning quota at 90s cadence (PR #219); added "Avg / day" and "Avg / week" burn-rate sub-rows to the Plan Value metric card on the Overview dashboard — computed server-side from fixed 7-day and 30-day windows (filter-independent) with a clamped divisor viaeffectiveBurnRateDaysso thin history doesn't inflate averages, plus a genericsubRowsprop onMetricCardfor the UI (PR #235); routed Anthropic 529 overloaded_error responses through the existing rate-limit failover machinery so configured fallback providers take over during an overload instead of forwarding the 529 directly to the client —parseRateLimittreats 529 as rate-limited, parsing Retry-After (delta-seconds and HTTP-date) and x-ratelimit-reset with aclampResetTimehelper that caps reset times to 24h to prevent hostile headers from permanently disabling an account; mid-streamSseRateLimitSnifferfires onoverloaded_errorframes for Anthropic-shape providers (anthropic/claude-oauth) only with a line-anchored regex to prevent false positives; two newRateLimitReasonvalues (upstream_529_overloaded_with_reset/upstream_529_overloaded_no_reset) surface accurate audit attribution in the dashboard;returnRateLimitedResponseOnExhaustionforwards the original 529 upstream response to the client when the full account pool is exhausted so callers see the real error instead of a generic 503 (PR #236); fixed auto-refresh scheduler probing manually-paused accounts in an endless loop — the eligibility SQL excluded paused accounts only whenauto_pause_on_overage_enabled=0, ignoringpause_reason, while the auto-resume guard insendDummyMessageonly un-pauses accounts wherepause_reason IN (NULL,'overage'); aligned the SQL eligibility query to a positive allowlist (paused=0 OR (overage_enabled=1 AND pause_reason IN (NULL,'overage'))) and tightenedisOveragePausedinaccount-selector.tsto require the same condition, so manual, failure-threshold, and peak_hours pauses are left completely alone; adds a regression test that re-executes the live scheduler SQL against a seeded in-memory SQLite DB covering all seven pause-reason scenarios (PR #237); replaced flat per-account 429 cooldown with adaptive exponential backoff —consecutive_rate_limitscounter increments atomically in SQL on each 429 and drivesmin(upstream_reset, BASE × 2^(n−1))cooldown (capped at MAX, default 30s base / 5min max), counter resets after 5min of clean operation or on successful token refresh, all four 429 paths (response-processor, model_fallback_429, all_models_exhausted_429, mid-stream SSE sniffer) route through a singleapplyRateLimitCooldownhelper, tunables overridable viaCCFLARE_RATE_LIMIT_BACKOFF_BASE_MS/CCFLARE_RATE_LIMIT_BACKOFF_MAX_MS/CCFLARE_RATE_LIMIT_RESET_STABILITY_MS(PR #213)
We welcome contributions! See CONTRIBUTING.md for guidelines.
This repository includes an automated Claude code review system:
- Automatic Review: Runs automatically when a new pull request is opened
- Manual Review: Can be manually triggered by contributors by commenting
/claude-reviewon the PR
MIT - See LICENSE for details
Built with ❤️ for developers who ship