Skip to content

jamarju/flux2-api-v3

Repository files navigation

flux2-api

FastAPI backend for text-to-image generation powered by FLUX.2 [klein] 4B.

Features

  • 🎨 Text-to-image generation with FLUX.2 [klein] 4B
  • ⚡ Sub-2-second inference (4 steps, step-distilled)
  • 🌐 Built-in web demo at /
  • 📊 Request queue with async processing
  • 🔧 Configurable via environment variables

Quick Start

git clone https://github.com/jamarju/flux2-api-v3.git
cd flux2-api-v3
uv run flux2-api

Open http://localhost:8000 for the web demo.

API

POST /generate

Generate an image from a text prompt.

Request:

{
  "prompt": "A cat holding a sign that says hello world",
  "seed": 42,
  "width": 1024,
  "height": 1024,
  "num_inference_steps": 4
}

All parameters except prompt are optional.

Response: PNG binary image

Response Headers:

Header Description
X-Seed Seed used for generation
X-Generation-Time-Ms Inference time in milliseconds
X-Queue-Wait-Ms Queue wait time in milliseconds

Example with curl:

curl -X POST http://localhost:8000/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A sunset over the ocean"}' \
  -o output.png

GET /health

Health check endpoint.

Response:

{"status": "ok", "model_loaded": true}

Configuration

Environment variables (prefix FLUX2_):

Variable Default Description
FLUX2_HOST 0.0.0.0 Server bind address
FLUX2_PORT 8000 Server port
FLUX2_DEVICE cuda PyTorch device
FLUX2_MODEL_ID black-forest-labs/FLUX.2-klein-4B HuggingFace model ID
FLUX2_DEFAULT_STEPS 4 Default inference steps
FLUX2_DEFAULT_WIDTH 1024 Default image width
FLUX2_DEFAULT_HEIGHT 1024 Default image height
FLUX2_GUIDANCE_SCALE 1.0 Guidance scale
FLUX2_DTYPE bfloat16 Model dtype (bfloat16, float16, float32)

Requirements

  • Python ≥ 3.12
  • NVIDIA GPU with ≥ 13 GB VRAM (RTX 3090/4070+)
  • uv package manager
  • CUDA toolkit

Public Access

See docs/cloudflare-tunnel.md for instructions on exposing the service to the internet via Cloudflare Tunnel.

Tests

# Unit tests (requires GPU)
uv run pytest

# Smoke test (starts a real server)
uv run pytest -m smoke

Benchmarks

Throughput Latency Queue Wait

For the full report, see benchmarks/results/REPORT.md.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors