Generate anything from your terminal.

A tiny CLI for generating text, images, and video with dead-simple commands. Pipe content in and out. Compare models side by side. See results inline.

$npm install -g ai-cli
command ai image
ready
$ ai image "a sunset" -m "openai/gpt-image-2,bfl/flux-2-pro"
 
Saved to /Users/you/output-1.png (3.2s)
Saved to /Users/you/output-2.png (4.1s)

Multi-model comparison.

Run the same prompt across multiple models in parallel. Compare outputs side by side to find the best result. Combine with -n to generate multiple per model.

  • comma-separated model IDs for parallel generation
  • configurable concurrency limits
  • per-job timing and structured JSON output
$ ai image "a sunset" -m "gpt-image-2,flux-2-pro"
 
Saved to /Users/you/output-1.png (3.2s)
Saved to /Users/you/output-2.png (4.7s)

Pipe everything.

Pipe text in as context, pipe images into video generation, chain commands together. Raw output on stdout when piped, file saves when interactive.

  • text stdin becomes prompt context
  • binary stdin for image-to-image and image-to-video
  • chain: ai image | ai video
$ git diff | ai text "explain these changes"
 
These changes refactor the auth module:
 
1. Splits session logic into its own file
2. Adds token expiry validation
3. Removes deprecated OAuth1 flow
 
$ ai image "a dragon" | ai video "animate this"
Saved to /Users/you/output.mp4

Hundreds of models, one key.

Access text, image, and video models from OpenAI, Anthropic, Google, Black Forest Labs, ByteDance, and more through Vercel AI Gateway.

  • short names resolve automatically: flux-2-pro, gpt-5.5
  • live model listing from the gateway
  • per-type defaults configurable via env vars
$ ai models --type image
 
openai
gpt-image-2
gpt-image-1
bfl
flux-2-pro
flux-kontext-pro
google
imagen-4.0-generate-001
...and more

Built for composability.

Not a chatbot. A generation tool that fits into any workflow — scripts, CI pipelines, agent toolchains, or just your terminal.

$ai text "hello"
001

Inline preview

Generated images and video frames display directly in your terminal using the Kitty graphics protocol. Supports Kitty, Ghostty, WezTerm, Warp, and iTerm2.

002

Agent-native output

Predictable behavior for scripts and agents. Raw stdout when piped, file saves when interactive. JSON metadata mode for CI pipelines.

003

Live model discovery

Models are fetched directly from the AI Gateway — no hardcoded lists to maintain. Use short names or full provider/model IDs.

004

Zero config

No config files, no init command, no setup wizard. Set an API key environment variable and start generating. Defaults work out of the box.