Model Library

Browse canonical models across providers with performance and coverage highlights.

Visible models

304

Active models

304

Providers covered

605

Model variants

32198

Claude Fable 5

Anthropic Claude Fable 5 is a Mythos-class frontier model for the most demanding reasoning and long-horizon agentic work, with a 1M-token context window, adaptive thinking, and strong vision capabilities. It is Anthropic's most capable widely released model.

Input price

From $0.100/M

Avg speed

46 t/s

First token

3.69s

Providers

Claude Opus 4.8

Anthropic Claude Opus 4.8 is Anthropic frontier model for complex reasoning, long-horizon agentic coding, and professional knowledge work, with a 1M-token context window and adaptive thinking support.

Input price+1 free

From $0.030/M

Avg speed

46 t/s

First token

2.54s

Providers

121

Qwen-72B

Qwen-72B is a 72-billion-parameter language model in Alibaba Cloud's Qwen (Tongyi Qianwen) family, designed for multilingual instruction following, reasoning, and general-purpose text generation.

Input price

From $2.64/M

Avg speed

—

First token

—

Providers

Qwen3 Instruct

Alibaba Qwen3 Instruct is an instruction-tuned variant in the Qwen series, optimized for following instructions and conversational tasks.

Input price

From $0.010/M

Avg speed

—

First token

—

Providers

DeepSeek V4 Pro

DeepSeek V4 Pro is the professional-tier DeepSeek V4 model, targeting frontier reasoning, coding, and agent workflows with maximum capability.

Input price+9 free

From $0.0007/M

Avg speed

42 t/s

First token

8.83s

Providers

200

DeepSeek V4 Flash

DeepSeek V4 Flash is a fast, cost-efficient language model in the DeepSeek V4 family, optimized for low-latency chat, coding assistance, and high-throughput API workloads while retaining strong reasoning quality.

Input price+9 free

From $0.0007/M

Avg speed

70 t/s

First token

5.81s

Providers

208

MiMo-V2.5-TTS-VoiceDesign

Xiaomi MiMo-V2.5-TTS-VoiceDesign is the voice-design variant of MiMo-V2.5-TTS on the Xiaomi MiMo API platform, enabling custom voice creation through stylistic prompts. Pricing: free during the limited-time launch period (0x token consumption).

Input price+2 free

From $0.0071/M

Avg speed

—

First token

—

Providers

MiMo-V2.5-TTS-VoiceClone

Xiaomi MiMo-V2.5-TTS-VoiceClone is the voice-cloning variant of MiMo-V2.5-TTS on the Xiaomi MiMo API platform, enabling speech synthesis with cloned target voices. Pricing: free during the limited-time launch period (0x token consumption).

Input price+2 free

From $0.0071/M

Avg speed

—

First token

—

Providers

MiMo-V2.5-TTS

Xiaomi MiMo-V2.5-TTS is the text-to-speech model in the V2.5 series on the Xiaomi MiMo API platform, providing high-quality speech synthesis. Pricing: free during the limited-time launch period (0x token consumption).

Input price+2 free

From $0.0071/M

Avg speed

—

First token

—

Providers

MiMo-V2.5

Xiaomi MiMo-V2.5 is a native omnimodal sparse MoE model (310B total, 15B active) with unified text, image, video, and audio understanding, built on the MiMo-V2-Flash backbone with dedicated vision and audio encoders. It supports up to 1M tokens of context, strong agentic workflows, and open weights on Hugging Face.

Input price+1 free

From $0.0040/M

Avg speed

82 t/s

First token

3.60s

Providers

101

Qwen3 Instant

Alibaba Qwen3 Instant is a fast and efficient language model in the Qwen series, optimized for quick responses and high throughput.

Input price

From $0.010/M

Avg speed

—

First token

—

Providers

MiMo-V2.5-Pro

Xiaomi MiMo-V2.5-Pro is a large open-source language model in the MiMo series, offering advanced reasoning and general-purpose capabilities.

Input price

From $0.0000/M

Avg speed

47 t/s

First token

5.50s

Providers

105

MiniMax M2 Her

MiniMax M2 Her is a character-focused dialogue model in the MiniMax M2 family, tuned for roleplay, persona consistency, and emotionally expressive conversational responses.

Input price

From $1.03/M

Avg speed

—

First token

—

Providers

Kimi K2.6

Moonshot Kimi K2.6 is an open-source native multimodal agent model with 1 trillion MoE parameters, a 256K context window, and state-of-the-art coding, vision, and long-horizon agent swarm capabilities.

Input price+2 free

From $0.0021/M

Avg speed

57 t/s

First token

14.85s

Providers

MiMo-V2-TTS

Xiaomi MiMo-V2-TTS is a text-to-speech model in the MiMo series, optimized for natural speech synthesis and voice generation tasks.

Input price+5 free

From $0.0071/M

Avg speed

—

First token

—

Providers

Claude Opus 4.7 Max

Anthropic Claude Opus 4.7 Max is a high-capability language model in the Claude series, offering enhanced reasoning, code generation, and multimodal capabilities.

Input price

From $0.0014/M

Avg speed

—

First token

—

Providers

Claude Opus 4.7

Anthropic Claude Opus 4.7 targets frontier-level analysis, complex coding, and autonomous workflows that require deep multi-step reasoning.

Input price+2 free

From $0.0014/M

Avg speed

42 t/s

First token

3.95s

Providers

197

DeepSeek V4

DeepSeek V4 is DeepSeek's next-generation foundation model family, built for advanced reasoning, coding, math, and long-context agent applications.

Input price

From $0.050/M

Avg speed

47 t/s

First token

2.49s

Providers

Qwen1.8B Long Context

Alibaba Qwen1.8B Long Context is a compact language model in the Qwen series, optimized for low-latency responses and efficient inference.

Input price

From $0.050/M

Avg speed

—

First token

—

Providers

Qwen1.8B

Alibaba Qwen1.8B is a compact language model in the Qwen series, optimized for low-latency responses and efficient inference.

Input price

From $0.050/M

Avg speed

—

First token

—

Providers

Aion RP Llama 3.1

Aion RP Llama 3.1 is a roleplay-tuned variant in the Aion series, optimized for character-driven dialogue and creative writing.

Input price

From $0.401/M

Avg speed

—

First token

—

Providers

Hermes 3 Llama 3.1

Nous Research Hermes 3 is a generalist instruct model fine-tuned on Meta Llama 3.1, with strong reasoning, roleplay, multi-turn chat, tool calling, and structured JSON output.

Input price

From $0.0050/M

Avg speed

—

First token

—

Providers

LFM 2.5 1.2B Instruct

LFM 2.5 1.2B Instruct is a compact language model in the LFM series, optimized for low-latency responses and efficient inference.

Input price+1 free

From $0.0010/M

Avg speed

—

First token

—

Providers

Llama 3.1

Meta Llama 3.1 extends the Llama 3 family with stronger reasoning, tool use, and long-context support across 8B to 405B scales.

Input price

From $0.0007/M

Avg speed

—

First token

—

Providers

Model Library

ClaudeClaude Fable 5

ClaudeClaude Opus 4.8

QwenQwen-72B

QwenQwen3 Instruct

DeepSeekDeepSeek V4 Pro

DeepSeekDeepSeek V4 Flash

MiMo-V2.5-TTS-VoiceDesign

MiMo-V2.5-TTS-VoiceClone

MiMo-V2.5-TTS

MiMo-V2.5

QwenQwen3 Instant

MiMo-V2.5-Pro

MiniMax M2 Her

MoonshotAIKimi K2.6

MiMo-V2-TTS

ClaudeClaude Opus 4.7 Max

ClaudeClaude Opus 4.7

DeepSeekDeepSeek V4

QwenQwen1.8B Long Context

QwenQwen1.8B

MetaAIAion RP Llama 3.1

MetaAIHermes 3 Llama 3.1

LFM 2.5 1.2B Instruct

MetaAILlama 3.1

Claude Fable 5

Claude Opus 4.8

Qwen-72B

Qwen3 Instruct

DeepSeek V4 Pro

DeepSeek V4 Flash

Qwen3 Instant

Kimi K2.6

Claude Opus 4.7 Max

Claude Opus 4.7

DeepSeek V4

Qwen1.8B Long Context

Qwen1.8B

Aion RP Llama 3.1

Hermes 3 Llama 3.1

Llama 3.1