Browse canonical models across providers with performance and coverage highlights.
Anthropic Claude Fable 5 is a Mythos-class frontier model for the most demanding reasoning and long-horizon agentic work, with a 1M-token context window, adaptive thinking, and strong vision capabilities. It is Anthropic's most capable widely released model.
Input price
From $0.100/M
Avg speed
46 t/s
First token
3.69s
Providers
85
Anthropic Claude Opus 4.8 is Anthropic frontier model for complex reasoning, long-horizon agentic coding, and professional knowledge work, with a 1M-token context window and adaptive thinking support.
Input price+1 free
From $0.030/M
Avg speed
46 t/s
First token
2.54s
Providers
121
Qwen-72B is a 72-billion-parameter language model in Alibaba Cloud's Qwen (Tongyi Qianwen) family, designed for multilingual instruction following, reasoning, and general-purpose text generation.
Input price
From $2.64/M
Avg speed
—
First token
—
Providers
15
Alibaba Qwen3 Instruct is an instruction-tuned variant in the Qwen series, optimized for following instructions and conversational tasks.
Input price
From $0.010/M
Avg speed
—
First token
—
Providers
7
DeepSeek V4 Pro is the professional-tier DeepSeek V4 model, targeting frontier reasoning, coding, and agent workflows with maximum capability.
Input price+9 free
From $0.0007/M
Avg speed
42 t/s
First token
8.83s
Providers
200
DeepSeek V4 Flash is a fast, cost-efficient language model in the DeepSeek V4 family, optimized for low-latency chat, coding assistance, and high-throughput API workloads while retaining strong reasoning quality.
Input price+9 free
From $0.0007/M
Avg speed
70 t/s
First token
5.81s
Providers
208
Xiaomi MiMo-V2.5-TTS-VoiceDesign is the voice-design variant of MiMo-V2.5-TTS on the Xiaomi MiMo API platform, enabling custom voice creation through stylistic prompts. Pricing: free during the limited-time launch period (0x token consumption).
Input price+2 free
From $0.0071/M
Avg speed
—
First token
—
Providers
41
Xiaomi MiMo-V2.5-TTS-VoiceClone is the voice-cloning variant of MiMo-V2.5-TTS on the Xiaomi MiMo API platform, enabling speech synthesis with cloned target voices. Pricing: free during the limited-time launch period (0x token consumption).
Input price+2 free
From $0.0071/M
Avg speed
—
First token
—
Providers
44
Xiaomi MiMo-V2.5-TTS is the text-to-speech model in the V2.5 series on the Xiaomi MiMo API platform, providing high-quality speech synthesis. Pricing: free during the limited-time launch period (0x token consumption).
Input price+2 free
From $0.0071/M
Avg speed
—
First token
—
Providers
44
Xiaomi MiMo-V2.5 is a native omnimodal sparse MoE model (310B total, 15B active) with unified text, image, video, and audio understanding, built on the MiMo-V2-Flash backbone with dedicated vision and audio encoders. It supports up to 1M tokens of context, strong agentic workflows, and open weights on Hugging Face.
Input price+1 free
From $0.0040/M
Avg speed
82 t/s
First token
3.60s
Providers
101
Alibaba Qwen3 Instant is a fast and efficient language model in the Qwen series, optimized for quick responses and high throughput.
Input price
From $0.010/M
Avg speed
—
First token
—
Providers
3
Xiaomi MiMo-V2.5-Pro is a large open-source language model in the MiMo series, offering advanced reasoning and general-purpose capabilities.
Input price
From $0.0000/M
Avg speed
47 t/s
First token
5.50s
Providers
105
MiniMax M2 Her is a character-focused dialogue model in the MiniMax M2 family, tuned for roleplay, persona consistency, and emotionally expressive conversational responses.
Input price
From $1.03/M
Avg speed
—
First token
—
Providers
4
Moonshot Kimi K2.6 is an open-source native multimodal agent model with 1 trillion MoE parameters, a 256K context window, and state-of-the-art coding, vision, and long-horizon agent swarm capabilities.
Input price+2 free
From $0.0021/M
Avg speed
57 t/s
First token
14.85s
Providers
96
Xiaomi MiMo-V2-TTS is a text-to-speech model in the MiMo series, optimized for natural speech synthesis and voice generation tasks.
Input price+5 free
From $0.0071/M
Avg speed
—
First token
—
Providers
41
Anthropic Claude Opus 4.7 Max is a high-capability language model in the Claude series, offering enhanced reasoning, code generation, and multimodal capabilities.
Input price
From $0.0014/M
Avg speed
—
First token
—
Providers
26
Anthropic Claude Opus 4.7 targets frontier-level analysis, complex coding, and autonomous workflows that require deep multi-step reasoning.
Input price+2 free
From $0.0014/M
Avg speed
42 t/s
First token
3.95s
Providers
197
DeepSeek V4 is DeepSeek's next-generation foundation model family, built for advanced reasoning, coding, math, and long-context agent applications.
Input price
From $0.050/M
Avg speed
47 t/s
First token
2.49s
Providers
12
Alibaba Qwen1.8B Long Context is a compact language model in the Qwen series, optimized for low-latency responses and efficient inference.
Input price
From $0.050/M
Avg speed
—
First token
—
Providers
16
Alibaba Qwen1.8B is a compact language model in the Qwen series, optimized for low-latency responses and efficient inference.
Input price
From $0.050/M
Avg speed
—
First token
—
Providers
16
Aion RP Llama 3.1 is a roleplay-tuned variant in the Aion series, optimized for character-driven dialogue and creative writing.
Input price
From $0.401/M
Avg speed
—
First token
—
Providers
6
Nous Research Hermes 3 is a generalist instruct model fine-tuned on Meta Llama 3.1, with strong reasoning, roleplay, multi-turn chat, tool calling, and structured JSON output.
Input price
From $0.0050/M
Avg speed
—
First token
—
Providers
24
LFM 2.5 1.2B Instruct is a compact language model in the LFM series, optimized for low-latency responses and efficient inference.
Input price+1 free
From $0.0010/M
Avg speed
—
First token
—
Providers
21
Meta Llama 3.1 extends the Llama 3 family with stronger reasoning, tool use, and long-context support across 8B to 405B scales.
Input price
From $0.0007/M
Avg speed
—
First token
—
Providers
28