LogoLMSpeed
  • Home
  • Free
  • Models
  • Providers
  • Docs
LogoLMSpeed
LogoLMSpeed

The best API speed test tool

GitHubGitHubTwitterX (Twitter)Email
Product
  • Features
  • Pricing
  • FAQ
Leaderboard
  • Overview
  • Speed Ranking
  • Latency Ranking
  • Health Ranking
  • Model Pricing
  • Model Speed
  • Reasoning
  • Coding
Models
  • All Models
  • GPT
  • Claude
  • Gemini
  • DeepSeek
  • Llama
  • Qwen
Free Models
  • All Free Models
  • Free GPT
  • Free Claude
  • Free Gemini
  • Free DeepSeek
  • Free Llama
  • Free Qwen
Resources
  • Speed Test
  • Provider Directory
  • Documentation
Legal
  • Cookie Policy
  • Privacy Policy
  • Terms of Service
© 2026 LMSpeed All Rights Reserved.Made by Nexmoe with ❤️

Model Library

Browse canonical models across providers with performance and coverage highlights.

Visible models
304
Active models
304
Providers covered
605
Model variants
32198
Showing 1-24 of 304 models

ClaudeClaude Fable 5

Anthropic Claude Fable 5 is a Mythos-class frontier model for the most demanding reasoning and long-horizon agentic work, with a 1M-token context window, adaptive thinking, and strong vision capabilities. It is Anthropic's most capable widely released model.

Input price

From $0.100/M

Avg speed

46 t/s

First token

3.69s

Providers

85

ClaudeClaude Opus 4.8

Anthropic Claude Opus 4.8 is Anthropic frontier model for complex reasoning, long-horizon agentic coding, and professional knowledge work, with a 1M-token context window and adaptive thinking support.

Input price+1 free

From $0.030/M

Avg speed

46 t/s

First token

2.54s

Providers

121

QwenQwen-72B

Qwen-72B is a 72-billion-parameter language model in Alibaba Cloud's Qwen (Tongyi Qianwen) family, designed for multilingual instruction following, reasoning, and general-purpose text generation.

Input price

From $2.64/M

Avg speed

—

First token

—

Providers

15

QwenQwen3 Instruct

Alibaba Qwen3 Instruct is an instruction-tuned variant in the Qwen series, optimized for following instructions and conversational tasks.

Input price

From $0.010/M

Avg speed

—

First token

—

Providers

7

DeepSeekDeepSeek V4 Pro

DeepSeek V4 Pro is the professional-tier DeepSeek V4 model, targeting frontier reasoning, coding, and agent workflows with maximum capability.

Input price+9 free

From $0.0007/M

Avg speed

42 t/s

First token

8.83s

Providers

200

DeepSeekDeepSeek V4 Flash

DeepSeek V4 Flash is a fast, cost-efficient language model in the DeepSeek V4 family, optimized for low-latency chat, coding assistance, and high-throughput API workloads while retaining strong reasoning quality.

Input price+9 free

From $0.0007/M

Avg speed

70 t/s

First token

5.81s

Providers

208

MiMo-V2.5-TTS-VoiceDesign

Xiaomi MiMo-V2.5-TTS-VoiceDesign is the voice-design variant of MiMo-V2.5-TTS on the Xiaomi MiMo API platform, enabling custom voice creation through stylistic prompts. Pricing: free during the limited-time launch period (0x token consumption).

Input price+2 free

From $0.0071/M

Avg speed

—

First token

—

Providers

41

MiMo-V2.5-TTS-VoiceClone

Xiaomi MiMo-V2.5-TTS-VoiceClone is the voice-cloning variant of MiMo-V2.5-TTS on the Xiaomi MiMo API platform, enabling speech synthesis with cloned target voices. Pricing: free during the limited-time launch period (0x token consumption).

Input price+2 free

From $0.0071/M

Avg speed

—

First token

—

Providers

44

MiMo-V2.5-TTS

Xiaomi MiMo-V2.5-TTS is the text-to-speech model in the V2.5 series on the Xiaomi MiMo API platform, providing high-quality speech synthesis. Pricing: free during the limited-time launch period (0x token consumption).

Input price+2 free

From $0.0071/M

Avg speed

—

First token

—

Providers

44

MiMo-V2.5

Xiaomi MiMo-V2.5 is a native omnimodal sparse MoE model (310B total, 15B active) with unified text, image, video, and audio understanding, built on the MiMo-V2-Flash backbone with dedicated vision and audio encoders. It supports up to 1M tokens of context, strong agentic workflows, and open weights on Hugging Face.

Input price+1 free

From $0.0040/M

Avg speed

82 t/s

First token

3.60s

Providers

101

QwenQwen3 Instant

Alibaba Qwen3 Instant is a fast and efficient language model in the Qwen series, optimized for quick responses and high throughput.

Input price

From $0.010/M

Avg speed

—

First token

—

Providers

3

MiMo-V2.5-Pro

Xiaomi MiMo-V2.5-Pro is a large open-source language model in the MiMo series, offering advanced reasoning and general-purpose capabilities.

Input price

From $0.0000/M

Avg speed

47 t/s

First token

5.50s

Providers

105

MiniMax M2 Her

MiniMax M2 Her is a character-focused dialogue model in the MiniMax M2 family, tuned for roleplay, persona consistency, and emotionally expressive conversational responses.

Input price

From $1.03/M

Avg speed

—

First token

—

Providers

4

MoonshotAIKimi K2.6

Moonshot Kimi K2.6 is an open-source native multimodal agent model with 1 trillion MoE parameters, a 256K context window, and state-of-the-art coding, vision, and long-horizon agent swarm capabilities.

Input price+2 free

From $0.0021/M

Avg speed

57 t/s

First token

14.85s

Providers

96

MiMo-V2-TTS

Xiaomi MiMo-V2-TTS is a text-to-speech model in the MiMo series, optimized for natural speech synthesis and voice generation tasks.

Input price+5 free

From $0.0071/M

Avg speed

—

First token

—

Providers

41

ClaudeClaude Opus 4.7 Max

Anthropic Claude Opus 4.7 Max is a high-capability language model in the Claude series, offering enhanced reasoning, code generation, and multimodal capabilities.

Input price

From $0.0014/M

Avg speed

—

First token

—

Providers

26

ClaudeClaude Opus 4.7

Anthropic Claude Opus 4.7 targets frontier-level analysis, complex coding, and autonomous workflows that require deep multi-step reasoning.

Input price+2 free

From $0.0014/M

Avg speed

42 t/s

First token

3.95s

Providers

197

DeepSeekDeepSeek V4

DeepSeek V4 is DeepSeek's next-generation foundation model family, built for advanced reasoning, coding, math, and long-context agent applications.

Input price

From $0.050/M

Avg speed

47 t/s

First token

2.49s

Providers

12

QwenQwen1.8B Long Context

Alibaba Qwen1.8B Long Context is a compact language model in the Qwen series, optimized for low-latency responses and efficient inference.

Input price

From $0.050/M

Avg speed

—

First token

—

Providers

16

QwenQwen1.8B

Alibaba Qwen1.8B is a compact language model in the Qwen series, optimized for low-latency responses and efficient inference.

Input price

From $0.050/M

Avg speed

—

First token

—

Providers

16

MetaAIAion RP Llama 3.1

Aion RP Llama 3.1 is a roleplay-tuned variant in the Aion series, optimized for character-driven dialogue and creative writing.

Input price

From $0.401/M

Avg speed

—

First token

—

Providers

6

MetaAIHermes 3 Llama 3.1

Nous Research Hermes 3 is a generalist instruct model fine-tuned on Meta Llama 3.1, with strong reasoning, roleplay, multi-turn chat, tool calling, and structured JSON output.

Input price

From $0.0050/M

Avg speed

—

First token

—

Providers

24

LFM 2.5 1.2B Instruct

LFM 2.5 1.2B Instruct is a compact language model in the LFM series, optimized for low-latency responses and efficient inference.

Input price+1 free

From $0.0010/M

Avg speed

—

First token

—

Providers

21

MetaAILlama 3.1

Meta Llama 3.1 extends the Llama 3 family with stronger reasoning, tool use, and long-context support across 8B to 405B scales.

Input price

From $0.0007/M

Avg speed

—

First token

—

Providers

28

  • 1
  • 2
  • 3
  • 12
  • 13
+63 more
Jun 10
+91 more
Jun 11
+5 more
+151 more
Jun 9
+156 more
Jun 8
+22 more
+25 more
+25 more
+75 more
Jun 12
+80 more
Jun 9
+64 more
Jun 13
+23 more
+14 more
+147 more
Jun 10
+5 more
Apr 20
+9 more
+9 more
+12 more
+8 more
+15 more