🎄 Holiday Special: Get +25% FREE CREDITS on all top-ups until Jan 5th. Add Credits

Models Chat Benchmarks Docs Blog Changelog About Us

MiniMax: MiniMax M2

minimax/minimax-m2

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier intelligence across general reasoning, tool use, and multi-step task execution while maintaining low latency and deployment efficiency. The model excels in code generation, multi-file editing, compile-run-fix loops, and test-validated repair, showing strong results on SWE-Bench Verified, Multi-SWE-Bench, and Terminal-Bench. It also performs competitively in agentic evaluations such as BrowseComp and GAIA, effectively handling long-horizon planning, retrieval, and recovery from execution errors. Benchmarked by Artificial Analysis, MiniMax-M2 ranks among the top open-source models for composite intelligence, spanning mathematics, science, and instruction-following. Its small activation footprint enables fast inference, high concurrency, and improved unit economics, making it well-suited for large-scale agents, developer assistants, and reasoning-driven applications that require responsiveness and cost efficiency.

ByminimaxInput typeOutput typePublish time2025-10-27

Recent activity on MiniMax M2

Tokens processed per day

Throughput

(tokens/s)

Providers	Min (tokens/s)	Max (tokens/s)	Avg (tokens/s)
MiniMax	0.47	45.03	30.67

First Token Latency

(ms)

Providers	Min (ms)	Max (ms)	Avg (ms)
MiniMax	798	2761	1847.10

Providers for MiniMax M2

ZenMux Provider to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.

MiniMax

Latency

1.68

Throughput

tps

Uptime

100.00

Recent uptime

Jan 05,2026 - 7 PM-

Price

Input

$ 0.3/ M tokens

Output

$ 1.2/ M tokens

Cache read

$ 0.03/ M tokens

Cache write 5m

Cache write 1h

Cache write

Web search

Image

reasoning

Video

Audio

Audio & Video

Model limitation

Context

204.80K

Max output

128.00K

Supported Parameters

max_completion_tokens

temperature

top_p

frequency_penalty

presence_penalty

seed

logit_bias

logprobs

top_logprobs

response_format

stop

tools

tool_choice

parallel_tool_calls

Model Protocol Compatibility

OpenAI Chat Completions

OpenAI Responses

Anthropic Messages

Google VertexAI

Sample code and API for MiniMax M2

ZenMux normalizes requests and responses across providers for you.

OpenAI: Python-SDK

OpenAI: cURL

Anthropic: Python-SDK

Anthropic: cURL

python
from openai import OpenAI  
  
client = OpenAI(  
  base_url="https://zenmux.ai/api/v1",  
  api_key="<ZENMUX_API_KEY>",  
)  
  
# Chat Completion  
completion = client.chat.completions.create(  
  model="minimax/minimax-m2",  
  messages=[  
    {  
      "role": "user",  
      "content": "What is the meaning of life?"  
    }  
  ]  
)  
print(completion.choices[0].message.content)