MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier intelligence across general reasoning, tool use, and multi-step task execution while maintaining low latency and deployment efficiency. The model excels in code generation, multi-file editing, compile-run-fix loops, and test-validated repair, showing strong results on SWE-Bench Verified, Multi-SWE-Bench, and Terminal-Bench. It also performs competitively in agentic evaluations such as BrowseComp and GAIA, effectively handling long-horizon planning, retrieval, and recovery from execution errors. Benchmarked by Artificial Analysis, MiniMax-M2 ranks among the top open-source models for composite intelligence, spanning mathematics, science, and instruction-following. Its small activation footprint enables fast inference, high concurrency, and improved unit economics, making it well-suited for large-scale agents, developer assistants, and reasoning-driven applications that require responsiveness and cost efficiency.
| Providers | Min (tokens/s) | Max (tokens/s) | Avg (tokens/s) |
|---|---|---|---|
| MiniMax | 1.32 | 53.62 | 34.82 |
| Providers | Min (ms) | Max (ms) | Avg (ms) |
|---|---|---|---|
| MiniMax | 798 | 4614 | 1944.69 |
pythonfrom openai import OpenAI client = OpenAI( base_url="https://zenmux.ai/api/v1", api_key="<ZENMUX_API_KEY>", ) completion = client.chat.completions.create( model="minimax/minimax-m2", messages=[ { "role": "user", "content": "What is the meaning of life?" } ] ) print(completion.choices[0].message.content)