加载中...

M2.1

Name: MiniMax M2.1 Preview
Availability: InStock
Author: MiniMaxAI

MiniMax M2.1 Preview

Release date: 2025-12-23更新于: 2025-12-24 09:01:041,267

Live demo GitHub Hugging Face Compare

Parameters

2300.0亿

Context length

200K

Chinese support

Supported

Reasoning ability

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

M2.1

Model basics

Reasoning traces

Supported

Context length

200K tokens

Max output length

131072 tokens

Model type

聊天大模型

Release date

2025-12-23

Model file size

No data

MoE architecture

Yes

Total params / Active params

2300.0B / 100B

Knowledge cutoff

No data

Inference modes

常规模式（Non-Thinking Mode）思考模式（Thinking Mode）

M2.1

Open source & experience

Code license

MIT License

Weights license

Modified MIT- 免费商用授权

GitHub repo

https://github.com/MiniMax-AI/MiniMax-M2

Hugging Face

https://huggingface.co/MiniMaxAI/MiniMax-M2.1

Live demo

https://agent.minimax.io/

M2.1

Official resources

Paper

MiniMax M2.1: Significantly Enhanced Multi-Language Programming, Built for Real-World Complex Tasks

DataLearnerAI blog

No blog post yet

M2.1

API details

API speed

3/5

💡Default unit: $/1M tokens. If vendors use other units, follow their published pricing.

Standard pricingStandard

Modality	Input	Output
Text	$0.3	$1.2

Cached pricingCache

Modality	Input cache	Output cache
Text	$0.03	$0.375

M2.1

Benchmark Results

综合评估

3 evaluations

Benchmark / mode

Score

Rank/total

MMLU ProThinking

4 / 112

GPQA DiamondThinking

43 / 150

HLEThinking

42 / 99

编程与软件工程

2 evaluations

Benchmark / mode

Score

Rank/total

SWE-bench VerifiedThinking

74.80

18 / 85

SWE-Bench Pro - PublicThinking + With tools

32.60

11 / 12

数学推理

1 evaluations

Benchmark / mode

Score

Rank/total

AIME2025Thinking

54 / 105

Agent能力评测

2 evaluations

Benchmark / mode

Score

Rank/total

τ²-Bench - TelecomThinking + With tools

16 / 26

Aider-PolyglotThinking + With tools

17 / 26

指令跟随

1 evaluations

Benchmark / mode

Score

Rank/total

IF BenchThinking + With tools

6 / 23

AI Agent - 信息收集

1 evaluations

Benchmark / mode

Score

Rank/total

BrowseCompThinking + With tools

47.40

17 / 25

AI Agent - 工具使用

1 evaluations

Benchmark / mode

Score

Rank/total

Terminal Bench 2.0Thinking + With tools

47.90

11 / 18

查看评测深度分析与其他模型对比

M2.1

Publisher

MiniMaxAI

View publisher details

MiniMax M2.1 Preview

Model Overview

2025 年 12 月 23 日，MiniMax 发布 MiniMax M2.1，定位是“面向真实复杂任务”的新一代代码与 Agent 模型：不只更会写 Python，而是把 多语言工程、Web/App 交互与审美、以及办公场景的复合指令执行当成主战场来优化。minimax.io+1

这一版官方给出的自我对比很直白：M2 解决“成本与可用性”，M2.1 要解决“真实工作流里的复杂度”——语言更多、约束更多、链路更长、工具更多。minimax.io+1

1）它到底“升级”在哪：从单点 coding 到全链路交付

如果只用一句话概括 M2.1 的方向：把模型从“写得像”推进到“跑得起来、交付得出、还能好看”。

官方的 Key Highlights 里，有几个点值得拆开看：

多语言优先，而不是 Python 优先。 M2.1 明确把 Rust/Java/Go/C++/Kotlin/ObjC/TS/JS 等作为系统性增强对象——这其实更贴近企业代码库的真实形态：核心服务 + 历史包袱 + 多端应用 + 脚本胶水。minimax.io+1

WebDev / AppDev 的“交互与审美”被当成能力目标。 这点很关键：过去很多 coding benchmark 只评“代码对不对”，但真实产品要的是页面结构、交互逻辑、组件状态、动画效果、甚至整体视觉风格的一致性。M2.1 把 Android/iOS 原生开发也单列出来，等于是把“端到端应用生成”从口号变成了优化项。minimax.io+1

复合指令约束（composite instruction constraints）与办公场景。 你可以把它理解成：模型要同时满足多条约束（格式、步骤、数据口径、工具调用顺序、输出结构），并在“边做边检查”中维持一致性。官方把这条和“Office 场景”绑定，背后其实是更 Agent 化的执行路径。minimax.io+1

2）参数与形态：为什么它仍然强调“轻量”和“激活参数”

MiniMax 的 API 文档里给了一个很明确的结构信息：230B 总参数、每次推理激活 10B。这基本坐实了它仍然是以 MoE / 稀疏激活为核心的效率路线。MiniMax API Docs

在使用侧，OpenRouter 的模型卡片显示 上下文 204,800 tokens，并把它描述为“面向 coding、agentic workflows、modern app development”的轻量 SOTA 模型。OpenRouter

这类结构的实际意义是：如果你的链路是 “写—跑—报错—修—再跑”，或者需要高并发做 review、refactor、生成测试，那么“每次只激活一小部分参数”通常更容易把 吞吐/延迟/成本压到一个可用区间。

3）评测信号：VIBE 这种 benchmark 为什么值得关注？

M2.1 发布里最值得看的不是“又赢了谁”，而是他们自己提出了一个新 benchmark：VIBE（Visual & Interactive Benchmark for Execution）。

它想评的是：模型能否从零搭一个可运行应用，并在真实运行环境里通过交互逻辑与视觉效果的验证；并且它使用 Agent-as-a-Verifier (AaaV) 的方式做自动化评估。minimax.io

官方给出的 VIBE 成绩是：

VIBE 总分：88.6
VIBE-Web：91.5
VIBE-Android：89.7 minimax.io

为什么这类分数比传统“写代码 benchmark”更有解释力？因为它更接近你在 vibe coding / agentic engineering 里真正关心的东西：能不能在一个 runtime 里跑起来、并且结果能被验证。

4）“生态位”变化：它开始被当作 IDE / 工具链的默认大脑

一个很现实的判断方法是：看它是否快速进入主流工具链。

Vercel AI Gateway 在 2025-12-22 的 changelog 里提到 M2.1 已接入（对很多前端/全栈团队来说，这意味着接入成本显著降低）。Vercel
Kilo 也同步宣布可用，并给了偏“日常开发场景”的试用描述（创建/调试/扩展/文档四类任务）。blog.kilo.ai

这类信号比“单次跑分截图”更重要：说明它正在被当成 可稳定上线的工作流模型，而不仅是“偶尔惊艳”。

5）怎么用：把 M2.1 放进你的默认链路

这里给几个偏工程落地的用法建议（不追求花哨，追求能跑通）：

A. 代码重构 / Review / 测试生成：让它吃“变更 + 约束”，别只吃“需求”。

M2.1 强调 refactoring、instruction constraints，你在提示词里最好把“不可改动的行为”“必须保留的接口”“测试必须通过”写成显式约束（并要求输出变更说明 + 风险点），它的优势才更容易体现。minimax.io+1

B. 端到端小应用：按“可运行产物”组织输出。

VIBE 的思路其实在提醒你：不要只让模型吐一段代码；要让它输出一个最小可运行工程（目录结构、依赖、启动命令、关键页面/路由、基础数据/mock），并在每轮迭代里固定验收标准。minimax.io+1

C. 多轮 Agent：尽量保留 reasoning / 中间态。

OpenRouter 的说明里有一句很“工程化”的提醒：多轮对话里要尽量保留 reasoning 信息（reasoning_details）以避免性能退化。哪怕你不用 OpenRouter，这个思路也成立：别丢掉中间状态，否则长链路任务会更容易漂。OpenRouter

6）开放与部署：这次仍然强调“可本地化”

官方在 How to Use 里明确写到：M2.1 API 已上线，并且 模型权重已开源，支持本地部署，推荐的推理框架包括 SGLang、vLLM，以及 MLX / KTransformers 等。minimax.io

对团队来说，这意味着两条路都能走：

要快就直接 API；要控成本/控数据/控延迟，就走本地部署。

怎么看 M2.1 的“关键价值”

M2.1 这次最清晰的信号是：coding 模型的竞争点正在从“写得更像”转向“交付得更完整”——多语言工程、端到端应用、工具链协作、以及可验证的执行闭环。

如果你过去评估 coding 模型只看“补全质量”，那 M2.1 值得你换一种评测方式：把任务写成一个可运行、可验收、可回归的工程流程，再看它在 10～50 次迭代里是否稳定。

DataLearner 官方微信

欢迎关注 DataLearner 官方微信，获得最新 AI 技术推送

加载中...

M2.1

MiniMax M2.1 Preview

Release date: 2025-12-23更新于: 2025-12-24 09:01:041,267

Live demo GitHub Hugging Face Compare

Parameters

2300.0亿

Context length

200K

Chinese support

Supported

Reasoning ability

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

M2.1

Model basics

Reasoning traces

Supported

Context length

200K tokens

Max output length

131072 tokens

Model type

聊天大模型

Release date

2025-12-23

Model file size

No data

MoE architecture

Yes

Total params / Active params

2300.0B / 100B

Knowledge cutoff

No data

Inference modes

常规模式（Non-Thinking Mode）思考模式（Thinking Mode）

M2.1

Open source & experience

Code license

MIT License

Weights license

Modified MIT- 免费商用授权

GitHub repo

https://github.com/MiniMax-AI/MiniMax-M2

Hugging Face

https://huggingface.co/MiniMaxAI/MiniMax-M2.1

Live demo

https://agent.minimax.io/

M2.1

Official resources

Paper

MiniMax M2.1: Significantly Enhanced Multi-Language Programming, Built for Real-World Complex Tasks

DataLearnerAI blog

No blog post yet

M2.1

API details

API speed

3/5

💡Default unit: $/1M tokens. If vendors use other units, follow their published pricing.

Standard pricingStandard

Modality	Input	Output
Text	$0.3	$1.2

Cached pricingCache

Modality	Input cache	Output cache
Text	$0.03	$0.375

M2.1

Benchmark Results

综合评估

3 evaluations

Benchmark / mode

Score

Rank/total

MMLU ProThinking

4 / 112

GPQA DiamondThinking

43 / 150

HLEThinking

42 / 99

编程与软件工程

2 evaluations

Benchmark / mode

Score

Rank/total

SWE-bench VerifiedThinking

74.80

18 / 85

SWE-Bench Pro - PublicThinking + With tools

32.60

11 / 12

数学推理

1 evaluations

Benchmark / mode

Score

Rank/total

AIME2025Thinking

54 / 105

Agent能力评测

2 evaluations

Benchmark / mode

Score

Rank/total

τ²-Bench - TelecomThinking + With tools

16 / 26

Aider-PolyglotThinking + With tools

17 / 26

指令跟随

1 evaluations

Benchmark / mode

Score

Rank/total

IF BenchThinking + With tools

6 / 23

AI Agent - 信息收集

1 evaluations

Benchmark / mode

Score

Rank/total

BrowseCompThinking + With tools

47.40

17 / 25

AI Agent - 工具使用

1 evaluations

Benchmark / mode

Score

Rank/total

Terminal Bench 2.0Thinking + With tools

47.90

11 / 18

查看评测深度分析与其他模型对比

M2.1

Publisher

MiniMaxAI

View publisher details

MiniMax M2.1 Preview

Model Overview

1）它到底“升级”在哪：从单点 coding 到全链路交付

如果只用一句话概括 M2.1 的方向：把模型从“写得像”推进到“跑得起来、交付得出、还能好看”。

官方的 Key Highlights 里，有几个点值得拆开看：

2）参数与形态：为什么它仍然强调“轻量”和“激活参数”

在使用侧，OpenRouter 的模型卡片显示 上下文 204,800 tokens，并把它描述为“面向 coding、agentic workflows、modern app development”的轻量 SOTA 模型。OpenRouter

3）评测信号：VIBE 这种 benchmark 为什么值得关注？

M2.1 发布里最值得看的不是“又赢了谁”，而是他们自己提出了一个新 benchmark：VIBE（Visual & Interactive Benchmark for Execution）。

官方给出的 VIBE 成绩是：

VIBE 总分：88.6
VIBE-Web：91.5
VIBE-Android：89.7 minimax.io

4）“生态位”变化：它开始被当作 IDE / 工具链的默认大脑

一个很现实的判断方法是：看它是否快速进入主流工具链。

Vercel AI Gateway 在 2025-12-22 的 changelog 里提到 M2.1 已接入（对很多前端/全栈团队来说，这意味着接入成本显著降低）。Vercel
Kilo 也同步宣布可用，并给了偏“日常开发场景”的试用描述（创建/调试/扩展/文档四类任务）。blog.kilo.ai

这类信号比“单次跑分截图”更重要：说明它正在被当成 可稳定上线的工作流模型，而不仅是“偶尔惊艳”。

5）怎么用：把 M2.1 放进你的默认链路

这里给几个偏工程落地的用法建议（不追求花哨，追求能跑通）：

6）开放与部署：这次仍然强调“可本地化”

对团队来说，这意味着两条路都能走：

要快就直接 API；要控成本/控数据/控延迟，就走本地部署。

怎么看 M2.1 的“关键价值”

DataLearner 官方微信

欢迎关注 DataLearner 官方微信，获得最新 AI 技术推送