See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 2 个模型的评测数据与核心参数。

Muse Spark
Facebook AI研究实验室
Best overall
Gemini 3.1 Pro Preview · 53.88
Best single
Gemini 3.1 Pro Preview · GPQA Diamond 94.30
Modality coverage
Muse Spark · 4 modalities
Head to head
5
Benchmarks
2
Wins
3
Losses
-6.68
Average diff
Compare benchmark results across thinking modes and tool usage.
Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology
Complete scores for each model/mode across selected benchmarks.
5 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.
| Benchmark | Muse Spark | Gemini 3.1 Pro Preview |
|---|---|---|
ARC-AGI-2 综合评估 | 42.50Thinking Enabled | 77.10Thinking Level · High |
GPQA Diamond 综合评估 | 89.50Thinking Enabled | 94.30Thinking Level · High |
HLE 综合评估 | 50.40Thinking Enabled | Tools | 44.40Thinking Level · High |
FrontierMath 数学推理 | 39.00Thinking Enabled | 36.90Thinking Level · High |
14.60Standard Mode | 16.70Standard Mode |
Side-by-side input/output token pricing
Licensing, MoE architecture, and multi-modality support.
| Features & specs | Muse SparkFacebook AI研究实验室 | Gemini 3.1 Pro PreviewGoogle Deep Mind |
|---|---|---|
Core specsRelease | 2026-04-08 | 2026-02-20 |
Context length | 262K | 1M |
Max output | Not provided | 32768 |
MoE | No | No |
LicenseCode Open Source | Not provided | Not provided |
Weights Open Source | Not provided | Not provided |
Commercial use | 不开源 | 不开源 |
Modality supportText Input/Output | / | Not provided |
Image Input/Output | / | Not provided |
Audio Input/Output | / | Not provided |
Video Input/Output | / | Not provided |
ResourcesPaper / report | Introducing Muse Spark: Scaling Towards Personal Superintelligence | Gemini 3.1 Pro: A smarter model for your most complex tasks |

Gemini 3.1 Pro Preview
Google Deep Mind