Muse SparkvsGemini 3.1 Pro Preview

Across 9 shared benchmarks, Gemini 3.1 Pro Preview leads overall: Muse Spark wins 3, Gemini 3.1 Pro Preview wins 6, with 0 ties and an average score difference of -5.42.

Muse Spark

Facebook AI研究实验室 · 2026-04-08 · Reasoning model

Gemini 3.1 Pro Preview

Google Deep Mind · 2026-02-20 · Multimodal model

Muse Spark3 wins(33%)(67%)6 winsGemini 3.1 Pro Preview

Benchmark scores

Grouped by capability, sorted by largest gap within each. 9 shared benchmarks.

General Knowledge

Gemini 3.1 Pro Preview 2/3

Benchmark	Muse Spark	Gemini 3.1 Pro Preview	Diff
ARC-AGI-2	42.5028 / 62Thinking (No Tools)	77.109 / 62Thinking High (No Tools)	-34.60
HLE	586 / 172深度思考（无工具、并行）	51.4022 / 172Thinking High (With Tools)	+6.60
GPQA Diamond	89.5025 / 187Thinking (No Tools)	94.303 / 187Thinking High (No Tools)	-4.80

AI Agent - Tool Usage

Even 2/2

Benchmark	Muse Spark	Gemini 3.1 Pro Preview	Diff
Terminal Bench 2.0	5924 / 47Thinking (With Tools)	68.508 / 47Thinking High (With Tools)	-9.50
MCP-Atlas	82.205 / 27Normal (With Tools)	78.209 / 27Thinking High (With Tools)	+4

Math and Reasoning

Even 2/2

Benchmark	Muse Spark	Gemini 3.1 Pro Preview	Diff
FrontierMath - Tier 4	14.6023 / 80Normal (No Tools)	16.7020 / 80Normal (No Tools)	-2.10
FrontierMath	399 / 60Thinking (No Tools)	36.9011 / 60Thinking High (No Tools)	+2.10

Agent Level Benchmark

Gemini 3.1 Pro Preview 1/1

Benchmark	Muse Spark	Gemini 3.1 Pro Preview	Diff
τ²-Bench - Telecom	9220 / 35Thinking (With Tools)	99.301 / 35Thinking High (With Tools)	-7.30

Coding and Software Engineer

Gemini 3.1 Pro Preview 1/1

Benchmark	Muse Spark	Gemini 3.1 Pro Preview	Diff
SWE-bench Verified	77.4027 / 112Thinking (With Tools)	80.6011 / 112Thinking High (With Tools)	-3.20

Specs

Field	Muse Spark	Gemini 3.1 Pro Preview
Publisher	Facebook AI研究实验室	Google Deep Mind
Release date	2026-04-08	2026-02-20
Model type	Reasoning model	Multimodal model
Architecture	Dense	Dense
Parameters	Not available	Not available
Context length	262K	1M
Max output	Not available	64K

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

Item	Muse Spark	Gemini 3.1 Pro Preview
Text input	Not public	$2 / 1M tokens
Text output	Not public	$12 / 1M tokens

One or both models have incomplete public pricing.

Summary

Gemini 3.1 Pro Previewleads in:General Knowledge (2/3), Agent Level Benchmark (1/1), Coding and Software Engineer (1/1)
Tied in:AI Agent - Tool Usage, Math and Reasoning

On average across the 9 shared benchmarks, Gemini 3.1 Pro Preview scores 5.42 higher.

Largest single-benchmark gap: ARC-AGI-2 — Muse Spark 42.50 vs Gemini 3.1 Pro Preview 77.10 (-34.60).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

Muse Spark details Gemini 3.1 Pro Preview details·Customize in compare tool