Gemma 4 31BvsGemma 3 - 27B (IT)

Across 3 shared benchmarks, Gemma 4 31B leads overall: Gemma 4 31B wins 3, Gemma 3 - 27B (IT) wins 0, with 0 ties and an average score difference of +36.63.

Gemma 4 31B

DeepMind · 2026-04-02 · Chat model

Gemma 3 - 27B (IT)

Google Deep Mind · 2025-03-12 · Chat model

Gemma 4 31B3 wins(100%)(0%)0 winsGemma 3 - 27B (IT)

Benchmark scores

Grouped by capability, sorted by largest gap within each. 3 shared benchmarks.

General Knowledge

Gemma 4 31B 2/2

Benchmark	Gemma 4 31B	Gemma 3 - 27B (IT)	Diff
GPQA Diamond	84.3058 / 187Thinking (No Tools)	42.40169 / 187Normal (No Tools)	+41.90
MMLU Pro	85.2024 / 132Thinking (No Tools)	67.50100 / 132Normal (No Tools)	+17.70

Coding and Software Engineer

Gemma 4 31B 1/1

Benchmark	Gemma 4 31B	Gemma 3 - 27B (IT)	Diff
LiveCodeBench	8030 / 123Thinking (No Tools)	29.70119 / 123Normal (No Tools)	+50.30

Specs

Field	Gemma 4 31B	Gemma 3 - 27B (IT)
Publisher	DeepMind	Google Deep Mind
Release date	2026-04-02	2025-03-12
Model type	Chat model	Chat model
Architecture	Dense	Dense
Parameters	30.7B	27B
Context length	256K	128K
Max output	32K	Not available

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

Item	Gemma 4 31B	Gemma 3 - 27B (IT)
Text input	Not public	$0.09 / 1M tokens
Text output	Not public	$0.16 / 1M tokens

One or both models have incomplete public pricing.

Summary

Gemma 4 31Bleads in:General Knowledge (2/2), Coding and Software Engineer (1/1)

On average across the 3 shared benchmarks, Gemma 4 31B scores 36.63 higher.

Largest single-benchmark gap: LiveCodeBench — Gemma 4 31B 80 vs Gemma 3 - 27B (IT) 29.70 (+50.30).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

Gemma 4 31B details Gemma 3 - 27B (IT) details·Customize in compare tool