Gemma 4 31BvsGemma 3 - 27B (IT)
Across 3 shared benchmarks, Gemma 4 31B leads overall: Gemma 4 31B wins 3, Gemma 3 - 27B (IT) wins 0, with 0 ties and an average score difference of +36.63.
Gemma 4 31B
DeepMind · 2026-04-02 · AI model
Gemma 3 - 27B (IT)
Google Deep Mind · 2025-03-12 · AI model
Gemma 4 31B3 wins(100%)(0%)0 winsGemma 3 - 27B (IT)
Benchmark scores
Grouped by capability, sorted by largest gap within each. 3 shared benchmarks.
General Knowledge
Gemma 4 31B 2/2| Benchmark | Gemma 4 31B | Gemma 3 - 27B (IT) | Diff |
|---|---|---|---|
| GPQA Diamond | 84.3050 / 175Thinking (No Tools) | 42.40158 / 175Normal (No Tools) | +41.90 |
| MMLU Pro | 85.2021 / 124Thinking (No Tools) | 67.5094 / 124Normal (No Tools) | +17.70 |
Coding and Software Engineer
Gemma 4 31B 1/1Specs
| Field | Gemma 4 31B | Gemma 3 - 27B (IT) |
|---|---|---|
| Publisher | DeepMind | Google Deep Mind |
| Release date | 2026-04-02 | 2025-03-12 |
| Model type | AI model | AI model |
| Architecture | Dense | Dense |
| Parameters | 31.0 | 270.0 |
| Context length | 256K | 128K |
| Max output | 32768 | Not available |
Summary
- Gemma 4 31Bleads in:General Knowledge (2/2), Coding and Software Engineer (1/1)
On average across the 3 shared benchmarks, Gemma 4 31B scores 36.63 higher.
Largest single-benchmark gap: LiveCodeBench — Gemma 4 31B 80 vs Gemma 3 - 27B (IT) 29.70 (+50.30).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.