Gemini 3.1 Flash-Lite

Name: Gemini 3.1 Flash-Lite
Author: Google Deep Mind

Multimodal modelGemini 3.1

Release date: 2026-05-07Knowledge cutoff: 2025-0118

Live demoGitHubHugging FaceCompare

Parameters

Not disclosed

Context length

Chinese support

Supported

Reasoning ability

Google DeepMind 于 2026 年 5 月 7 日 GA 的 Gemini 3.1 Flash-Lite，面向高吞吐、低延迟和成本敏感场景，支持 1M 输入、约 64K 输出以及最高 High 档 thinking。

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Gemini 3.1 Flash-Lite

Model basics

Reasoning traces

Supported

Thinking modes

Standard Mode (Default)Thinking Level · LowThinking Level · MediumThinking Level · High

Context length

1M tokens

Max output length

64K tokens

Model type

Multimodal model

Modality (in / out)

Text, Image, Audio, Video → Text

Release date

2026-05-07

Model file size

No data

MoE architecture

Total params / Active params

No data / N/A

Knowledge cutoff

2025-01

Gemini 3.1 Flash-Lite

Open source & experience

Code license

不开源

Weights license

不开源

GitHub repo

GitHub link unavailable

Hugging Face

Hugging Face link unavailable

Live demo

https://aistudio.google.com/

Gemini 3.1 Flash-Lite

Official resources

Paper

Gemini 3.1 Flash-Lite | Gemini Enterprise Agent Platform

DataLearnerAI blog

No blog post yet

Gemini 3.1 Flash-Lite

API details

API speed

5/5

No public API pricing yet.

Gemini 3.1 Flash-Lite

Benchmark Results

Gemini 3.1 Flash-Lite currently shows benchmark results led by LiveBench (61 / 115, score 61.68), MCP-Atlas (19 / 23, score 57.10). This page also consolidates core specs, context limits, and API pricing so you can evaluate the model from benchmark results and deployment constraints together.