Llama 4 Behemoth Instruct

Name: Llama-4-Behemoth-17B-128E-Instruct
Author: Facebook AI研究实验室

Multimodal modelLlama 4

Llama-4-Behemoth-17B-128E-Instruct

Release date: 2025-04-05Updated: 2025-04-06 08:27:261,219

Live demoGitHub Hugging Face Compare

Parameters

Context length

1000K

Chinese support

Supported

Reasoning ability

Llama-4-Behemoth-17B-128E-Instruct is an AI model published by Facebook AI研究实验室, released on 2025-04-05, for Multimodal model, with 2T parameters, and 1000K context length, requiring about 4000GB storage, under the Llama4 License license, with a 95.00 score on MATH-500.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Llama 4 Behemoth Instruct

Model basics

Reasoning traces

Not supported

Thinking modes

Thinking modes not supported

Context length

1000K tokens

Max output length

4K tokens

Model type

Multimodal model

Modality (in / out)

Text, Image, Audio, Video → Text

Release date

2025-04-05

Model file size

4000GB

MoE architecture

Total params / Active params

2T / N/A

Knowledge cutoff

No data

Llama 4 Behemoth Instruct

Open source & experience

Code license

Llama4 License

Weights license

Llama4 License- 免费商用授权

GitHub repo

https://github.com/meta-llama/llama-models/tree/main/models/llama4

Hugging Face

https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E

Live demo

No live demo

Llama 4 Behemoth Instruct

Official resources

Paper

The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation

DataLearnerAI blog

No blog post yet

Llama 4 Behemoth Instruct

API details

API speed

3/5

No public API pricing yet.

Llama 4 Behemoth Instruct

Benchmark Results

Llama 4 Behemoth Instruct currently shows benchmark results led by MMLU Pro (49 / 126, score 82.20), GPQA Diamond (98 / 179, score 73.70), MATH-500 (25 / 44, score 95). This page also consolidates core specs, context limits, and API pricing so you can evaluate the model from benchmark results and deployment constraints together.