Nemotron 3 Ultra

Name: NVIDIA Nemotron 3 Ultra 550B-A55B
Author: NVIDIA

Reasoning model

NVIDIA Nemotron 3 Ultra 550B-A55B

Release date: 2026-06-04Knowledge cutoff: 2026-0513

Live demo GitHub Hugging Face Compare

Parameters

550B

Context length

Chinese support

Supported

Reasoning ability

NVIDIA 于 2026 年 6 月 4 日发布的 Nemotron 3 Ultra，550B 总参数、55B 激活参数，采用 LatentMoE / Mamba-2 / Attention 混合架构，支持 1M 上下文和可开关 reasoning。

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Nemotron 3 Ultra

Model basics

Reasoning traces

Supported

Thinking modes

Thinking Mode (Default)Standard Mode

Context length

1M tokens

Max output length

No data

Model type

Reasoning model

Modality (in / out)

Text → Text

Release date

2026-06-04

Model file size

No data

MoE architecture

Yes

Total params / Active params

550B / 55B

Knowledge cutoff

2026-05

Nemotron 3 Ultra

Open source & experience

Code license

免费商用授权

Weights license

免费商用授权

GitHub repo

https://github.com/NVIDIA/NeMo

Hugging Face

https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16

Live demo

https://build.nvidia.com/nvidia/nemotron-3-ultra-550b-a55b

Nemotron 3 Ultra

Official resources

Paper

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

DataLearnerAI blog

No blog post yet

Nemotron 3 Ultra

API details

API speed

4/5

No public API pricing yet.

Nemotron 3 Ultra

Benchmark Results

Nemotron 3 Ultra currently shows benchmark results led by LongBench v2 (4 / 11, score 61.90), LiveBench (88 / 115, score 51.78). This page also consolidates core specs, context limits, and API pricing so you can evaluate the model from benchmark results and deployment constraints together.