DataLearner logoDataLearnerAI
Latest AI Insights
Model Leaderboards
Benchmarks
Model Directory
Model Comparison
Resource Center
Tools
LanguageEnglish
DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

Products

  • Leaderboards
  • Model comparison
  • Datasets

Resources

  • Tutorials
  • Editorial
  • Tool directory

Company

  • About
  • Privacy policy
  • Data methodology
  • Contact

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

Privacy policyTerms of service
Page navigation
目录
Model catalogDeepSeek-V4-Pro
DE

DeepSeek-V4-Pro

推理大模型

DeepSeek V4 Pro

Release date: 2026-04-24更新于: 2026-04-24 13:39:15.488知识截止: 2025-05484
Live demoGitHubHugging FaceCompare
Parameters
16000.0亿
Context length
1M
Chinese support
Supported
Reasoning ability

DeepSeek V4 Pro is an AI model published by DeepSeek-AI, released on 2026-04-24, for 推理大模型, with 16000.0B parameters, and 1M tokens context length, under the MIT License license.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

DeepSeek-V4-Pro

Model basics

Reasoning traces
Supported
Thinking modes
Thinking Level · Max (Default)Standard ModeThinking Level · High
Context length
1M tokens
Max output length
384000 tokens
Model type
推理大模型
Release date
2026-04-24
Model file size
No data
MoE architecture
Yes
Total params / Active params
16000.0B / 490B
Knowledge cutoff
2025-05
DeepSeek-V4-Pro

Open source & experience

Code license
MIT License
Weights license
MIT License- 免费商用授权
GitHub repo
GitHub link unavailable
Hugging Face
https://huggingface.co/collections/deepseek-ai/deepseek-v4
Live demo
https://chat.deepseek.com
DeepSeek-V4-Pro

Official resources

Paper
DeepSeek-V4 Technical Report
DataLearnerAI blog
No blog post yet
DeepSeek-V4-Pro

API details

API speed
4/5
💡Default unit: $/1M tokens. If vendors use other units, follow their published pricing.
Learn about pricing modes
Standard
TypeConditionInputOutput
Text-$1.74/ 1M$3.48/ 1M
Cache PricingPrompt Cache
TypeTTLWriteRead
Text1d$1.74/ 1M$0.145/ 1M
DeepSeek-V4-Pro

Benchmark Results

DeepSeek-V4-Pro currently shows benchmark results led by LiveCodeBench (1 / 118, score 93.50), IMO-AnswerBench (1 / 17, score 89.80), SWE-bench Verified (7 / 103, score 80.60). This page also consolidates core specs, context limits, and API pricing so you can evaluate the model from benchmark results and deployment constraints together.

Thinking
All modesNormalThinking
Thinking mode details (2)
All thinking modesDefault (Max)Deep
Tool usage
All modesWith toolsNo tools
Internet
All modesOfflineInternet enabled

编程与软件工程

3 evaluations
Benchmark / mode
Score
Rank/total
SWE-bench Verified
HighTools
79.40
14 / 103
SWE-bench Multilingual
HighTools
74.10
4 / 17
SWE-Bench Pro - Public
HighTools
54.40
15 / 36

AI Agent - 工具使用

1 evaluations
Benchmark / mode
Score
Rank/total
Terminal Bench 2.0
HighTools
63.30
12 / 43
View benchmark analysisCompare with other models
DeepSeek-V4-Pro

Publisher

DeepSeek-AI
DeepSeek-AI
View publisher details
DeepSeek V4 Pro

Model Overview

DeepSeek-V4-Pro 预览版:性能比肩顶级闭源模型

DeepSeek-V4-Pro 是 DeepSeek 于 2026 年 4 月 24 日正式发布并开源的旗舰级大语言模型预览版,属于 DeepSeek-V4 系列的高端型号。该模型旨在将百万级上下文、强化的 Agent 能力和顶级推理性能融为一体,打破开源模型与闭源模型之间的能力边界。

架构与技术规格

DeepSeek-V4-Pro 采用混合专家(MoE)架构,总参数量达到 1.6 万亿(1.6T),每次推理激活参数约 490 亿(49B)。其上下文窗口原生支持 100 万 token(1M),最大输出长度可达 384K token。为实现高效长上下文处理,V4 引入了创新的混合注意力机制,融合压缩稀疏注意力(CSA)与重压缩注意力(HCA),并结合 DSA 稀疏注意力技术。在 100 万 token 的极端场景下,该架构使单 token 推理计算量降至前代 V3.2 的 27%,KV 缓存占用仅为 10%。模型预训练数据量达 33 万亿 token,并采用了 Muon 优化器、流形约束超连接(mHC)等新型训练策略。

核心能力与支持模态

DeepSeek-V4-Pro 目前为纯文本模型,不支持视觉输入或多模态识别。其核心能力聚焦于三大领域:Agent 能力经过专项优化,在 Agentic Coding 评测中达到开源模型最佳水平,内部测试表明其编码体验优于 Sonnet 4.5,交付质量接近 Opus 4.6 非思考模式;世界知识大幅领先所有开源模型,仅次于 Gemini-Pro-3.1 等顶尖闭源模型;推理性能在数学、STEM 和竞赛代码评测中超越所有公开评测的开源模型,比肩世界顶级闭源模型。模型同时支持非思考模式与思考模式,用户可通过参数调节思考强度以应对复杂推理任务。

性能评价

根据官方公布的基准测试,DeepSeek-V4-Pro 在 Agent 任务、知识问答和推理等维度均显著超越前代模型和同期开源竞品,与 GPT-5.4 xHigh、Gemini-3.1-Pro 等闭源顶尖模型互有胜负。DeepSeek 已将其作为内部 Agentic Coding 的首选模型,并针对 Claude Code、OpenClaw、OpenCode、CodeBuddy 等主流 Agent 产品进行了深度适配优化。

应用场景与限制

官方推荐场景包括:复杂逻辑推理、深度代码生成、大型文档分析、自动化 Agent 工作流等。当前局限在于:尚不支持多模态输入;在深度思考模式下与 Opus 4.6 仍有差距;作为预览版,未来 API 稳定性和功能可能存在调整。

访问方式与许可

模型已全面开源,权重和技术报告可通过 Hugging Face 和魔搭社区获取。许可证采用 MIT License,允许商用、修改和再分发。API 服务已同步上线,开发者通过修改 model_name 为 deepseek-v4-pro 即可调用,兼容 OpenAI ChatCompletions 和 Anthropic 接口格式。

DataLearner 官方微信

欢迎关注 DataLearner 官方微信,获得最新 AI 技术推送

DataLearner 官方微信二维码