DeepSeek-V3

Name: DeepSeek-V3
Author: DeepSeek-AI

聊天大模型

DeepSeek-V3

Release date: 2024-12-26更新于: 2025-03-21 11:14:411,480

Live demoGitHub Hugging Face Compare

Parameters

6810.0亿

Context length

128K

Chinese support

Supported

Reasoning ability

DeepSeek-V3 is an AI model published by DeepSeek-AI, released on 2024-12-26, for 聊天大模型, with 6810.0B parameters, and 128K tokens context length, requiring about 687.9 GB storage, under the DEEPSEEK LICENSE AGREEMENT license.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

DeepSeek-V3

Model basics

Reasoning traces

Not supported

Thinking modes

Thinking modes not supported

Context length

128K tokens

Max output length

No data

Model type

聊天大模型

Release date

2024-12-26

Model file size

687.9 GB

MoE architecture

Total params / Active params

6810.0B / N/A

Knowledge cutoff

No data

DeepSeek-V3

Open source & experience

Code license

MIT License

Weights license

DEEPSEEK LICENSE AGREEMENT- 免费商用授权

GitHub repo

https://github.com/deepseek-ai/DeepSeek-V3

Hugging Face

https://huggingface.co/deepseek-ai/DeepSeek-V3

Live demo

No live demo

DeepSeek-V3

Official resources

Paper

Introducing DeepSeek-V3

DataLearnerAI blog

开源大模型的新里程碑：DeepSeek AI开源6510亿参数的DeepSeek V3模型，评测结果显著好于4050亿参数的Llama3.1 405B，比肩Sonnet 3.5的开源模型

DeepSeek-V3

API details

API speed

No data

No public API pricing yet.

DeepSeek-V3

Benchmark Results

DeepSeek-V3 currently shows benchmark results led by BBH (3 / 20, score 92.30), MATH (7 / 42, score 87.80), HumanEval (9 / 39, score 89). This page also consolidates core specs, context limits, and API pricing so you can evaluate the model from benchmark results and deployment constraints together.

综合评估

5 evaluations

Benchmark / mode

Score

Rank/total

BBH

Standard Mode

92.30

3 / 20

MMLU

Standard Mode

88.50

17 / 65

MMLU Pro

Standard Mode

75.90

78 / 124

GPQA Diamond

Standard Mode

59.10

139 / 175

GPQA

Standard Mode

59.10

5 / 14

编程与软件工程

2 evaluations

Benchmark / mode

Score

Rank/total

HumanEval

Standard Mode

9 / 39

LiveCodeBench

Standard Mode

34.60

105 / 118

数学推理

4 evaluations

Benchmark / mode

Score

Rank/total

MATH

Standard Mode

87.80

7 / 42

MATH-500

Standard Mode

87.80

39 / 44

AIME 2024

Standard Mode

52 / 62

FrontierMath

Standard Mode

1.70

46 / 57

常识问答

1 evaluations

Benchmark / mode

Score

Rank/total

SimpleQA

Standard Mode

24.90

29 / 45

写作和创作

1 evaluations

Benchmark / mode

Score

Rank/total

Creative Writing

Standard Mode

81.60

15 / 23

常识推理

1 evaluations

Benchmark / mode

Score

Rank/total

Simple Bench

Standard Mode

18.90

27 / 27

View benchmark analysis Compare with other models

DeepSeek-V3

Publisher

DeepSeek-AI

View publisher details

DeepSeek-V3

Model Overview

DeepSeek AI开源的大语言模型，是其开源的第三代大语言模型。DeepSeek V3是一个混合专家架构的模型（Mixture-of-Experts），总参数量6810亿，每次推理会激活其中370亿的参数。DeepSeek V3模型在14.8万亿tokens上完成训练，花费了278.8万个H800小时训练完成，其各项评测结果都十分优异。

本版本是经过后训练（Post Training）之后的版本。

DataLearner 官方微信

欢迎关注 DataLearner 官方微信，获得最新 AI 技术推送