DeepSeek-V3
DeepSeek-V3 is an AI model published by DeepSeek-AI, released on 2024-12-26, for 聊天大模型, with 6810.0B parameters, and 128K tokens context length, requiring about 687.9 GB storage, under the DEEPSEEK LICENSE AGREEMENT license.
Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology
DeepSeek-V3 currently shows benchmark results led by BBH (3 / 20, score 92.30), MATH (7 / 42, score 87.80), HumanEval (9 / 39, score 89). This page also consolidates core specs, context limits, and API pricing so you can evaluate the model from benchmark results and deployment constraints together.
DeepSeek AI开源的大语言模型,是其开源的第三代大语言模型。DeepSeek V3是一个混合专家架构的模型(Mixture-of-Experts),总参数量6810亿,每次推理会激活其中370亿的参数。DeepSeek V3模型在14.8万亿tokens上完成训练,花费了278.8万个H800小时训练完成,其各项评测结果都十分优异。
本版本是经过后训练(Post Training)之后的版本。
欢迎关注 DataLearner 官方微信,获得最新 AI 技术推送
