DeepSeek-V2-MoE-236B-Chat
DeepSeek-V2-MoE-236B-Chat is an AI model published by DeepSeek-AI, released on 2024-05-06, for 聊天大模型, with 2360.0B parameters, and 128K tokens context length, requiring about 472GB storage, under the DEEPSEEK LICENSE AGREEMENT license.
Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology
幻方量化旗下大模型企业深度求索开源的全球最大规模的大语言模型,参数数量2360亿,是一个基于混合专家架构的模型,每次推理激活其中的210亿参数。
DeepSeek-V2-236B-Chat是在8.1万亿tokens数据集上训练得到,并且做过有监督微调和强化学习对齐的版本。
欢迎关注 DataLearner 官方微信,获得最新 AI 技术推送
