DeepSeek-V2-236B

Name: DeepSeek-V2-MoE-236B
Author: DeepSeek-AI

Foundation modelDeepSeek V2

DeepSeek-V2-MoE-236B

Release date: 2024-05-06Updated: 2024-05-07 08:37:17696

Live demoGitHub Hugging Face Compare

Parameters

236B

Context length

128K

Chinese support

Supported

Reasoning ability

DeepSeek-V2-MoE-236B is an AI model published by DeepSeek-AI, released on 2024-05-06, for Foundation model, with 236B parameters, and 128K context length, requiring about 472GB storage, under the DEEPSEEK LICENSE AGREEMENT license.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

DeepSeek-V2-236B

Model basics

Reasoning traces

Not supported

Thinking modes

Thinking modes not supported

Context length

128K tokens

Max output length

No data

Model type

Foundation model

Modality (in / out)

No data

Release date

2024-05-06

Model file size

472GB

MoE architecture

Total params / Active params

236B / N/A

Knowledge cutoff

No data

DeepSeek-V2-236B

Open source & experience

Code license

MIT License

Weights license

DEEPSEEK LICENSE AGREEMENT- 免费商用授权

GitHub repo

https://github.com/deepseek-ai/DeepSeek-V2

Hugging Face

https://huggingface.co/deepseek-ai/DeepSeek-V2

Live demo

No live demo

DeepSeek-V2-236B

Official resources

Paper

No paper available

DataLearnerAI blog

No blog post yet

DeepSeek-V2-236B

API details

API speed

No data

No public API pricing yet.

DeepSeek-V2-236B

Benchmark Results

No benchmark data to show.

Compare with other models

No curated comparisons for this model yet.

Want a custom combination? Open the compare tool

DeepSeek-V2-236B

Publisher

DeepSeek-AI

View publisher details

DeepSeek-V2-MoE-236B

Model Overview

DataLearner on WeChat

Follow DataLearner on WeChat for AI model updates and research notes.