DeepSeek-R1-Distill-Qwen-7B

Name: DeepSeek-R1-Distill-Qwen-7B
Author: DeepSeek-AI

Reasoning modelDeepSeek R1 DistillDeepSeek R1 Distill

Release date: 2025-01-20Updated: 2025-02-27 22:11:471,321

Live demoGitHubHugging Face Compare

Parameters

Context length

128K

Chinese support

Supported

Reasoning ability

DeepSeek-R1-Distill-Qwen-7B is an AI model published by DeepSeek-AI, released on 2025-01-20, for Reasoning model, with 7B parameters, and 128K context length, requiring about 14GB storage, under the MIT License license, with a 91.40 score on MATH-500.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

DeepSeek-R1-Distill-Qwen-7B

Model basics

Reasoning traces

Supported

Thinking modes

Thinking modes not supported

Context length

128K tokens

Max output length

No data

Model type

Reasoning model

Modality (in / out)

No data

Release date

2025-01-20

Model file size

14GB

MoE architecture

Total params / Active params

7B / N/A

Knowledge cutoff

No data

DeepSeek-R1-Distill-Qwen-7B

Open source & experience

Code license

MIT License

Weights license

MIT License- 免费商用授权

GitHub repo

GitHub link unavailable

Hugging Face

https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

Live demo

No live demo

DeepSeek-R1-Distill-Qwen-7B

Official resources

Paper

DeepSeek-R1-Distill-Qwen-7B

DataLearnerAI blog

No blog post yet

DeepSeek-R1-Distill-Qwen-7B

API details

API speed

No data

No public API pricing yet.

DeepSeek-R1-Distill-Qwen-7B

Benchmark Results

DeepSeek-R1-Distill-Qwen-7B currently shows benchmark results led by AIME 2024 (45 / 62, score 53.30), MATH-500 (32 / 44, score 91.40), GPQA Diamond (158 / 182, score 49.50). This page also consolidates core specs, context limits, and API pricing so you can evaluate the model from benchmark results and deployment constraints together.