DE

DeepSeek-R1-Distill-Llama-70B

Reasoning modelDeepSeek R1 Distill

DeepSeek-R1-Distill-Llama-70B

Release date: 2025-01-20Updated: 2025-02-08 12:08:541,464
Parameters
70B
Context length
128K
Chinese support
Not supported
Reasoning ability

DeepSeek-R1-Distill-Llama-70B is an AI model published by DeepSeek-AI, released on 2025-01-20, for Reasoning model, with 70B parameters, and 128K context length, requiring about 140GB storage, under the MIT License license, with a 94.50 score on MATH-500.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

DeepSeek-R1-Distill-Llama-70B

Model basics

Reasoning traces
Supported
Thinking modes
Thinking modes not supported
Context length
128K tokens
Max output length
No data
Model type
Reasoning model
Modality (in / out)
No data
Release date
2025-01-20
Model file size
140GB
MoE architecture
No
Total params / Active params
70B / N/A
Knowledge cutoff
No data
DeepSeek-R1-Distill-Llama-70B

Open source & experience

Code license
Weights license
MIT License- 免费商用授权
Live demo
No live demo
DeepSeek-R1-Distill-Llama-70B

Official resources

Paper
No paper available
DataLearnerAI blog
No blog post yet
DeepSeek-R1-Distill-Llama-70B

API details

API speed
No data
No public API pricing yet.
DeepSeek-R1-Distill-Llama-70B

Benchmark Results

DeepSeek-R1-Distill-Llama-70B currently shows benchmark results led by MATH-500 (27 / 44, score 94.50), GPQA Diamond (130 / 179, score 65.20). This page also consolidates core specs, context limits, and API pricing so you can evaluate the model from benchmark results and deployment constraints together.

Thinking

General Knowledge

1 evaluations
Benchmark / mode
Score
Rank/total
65.20
130 / 179

Math and Reasoning

1 evaluations
Benchmark / mode
Score
Rank/total
94.50
27 / 44

Compare with other models

No curated comparisons for this model yet.

Want a custom combination? Open the compare tool

DeepSeek-R1-Distill-Llama-70B

Publisher

DeepSeek-R1-Distill-Llama-70B

Model Overview

DeepSeek-R1-Distill-Llama-70B is an AI model published by DeepSeek-AI, released on 2025-01-20, for Reasoning model, with 70B parameters, and 128K context length, requiring about 140GB storage, under the MIT License license, with a 94.50 score on MATH-500.

DataLearner on WeChat

Follow DataLearner on WeChat for AI model updates and research notes.

DataLearner WeChat QR code