DeepSeek-R1-Distill-Llama-70B

Name: DeepSeek-R1-Distill-Llama-70B
Author: DeepSeek-AI

Reasoning model

Release date: 2025-01-20Updated: 2025-02-08 12:08:541,427

Live demo

DeepSeek-R1-Distill-Llama-70B is an AI model published by DeepSeek-AI, released on 2025-01-20, for Reasoning model, with 700.0B parameters, and 128K tokens context length, requiring about 140GB storage, under the MIT License license.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

DeepSeek-R1-Distill-Llama-70B

Model basics

Reasoning traces

Supported

Thinking modes

Thinking modes not supported

Context length

128K tokens

Max output length

No data

Model type

Reasoning model

Release date

2025-01-20

Model file size

140GB

MoE architecture

Total params / Active params

70B / N/A

Knowledge cutoff

No data

DeepSeek-R1-Distill-Llama-70B

Open source & experience

Code license

MIT License

Weights license

MIT License- 免费商用授权

GitHub repo

https://github.com/deepseek-ai/DeepSeek-R1

Hugging Face

https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

Live demo

No live demo

DeepSeek-R1-Distill-Llama-70B

Benchmark Results

DeepSeek-R1-Distill-Llama-70B currently shows benchmark results led by MATH-500 (27 / 44, score 94.50), GPQA Diamond (126 / 175, score 65.20). This page also consolidates core specs, context limits, and API pricing so you can evaluate the model from benchmark results and deployment constraints together.

General Knowledge

1 evaluations

Benchmark / mode

Score

Rank/total

GPQA Diamond

Standard Mode

65.20

126 / 175

Math and Reasoning

1 evaluations

Benchmark / mode

Score

Rank/total

MATH-500

Standard Mode

94.50

27 / 44

View benchmark analysis Compare with other models

DeepSeek-R1-Distill-Llama-70B

DeepSeek-R1-Distill-Llama-70B

Model basics

Open source & experience

Official resources

API details

Benchmark Results

General Knowledge

Math and Reasoning

Publisher

Model Overview

DataLearner on WeChat