加载中...

GPT-3

Name: Generative Pre-trained Transformer 3
Author: OpenAI

Generative Pre-trained Transformer 3

Release date: 2020-05-28更新于: 2023-08-16 22:09:19.616740

Live demoGitHubHugging Face

Parameters

1750.0亿

Context length

Chinese support

Supported

Reasoning ability

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Model basics

Reasoning traces

Not supported

Context length

2K tokens

Max output length

No data

Model type

基础大模型

Release date

2020-05-28

Model file size

No data

MoE architecture

Total params / Active params

1750.0B / N/A

Knowledge cutoff

No data

Inference modes

No mode data

Open source & experience

Code license

不开源

Weights license

不开源- 不开源

GitHub repo

GitHub link unavailable

Hugging Face

Hugging Face link unavailable

Live demo

No live demo

Official resources

Paper

Language Models are Few-Shot Learners

DataLearnerAI blog

No blog post yet

API details

API speed

No data

No public API pricing yet.

Benchmark Scores

No benchmark data to show.

Publisher

OpenAI

View publisher details

Model Overview

GPT-3是OpenAI发布的迄今为止最强大的大语言预训练模型之一。GPT-3是OpenAI的第三代自回归语言模型。相比较GPT-2，GPT-3模型参数大了2个量级，达到了1750个参数。

由于GPT-3太过强大，OpenAI认为可能会出现利用这个模型实施各种“不好”的事情行为。因此他们并没有公开这个模型，这也与他们建立之初作为一个非盈利的开放AI研究机构相违背。起初，这种行为遭到了大量的批评。但是，现在发现这个模型的确可能会导致很多坏事情，而且各大企业也都不再开放这种模型，因此，批评声音逐渐减弱了。

GPT-3模型的训练来自大量的互联网无标注数据。根据维基百科的介绍，其权重占比如下：

数据集	token数量	训练mix的权重
Common Crawl	4100亿	60%
WebText2	190亿	22%
Books1	120亿	8%
Books2	550亿	8%
Wikipedia	30亿	3%

由于GPT-3的训练数据包罗万象，它不需要进一步训练不同的语言任务。

GPT-3模型本身可以做很多事情，OpenAI也基于这个模型在不同领域做了微调，产生了很多领域内的应用，包括代码生成、图像生成等。

Foundation model

GPT

View details

DataLearner 官方微信

欢迎关注 DataLearner 官方微信，获得最新 AI 技术推送

加载中...

GPT-3

Generative Pre-trained Transformer 3

Release date: 2020-05-28更新于: 2023-08-16 22:09:19.616740

Live demoGitHubHugging Face

Parameters

1750.0亿

Context length

Chinese support

Supported

Reasoning ability

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Model basics

Reasoning traces

Not supported

Context length

2K tokens

Max output length

No data

Model type

基础大模型

Release date

2020-05-28

Model file size

No data

MoE architecture

Total params / Active params

1750.0B / N/A

Knowledge cutoff

No data

Inference modes

No mode data

Open source & experience

Code license

不开源

Weights license

不开源- 不开源

GitHub repo

GitHub link unavailable

Hugging Face

Hugging Face link unavailable

Live demo

No live demo

Official resources

Paper

Language Models are Few-Shot Learners

DataLearnerAI blog

No blog post yet

API details

API speed

No data

No public API pricing yet.

Benchmark Scores

No benchmark data to show.

Publisher

OpenAI

View publisher details

Model Overview

GPT-3模型的训练来自大量的互联网无标注数据。根据维基百科的介绍，其权重占比如下：

数据集	token数量	训练mix的权重
Common Crawl	4100亿	60%
WebText2	190亿	22%
Books1	120亿	8%
Books2	550亿	8%
Wikipedia	30亿	3%

由于GPT-3的训练数据包罗万象，它不需要进一步训练不同的语言任务。

GPT-3模型本身可以做很多事情，OpenAI也基于这个模型在不同领域做了微调，产生了很多领域内的应用，包括代码生成、图像生成等。

Foundation model

GPT

View details

DataLearner 官方微信

欢迎关注 DataLearner 官方微信，获得最新 AI 技术推送