Qwen-1.8B
Qwen-1.8B is an AI model published by 阿里巴巴, released on 2023-11-30, for 基础大模型, with 18.0B parameters, and 8K tokens context length, requiring about 3.6GB storage, under the Tongyi Qianwen RESEARCH LICENSE AGREEMENT license.
Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology
阿里巴巴达摩院开源的18亿参数规模的大语言模型。
Qwen-1.8B模型是基座模型,阿里巴巴还开源了对齐后的聊天优化版本的Qwen-1.8B-Chat以及量化版本。具体的模型如下:
Qwen-1.8B基座模型: https://huggingface.co/Qwen/Qwen-1_8B
Qwen-1.8B-Chat的聊天的对齐优化模型: https://huggingface.co/Qwen/Qwen-1_8B
Int8量化版本的对齐优化模型Qwen-1.8B-Chat-Int8: https://huggingface.co/Qwen/Qwen-1_8B-Chat-Int8
Int4量化版本的对齐优化模型Qwen-1.8B-Chat-Int4: https://huggingface.co/Qwen/Qwen-1_8B-Chat-Int4
关于模型的介绍参考: https://www.datalearner.com/blog/1051701271552217
Qwen-1.8B模型的评测结果如下:
MMLU评测结果:
| Model | Avg. |
|---|---|
| GPT-Neo-1.3B | 24.6 |
| OPT-1.3B | 25.1 |
| Pythia-1B | 26.6 |
| Bloom-1.1B | 26.7 |
| Bloom-1.7B | 27.7 |
| Bloomz-1.7B | 30.7 |
| Bloomz-3B | 33.3 |
| Qwen-1.8B | 45.3 |
Qwen-1.8B模型的代码评测结果(HumanEval):
| Model | Pass@1 |
|---|---|
| GPT-Neo-1.3B | 3.66 |
| GPT-Neo-2.7B | 7.93 |
| Pythia-1B | 3.67 |
| Pythia-2.8B | 5.49 |
| Bloom-1.1B | 2.48 |
| Bloom-1.7B | 4.03 |
| Bloom-3B | 6.48 |
| Bloomz-1.7B | 4.38 |
| Bloomz-3B | 6.71 |
| Qwen-1.8B | 15.2 |
Qwen-1.8B模型的数学评测(GSM8K):
| Model | Acc. |
|---|---|
| GPT-Neo-1.3B | 1.97 |
| GPT-Neo-2.7B | 1.74 |
| Pythia-1B | 2.20 |
| Pythia-2.8B | 3.11 |
| Openllama-3B | 3.11 |
| Bloom-1.1B | 1.82 |
| Bloom-1.7B | 2.05 |
| Bloom-3B | 1.82 |
| Bloomz-1.7B | 2.05 |
| Bloomz-3B | 3.03 |
| Qwen-1.8B | 32.3 |
欢迎关注 DataLearner 官方微信,获得最新 AI 技术推送
