Llama2模型量化结果地址
TheBlokeAI在Llama2发布几个小时就发布了量化版本,速度真快~

| 量化版本名称 | 量化模型链接 | int-4量化|最大内存| | ------------ | ------------ | | Llama-2-70B-GPTQ |https://huggingface.co/TheBloke/Llama-2-70B-GPTQ | 35.33 GB || | Llama-2-70B-chat-GPTQ | https://huggingface.co/TheBloke/Llama-2-70B-chat-GPTQ | 35.33 GB || | Llama-2-13B-chat-GPTQ | https://huggingface.co/TheBloke/Llama-2-13B-chat-GPTQ | 7.26 GB | | | Llama-2-13B-chat-GGML | https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML |7.32 GB | 9.82 GB | | Llama-2-13B-GPTQ | https://huggingface.co/TheBloke/Llama-2-13B-GPTQ | 7.26 GB | | | Llama-2-13B-GGML | https://huggingface.co/TheBloke/Llama-2-13B-GGML| 7.32 GB | 9.82 GB | | Llama-2-7B-GGML | https://huggingface.co/TheBloke/Llama-2-7B-GGML | 3.79 GB | 6.29 GB | | Llama-2-7B-GPTQ | https://huggingface.co/TheBloke/Llama-2-7B-GPTQ | 3.90 GB | | | Llama-2-7B-Chat-GGML | https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML | 3.79 GB| 6.29 GB | | Llama-2-7b-Chat-GPTQ |https://huggingface.co/TheBloke/Llama-2-7b-Chat-GPTQ| 3.90 GB | | |
