@@ -70,6 +70,7 @@ Unified Checkpoint 大模型存储格式在模型参数分布上支持动态扩
70
70
| [ LLama2] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama ) | meta-llama/Llama-2-7b, meta-llama/Llama-2-7b-chat, meta-llama/Llama-2-13b, meta-llama/Llama-2-13b-chat, meta-llama/Llama-2-70b, meta-llama/Llama-2-70b-chat |
71
71
| [ LLama3] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama ) | meta-llama/Meta-Llama-3-8B, meta-llama/Meta-Llama-3-8B-Instruct, meta-llama/Meta-Llama-3-70B, meta-llama/Meta-Llama-3-70B-Instruct |
72
72
| [ LLama3.1] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama ) | meta-llama/Meta-Llama-3.1-8B, meta-llama/Meta-Llama-3.1-8B-Instruct, meta-llama/Meta-Llama-3.1-70B, meta-llama/Meta-Llama-3.1-70B-Instruct, meta-llama/Meta-Llama-3.1-405B, meta-llama/Meta-Llama-3.1-405B-Instruct, meta-llama/Llama-Guard-3-8B |
73
+ | [ LLama3.2] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama ) | meta-llama/Llama-3.2-1B, meta-llama/Llama-3.2-1B-Instruct, meta-llama/Llama-3.2-3B, meta-llama/Llama-3.2-3B-Instruct, meta-llama/Llama-Guard-3-1B |
73
74
| [ Baichuan] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/baichuan ) | baichuan-inc/Baichuan-7B, baichuan-inc/Baichuan-13B-Base, baichuan-inc/Baichuan-13B-Chat |
74
75
| [ Baichuan2] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/baichuan ) | baichuan-inc/Baichuan2-7B-Base, baichuan-inc/Baichuan2-7B-Chat, baichuan-inc/Baichuan2-13B-Base, baichuan-inc/Baichuan2-13B-Chat |
75
76
| [ Bloom] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/bloom ) | bigscience/bloom-560m, bigscience/bloom-560m-bf16, bigscience/bloom-1b1, bigscience/bloom-3b, bigscience/bloom-7b1, bigscience/bloomz-560m, bigscience/bloomz-1b1, bigscience/bloomz-3b, bigscience/bloomz-7b1-mt, bigscience/bloomz-7b1-p3, bigscience/bloomz-7b1, bellegroup/belle-7b-2m |
@@ -85,7 +86,7 @@ Unified Checkpoint 大模型存储格式在模型参数分布上支持动态扩
85
86
| [ Qwen2] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/ ) | Qwen/Qwen2-0.5B, Qwen/Qwen2-0.5B-Instruct, Qwen/Qwen2-1.5B, Qwen/Qwen2-1.5B-Instruct, Qwen/Qwen2-7B, Qwen/Qwen2-7B-Instruct, Qwen/Qwen2-72B, Qwen/Qwen2-72B-Instruct, Qwen/Qwen2-57B-A14B, Qwen/Qwen2-57B-A14B-Instruct |
86
87
| [ Qwen2-Math] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/ ) | Qwen/Qwen2-Math-1.5B, Qwen/Qwen2-Math-1.5B-Instruct, Qwen/Qwen2-Math-7B, Qwen/Qwen2-Math-7B-Instruct, Qwen/Qwen2-Math-72B, Qwen/Qwen2-Math-72B-Instruct, Qwen/Qwen2-Math-RM-72B |
87
88
| [ Qwen2.5] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/ ) | Qwen/Qwen2.5-0.5B, Qwen/Qwen2.5-0.5B-Instruct, Qwen/Qwen2.5-1.5B, Qwen/Qwen2.5-1.5B-Instruct, Qwen/Qwen2.5-3B, Qwen/Qwen2.5-3B-Instruct, Qwen/Qwen2.5-7B, Qwen/Qwen2.5-7B-Instruct, Qwen/Qwen2.5-14B, Qwen/Qwen2.5-14B-Instruct, Qwen/Qwen2.5-32B, Qwen/Qwen2.5-32B-Instruct, Qwen/Qwen2.5-72B, Qwen/Qwen2.5-72B-Instruct |
88
- | [ Qwen2.5-Math] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/ ) | Qwen/Qwen2.5-Math-1.5B, Qwen/Qwen2.5-Math-1.5B-Instruct, Qwen/Qwen2.5-Math-7B, Qwen/Qwen2.5-Math-7B-Instruct, Qwen/Qwen2.5-Math-72B, Qwen/Qwen2.5-Math-72B-Instruct, Qwen/Qwen2.5-Math-RM-72B |
89
+ | [ Qwen2.5-Math] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/ ) | Qwen/Qwen2.5-Math-1.5B, Qwen/Qwen2.5-Math-1.5B-Instruct, Qwen/Qwen2.5-Math-7B, Qwen/Qwen2.5-Math-7B-Instruct, Qwen/Qwen2.5-Math-72B, Qwen/Qwen2.5-Math-72B-Instruct, Qwen/Qwen2.5-Math-RM-72B |
89
90
| [ Qwen2.5-Coder] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/ ) | Qwen/Qwen2.5-Coder-1.5B, Qwen/Qwen2.5-Coder-1.5B-Instruct, Qwen/Qwen2.5-Coder-7B, Qwen/Qwen2.5-Coder-7B-Instruct |
90
91
| [ Yuan2] ( https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/yuan/ ) | IEITYuan/Yuan2-2B, IEITYuan/Yuan2-51B, IEITYuan/Yuan2-102B |
91
92
@@ -96,9 +97,6 @@ Unified Checkpoint 大模型存储格式在模型参数分布上支持动态扩
96
97
| :---------------------:| :--------:| :------------:| :--------:| :------------:| :------:| :------:| :----------:|
97
98
| | | 基础能力 | 序列并行 | stage1 | stage2 | stage3 | |
98
99
| Llama | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
99
- | Llama2 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
100
- | Llama3 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
101
- | Llama3.1 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
102
100
| Qwen | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
103
101
| Qwen1.5 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
104
102
| Qwen2 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
@@ -119,7 +117,7 @@ Unified Checkpoint 大模型存储格式在模型参数分布上支持动态扩
119
117
120
118
| 模型名称/能力支持 | Pretrain | SFT | LoRA | Prefix Tuning | DPO | RLHF | Quantization | Torch convert |
121
119
| :------------------:| :--------:| :---:| :----:| :-------------:| :---:| :----:| :------------:| :-------------:|
122
- | LLaMA | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
120
+ | Llama | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
123
121
| Qwen | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ |
124
122
| Mixtral | ✅ | ✅ | ✅ | ❌ | 🚧 | 🚧 | 🚧 | 🚧 |
125
123
| Mistral | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ |
@@ -151,7 +149,7 @@ Unified Checkpoint 大模型存储格式在模型参数分布上支持动态扩
151
149
* python >= 3.8
152
150
* paddlepaddle >= 3.0.0b0
153
151
154
- 如果您尚未安装PaddlePaddle ,请参考 [ 飞桨官网] ( https://www.paddlepaddle.org.cn/ ) 进行安装。
152
+ 如果您尚未安装 PaddlePaddle ,请参考 [ 飞桨官网] ( https://www.paddlepaddle.org.cn/ ) 进行安装。
155
153
156
154
### pip 安装
157
155
0 commit comments