Skip to content

Commit f0d1394

Browse files
committed
up
2 parents b42068f + 943a0ff commit f0d1394

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+1455
-1477
lines changed

Makefile

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,9 @@ lint:
3535
test: unit-test
3636

3737
unit-test:
38-
PYTHONPATH=$(shell pwd) pytest \
39-
-n auto --cov paddlenlp \
38+
PYTHONPATH=$(shell pwd) pytest -v \
39+
-n auto \
40+
--cov paddlenlp \
4041
--cov-report xml:coverage.xml
4142

4243
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

applications/question_answering/unsupervised_qa/README.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
- [问题生成](#问题生成)
2323
- [过滤模型](#过滤模型)
2424
- [语义索引和召回模型](#语义索引和召回模型)
25-
- [排序模型](排序模型)
25+
- [排序模型](#排序模型)
2626
- [References](#References)
2727

2828
## 简介
@@ -74,7 +74,7 @@
7474

7575
**语义索引**:针对给定问答对语料,我们基于RocketQA(即`rocketqa-zh-base-query-encoder`)对问答对进行语义向量化,并通过ElasticSearch的ANN服务构建索引库。
7676

77-
**召回排序**:给定用户查询,我们给予RocketQA的query-encoder和cross-encoder分别进行召回和排序操作,得到目标的问答对,从而返回给用户查询结果。
77+
**召回排序**:给定用户查询,我们基于RocketQA的query-encoder和cross-encoder分别进行召回和排序操作,得到目标的问答对,从而返回给用户查询结果。
7878

7979
**Pipelines**:由于本项目设计的模块较多,我们使用PaddleNLP Pipelines进行模块的组合和项目的构建。大体来说,我们的Pipelines包含两个具体的pipeline和三个服务。两个pipeline分别是qa_generation_pipeline和dense_faq_pipeline;三个服务分别是基于ElasticSearch的ANN在线索引库服务,基于RestAPI的模型后端服务以及基于Streamlit的前端WebUI服务。
8080

@@ -124,7 +124,6 @@ python run_pipelines_example.py --device cpu --source_file data/source_file.txt
124124

125125

126126
## 可视化无监督检索式问答系统
127-
<!-- **【注意】** 关于构建Web可视化问答对自动生成智能检索式问答系统,请参考[Pipelines-无监督智能检索问答系统](../../../pipelines/examples/unsupervised_question_answering/README.md)。 -->
128127
开发者可以基于Pipelines进一步构建Web可视化的无监督检索式问答系统,其效果如下,
129128
<div align="center">
130129
<img src="https://user-images.githubusercontent.com/20476674/199488926-c64d3f4e-8117-475f-afe6-b02088105d09.gif" >
@@ -217,7 +216,7 @@ python -u run_corpus_preparation.py \
217216
<!-- ### 检索模型训练部署
218217
在已有问答语料库和语义检索模型前提下,模型部署首先要把语义检索模型由动态图转换成静态图,然后转换成serving的格式,此外还需要基于Milvus和问答语料库构建语义检索引擎。
219218
220-
关于如何对语义检索模型进行无监督训练,以及针对给定问答语料库进行模型部署,请参考[faq_system](../README.md)。 -->
219+
关于如何对语义检索模型进行无监督训练,以及针对给定问答语料库进行模型部署,请参考faq_system -->
221220

222221
### 基于Pipelines构建问答系统
223222
本项目提供了基于Pipelines的低成本构建问答对自动生成智能检索问答系统的能力。开发者只需要提供非结构化的纯文本,就可以使用本项目预制的问答对生成模块生成大量的问答对,并基于此快速搭建一个针对自己业务的检索问答系统,并可以提供Web可视化产品服务。Web可视化产品服务支持问答检索、在线问答对生成,在线文件上传和解析,在线索引库更新等功能,用户也可根据需要自行调整。具体的构建流程请参考[Pipelines-无监督智能检索问答系统](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/pipelines/examples/unsupervised-question-answering)
@@ -231,7 +230,7 @@ python -u run_corpus_preparation.py \
231230
#### 自定义数据
232231
在许多情况下,我们需要使用本地数据集来微调模型从而得到定制化的能力,让生成的问答对更接近于理想分布,本项目支持使用固定格式本地数据集文件进行微调。
233232

234-
这里我们提供预先标注好的文件样例[train.json](https://paddlenlp.bj.bcebos.com/applications/unsupervised_qa/train.json)[dev.json](https://paddlenlp.bj.bcebos.com/applications/unsupervised_qa/test.json),开发者可直接下载放入`data`目录,此外也可自行构建本地数据集,具体来说,本地数据集主要包含以下文件:
233+
这里我们提供预先标注好的文件样例[train.json](https://paddlenlp.bj.bcebos.com/applications/unsupervised_qa/train.json)[dev.json](https://paddlenlp.bj.bcebos.com/applications/unsupervised_qa/dev.json),开发者可直接下载放入`data`目录,此外也可自行构建本地数据集,具体来说,本地数据集主要包含以下文件:
235234
```text
236235
data
237236
├── train.json # 训练数据集文件

applications/question_answering/unsupervised_qa/run.sh

Lines changed: 0 additions & 102 deletions
This file was deleted.

applications/text_classification/hierarchical/README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -370,7 +370,7 @@ export/
370370
使用裁剪功能需要安装 paddleslim:
371371

372372
```shell
373-
pip install paddleslim==2.2.2
373+
pip install paddleslim==2.4.1
374374
```
375375

376376
开始模型裁剪训练,默认为GPU训练,使用CPU训练只需将设备参数配置改为`--device "cpu"`
@@ -379,6 +379,7 @@ python prune.py \
379379
--device "gpu" \
380380
--dataset_dir "data" \
381381
--output_dir "prune" \
382+
--learning_rate 3e-5 \
382383
--per_device_train_batch_size 32 \
383384
--per_device_eval_batch_size 32 \
384385
--num_train_epochs 10 \
@@ -394,7 +395,7 @@ python prune.py \
394395
* `device`: 选用什么设备进行裁剪,选择cpu、gpu。如使用gpu训练,可使用参数--gpus指定GPU卡号。
395396
* `per_device_train_batch_size`:训练集裁剪训练过程批处理大小,请结合显存情况进行调整,若出现显存不足,请适当调低这一参数;默认为32。
396397
* `per_device_eval_batch_size`:开发集评测过程批处理大小,请结合显存情况进行调整,若出现显存不足,请适当调低这一参数;默认为32。
397-
* `learning_rate`:训练最大学习率;默认为3e-5。
398+
* `learning_rate`:训练最大学习率;默认为5e-5。
398399
* `num_train_epochs`: 训练轮次,使用早停法时可以选择100;默认为10。
399400
* `logging_steps`: 训练过程中日志打印的间隔steps数,默认100。
400401
* `save_steps`: 训练过程中保存模型checkpoint的间隔steps数,默认100。

applications/text_classification/multi_class/README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -392,7 +392,7 @@ export/
392392
使用裁剪功能需要安装 paddleslim:
393393

394394
```shell
395-
pip install paddleslim==2.2.2
395+
pip install paddleslim==2.4.1
396396
```
397397

398398
开始模型裁剪训练,默认为GPU训练,使用CPU训练只需将设备参数配置改为`--device "cpu"`
@@ -401,6 +401,7 @@ python prune.py \
401401
--device "gpu" \
402402
--dataset_dir "data" \
403403
--output_dir "prune" \
404+
--learning_rate 3e-5 \
404405
--per_device_train_batch_size 32 \
405406
--per_device_eval_batch_size 32 \
406407
--num_train_epochs 10 \
@@ -416,7 +417,7 @@ python prune.py \
416417
* `device`: 选用什么设备进行裁剪,选择cpu、gpu。如使用gpu训练,可使用参数--gpus指定GPU卡号。
417418
* `per_device_train_batch_size`:训练集裁剪训练过程批处理大小,请结合显存情况进行调整,若出现显存不足,请适当调低这一参数;默认为32。
418419
* `per_device_eval_batch_size`:开发集评测过程批处理大小,请结合显存情况进行调整,若出现显存不足,请适当调低这一参数;默认为32。
419-
* `learning_rate`:训练最大学习率;默认为3e-5。
420+
* `learning_rate`:训练最大学习率;默认为5e-5。
420421
* `num_train_epochs`: 训练轮次,使用早停法时可以选择100;默认为10。
421422
* `logging_steps`: 训练过程中日志打印的间隔steps数,默认100。
422423
* `save_steps`: 训练过程中保存模型checkpoint的间隔steps数,默认100。

applications/text_classification/multi_label/README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -367,7 +367,7 @@ export/
367367
使用裁剪功能需要安装 paddleslim:
368368

369369
```shell
370-
pip install paddleslim==2.2.2
370+
pip install paddleslim==2.4.1
371371
```
372372

373373
开始模型裁剪训练,默认为GPU训练,使用CPU训练只需将设备参数配置改为`--device "cpu"`
@@ -376,6 +376,7 @@ python prune.py \
376376
--device "gpu" \
377377
--dataset_dir "data" \
378378
--output_dir "prune" \
379+
--learning_rate 3e-5 \
379380
--per_device_train_batch_size 32 \
380381
--per_device_eval_batch_size 32 \
381382
--num_train_epochs 10 \
@@ -391,7 +392,7 @@ python prune.py \
391392
* `device`: 选用什么设备进行裁剪,选择cpu、gpu。如使用gpu训练,可使用参数--gpus指定GPU卡号。
392393
* `per_device_train_batch_size`:训练集裁剪训练过程批处理大小,请结合显存情况进行调整,若出现显存不足,请适当调低这一参数;默认为32。
393394
* `per_device_eval_batch_size`:开发集评测过程批处理大小,请结合显存情况进行调整,若出现显存不足,请适当调低这一参数;默认为32。
394-
* `learning_rate`:训练最大学习率;默认为3e-5。
395+
* `learning_rate`:训练最大学习率;默认为5e-5。
395396
* `num_train_epochs`: 训练轮次,使用早停法时可以选择100;默认为10。
396397
* `logging_steps`: 训练过程中日志打印的间隔steps数,默认100。
397398
* `save_steps`: 训练过程中保存模型checkpoint的间隔steps数,默认100。

docs/locale/en/LC_MESSAGES/source/paddlenlp.transformers.tokenizer_utils_faster.po renamed to docs/locale/en/LC_MESSAGES/source/paddlenlp.transformers.tokenizer_utils_fast.po

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ msgstr ""
1717
"Content-Transfer-Encoding: 8bit\n"
1818
"Generated-By: Babel 2.10.1\n"
1919

20-
#: ../source/paddlenlp.transformers.tokenizer_utils_faster.rst:2
21-
msgid "tokenizer\\_utils\\_faster"
20+
#: ../source/paddlenlp.transformers.tokenizer_utils_fast.rst:2
21+
msgid "tokenizer\\_utils\\_fast"
2222
msgstr ""
2323

docs/source/paddlenlp.transformers.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,5 +79,5 @@ paddlenlp.transformers
7979
paddlenlp.transformers.sentencepiece_model_pb2
8080
paddlenlp.transformers.tokenizer_utils
8181
paddlenlp.transformers.tokenizer_utils_base
82-
paddlenlp.transformers.tokenizer_utils_faster
82+
paddlenlp.transformers.tokenizer_utils_fast
8383
paddlenlp.transformers.utils

docs/source/paddlenlp.transformers.tokenizer_utils_faster.rst renamed to docs/source/paddlenlp.transformers.tokenizer_utils_fast.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
tokenizer\_utils\_faster
1+
tokenizer\_utils\_fast
22
======================================================
33

4-
.. automodule:: paddlenlp.transformers.tokenizer_utils_faster
4+
.. automodule:: paddlenlp.transformers.tokenizer_utils_fast
55
:members:
66
:no-undoc-members:
77
:show-inheritance:

examples/language_model/moe/dygraph/framework/group_sharded.py

Lines changed: 14 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -26,28 +26,21 @@
2626
# See the License for the specific language governing permissions and
2727
# limitations under the License.
2828

29-
import os
3029
from types import MethodType
3130

3231
import paddle
33-
from paddle.optimizer import Optimizer
34-
from paddle.fluid.framework import in_dygraph_mode
35-
from paddle.fluid.clip import ClipGradBase, _squared_l2_norm
36-
from paddle.fluid.dygraph import base as imperative_base
37-
from paddle.fluid import core, framework
38-
from paddle.incubate.distributed.models.moe.grad_clip import ClipGradForMOEByGlobalNorm
39-
40-
# Old version
41-
from paddle.distributed.fleet.meta_optimizers.dygraph_optimizer.sharding_optimizer_stage2 import (
42-
ShardingOptimizerStage2,
43-
)
44-
from paddle.distributed.fleet.meta_parallel.sharding.sharding_stage2 import ShardingStage2
45-
from paddle.distributed.fleet.meta_parallel.sharding.sharding_stage3 import ShardingStage3
4632

4733
# New version
48-
from paddle.distributed.fleet.meta_parallel.sharding.group_sharded_optimizer_stage2 import GroupShardedOptimizerStage2
49-
from paddle.distributed.fleet.meta_parallel.sharding.group_sharded_stage2 import GroupShardedStage2
50-
from paddle.distributed.fleet.meta_parallel.sharding.group_sharded_stage3 import GroupShardedStage3
34+
from paddle.distributed.fleet.meta_parallel.sharding.group_sharded_optimizer_stage2 import (
35+
GroupShardedOptimizerStage2,
36+
)
37+
from paddle.distributed.fleet.meta_parallel.sharding.group_sharded_stage2 import (
38+
GroupShardedStage2,
39+
)
40+
from paddle.fluid import core
41+
from paddle.fluid.dygraph import base as imperative_base
42+
from paddle.incubate.distributed.models.moe.grad_clip import ClipGradForMOEByGlobalNorm
43+
from paddle.optimizer import Optimizer
5144

5245

5346
class ClipGradForShardedMOEByGlobalNorm(ClipGradForMOEByGlobalNorm):
@@ -139,16 +132,10 @@ def check_dtype(param):
139132
)
140133

141134
# convert model/optimizer
142-
if in_dygraph_mode():
143-
optimizer = GroupShardedOptimizerStage2(params=sharded_params, optim=optimizer, group=group, offload=offload)
144-
model = GroupShardedStage2(
145-
model, optimizer, group=group, sync_buffers=sync_buffers, buffer_max_size=buffer_max_size
146-
)
147-
else:
148-
optimizer = ShardingOptimizerStage2(params=sharded_params, optim=optimizer, group=group, offload=offload)
149-
model = ShardingStage2(
150-
model, optimizer, group=group, sync_buffers=sync_buffers, buffer_max_size=buffer_max_size
151-
)
135+
optimizer = GroupShardedOptimizerStage2(params=sharded_params, optim=optimizer, group=group, offload=offload)
136+
model = GroupShardedStage2(
137+
model, optimizer, group=group, sync_buffers=sync_buffers, buffer_max_size=buffer_max_size
138+
)
152139

153140
clear_func = model._clear_gradients
154141
for opt in model._sharding_optimizers:

examples/language_model/moe/dygraph/framework/storage_process.py

Lines changed: 8 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -12,18 +12,15 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15-
from paddle.framework import core
16-
import numpy as np
1715
from collections import OrderedDict
1816

19-
from paddle.fluid.framework import in_dygraph_mode, _in_legacy_dygraph
20-
21-
if in_dygraph_mode():
22-
from paddle.distributed.fleet.meta_parallel.sharding.group_sharded_storage import ParamStorage, GradStorage
23-
elif _in_legacy_dygraph():
24-
from paddle.distributed.fleet.utils.internal_storage import ParamStorage, GradStorage
25-
26-
from paddle.distributed.fleet.meta_parallel.sharding.sharding_utils import Type
17+
import numpy as np
18+
from paddle.distributed.fleet.meta_parallel.sharding.group_sharded_storage import (
19+
GradStorage,
20+
ParamStorage,
21+
)
22+
from paddle.distributed.fleet.meta_parallel.sharding.group_sharded_utils import Type
23+
from paddle.framework import core
2724

2825
alignment = {
2926
"gpu": 256,
@@ -37,10 +34,7 @@
3734
def assign_group_by_size(parameters, group_size=256 * 1024 * 1024):
3835
is_sparse_gradient = [False] * len(parameters)
3936

40-
if in_dygraph_mode():
41-
group_indices = core.eager_assign_group_by_size(parameters, is_sparse_gradient, [group_size, group_size])
42-
elif _in_legacy_dygraph():
43-
group_indices = core.assign_group_by_size(parameters, is_sparse_gradient, [group_size, group_size])
37+
group_indices = core.eager_assign_group_by_size(parameters, is_sparse_gradient, [group_size, group_size])
4438

4539
var_groups = OrderedDict()
4640
for group_idx, indices in enumerate(group_indices):

examples/language_model/moe/dygraph/run_moe_pretrain.py

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,7 @@
3838
from paddle.distributed.fleet.meta_parallel.sharding.group_sharded_utils import (
3939
GroupShardedScaler,
4040
)
41-
from paddle.distributed.fleet.meta_parallel.sharding.sharding_utils import (
42-
ShardingScaler,
43-
)
44-
from paddle.fluid.framework import core, in_dygraph_mode
41+
from paddle.fluid.framework import core
4542
from paddle.incubate.distributed.models import moe
4643
from utils import get_timers, set_timers
4744
from visualdl import LogWriter
@@ -426,8 +423,7 @@ def do_train(args):
426423
scaler = fleet.distributed_scaler(scaler)
427424
scaler._unscale = MethodType(unscale_method, scaler)
428425
else:
429-
wrap_scale_func = GroupShardedScaler if in_dygraph_mode() else ShardingScaler
430-
scaler = wrap_scale_func(scaler)
426+
scaler = GroupShardedScaler(scaler)
431427

432428
model = paddle.amp.decorate(models=model, optimizers=None, level="O2", save_dtype="float32")
433429

paddlenlp/experimental/autonlp/auto_trainer_base.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -126,7 +126,7 @@ def export(self, export_path, trial_id=None):
126126
"""
127127
model_result = self._get_model_result(trial_id=trial_id)
128128
exported_model_path = os.path.join(model_result.log_dir, self.export_path)
129-
shutil.copytree(exported_model_path, export_path, dirs_exist_ok=True)
129+
shutil.copytree(exported_model_path, export_path)
130130
logger.info(f"Exported to {export_path}")
131131

132132
@abstractmethod

0 commit comments

Comments
 (0)