Skip to content

Commit 18d789a

Browse files
committed
fix some bugs
1 parent 76a118b commit 18d789a

File tree

6 files changed

+17
-13
lines changed

6 files changed

+17
-13
lines changed

csrc/cpu/README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,16 @@
11
# cpu-custom-ops
22

33
## 快速开始
4-
5-
### 1.环境准备
4+
### 1. 详细 cpu 推理教程
5+
[cpu](../../llm/docs/cpu_install.md)
6+
###
7+
### 2.环境准备
68
```shell
79
# 查询机器是否支持 avx512指令
810
lscpu | grep avx512*
911
```
1012

11-
### 2.安装 cpu 自定义算子和第三方库
13+
### 3.安装 cpu 自定义算子和第三方库
1214
```shell
1315
#建议在 gcc 9.4.0 下安装第三方库
1416
bash setup.sh

csrc/cpu/setup.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
# apt-get install numactl
1818

1919
# 1. download XFT
20-
if [ ! -d xFasterTransformer]; then
20+
if [ ! -d xFasterTransformer ]; then
2121
git clone https://github.com/intel/xFasterTransformer.git
2222
fi
2323

@@ -55,12 +55,12 @@ rm -rf build
5555
mkdir build && cd build
5656
cmake ..
5757
make -j
58+
cd ..
5859

5960
#xft
6061
export XFT_HEADER_DIR=$PWD
6162
export XFT_LIB_DIR=$XFT_HEADER_DIR/build
6263
export LD_LIBRARY_PATH=$XFT_LIB_DIR:$LD_LIBRARY_PATH
63-
6464
#setup cpu paddle_nlp ops
6565
cd ..
66-
python ./src/setup_cpu.py install
66+
python ./src/setup_cpu.py install --user

csrc/cpu/src/setup_cpu.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ def check_avx512_bf16__support():
112112

113113
custom_kernel_dot_module = CppExtension(
114114
sources=[
115-
"../generation/save_with_output.cc",
115+
"../gpu/save_with_output.cc",
116116
"./src/token_penalty_multi_scores.cc",
117117
"./src/stop_generation_multi_ends.cc",
118118
"./src/set_value_by_flags.cc",
@@ -129,6 +129,6 @@ def check_avx512_bf16__support():
129129
setup(
130130
name="paddlenlp_ops",
131131
version="1.0",
132-
description="custom kernel fot compiling",
132+
description="custom kernel for compiling",
133133
ext_modules=[custom_kernel_dot_module],
134134
)

llm/docs/cpu_install.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@ PaddleNLP 在支持 AVX 指令的 CPU 上对 llama 系列模型进行了深度
33

44
### 检查硬件:
55

6-
| 芯片类型 | GCC 版本 |
7-
| --- | --- |
8-
| Intel(R) Xeon(R) Platinum 8463B | 9.4.0|
6+
| 芯片类型 | GCC 版本 |cmake 版本 |
7+
| --- | --- | --- |
8+
| Intel(R) Xeon(R) Platinum 8463B | 9.4.0| >=3.18 |
99

1010
**注:如果要验证您的机器是否支持 AVX 指令,只需系统环境下输入命令,看是否有输出:**
1111
```

paddlenlp/experimental/transformers/fused_transformer_layers.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,10 @@
4040
"The paddlenlp_ops package is not installed. you can read the docs and install it by hand, "
4141
"you can refer to: https://github.com/PaddlePaddle/PaddleNLP/blob/develop/csrc/README.md"
4242
)
43-
from paddlenlp_ops import rebuild_padding_v2
43+
if (
44+
paddle.device.get_all_custom_device_type() is not None and len(paddle.device.get_all_custom_device_type()) > 0
45+
) or core.is_compiled_with_cuda():
46+
from paddlenlp_ops import rebuild_padding_v2
4447

4548
if core.is_compiled_with_cuda():
4649
if os.getenv("FLAGS_CUTLASS_FP8_GEMM", "False") == "True":

paddlenlp/experimental/transformers/llama/modeling.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -291,7 +291,6 @@ def forward(
291291
@paddle.no_grad()
292292
# avx
293293
def set_state_dict(self, state_dict):
294-
self.transformer_block.init_weight()
295294
unfused_state_dict = {}
296295
head_size = self.hidden_size // self.num_attention_heads
297296
split_fn = split_param_func()

0 commit comments

Comments
 (0)