Optimize attention output linear fp8 memory #10204

phlrain · 2025-03-19T11:16:33Z

output linear的输入 x ，是fa的输出，这个变量，在fa反向的时候是需要的，所以在output linear 使用fp的时候，save for backward可以直接用x，用x_fp8 会增加显存；单层decoder 大约能节省100M空间

…into dsv3_dev

paddle-bot · 2025-03-19T11:16:38Z

Thanks for your contribution!

lixinqi · 2025-03-19T11:29:04Z

paddlenlp/transformers/deepseek_v2/fp8_linear.py

+        x_quant, x_scale = kitchen_quant(
+            x, backend=kitchen.ops.Backend.CUTLASS, is_1d_scaled=True, return_transpose=False
+        )
+        weight_t = weight.T.contiguous()


这些转置不会引起性能问题吗？

lixinqi · 2025-03-19T11:30:21Z

paddlenlp/transformers/deepseek_v2/fp8_linear.py

+        )
+
+
+        x_t_shape = x_t_shape.numpy()


x_t_shape后面没有调用

phlrain added 3 commits March 19, 2025 15:27

optimize atten impl

b4a1653

Merge branch 'dsv3_dev' of https://github.com/PaddlePaddle/PaddleNLP …

ec01202

…into dsv3_dev

optimize_attention_output_linear_fp8_memory

7ec9102

sneaxiy approved these changes Mar 19, 2025

View reviewed changes

sneaxiy merged commit 0a77769 into PaddlePaddle:dsv3_dev Mar 19, 2025
1 of 5 checks passed

lixinqi reviewed Mar 19, 2025

View reviewed changes

paddlenlp/transformers/deepseek_v2/fp8_linear.py

)

x_t_shape = x_t_shape.numpy()

Copy link

lixinqi Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x_t_shape后面没有调用

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize attention output linear fp8 memory #10204

Optimize attention output linear fp8 memory #10204

Uh oh!

phlrain commented Mar 19, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Mar 19, 2025

Uh oh!

Uh oh!

lixinqi Mar 19, 2025

Uh oh!

lixinqi Mar 19, 2025

Uh oh!

Uh oh!

Optimize attention output linear fp8 memory #10204

Optimize attention output linear fp8 memory #10204

Uh oh!

Conversation

phlrain commented Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot bot commented Mar 19, 2025

Uh oh!

Uh oh!

lixinqi Mar 19, 2025

Choose a reason for hiding this comment

Uh oh!

lixinqi Mar 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

phlrain commented Mar 19, 2025 •

edited

Loading