Skip to content

[Accuracy diff No.127] Fix accuracy diff for paddle.nn.functional.sigmoid_focal_loss API #73430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

NKNaN
Copy link
Contributor

@NKNaN NKNaN commented Jun 18, 2025

PR Category

Execute Infrastructure

PR Types

Improvements

Description

经检查原来的focal loss实现默认了 label 取值为 0.0 或 1.0,而 api 文档中描述 label 的取值范围是 0 到 1,所以按照 PaddelAPITest 中的 torch 转化代码修改 focal loss 的实现,以支持取值在 0~1 之间的 label 输入。

同时需要将 PaddelAPITest 的 SigmoidFocalLossRule 中 reduction 参数默认值改为 sum。

回测结果:
image

Copy link

paddle-bot bot commented Jun 18, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这看上去直接将 C 实现改为 python 实现了😶‍🌫️😶‍🌫️,这不太符合 paddle 库的贡献标准,破坏了动静图的区分,改动太大

paddleapitest 只是个粗略的测试项目,一切以 paddle 实现为准~

请同学参考参考贡献文档🫡:https://www.paddlepaddle.org.cn/documentation/docs/zh/dev_guides/index_cn.html

Copy link
Contributor Author

@NKNaN NKNaN Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

那就按 paddle 的实现修改 SigmoidFocalLossRule 中 torch 的转换规则?
(paddle 的实现确认没问题的话就可能还需要改一下 paddle 的文档里对 label 的描述, 应该必须是 0 或 1 才对)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参考 API 文档:https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api/paddle/nn/functional/sigmoid_focal_loss_cn.html#sigmoid-focal-loss

其中 label 为 Tensor 类型,其值可以取 [0, 1] 中的任意值,不存在 “label 取值为 0.0 或 1.0” 的说法~测试代码:

import paddle

# paddle.nn.functional.sigmoid_focal_loss(Tensor([270072, 80],"float32"), Tensor([270072, 80],"float32"), )

logit = paddle.randn([270072, 80], dtype="float32")
label = paddle.uniform([270072, 80], dtype="float32")

result = paddle.nn.functional.sigmoid_focal_loss(logit, label)
print(result)

但精度问题可能是 SigmoidFocalLossRule 写错了,也可能是 paddle 内核代码有问题,后者可以具体看内核代码是如何处理的:paddle/phi/kernels/gpu/sigmoid_cross_entropy_with_logits_kernel.cu

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文档中的公式是
−label ∗ [alpha ∗ (1−sigmoid(logit))**gamma] * log(sigmoid(logit)) − (1−label ) ∗ [(1−alpha) ∗ sigmoid(logit)**gamma] log(1−sigmoid(logit))

我理解的现在 paddle 的实现是这样:
loss = -label * sigmoid(logit) - (1 - label) * log (1 - sigmoid(logit)) // _C_ops.sigmoid_cross_entropy_with_logits
// 这一步可以看作上面公式里除去 alpha 和 gamma 的项

pred = sigmoid(logit)
p_t = pred * label + (1 - pred) * (1 - label)
alpha_t = alpha * label + (1 - alpha) * (1 - label)
loss = alpha_t * loss = [alpha * label + (1 - alpha) * (1 - label)] * [-label * sigmoid(logit) - (1 - label) * log (1 - sigmoid(logit))]
// 从这里开始就可以看出如果 label 不是 0 或 1 那计算的结果就不是文档中公式计算的结果,后面的 gamma_t 也一样。

我的意思是这个 API 虽然可以输入 0-1 之间的label值,也可以得到一个计算结果,但这个计算结果并不是文档中公式的结果,只有当这个 API 输入 0或1 的 label 值时,结果才和文档公式的结果一致。(对应单测在修改之前也是只有 label 是 0或1 的case)

如果要把现在改的 paddle API 组合实现替换成相应的 _C_ops 也行。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我明白你的意思了,代码中的表达确实与文档不一致,文档的含义是:
$\text{loss} = -\text{Labels} \cdot \alpha \cdot (1 - \sigma(\text{Logit}))^{\gamma} \log(\sigma(\text{Logit})) - (1 - \text{Labels}) \cdot (1 - \alpha) \cdot \sigma(\text{Logit})^{\gamma} \log(1 - \sigma(\text{Logit}))$

而代码调制的结果是:
$\text{loss} = \alpha_t \cdot (1 - p_t)^{\gamma} \cdot \left[ -\text{Labels} \cdot \log(\text{pred}) - (1 - \text{Labels}) \cdot \log(1 - \text{pred}) \right]$
$\text{loss} = \left[ \text{Labels} \cdot \alpha + (1 - \text{Labels}) \cdot (1 - \alpha) \right] \cdot \left( 1 - \text{Labels} \cdot \text{pred} - (1 - \text{Labels}) \cdot (1 - \text{pred}) \right)^{\gamma} \cdot \left[ -\text{Labels} \cdot \log(\text{pred}) - (1 - \text{Labels}) \cdot \log(1 - \text{pred}) \right]$

这看上去确实与设计有所误差,但目前认为这是合理的表达,参考论文原文3.2节:https://arxiv.org/pdf/1708.02002

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的,原文里第三节也规定了 y \in {+/- 1},要不就修改一下 PaddleAPITest/tester/api_config/config_analyzer.py 里 get_numpy_tensor 给这个 API 的输入,限定为 0.0 或 1.0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以的,需要限制输入,论文和 paddle api 文档里都说了是 0/1 的标签,看了一下 torchvision: sigmoid_focal_loss 的实现也是同样的假设和类似的计算过程

@luotao1 luotao1 added the HappyOpenSource Pro 进阶版快乐开源活动,更具挑战性的任务 label Jun 18, 2025
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (develop@c683e64). Learn more about missing BASE report.

Additional details and impacted files
@@             Coverage Diff             @@
##             develop    #73430   +/-   ##
===========================================
  Coverage           ?   100.00%           
===========================================
  Files              ?         1           
  Lines              ?        11           
  Branches           ?         0           
===========================================
  Hits               ?        11           
  Misses             ?         0           
  Partials           ?         0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@NKNaN
Copy link
Contributor Author

NKNaN commented Jun 19, 2025

已在 PFCCLab/PaddleAPITest#292 修改

@NKNaN NKNaN closed this Jun 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers HappyOpenSource Pro 进阶版快乐开源活动,更具挑战性的任务
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants