[Accuracy diff No.127] Fix accuracy diff for paddle.nn.functional.sigmoid_focal_loss API #73430

NKNaN · 2025-06-18T08:18:40Z

PR Category

Execute Infrastructure

PR Types

Improvements

Description

经检查原来的focal loss实现默认了 label 取值为 0.0 或 1.0，而 api 文档中描述 label 的取值范围是 0 到 1，所以按照 PaddelAPITest 中的 torch 转化代码修改 focal loss 的实现，以支持取值在 0~1 之间的 label 输入。

同时需要将 PaddelAPITest 的 SigmoidFocalLossRule 中 reduction 参数默认值改为 sum。

回测结果：

paddle-bot · 2025-06-18T08:18:45Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

cangtianhuang · 2025-06-18T08:31:07Z

python/paddle/nn/functional/loss.py

这看上去直接将 C 实现改为 python 实现了😶‍🌫️😶‍🌫️，这不太符合 paddle 库的贡献标准，破坏了动静图的区分，改动太大

paddleapitest 只是个粗略的测试项目，一切以 paddle 实现为准～

请同学参考参考贡献文档🫡：https://www.paddlepaddle.org.cn/documentation/docs/zh/dev_guides/index_cn.html

那就按 paddle 的实现修改 SigmoidFocalLossRule 中 torch 的转换规则？
（paddle 的实现确认没问题的话就可能还需要改一下 paddle 的文档里对 label 的描述，应该必须是 0 或 1 才对）

参考 API 文档：https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api/paddle/nn/functional/sigmoid_focal_loss_cn.html#sigmoid-focal-loss

其中 label 为 Tensor 类型，其值可以取 [0, 1] 中的任意值，不存在 “label 取值为 0.0 或 1.0” 的说法～测试代码：

import paddle # paddle.nn.functional.sigmoid_focal_loss(Tensor([270072, 80],"float32"), Tensor([270072, 80],"float32"), ) logit = paddle.randn([270072, 80], dtype="float32") label = paddle.uniform([270072, 80], dtype="float32") result = paddle.nn.functional.sigmoid_focal_loss(logit, label) print(result)

但精度问题可能是 SigmoidFocalLossRule 写错了，也可能是 paddle 内核代码有问题，后者可以具体看内核代码是如何处理的：paddle/phi/kernels/gpu/sigmoid_cross_entropy_with_logits_kernel.cu

文档中的公式是
−label ∗ [alpha ∗ (1−sigmoid(logit))**gamma] * log(sigmoid(logit)) − (1−label ) ∗ [(1−alpha) ∗ sigmoid(logit)**gamma] log(1−sigmoid(logit))

我理解的现在 paddle 的实现是这样：
loss = -label * sigmoid(logit) - (1 - label) * log (1 - sigmoid(logit)) // _C_ops.sigmoid_cross_entropy_with_logits
// 这一步可以看作上面公式里除去 alpha 和 gamma 的项

pred = sigmoid(logit)
p_t = pred * label + (1 - pred) * (1 - label)
alpha_t = alpha * label + (1 - alpha) * (1 - label)
loss = alpha_t * loss = [alpha * label + (1 - alpha) * (1 - label)] * [-label * sigmoid(logit) - (1 - label) * log (1 - sigmoid(logit))]
// 从这里开始就可以看出如果 label 不是 0 或 1 那计算的结果就不是文档中公式计算的结果，后面的 gamma_t 也一样。

我的意思是这个 API 虽然可以输入 0-1 之间的label值，也可以得到一个计算结果，但这个计算结果并不是文档中公式的结果，只有当这个 API 输入 0或1 的 label 值时，结果才和文档公式的结果一致。（对应单测在修改之前也是只有 label 是 0或1 的case）

如果要把现在改的 paddle API 组合实现替换成相应的 _C_ops 也行。

我明白你的意思了，代码中的表达确实与文档不一致，文档的含义是：
$\text{loss} = -\text{Labels} \cdot \alpha \cdot (1 - \sigma(\text{Logit}))^{\gamma} \log(\sigma(\text{Logit})) - (1 - \text{Labels}) \cdot (1 - \alpha) \cdot \sigma(\text{Logit})^{\gamma} \log(1 - \sigma(\text{Logit}))$

而代码调制的结果是：
$\text{loss} = \alpha_t \cdot (1 - p_t)^{\gamma} \cdot \left[ -\text{Labels} \cdot \log(\text{pred}) - (1 - \text{Labels}) \cdot \log(1 - \text{pred}) \right]$
$\text{loss} = \left[ \text{Labels} \cdot \alpha + (1 - \text{Labels}) \cdot (1 - \alpha) \right] \cdot \left( 1 - \text{Labels} \cdot \text{pred} - (1 - \text{Labels}) \cdot (1 - \text{pred}) \right)^{\gamma} \cdot \left[ -\text{Labels} \cdot \log(\text{pred}) - (1 - \text{Labels}) \cdot \log(1 - \text{pred}) \right]$

这看上去确实与设计有所误差，但目前认为这是合理的表达，参考论文原文3.2节：https://arxiv.org/pdf/1708.02002

是的，原文里第三节也规定了 y \in {+/- 1}，要不就修改一下 PaddleAPITest/tester/api_config/config_analyzer.py 里 get_numpy_tensor 给这个 API 的输入，限定为 0.0 或 1.0？

可以的，需要限制输入，论文和 paddle api 文档里都说了是 0/1 的标签，看了一下 torchvision: sigmoid_focal_loss 的实现也是同样的假设和类似的计算过程

codecov-commenter · 2025-06-18T12:03:58Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (develop@c683e64). Learn more about missing BASE report.

Additional details and impacted files

@@             Coverage Diff             @@
##             develop    #73430   +/-   ##
===========================================
  Coverage           ?   100.00%           
===========================================
  Files              ?         1           
  Lines              ?        11           
  Branches           ?         0           
===========================================
  Hits               ?        11           
  Misses             ?         0           
  Partials           ?         0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

NKNaN · 2025-06-19T06:22:50Z

已在 PFCCLab/PaddleAPITest#292 修改

fix focal_loss accuracy

2fe264f

paddle-bot bot added the contributor External developers label Jun 18, 2025

NKNaN mentioned this pull request Jun 18, 2025

[Accuracy diff No.127] Fix accuracy diff for paddle.nn.functional.sigmoid_focal_loss API PFCCLab/PaddleAPITest#292

Merged

cangtianhuang reviewed Jun 18, 2025

View reviewed changes

luotao1 mentioned this pull request Jun 18, 2025

【开源任务】Paddle CPU/GPU Kernel 精度问题推全 #72667

Open

luotao1 added the HappyOpenSource Pro 进阶版快乐开源活动，更具挑战性的任务 label Jun 18, 2025

luotao1 assigned luotao1 and lshpku Jun 18, 2025

NKNaN closed this Jun 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Accuracy diff No.127] Fix accuracy diff for paddle.nn.functional.sigmoid_focal_loss API #73430

[Accuracy diff No.127] Fix accuracy diff for paddle.nn.functional.sigmoid_focal_loss API #73430

Uh oh!

NKNaN commented Jun 18, 2025

Uh oh!

paddle-bot bot commented Jun 18, 2025

Uh oh!

cangtianhuang Jun 18, 2025

Uh oh!

NKNaN Jun 18, 2025 •

edited

Loading

Uh oh!

cangtianhuang Jun 18, 2025

Uh oh!

NKNaN Jun 18, 2025

Uh oh!

cangtianhuang Jun 18, 2025

Uh oh!

NKNaN Jun 18, 2025

Uh oh!

Cutelemon6 Jun 18, 2025

Uh oh!

codecov-commenter commented Jun 18, 2025

Uh oh!

NKNaN commented Jun 19, 2025

Uh oh!

Uh oh!

[Accuracy diff No.127] Fix accuracy diff for paddle.nn.functional.sigmoid_focal_loss API #73430

[Accuracy diff No.127] Fix accuracy diff for paddle.nn.functional.sigmoid_focal_loss API #73430

Uh oh!

Conversation

NKNaN commented Jun 18, 2025

PR Category

PR Types

Description

Uh oh!

paddle-bot bot commented Jun 18, 2025

Uh oh!

cangtianhuang Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

NKNaN Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cangtianhuang Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

NKNaN Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

cangtianhuang Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

NKNaN Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

Cutelemon6 Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Jun 18, 2025

Codecov Report

Uh oh!

NKNaN commented Jun 19, 2025

Uh oh!

Uh oh!

NKNaN Jun 18, 2025 •

edited

Loading