-
Notifications
You must be signed in to change notification settings - Fork 5.8k
[Accuracy diff No.127] Fix accuracy diff for paddle.nn.functional.sigmoid_focal_loss API #73430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这看上去直接将 C 实现改为 python 实现了😶🌫️😶🌫️,这不太符合 paddle 库的贡献标准,破坏了动静图的区分,改动太大
paddleapitest 只是个粗略的测试项目,一切以 paddle 实现为准~
请同学参考参考贡献文档🫡:https://www.paddlepaddle.org.cn/documentation/docs/zh/dev_guides/index_cn.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
那就按 paddle 的实现修改 SigmoidFocalLossRule 中 torch 的转换规则?
(paddle 的实现确认没问题的话就可能还需要改一下 paddle 的文档里对 label 的描述, 应该必须是 0 或 1 才对)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
其中 label 为 Tensor 类型,其值可以取 [0, 1] 中的任意值,不存在 “label 取值为 0.0 或 1.0” 的说法~测试代码:
import paddle
# paddle.nn.functional.sigmoid_focal_loss(Tensor([270072, 80],"float32"), Tensor([270072, 80],"float32"), )
logit = paddle.randn([270072, 80], dtype="float32")
label = paddle.uniform([270072, 80], dtype="float32")
result = paddle.nn.functional.sigmoid_focal_loss(logit, label)
print(result)
但精度问题可能是 SigmoidFocalLossRule
写错了,也可能是 paddle 内核代码有问题,后者可以具体看内核代码是如何处理的:paddle/phi/kernels/gpu/sigmoid_cross_entropy_with_logits_kernel.cu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
文档中的公式是
−label ∗ [alpha ∗ (1−sigmoid(logit))**gamma] * log(sigmoid(logit)) − (1−label ) ∗ [(1−alpha) ∗ sigmoid(logit)**gamma] log(1−sigmoid(logit))
我理解的现在 paddle 的实现是这样:
loss = -label * sigmoid(logit) - (1 - label) * log (1 - sigmoid(logit)) // _C_ops.sigmoid_cross_entropy_with_logits
// 这一步可以看作上面公式里除去 alpha 和 gamma 的项
pred = sigmoid(logit)
p_t = pred * label + (1 - pred) * (1 - label)
alpha_t = alpha * label + (1 - alpha) * (1 - label)
loss = alpha_t * loss = [alpha * label + (1 - alpha) * (1 - label)] * [-label * sigmoid(logit) - (1 - label) * log (1 - sigmoid(logit))]
// 从这里开始就可以看出如果 label 不是 0 或 1 那计算的结果就不是文档中公式计算的结果,后面的 gamma_t 也一样。
我的意思是这个 API 虽然可以输入 0-1 之间的label值,也可以得到一个计算结果,但这个计算结果并不是文档中公式的结果,只有当这个 API 输入 0或1 的 label 值时,结果才和文档公式的结果一致。(对应单测在修改之前也是只有 label 是 0或1 的case)
如果要把现在改的 paddle API 组合实现替换成相应的 _C_ops 也行。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我明白你的意思了,代码中的表达确实与文档不一致,文档的含义是:
而代码调制的结果是:
这看上去确实与设计有所误差,但目前认为这是合理的表达,参考论文原文3.2节:https://arxiv.org/pdf/1708.02002
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的,原文里第三节也规定了 y \in {+/- 1},要不就修改一下 PaddleAPITest/tester/api_config/config_analyzer.py 里 get_numpy_tensor 给这个 API 的输入,限定为 0.0 或 1.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以的,需要限制输入,论文和 paddle api 文档里都说了是 0/1 的标签,看了一下 torchvision: sigmoid_focal_loss 的实现也是同样的假设和类似的计算过程
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #73430 +/- ##
===========================================
Coverage ? 100.00%
===========================================
Files ? 1
Lines ? 11
Branches ? 0
===========================================
Hits ? 11
Misses ? 0
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
PR Category
Execute Infrastructure
PR Types
Improvements
Description
经检查原来的focal loss实现默认了 label 取值为 0.0 或 1.0,而 api 文档中描述 label 的取值范围是 0 到 1,所以按照 PaddelAPITest 中的 torch 转化代码修改 focal loss 的实现,以支持取值在 0~1 之间的 label 输入。
同时需要将 PaddelAPITest 的 SigmoidFocalLossRule 中 reduction 参数默认值改为 sum。
回测结果:
