Skip to content

Commit db3bf4a

Browse files
authored
[cherry-pick]update for 3.0.2 (#15775)
* update for 3.0.2 * fix typo
1 parent 32f1434 commit db3bf4a

File tree

4 files changed

+171
-39
lines changed

4 files changed

+171
-39
lines changed

README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,42 @@ PaddleOCR 3.0除了提供优秀的模型库外,还提供好学易用的工具
3939

4040

4141
## 📣 最新动态
42+
43+
🔥🔥2025.06.19: **PaddleOCR 3.0.2** 发布,包含:
44+
45+
- **功能新增:**
46+
- 模型默认下载源从`BOS`改为`HuggingFace`,同时也支持用户通过更改环境变量`PADDLE_PDX_MODEL_SOURCE``BOS`,将模型下载源设置为百度云对象存储BOS。
47+
- PP-OCRv5、PP-StructureV3、PP-ChatOCRv4等pipeline新增C++、Java、Go、C#、Node.js、PHP 6种语言的服务调用示例。
48+
- 优化PP-StructureV3产线中版面分区排序算法,对复杂竖版版面排序逻辑进行完善,进一步提升了复杂版面排序效果。
49+
- 优化模型选择逻辑,当指定语言、未指定模型版本时,自动选择支持该语言的最新版本的模型。
50+
- 为MKL-DNN缓存大小设置默认上界,防止缓存无限增长。同时,支持用户配置缓存容量。
51+
- 更新高性能推理默认配置,支持Paddle MKL-DNN加速。优化高性能推理自动配置逻辑,支持更智能的配置选择。
52+
- 调整默认设备获取逻辑,考虑环境中安装的Paddle框架对计算设备的实际支持情况,使程序行为更符合直觉。
53+
- 新增PP-OCRv5的Android端示例,[详情](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/deployment/on_device_deployment.html)
54+
55+
- **Bug修复:**
56+
- 修复PP-StructureV3部分CLI参数不生效的问题。
57+
- 修复部分情况下`export_paddlex_config_to_yaml`无法正常工作的问题。
58+
- 修复save_path实际行为与文档描述不符的问题。
59+
- 修复基础服务化部署在使用MKL-DNN时可能出现的多线程错误。
60+
- 修复Latex-OCR模型的图像预处理的通道顺序错误。
61+
- 修复文本识别模块保存可视化图像的通道顺序错误。
62+
- 修复PP-StructureV3中表格可视化结果通道顺序错误。
63+
- 修复PP-StructureV3产线中极特殊的情况下,计算overlap_ratio时,变量溢出问题。
64+
65+
- **文档优化:**
66+
- 更新文档中对`enable_mkldnn`参数的说明,使其更准确地描述程序的实际行为。
67+
- 修复文档中对`lang``ocr_version`参数描述的错误。
68+
- 补充通过CLI导出产线配置文件的说明。
69+
- 修复PP-OCRv5性能数据表格中的列缺失问题。
70+
- 润色PP-StructureV3在不同配置下的benchmark指标。
71+
72+
- **其他:**
73+
- 放松numpy、pandas等依赖的版本限制,恢复对Python 3.12的支持。
74+
75+
<details>
76+
<summary><strong>历史日志</strong></summary>
77+
4278
🔥🔥2025.06.05: **PaddleOCR 3.0.1** 发布,包含:
4379

4480
- **优化部分模型和模型配置:**
@@ -65,6 +101,9 @@ PaddleOCR 3.0除了提供优秀的模型库外,还提供好学易用的工具
65101
2. 💻 原生支持**文心大模型4.5 Turbo**,还兼容 PaddleNLP、Ollama、vLLM 等工具部署的大模型。
66102
3. 🤝 集成 [PP-DocBee2](https://github.com/PaddlePaddle/PaddleMIX/tree/develop/paddlemix/examples/ppdocbee2),支持印刷文字、手写体文字、印章信息、表格、图表等常见的复杂文档信息抽取和理解的能力。
67103

104+
[更多日志](https://paddlepaddle.github.io/PaddleOCR/latest/update/update.html)
105+
106+
</details>
68107

69108
## ⚡ 快速开始
70109
### 1. 在线体验

README_en.md

Lines changed: 41 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,46 @@ In addition to providing an outstanding model library, PaddleOCR 3.0 also offers
4444

4545
## 📣 Recent updates
4646

47-
#### **🔥🔥 2025.06.05: Release of PaddleOCR 3.0.1, includes:**
47+
#### 🔥🔥**2025.06.19: Release of PaddleOCR 3.0.2, includes:**
48+
49+
- **New Features:**
50+
51+
- The default download source has been changed from `BOS` to `HuggingFace`. Users can also change the environment variable `PADDLE_PDX_MODEL_SOURCE` to `BOS` to set the model download source back to Baidu Object Storage (BOS).
52+
- Added service invocation examples for six languages—C++, Java, Go, C#, Node.js, and PHP—for pipelines like PP-OCRv5, PP-StructureV3, and PP-ChatOCRv4.
53+
- Improved the layout partition sorting algorithm in the PP-StructureV3 pipeline, enhancing the sorting logic for complex vertical layouts to deliver better results.
54+
- Enhanced model selection logic: when a language is specified but a model version is not, the system will automatically select the latest model version supporting that language.
55+
- Set a default upper limit for MKL-DNN cache size to prevent unlimited growth, while also allowing users to configure cache capacity.
56+
- Updated default configurations for high-performance inference to support Paddle MKL-DNN acceleration and optimized the logic for automatic configuration selection for smarter choices.
57+
- Adjusted the logic for obtaining the default device to consider the actual support for computing devices by the installed Paddle framework, making program behavior more intuitive.
58+
- Added Android example for PP-OCRv5. [Details](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/deployment/on_device_deployment.html).
59+
60+
- **Bug Fixes:**
61+
62+
- Fixed an issue with some CLI parameters in PP-StructureV3 not taking effect.
63+
- Resolved an issue where `export_paddlex_config_to_yaml` would not function correctly in certain cases.
64+
- Corrected the discrepancy between the actual behavior of `save_path` and its documentation description.
65+
- Fixed potential multithreading errors when using MKL-DNN in basic service deployment.
66+
- Corrected channel order errors in image preprocessing for the Latex-OCR model.
67+
- Fixed channel order errors in saving visualized images within the text recognition module.
68+
- Resolved channel order errors in visualized table results within PP-StructureV3 pipeline.
69+
- Fixed an overflow issue in the calculation of `overlap_ratio` under extremely special circumstances in the PP-StructureV3 pipeline.
70+
71+
- **Documentation Improvements:**
72+
73+
- Updated the description of the `enable_mkldnn` parameter in the documentation to accurately reflect the program's actual behavior.
74+
- Fixed errors in the documentation regarding the `lang` and `ocr_version` parameters.
75+
- Added instructions for exporting production line configuration files via CLI.
76+
- Fixed missing columns in the performance data table for PP-OCRv5.
77+
- Refined benchmark metrics for PP-StructureV3 across different configurations.
78+
79+
- **Others:**
80+
81+
- Relaxed version restrictions on dependencies like numpy and pandas, restoring support for Python 3.12.
82+
83+
<details>
84+
<summary><strong>History Log</strong></summary>
85+
86+
#### **2025.06.05: Release of PaddleOCR 3.0.1, includes:**
4887

4988
- **Optimisation of certain models and model configurations:**
5089
- Updated the default model configuration for PP-OCRv5, changing both detection and recognition from mobile to server models. To improve default performance in most scenarios, the parameter `limit_side_len` in the configuration has been changed from 736 to 64.
@@ -68,20 +107,7 @@ In addition to providing an outstanding model library, PaddleOCR 3.0 also offers
68107
2. 💻 Native support for **ERINE4.5 Turbo**, with compatibility for large-model deployments via PaddleNLP, Ollama, vLLM, and more.
69108
3. 🤝 Integrated [PP-DocBee2](https://github.com/PaddlePaddle/PaddleMIX/tree/develop/paddlemix/examples/ppdocbee2), enabling extraction and understanding of printed text, handwriting, seals, tables, charts, and other common elements in complex documents.
70109

71-
<details>
72-
<summary><strong>The history of updates </strong></summary>
73-
74-
75-
- 🔥🔥2025.03.07: Release of **PaddleOCR v2.10**, including:
76-
77-
- **12 new self-developed models:**
78-
- **[Layout Detection series](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/layout_detection.html)**(3 models): PP-DocLayout-L, M, and S -- capable of detecting 23 common layout types across diverse document formats(papers, reports, exams, books, magazines, contracts, etc.) in English and Chinese. Achieves up to **90.4% mAP@0.5** , and lightweight features can process over 100 pages per second.
79-
- **[Formula Recognition series](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/formula_recognition.html)**(2 models): PP-FormulaNet-L and S -- supports recognition of 50,000+ LaTeX expressions, handling both printed and handwritten formulas. PP-FormulaNet-L offers **6% higher accuracy** than comparable models; PP-FormulaNet-S is 16x faster while maintaining similar accuracy.
80-
- **[Table Structure Recognition series](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/table_structure_recognition.html)**(2 models): SLANeXt_wired and SLANeXt_wireless -- newly developed models with **6% accuracy improvement** over SLANet_plus in complex table recognition.
81-
- **[Table Classification](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/table_classification.html)**(1 model):
82-
PP-LCNet_x1_0_table_cls -- an ultra-lightweight classifier for wired and wireless tables.
83-
84-
[Learn more](https://paddlepaddle.github.io/PaddleOCR/latest/en/update.html)
110+
[History Log](https://paddlepaddle.github.io/PaddleOCR/latest/en/update/update.html)
85111

86112
</details>
87113

docs/update/update.en.md

Lines changed: 47 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,26 +6,61 @@ hide:
66
---
77

88
### Recently Update
9+
#### **🔥🔥 2025.06.19: Release of PaddleOCR v3.0.2, which includes:**
10+
11+
- **New Features:**
12+
13+
- The default download source has been changed from `BOS` to `HuggingFace`. Users can also change the environment variable `PADDLE_PDX_MODEL_SOURCE` to `BOS` to set the model download source back to Baidu Object Storage (BOS).
14+
- Added service invocation examples for six languages—C++, Java, Go, C#, Node.js, and PHP—for pipelines like PP-OCRv5, PP-StructureV3, and PP-ChatOCRv4.
15+
- Improved the layout partition sorting algorithm in the PP-StructureV3 pipeline, enhancing the sorting logic for complex vertical layouts to deliver better results.
16+
- Enhanced model selection logic: when a language is specified but a model version is not, the system will automatically select the latest model version supporting that language.
17+
- Set a default upper limit for MKL-DNN cache size to prevent unlimited growth, while also allowing users to configure cache capacity.
18+
- Updated default configurations for high-performance inference to support Paddle MKL-DNN acceleration and optimized the logic for automatic configuration selection for smarter choices.
19+
- Adjusted the logic for obtaining the default device to consider the actual support for computing devices by the installed Paddle framework, making program behavior more intuitive.
20+
- Added Android example for PP-OCRv5. [Details](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/deployment/on_device_deployment.html).
21+
22+
- **Bug Fixes:**
23+
24+
- Fixed an issue with some CLI parameters in PP-StructureV3 not taking effect.
25+
- Resolved an issue where `export_paddlex_config_to_yaml` would not function correctly in certain cases.
26+
- Corrected the discrepancy between the actual behavior of `save_path` and its documentation description.
27+
- Fixed potential multithreading errors when using MKL-DNN in basic service deployment.
28+
- Corrected channel order errors in image preprocessing for the Latex-OCR model.
29+
- Fixed channel order errors in saving visualized images within the text recognition module.
30+
- Resolved channel order errors in visualized table results within PP-StructureV3 pipeline.
31+
- Fixed an overflow issue in the calculation of `overlap_ratio` under extremely special circumstances in the PP-StructureV3 pipeline.
32+
33+
- **Documentation Improvements:**
34+
35+
- Updated the description of the `enable_mkldnn` parameter in the documentation to accurately reflect the program's actual behavior.
36+
- Fixed errors in the documentation regarding the `lang` and `ocr_version` parameters.
37+
- Added instructions for exporting production line configuration files via CLI.
38+
- Fixed missing columns in the performance data table for PP-OCRv5.
39+
- Refined benchmark metrics for PP-StructureV3 pipeline across different configurations.
40+
41+
- **Others:**
42+
43+
- Relaxed version restrictions on dependencies like numpy and pandas, restoring support for Python 3.12.
944

1045
#### **🔥🔥 2025.06.05: Release of PaddleOCR v3.0.1, which includes:**
1146

1247
- **Optimisation of certain models and model configurations:**
13-
- Updated the default model configuration for PP-OCRv5, changing both detection and recognition from mobile to server models. To improve default performance in most scenarios, the parameter `limit_side_len` in the configuration has been changed from 736 to 64.
14-
- Added a new text line orientation classification model `PP-LCNet_x1_0_textline_ori` with an accuracy of 99.42%. The default text line orientation classifier for OCR, PP-StructureV3, and PP-ChatOCRv4 pipelines has been updated to this model.
15-
- Optimised the text line orientation classification model `PP-LCNet_x0_25_textline_ori`, improving accuracy by 3.3 percentage points to a current accuracy of 98.85%.
48+
- Updated the default model configuration for PP-OCRv5, changing both detection and recognition from mobile to server models. To improve default performance in most scenarios, the parameter `limit_side_len` in the configuration has been changed from 736 to 64.
49+
- Added a new text line orientation classification model `PP-LCNet_x1_0_textline_ori` with an accuracy of 99.42%. The default text line orientation classifier for OCR, PP-StructureV3, and PP-ChatOCRv4 pipelines has been updated to this model.
50+
- Optimised the text line orientation classification model `PP-LCNet_x0_25_textline_ori`, improving accuracy by 3.3 percentage points to a current accuracy of 98.85%.
1651

1752
- **Optimisation of issues present in version 3.0.0:**
18-
- **Improved CLI usage experience:** When using the PaddleOCR CLI without passing any parameters, a usage prompt is now provided.
19-
- **New parameters added:** PP-ChatOCRv3 and PP-StructureV3 now support the `use_textline_orientation` parameter.
20-
- **CPU inference speed optimisation:** All pipeline CPU inferences now enable MKL-DNN by default.
21-
- **Support for C++ inference:** The detection and recognition concatenation part of PP-OCRv5 now supports C++ inference.
53+
- **Improved CLI usage experience:** When using the PaddleOCR CLI without passing any parameters, a usage prompt is now provided.
54+
- **New parameters added:** PP-ChatOCRv3 and PP-StructureV3 now support the `use_textline_orientation` parameter.
55+
- **CPU inference speed optimisation:** All pipeline CPU inferences now enable MKL-DNN by default.
56+
- **Support for C++ inference:** The detection and recognition concatenation part of PP-OCRv5 now supports C++ inference.
2257

2358
- **Fixes for issues present in version 3.0.0:**
24-
- Fixed an issue where PP-StructureV3 encountered CPU inference errors due to the inability to use MKL-DNN with formula and table recognition models.
25-
- Fixed an issue where GPU environments encountered the error `FatalError: Process abort signal is detected by the operating system` during inference.
26-
- Fixed type hint issues in some Python 3.8 environments.
27-
- Fixed the issue where the method `PPStructureV3.concatenate_markdown_pages` was missing.
28-
- Fixed an issue where specifying both `lang` and `model_name` when instantiating `paddleocr.PaddleOCR` resulted in `model_name` being ineffective.
59+
- Fixed an issue where PP-StructureV3 encountered CPU inference errors due to the inability to use MKL-DNN with formula and table recognition models.
60+
- Fixed an issue where GPU environments encountered the error `FatalError: Process abort signal is detected by the operating system` during inference.
61+
- Fixed type hint issues in some Python 3.8 environments.
62+
- Fixed the issue where the method `PPStructureV3.concatenate_markdown_pages` was missing.
63+
- Fixed an issue where specifying both `lang` and `model_name` when instantiating `paddleocr.PaddleOCR` resulted in `model_name` being ineffective.
2964

3065
#### **🔥🔥 2025.05.20: PaddleOCR 3.0 Official Release Highlights**
3166

0 commit comments

Comments
 (0)