Skip to content

[wip] update v2.1 readme #736

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Oct 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Paddle Multimodal Integration and eXploration
message: >-
If you use this repository, please cite it using the metadata from this file.
type: software
authors:
- given-names: PaddleMIX Authors
repository-code: 'https://github.com/PaddlePaddle/PaddleMIX'
repository: 'https://github.com/PaddlePaddle/PaddleMIX'
keywords:
- paddlemix
license: Apache-2.0
9 changes: 8 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ PaddleMIX是基于飞桨的多模态大模型开发套件,聚合图像、文
**🔥2024.10.11 发布PaddleMIX v2.1**
* 支持[PaddleNLP 3.0 beta](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v3.0.0-beta0)版本,抢先体验其最新功能。
* 新增[Qwen2-VL](./paddlemix/examples/qwen2_vl/)、[InternVL2](./paddlemix/examples/internvl2/)、[Stable Diffusion 3 (SD3)](https://github.com/PaddlePaddle/PaddleMIX/blob/develop/ppdiffusers/examples/dreambooth/README_sd3.md)等前沿模型。
* 发布自研多模数据能力标签模型[PP-InsCapTagger](./paddlemix/datacopilot/example/pp_inscaptagger/);可用于数据的分析和过滤,试验案例表明在保持模型效果的条件下可减少50%的数据量,大幅提高训练效率。
* DataCopilot发布自研多模数据能力标签模型[PP-InsCapTagger](./paddlemix/datacopilot/example/pp_inscaptagger/);可用于数据的分析和过滤,试验案例表明在保持模型效果的条件下可减少50%的数据量,大幅提高训练效率。
* 多模态大模型InternVL2、LLaVA、SD3、SDXL适配昇腾910B,提供国产计算芯片上的训推能力。

**2024.09.11 更新**
Expand Down Expand Up @@ -145,6 +145,7 @@ sh build_env.sh
<li><a href="paddlemix/examples/evaclip">EVA-CLIP</a></li>
<li><a href="paddlemix/examples/llava">LLaVA</a></li>
<li><a href="paddlemix/examples/llava">LLaVA-1.5</a></li>
<li><a href="paddlemix/examples/llava">LLaVA-1.6</a></li>
<li><a href="paddlemix/examples/llava">LLaVA-NeXT</a></li>
<li><a href="paddlemix/examples/qwen_vl">Qwen-VL</a></li>
<li><a href="paddlemix/examples/qwen2_vl">Qwen2-VL</a></li>
Expand All @@ -169,13 +170,19 @@ sh build_env.sh
<ul>
<li><a href="paddlemix/examples/imagebind">ImageBind</a></li>
</ul>
</ul>
<li><b>数据分析</b></li>
<ul>
<li><a href="./paddlemix/datacopilot/example/pp_inscaptagger/">PP-InsCapTagger</a></li>
</ul>
</td>
<td>
<ul>
</ul>
<li><b>文生图</b></li>
<ul>
<li><a href="ppdiffusers/examples/stable_diffusion">Stable Diffusion</a></li>
<li><a href="ppdiffusers/examples/dreambooth/README_sd3.md">Stable Diffusion 3 (SD3)</a></li>
<li><a href="ppdiffusers/examples/controlnet">ControlNet</a></li>
<li><a href="ppdiffusers/examples/t2i-adapter">T2I-Adapter</a></li>
<li><a href="ppdiffusers/examples/text_to_image_laion400m">LDM</a></li>
Expand Down
29 changes: 29 additions & 0 deletions paddlemix/datacopilot/example/pp_inscaptagger/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,3 +110,32 @@ LLaVA v1.5模型SFT阶段训练时,使用的指令微调数据集为[LLaVA-Ins
| llava-1.5-7b <br> (tag 50%/our) | 70.24 | 57.12 | 78.32 | 62.14 | 37.11 | 1476 <br> 338 |

通过PP-InsCapTagger的打标和优化,50%数据集与原始数据集的训练效果基本持平,大大提高了模型训练效率。



## 引用
如果在你的工作中用到`PP-InsCapTagger`,请按照下面的方式引用:

<details>
<summary> bibtex </summary>

```bibtex

@software{PaddleMIX_Authors_Paddle_Multimodal_Integration,
author = {PaddleMIX Authors},
license = {Apache-2.0},
title = {{Paddle Multimodal Integration and eXploration}},
url = {https://github.com/PaddlePaddle/PaddleMIX}
}

@software{Lv_Instance_Capability_Tagger_2024,
author = {Lv, Wenyu and Huang, Kui and Zhao, Yian},
license = {Apache-2.0},
month = oct,
title = {{Instance Capability Tagger: Enhancing Multimodal Data Efficiency for Model Training}},
url = {https://github.com/lyuwenyu/PP-InsCapTagger},
version = {1.0},
year = {2024}
}
```
</details>
2 changes: 1 addition & 1 deletion paddlemix/datacopilot/nn/inscaptagger.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ def __init__(self, model_name_or_path, max_new_tokens = 4096, dtype='float16') -


def init_model(self, model_name_or_path, max_new_tokens, dtype):
tokenizer = AutoTokenizerMIX.from_pretrained(model_name_or_path, use_fast=False)
tokenizer = AutoTokenizerMIX.from_pretrained(model_name_or_path)
model_config = AutoConfigMIX.from_pretrained(model_name_or_path)
model = AutoModelMIX.from_pretrained(model_name_or_path, dtype=dtype)
model.eval()
Expand Down