Skip to content

[Tencent Hunyuan Team] Add HunyuanDiT-v1.2 Support #8747

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 1, 2024

Conversation

gnobitab
Copy link
Contributor

@gnobitab gnobitab commented Jul 1, 2024

We slightly changed the hunyuandit_transformers_2d.py and embeddings.py to support Hunyuan-DiT v1.2 inference.
It adds additional logic to avoid using style_embedder and image_meta_size (as they are not effective in the current inference framework anyway).

Please have a look. Thank you.

Test script:

from diffusers import HunyuanDiTPipeline
import torch
pipe = HunyuanDiTPipeline.from_pretrained("Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers", torch_dtype=torch.float16)
pipe.to('cuda')

# ### NOTE: HunyuanDiT supports both Chinese and English inputs
prompt = "一个宇航员在骑马"
#prompt = "An astronaut riding a horse"
generator=torch.Generator(device="cuda").manual_seed(0)
image = pipe(height=1024, width=1024, prompt=prompt, generator=generator, num_inference_steps=50, guidance_scale=5.0).images[0]

image.save("img.png")

@s9anus98a
Copy link

The config attributes {'use_style_cond_and_image_meta_size': False} were passed to HunyuanDiT2DModel, but are not expected and will be ignored. Please verify your config.json configuration file.


ValueError Traceback (most recent call last)

in <cell line: 4>()
2 from diffusers import HunyuanDiTPipeline
3
----> 4 pipe = HunyuanDiTPipeline.from_pretrained("Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers", torch_dtype=torch.float16)
5 pipe.to("cuda")
6

4 frames

/usr/local/lib/python3.10/dist-packages/diffusers/models/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
748 missing_keys = set(model.state_dict().keys()) - set(state_dict.keys())
749 if len(missing_keys) > 0:
--> 750 raise ValueError(
751 f"Cannot load {cls} from {pretrained_model_name_or_path} because the following keys are"
752 f" missing: \n {', '.join(missing_keys)}. \n Please make sure to pass"

ValueError: Cannot load <class 'diffusers.models.transformers.hunyuan_transformer_2d.HunyuanDiT2DModel'> from /root/.cache/huggingface/hub/models--Tencent-Hunyuan--HunyuanDiT-v1.2-Diffusers/snapshots/bf329a9a93c2346d0986d91263207d3226d2858d/transformer because the following keys are missing:
time_extra_emb.style_embedder.weight.
Please make sure to pass low_cpu_mem_usage=False and device_map=None if you want to randomly initialize those weights or else make sure your checkpoint file is correct.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Jul 1, 2024

don't forget make style

@gnobitab
Copy link
Contributor Author

gnobitab commented Jul 1, 2024

The config attributes {'use_style_cond_and_image_meta_size': False} were passed to HunyuanDiT2DModel, but are not expected and will be ignored. Please verify your config.json configuration file.

ValueError Traceback (most recent call last)

in <cell line: 4>() 2 from diffusers import HunyuanDiTPipeline 3 ----> 4 pipe = HunyuanDiTPipeline.from_pretrained("Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers", torch_dtype=torch.float16) 5 pipe.to("cuda") 6

4 frames

/usr/local/lib/python3.10/dist-packages/diffusers/models/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs) 748 missing_keys = set(model.state_dict().keys()) - set(state_dict.keys()) 749 if len(missing_keys) > 0: --> 750 raise ValueError( 751 f"Cannot load {cls} from {pretrained_model_name_or_path} because the following keys are" 752 f" missing: \n {', '.join(missing_keys)}. \n Please make sure to pass"

ValueError: Cannot load <class 'diffusers.models.transformers.hunyuan_transformer_2d.HunyuanDiT2DModel'> from /root/.cache/huggingface/hub/models--Tencent-Hunyuan--HunyuanDiT-v1.2-Diffusers/snapshots/bf329a9a93c2346d0986d91263207d3226d2858d/transformer because the following keys are missing: time_extra_emb.style_embedder.weight. Please make sure to pass low_cpu_mem_usage=False and device_map=None if you want to randomly initialize those weights or else make sure your checkpoint file is correct.

Please use the code provided in the current PR to load the model

xingchaoliu and others added 2 commits July 1, 2024 15:09
@yiyixuxu yiyixuxu merged commit a3904d7 into huggingface:main Jul 1, 2024
14 of 15 checks passed
@s9anus98a
Copy link

The config attributes {'use_style_cond_and_image_meta_size': False} were passed to HunyuanDiT2DModel, but are not expected and will be ignored. Please verify your config.json configuration file.
ValueError Traceback (most recent call last)
in <cell line: 4>() 2 from diffusers import HunyuanDiTPipeline 3 ----> 4 pipe = HunyuanDiTPipeline.from_pretrained("Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers", torch_dtype=torch.float16) 5 pipe.to("cuda") 6
4 frames
/usr/local/lib/python3.10/dist-packages/diffusers/models/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs) 748 missing_keys = set(model.state_dict().keys()) - set(state_dict.keys()) 749 if len(missing_keys) > 0: --> 750 raise ValueError( 751 f"Cannot load {cls} from {pretrained_model_name_or_path} because the following keys are" 752 f" missing: \n {', '.join(missing_keys)}. \n Please make sure to pass"
ValueError: Cannot load <class 'diffusers.models.transformers.hunyuan_transformer_2d.HunyuanDiT2DModel'> from /root/.cache/huggingface/hub/models--Tencent-Hunyuan--HunyuanDiT-v1.2-Diffusers/snapshots/bf329a9a93c2346d0986d91263207d3226d2858d/transformer because the following keys are missing: time_extra_emb.style_embedder.weight. Please make sure to pass low_cpu_mem_usage=False and device_map=None if you want to randomly initialize those weights or else make sure your checkpoint file is correct.

Please use the code provided in the current PR to load the model

what makes HYDIT very slow compare to other 50 step dit like sd3 & pixart sigma ?

@neonhuang
Copy link

neonhuang commented Jul 9, 2024

您好,还是遇到 pipe = HunyuanDiTPipeline.from_pretrained("Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers", torch_dtype=torch.float16)
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py", line 881, in from_pretrained
loaded_sub_model = load_sub_model(
File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_loading_utils.py", line 703, in load_sub_model
loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/diffusers/models/modeling_utils.py", line 750, in from_pretrained
raise ValueError(
ValueError: Cannot load <class 'diffusers.models.transformers.hunyuan_transformer_2d.HunyuanDiT2DModel'> from /root/.cache/huggingface/hub/models--Tencent-Hunyuan--HunyuanDiT-v1.2-Diffusers/snapshots/5e96094e0ad19e7f475de8711f03634ca0ccc40c/transformer because the following keys are missing:
time_extra_emb.style_embedder.weight.
Please make sure to pass low_cpu_mem_usage=False and device_map=None if you want to randomly initialize those weights or else make sure your checkpoint file is correct

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Jul 9, 2024

@neonhuang
Hi! I think you need to install diffusers from source and use the most recent version in order to load the v1.2 checkpoint

@neonhuang
Copy link

@neonhuang Hi! I think you need to install diffusers from source and use the most recent version in order to load the v1.2 checkpoint

我下载的模型,是通过pipe = HunyuanDiTPipeline.from_pretrained("Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers", torch_dtype=torch.float16)自动下载的。
models--Tencent-Hunyuan--HunyuanDiT-v1.2-Diffusers# tree -L 4
.
├── blobs
│   ├── 05592dd7745eb8b9b7e19753208af804f6393e45
│   ├── 2c0c539ab8e8fba3877cc94bc483e427f74c525f817a809b028ebc8d96d75a94
│   ├── 36c538ea7c70fa3060bfdedfd8c8efab370b699c
│   ├── 3b49976cb1fe40da28a600d783f4686024c97eb01f224c54305cce55ddcd8a5e
│   ├── 3b6e6f54f337932574c05d1377c08aaa9d2062a0
│   ├── 40e63298d71c4975e0ac423c78a80db851cf975f
│   ├── 49a263c94c555ebd2d2e3f866d38aed4032aed5c
│   ├── 5d9f5f2314c1932c9dba328bcddeb8a42925556b
│   ├── 6246906805d02aca01714c71e4c8d77b69a7a131
│   ├── 742efa052c5730728cb140208c25c73268749ad4
│   ├── 8553f834bfa38c7d2c56f3dc65d5e0d943f006c4
│   ├── 8af691cadb78047d55721259355d708e87ddbba1b7845df9377d9a5ae917b45d
│   ├── 98a14dc6fe8d71c83576f135a87c61a16561c9c080abba418d2cc976ee034f88
│   ├── 9bbecc17cabbcbd3112c14d6982b51403b264bfa
│   ├── b2eb1a3fb1d8809a06b9af8aaab746c9a35b468f
│   ├── c57897ebb275499b3c6d5284f1187860f86741e6
│   ├── c6c6348af2cb4d5852fe51102ce39605903dbe7925c005cf8995506cc21ea914
│   └── ef78f86560d809067d12bac6c09f19a462cb3af3f54d2b8acbba26e1433125d6
├── refs
│   └── main
└── snapshots
└── 5e96094e0ad19e7f475de8711f03634ca0ccc40c
├── model_index.json -> ../../blobs/8553f834bfa38c7d2c56f3dc65d5e0d943f006c4
├── scheduler
│   └── scheduler_config.json -> ../../../blobs/40e63298d71c4975e0ac423c78a80db851cf975f
├── text_encoder
│   ├── config.json -> ../../../blobs/49a263c94c555ebd2d2e3f866d38aed4032aed5c
│   └── model.safetensors -> ../../../blobs/c6c6348af2cb4d5852fe51102ce39605903dbe7925c005cf8995506cc21ea914
├── text_encoder_2
│   ├── config.json -> ../../../blobs/5d9f5f2314c1932c9dba328bcddeb8a42925556b
│   ├── model-00001-of-00002.safetensors -> ../../../blobs/2c0c539ab8e8fba3877cc94bc483e427f74c525f817a809b028ebc8d96d75a94
│   ├── model-00002-of-00002.safetensors -> ../../../blobs/3b49976cb1fe40da28a600d783f4686024c97eb01f224c54305cce55ddcd8a5e
│   └── model.safetensors.index.json -> ../../../blobs/b2eb1a3fb1d8809a06b9af8aaab746c9a35b468f
├── tokenizer
│   ├── special_tokens_map.json -> ../../../blobs/9bbecc17cabbcbd3112c14d6982b51403b264bfa
│   ├── tokenizer_config.json -> ../../../blobs/c57897ebb275499b3c6d5284f1187860f86741e6
│   └── vocab.txt -> ../../../blobs/6246906805d02aca01714c71e4c8d77b69a7a131
├── tokenizer_2
│   ├── special_tokens_map.json -> ../../../blobs/05592dd7745eb8b9b7e19753208af804f6393e45
│   ├── spiece.model -> ../../../blobs/ef78f86560d809067d12bac6c09f19a462cb3af3f54d2b8acbba26e1433125d6
│   └── tokenizer_config.json -> ../../../blobs/3b6e6f54f337932574c05d1377c08aaa9d2062a0
├── transformer
│   ├── config.json -> ../../../blobs/36c538ea7c70fa3060bfdedfd8c8efab370b699c
│   └── diffusion_pytorch_model.safetensors -> ../../../blobs/8af691cadb78047d55721259355d708e87ddbba1b7845df9377d9a5ae917b45d
└── vae
├── config.json -> ../../../blobs/742efa052c5730728cb140208c25c73268749ad4
└── diffusion_pytorch_model.safetensors -> ../../../blobs/98a14dc6fe8d71c83576f135a87c61a16561c9c080abba418d2cc976ee034f88

sayakpaul pushed a commit that referenced this pull request Dec 23, 2024
* add v1.2 support

---------

Co-authored-by: xingchaoliu <xingchaoliu@tencent.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants