[Tencent Hunyuan Team] Add HunyuanDiT-v1.2 Support #8747

gnobitab · 2024-07-01T04:08:02Z

We slightly changed the hunyuandit_transformers_2d.py and embeddings.py to support Hunyuan-DiT v1.2 inference.
It adds additional logic to avoid using style_embedder and image_meta_size (as they are not effective in the current inference framework anyway).

Please have a look. Thank you.

Test script:

from diffusers import HunyuanDiTPipeline
import torch
pipe = HunyuanDiTPipeline.from_pretrained("Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers", torch_dtype=torch.float16)
pipe.to('cuda')

# ### NOTE: HunyuanDiT supports both Chinese and English inputs
prompt = "一个宇航员在骑马"
#prompt = "An astronaut riding a horse"
generator=torch.Generator(device="cuda").manual_seed(0)
image = pipe(height=1024, width=1024, prompt=prompt, generator=generator, num_inference_steps=50, guidance_scale=5.0).images[0]

image.save("img.png")

s9anus98a · 2024-07-01T05:36:03Z

The config attributes {'use_style_cond_and_image_meta_size': False} were passed to HunyuanDiT2DModel, but are not expected and will be ignored. Please verify your config.json configuration file.

ValueError Traceback (most recent call last)

in <cell line: 4>()
2 from diffusers import HunyuanDiTPipeline
3
----> 4 pipe = HunyuanDiTPipeline.from_pretrained("Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers", torch_dtype=torch.float16)
5 pipe.to("cuda")
6

4 frames

/usr/local/lib/python3.10/dist-packages/diffusers/models/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
748 missing_keys = set(model.state_dict().keys()) - set(state_dict.keys())
749 if len(missing_keys) > 0:
--> 750 raise ValueError(
751 f"Cannot load {cls} from {pretrained_model_name_or_path} because the following keys are"
752 f" missing: \n {', '.join(missing_keys)}. \n Please make sure to pass"

ValueError: Cannot load <class 'diffusers.models.transformers.hunyuan_transformer_2d.HunyuanDiT2DModel'> from /root/.cache/huggingface/hub/models--Tencent-Hunyuan--HunyuanDiT-v1.2-Diffusers/snapshots/bf329a9a93c2346d0986d91263207d3226d2858d/transformer because the following keys are missing:
time_extra_emb.style_embedder.weight.
Please make sure to pass low_cpu_mem_usage=False and device_map=None if you want to randomly initialize those weights or else make sure your checkpoint file is correct.

HuggingFaceDocBuilderDev · 2024-07-01T06:01:42Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

src/diffusers/models/embeddings.py

yiyixuxu · 2024-07-01T06:11:27Z

don't forget make style

gnobitab · 2024-07-01T06:55:26Z

The config attributes {'use_style_cond_and_image_meta_size': False} were passed to HunyuanDiT2DModel, but are not expected and will be ignored. Please verify your config.json configuration file.

ValueError Traceback (most recent call last)

in <cell line: 4>() 2 from diffusers import HunyuanDiTPipeline 3 ----> 4 pipe = HunyuanDiTPipeline.from_pretrained("Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers", torch_dtype=torch.float16) 5 pipe.to("cuda") 6

4 frames

/usr/local/lib/python3.10/dist-packages/diffusers/models/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs) 748 missing_keys = set(model.state_dict().keys()) - set(state_dict.keys()) 749 if len(missing_keys) > 0: --> 750 raise ValueError( 751 f"Cannot load {cls} from {pretrained_model_name_or_path} because the following keys are" 752 f" missing: \n {', '.join(missing_keys)}. \n Please make sure to pass"

ValueError: Cannot load <class 'diffusers.models.transformers.hunyuan_transformer_2d.HunyuanDiT2DModel'> from /root/.cache/huggingface/hub/models--Tencent-Hunyuan--HunyuanDiT-v1.2-Diffusers/snapshots/bf329a9a93c2346d0986d91263207d3226d2858d/transformer because the following keys are missing: time_extra_emb.style_embedder.weight. Please make sure to pass low_cpu_mem_usage=False and device_map=None if you want to randomly initialize those weights or else make sure your checkpoint file is correct.

Please use the code provided in the current PR to load the model

s9anus98a · 2024-07-01T07:57:32Z

The config attributes {'use_style_cond_and_image_meta_size': False} were passed to HunyuanDiT2DModel, but are not expected and will be ignored. Please verify your config.json configuration file.
ValueError Traceback (most recent call last)
in <cell line: 4>() 2 from diffusers import HunyuanDiTPipeline 3 ----> 4 pipe = HunyuanDiTPipeline.from_pretrained("Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers", torch_dtype=torch.float16) 5 pipe.to("cuda") 6
4 frames
/usr/local/lib/python3.10/dist-packages/diffusers/models/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs) 748 missing_keys = set(model.state_dict().keys()) - set(state_dict.keys()) 749 if len(missing_keys) > 0: --> 750 raise ValueError( 751 f"Cannot load {cls} from {pretrained_model_name_or_path} because the following keys are" 752 f" missing: \n {', '.join(missing_keys)}. \n Please make sure to pass"
ValueError: Cannot load <class 'diffusers.models.transformers.hunyuan_transformer_2d.HunyuanDiT2DModel'> from /root/.cache/huggingface/hub/models--Tencent-Hunyuan--HunyuanDiT-v1.2-Diffusers/snapshots/bf329a9a93c2346d0986d91263207d3226d2858d/transformer because the following keys are missing: time_extra_emb.style_embedder.weight. Please make sure to pass low_cpu_mem_usage=False and device_map=None if you want to randomly initialize those weights or else make sure your checkpoint file is correct.

Please use the code provided in the current PR to load the model

what makes HYDIT very slow compare to other 50 step dit like sd3 & pixart sigma ?

neonhuang · 2024-07-09T02:46:45Z

您好，还是遇到 pipe = HunyuanDiTPipeline.from_pretrained("Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers", torch_dtype=torch.float16)
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py", line 881, in from_pretrained
loaded_sub_model = load_sub_model(
File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_loading_utils.py", line 703, in load_sub_model
loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/diffusers/models/modeling_utils.py", line 750, in from_pretrained
raise ValueError(
ValueError: Cannot load <class 'diffusers.models.transformers.hunyuan_transformer_2d.HunyuanDiT2DModel'> from /root/.cache/huggingface/hub/models--Tencent-Hunyuan--HunyuanDiT-v1.2-Diffusers/snapshots/5e96094e0ad19e7f475de8711f03634ca0ccc40c/transformer because the following keys are missing:
time_extra_emb.style_embedder.weight.
Please make sure to pass low_cpu_mem_usage=False and device_map=None if you want to randomly initialize those weights or else make sure your checkpoint file is correct

yiyixuxu · 2024-07-09T03:08:33Z

@neonhuang
Hi! I think you need to install diffusers from source and use the most recent version in order to load the v1.2 checkpoint

neonhuang · 2024-07-09T04:11:04Z

@neonhuang Hi! I think you need to install diffusers from source and use the most recent version in order to load the v1.2 checkpoint

我下载的模型，是通过pipe = HunyuanDiTPipeline.from_pretrained("Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers", torch_dtype=torch.float16)自动下载的。
models--Tencent-Hunyuan--HunyuanDiT-v1.2-Diffusers# tree -L 4
.
├── blobs
│   ├── 05592dd7745eb8b9b7e19753208af804f6393e45
│   ├── 2c0c539ab8e8fba3877cc94bc483e427f74c525f817a809b028ebc8d96d75a94
│   ├── 36c538ea7c70fa3060bfdedfd8c8efab370b699c
│   ├── 3b49976cb1fe40da28a600d783f4686024c97eb01f224c54305cce55ddcd8a5e
│   ├── 3b6e6f54f337932574c05d1377c08aaa9d2062a0
│   ├── 40e63298d71c4975e0ac423c78a80db851cf975f
│   ├── 49a263c94c555ebd2d2e3f866d38aed4032aed5c
│   ├── 5d9f5f2314c1932c9dba328bcddeb8a42925556b
│   ├── 6246906805d02aca01714c71e4c8d77b69a7a131
│   ├── 742efa052c5730728cb140208c25c73268749ad4
│   ├── 8553f834bfa38c7d2c56f3dc65d5e0d943f006c4
│   ├── 8af691cadb78047d55721259355d708e87ddbba1b7845df9377d9a5ae917b45d
│   ├── 98a14dc6fe8d71c83576f135a87c61a16561c9c080abba418d2cc976ee034f88
│   ├── 9bbecc17cabbcbd3112c14d6982b51403b264bfa
│   ├── b2eb1a3fb1d8809a06b9af8aaab746c9a35b468f
│   ├── c57897ebb275499b3c6d5284f1187860f86741e6
│   ├── c6c6348af2cb4d5852fe51102ce39605903dbe7925c005cf8995506cc21ea914
│   └── ef78f86560d809067d12bac6c09f19a462cb3af3f54d2b8acbba26e1433125d6
├── refs
│   └── main
└── snapshots
└── 5e96094e0ad19e7f475de8711f03634ca0ccc40c
├── model_index.json -> ../../blobs/8553f834bfa38c7d2c56f3dc65d5e0d943f006c4
├── scheduler
│   └── scheduler_config.json -> ../../../blobs/40e63298d71c4975e0ac423c78a80db851cf975f
├── text_encoder
│   ├── config.json -> ../../../blobs/49a263c94c555ebd2d2e3f866d38aed4032aed5c
│   └── model.safetensors -> ../../../blobs/c6c6348af2cb4d5852fe51102ce39605903dbe7925c005cf8995506cc21ea914
├── text_encoder_2
│   ├── config.json -> ../../../blobs/5d9f5f2314c1932c9dba328bcddeb8a42925556b
│   ├── model-00001-of-00002.safetensors -> ../../../blobs/2c0c539ab8e8fba3877cc94bc483e427f74c525f817a809b028ebc8d96d75a94
│   ├── model-00002-of-00002.safetensors -> ../../../blobs/3b49976cb1fe40da28a600d783f4686024c97eb01f224c54305cce55ddcd8a5e
│   └── model.safetensors.index.json -> ../../../blobs/b2eb1a3fb1d8809a06b9af8aaab746c9a35b468f
├── tokenizer
│   ├── special_tokens_map.json -> ../../../blobs/9bbecc17cabbcbd3112c14d6982b51403b264bfa
│   ├── tokenizer_config.json -> ../../../blobs/c57897ebb275499b3c6d5284f1187860f86741e6
│   └── vocab.txt -> ../../../blobs/6246906805d02aca01714c71e4c8d77b69a7a131
├── tokenizer_2
│   ├── special_tokens_map.json -> ../../../blobs/05592dd7745eb8b9b7e19753208af804f6393e45
│   ├── spiece.model -> ../../../blobs/ef78f86560d809067d12bac6c09f19a462cb3af3f54d2b8acbba26e1433125d6
│   └── tokenizer_config.json -> ../../../blobs/3b6e6f54f337932574c05d1377c08aaa9d2062a0
├── transformer
│   ├── config.json -> ../../../blobs/36c538ea7c70fa3060bfdedfd8c8efab370b699c
│   └── diffusion_pytorch_model.safetensors -> ../../../blobs/8af691cadb78047d55721259355d708e87ddbba1b7845df9377d9a5ae917b45d
└── vae
├── config.json -> ../../../blobs/742efa052c5730728cb140208c25c73268749ad4
└── diffusion_pytorch_model.safetensors -> ../../../blobs/98a14dc6fe8d71c83576f135a87c61a16561c9c080abba418d2cc976ee034f88

* add v1.2 support --------- Co-authored-by: xingchaoliu <xingchaoliu@tencent.com> Co-authored-by: yiyixuxu <yixu310@gmail.com>

xingchaoliu added 2 commits July 1, 2024 12:04

add v1.2 support

200bb54

change docstring

91f4dc8

yiyixuxu reviewed Jul 1, 2024

View reviewed changes

src/diffusers/models/embeddings.py Show resolved Hide resolved

xingchaoliu and others added 2 commits July 1, 2024 15:09

fix yiyi comment

816574b

style

98d2890

yiyixuxu merged commit a3904d7 into huggingface:main Jul 1, 2024
14 of 15 checks passed

sayakpaul pushed a commit that referenced this pull request Dec 23, 2024

[Tencent Hunyuan Team] Add HunyuanDiT-v1.2 Support (#8747)

ad8cf58

* add v1.2 support --------- Co-authored-by: xingchaoliu <xingchaoliu@tencent.com> Co-authored-by: yiyixuxu <yixu310@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Tencent Hunyuan Team] Add HunyuanDiT-v1.2 Support #8747

[Tencent Hunyuan Team] Add HunyuanDiT-v1.2 Support #8747

Uh oh!

gnobitab commented Jul 1, 2024

Uh oh!

s9anus98a commented Jul 1, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Jul 1, 2024

Uh oh!

Uh oh!

yiyixuxu commented Jul 1, 2024

Uh oh!

gnobitab commented Jul 1, 2024

Uh oh!

Uh oh!

s9anus98a commented Jul 1, 2024

Uh oh!

neonhuang commented Jul 9, 2024 •

edited

Loading

Uh oh!

yiyixuxu commented Jul 9, 2024

Uh oh!

neonhuang commented Jul 9, 2024

Uh oh!

Uh oh!

[Tencent Hunyuan Team] Add HunyuanDiT-v1.2 Support #8747

[Tencent Hunyuan Team] Add HunyuanDiT-v1.2 Support #8747

Uh oh!

Conversation

gnobitab commented Jul 1, 2024

Uh oh!

s9anus98a commented Jul 1, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Jul 1, 2024

Uh oh!

Uh oh!

yiyixuxu commented Jul 1, 2024

Uh oh!

gnobitab commented Jul 1, 2024

Uh oh!

Uh oh!

s9anus98a commented Jul 1, 2024

Uh oh!

neonhuang commented Jul 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yiyixuxu commented Jul 9, 2024

Uh oh!

neonhuang commented Jul 9, 2024

Uh oh!

Uh oh!

neonhuang commented Jul 9, 2024 •

edited

Loading