Skip to content

Commit 37c9697

Browse files
hlkysayakpaul
andauthored
Add IP-Adapter example to Flux docs (#10633)
* Add IP-Adapter example to Flux docs * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
1 parent 9684c52 commit 37c9697

File tree

1 file changed

+47
-0
lines changed
  • docs/source/en/api/pipelines

1 file changed

+47
-0
lines changed

docs/source/en/api/pipelines/flux.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -309,6 +309,53 @@ image.save("output.png")
309309

310310
When unloading the Control LoRA weights, call `pipe.unload_lora_weights(reset_to_overwritten_params=True)` to reset the `pipe.transformer` completely back to its original form. The resultant pipeline can then be used with methods like [`DiffusionPipeline.from_pipe`]. More details about this argument are available in [this PR](https://github.com/huggingface/diffusers/pull/10397).
311311

312+
## IP-Adapter
313+
314+
<Tip>
315+
316+
Check out [IP-Adapter](../../../using-diffusers/ip_adapter) to learn more about how IP-Adapters work.
317+
318+
</Tip>
319+
320+
An IP-Adapter lets you prompt Flux with images, in addition to the text prompt. This is especially useful when describing complex concepts that are difficult to articulate through text alone and you have reference images.
321+
322+
```python
323+
import torch
324+
from diffusers import FluxPipeline
325+
from diffusers.utils import load_image
326+
327+
pipe = FluxPipeline.from_pretrained(
328+
"black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16
329+
).to("cuda")
330+
331+
image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/flux_ip_adapter_input.jpg").resize((1024, 1024))
332+
333+
pipe.load_ip_adapter(
334+
"XLabs-AI/flux-ip-adapter",
335+
weight_name="ip_adapter.safetensors",
336+
image_encoder_pretrained_model_name_or_path="openai/clip-vit-large-patch14"
337+
)
338+
pipe.set_ip_adapter_scale(1.0)
339+
340+
image = pipe(
341+
width=1024,
342+
height=1024,
343+
prompt="wearing sunglasses",
344+
negative_prompt="",
345+
true_cfg=4.0,
346+
generator=torch.Generator().manual_seed(4444),
347+
ip_adapter_image=image,
348+
).images[0]
349+
350+
image.save('flux_ip_adapter_output.jpg')
351+
```
352+
353+
<div class="justify-center">
354+
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/flux_ip_adapter_output.jpg"/>
355+
<figcaption class="mt-2 text-sm text-center text-gray-500">IP-Adapter examples with prompt "wearing sunglasses"</figcaption>
356+
</div>
357+
358+
312359
## Running FP16 inference
313360

314361
Flux can generate high-quality images with FP16 (i.e. to accelerate inference on Turing/Volta GPUs) but produces different outputs compared to FP32/BF16. The issue is that some activations in the text encoders have to be clipped when running in FP16, which affects the overall image. Forcing text encoders to run with FP32 inference thus removes this output difference. See [here](https://github.com/huggingface/diffusers/pull/9097#issuecomment-2272292516) for details.

0 commit comments

Comments
 (0)