Update text_inversion.mdx #393

johnowhitaker · 2022-09-07T12:46:43Z

Getting in a bit of background info as a starting point for these docs

Getting in a bit of background info

HuggingFaceDocBuilderDev · 2022-09-07T12:49:46Z

The documentation is not available anymore as the PR was closed or merged.

As suggested by surajpatil :)

patil-suraj

Thanks a lot for working on this!

patil-suraj · 2022-09-07T13:17:30Z

docs/source/training/text_inversion.mdx

-To start, use the [`DiffusionPipeline`] for quick inference and sample generations!
+Textual Inversion is a technique for capturing novel concepts from a small number of example images in a way that can later be used to control text-to-image pipelines. It does so by learning new 'words' in the embedding space of the pipeline's text encoder. These special words can then be used within text prompts to achieve very fine-grained control of the resulting images. 

+![Textual Inversion example](https://textual-inversion.github.io/static/images/editing/colorful_teapot.JPG)
+_By using just 3-5 images you can teach new concepts to a model such as Stable Diffusion for personalized image generation ([image source](https://github.com/rinongal/textual_inversion))._
+
+This technique was introduced in [An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion](https://arxiv.org/abs/2208.01618). The paper demonstrated the concept using a [latent diffusion model](https://github.com/CompVis/latent-diffusion) but the idea has since been applied to other variants such as [Stable Diffusion](https://huggingface.co/docs/diffusers/main/en/conceptual/stable_diffusion).
+
+
+## How It Works
+
+![Diagram from the paper showing overview](https://textual-inversion.github.io/static/images/training/training.JPG)
+_Architecture Overview from the [textual inversion blog post](https://textual-inversion.github.io/)_
+
+Before a text prompt can be used in a diffusion model, it must first be processed into a numerical representation. This typically involves tokenizing the text, converting each token to an embedding and then feeding those embeddings through a model (typically a transformer) whose output will be used as the conditioning for the diffusion model. 
+
+Textual inversion learns a new token embedding (v* in the diagram above). A prompt (that includes a token which will be mapped to this new embedding) is used in conjunction with a noised version of one or more training images as inputs to the generator model, which attempts to predict the denoised version of the image. The embedding is optimized based on how well the model does at this task - an embedding that better captures the object or style shown by the training images will give more useful information to the diffusion model and thus result in a lower denoising loss. After many steps (typically several thousand) with a variety of prompt and image variants the learned embedding should hopefully capture the essence of the new concept being taught.


Very nice write-up!

Update text_inversion.mdx

f9bc16b

Getting in a bit of background info

johnowhitaker added 4 commits September 7, 2022 14:56

fixed typo mode -> model

496be28

Link SD and re-write a few bits for clarity

e1e228c

Copied in info from the example script

d6bf063

As suggested by surajpatil :)

removed an unnecessary heading

529d10d

patil-suraj approved these changes Sep 7, 2022

View reviewed changes

patil-suraj merged commit 5b4f595 into huggingface:main Sep 7, 2022

PhaneeshB pushed a commit to nod-ai/diffusers that referenced this pull request Mar 1, 2023

seperate importer and benchmark deps (huggingface#393)

d38e37b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update text_inversion.mdx #393

Update text_inversion.mdx #393

Uh oh!

johnowhitaker commented Sep 7, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Sep 7, 2022 •

edited

Loading

Uh oh!

patil-suraj left a comment

Uh oh!

patil-suraj Sep 7, 2022

Uh oh!

Uh oh!

Update text_inversion.mdx #393

Update text_inversion.mdx #393

Uh oh!

Conversation

johnowhitaker commented Sep 7, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Sep 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patil-suraj left a comment

Choose a reason for hiding this comment

Uh oh!

patil-suraj Sep 7, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Sep 7, 2022 •

edited

Loading