Skip to content

Add Latte: Latent Diffusion Transformer for Video Generation #7223

Closed
@kabachuha

Description

@kabachuha

Model/Pipeline/Scheduler description

Latte is a text2video diffusion transformer (similar to Sora), improving past the DiT and PixArt-alpha text2image models

The implementation is already based on diffusers (see latte_t2v.py), so adding it here should be a straightforward task

Open source status

  • The model implementation is available.
  • The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

The official repo https://github.com/Vchitect/Latte
Model on Huggingface: https://huggingface.co/maxin-cn/Latte
Paper: https://arxiv.org/abs/2401.03048v1
Project page: https://maxin-cn.github.io/latte_project/

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleIssues that haven't received updates

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions