Skip to content

[Community] Training AutoencoderKL #894

@AI-Guru

Description

@AI-Guru

Hi!

I am working on latent diffusion for audio and music. It seems to me that Diffusers 🧨 is the place to be! There is a feature I would like to request: Training AutoencoderKL (Variational Autoencoder).

What I would love to do, is training AutoencoderKL on square and non-square images, either with one or more than one channels. I checked the implementation, and it seems to me that due to its fully convolutional nature, this would be perfectly possible.

A good start would be a script/notebook that shows how to train AutoencoderKL on a Hugging Face dataset. On the long run it could be even a Trainer.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions