-
Notifications
You must be signed in to change notification settings - Fork 6.4k
Closed
Labels
community-exampleshacktoberfeststaleIssues that haven't received updatesIssues that haven't received updates
Description
Hi!
I am working on latent diffusion for audio and music. It seems to me that Diffusers 🧨 is the place to be! There is a feature I would like to request: Training AutoencoderKL (Variational Autoencoder).
What I would love to do, is training AutoencoderKL on square and non-square images, either with one or more than one channels. I checked the implementation, and it seems to me that due to its fully convolutional nature, this would be perfectly possible.
A good start would be a script/notebook that shows how to train AutoencoderKL on a Hugging Face dataset. On the long run it could be even a Trainer.
Revist, jbmaxwell, betterze, MauroZMJ, ZhangQuantum and 2 morepatrickvonplaten
Metadata
Metadata
Assignees
Labels
community-exampleshacktoberfeststaleIssues that haven't received updatesIssues that haven't received updates