UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs
Abstract
Text-to-image diffusion models have demonstrated remarkable capabilities in transforming textual prompts into coherent images, yet the computational cost of their inference remains a persistent challenge. To address this issue, we present UFOGen, a novel generative model designed for ultra-fast, one-step text-to-image synthesis. In contrast to conventional approaches that focus on improving samplers or employing distillation techniques for diffusion models, UFOGen adopts a hybrid methodology, integrating diffusion models with a GAN objective. Leveraging a newly introduced diffusion-GAN objective and initialization with pre-trained diffusion models, UFOGen excels in efficiently generating high-quality images conditioned on textual descriptions in a single step. Beyond traditional text-to-image generation, UFOGen showcases versatility in applications. Notably, UFOGen stands among the pioneering models enabling one-step text-to-image generation and diverse downstream tasks, presenting a significant advancement in the landscape of efficient generative models. \blfootnote{*Work done as a student researcher of Google, dagger indicates equal contribution.
Community
I think the LCM (and LCM-LoRA) results are much better than they have showcased here.
I think the LCM (and LCM-LoRA) results are much better than they have showcased here.
Sure it's capable of much better results at higher steps... but at 1 step or 2 as shown in the example?
I'd guess that they used the original LCM_Dreamshaper_v7 model with produces results like that at 2 and 4 steps.
The distilled LCM SDXL produces much better images at 2 and 4 steps than the examples
It's normal in a paper that uses Stable Diffusion outputs to select the worst outputs, even if they are bragging about the Stable Diffusion outputs.
UFOGen: Revolutionizing Text-to-Image Generation with One-Step Diffusion GANs
Links π:
π Subscribe: https://www.youtube.com/@Arxflix
π Twitter: https://x.com/arxflix
π LMNT (Partner): https://lmnt.com/
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper