Post
27080
The first open Stable Diffusion 3-like architecture model is JUST out ๐ฃ - but it is not SD3! ๐ค
It is Tencent-Hunyuan/HunyuanDiT by Tencent, a 1.5B parameter DiT (diffusion transformer) text-to-image model ๐ผ๏ธโจ, trained with multi-lingual CLIP + multi-lingual T5 text-encoders for english ๐ค chinese understanding
Try it out by yourself here โถ๏ธ https://huggingface.co/spaces/multimodalart/HunyuanDiT
(a bit too slow as the model is chunky and the research code isn't super optimized for inference speed yet)
In the paper they claim to be SOTA open source based on human preference evaluation!
It is Tencent-Hunyuan/HunyuanDiT by Tencent, a 1.5B parameter DiT (diffusion transformer) text-to-image model ๐ผ๏ธโจ, trained with multi-lingual CLIP + multi-lingual T5 text-encoders for english ๐ค chinese understanding
Try it out by yourself here โถ๏ธ https://huggingface.co/spaces/multimodalart/HunyuanDiT
(a bit too slow as the model is chunky and the research code isn't super optimized for inference speed yet)
In the paper they claim to be SOTA open source based on human preference evaluation!