DiffSynth Studio
Introduction
DiffSynth is a new Diffusion engine. We have restructured architectures including Text Encoder, UNet, VAE, among others, maintaining compatibility with models from the open-source community while enhancing computational performance. This version is currently in its initial stage, supporting SD and SDXL architectures. In the future, we plan to develop more interesting features based on this new codebase.
Installation
Create Python environment:
conda env create -f environment.yml
We find that sometimes conda
cannot install cupy
correctly, please install it manually. See this document for more details.
Enter the Python environment:
conda activate DiffSynthStudio
Usage (in WebUI)
python -m streamlit run Diffsynth_Studio.py
https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/93085557-73f3-4eee-a205-9829591ef954
Usage (in Python code)
Example 1: Stable Diffusion
We can generate images with very high resolution. Please see examples/sd_text_to_image.py
for more details.
Example 2: Stable Diffusion XL
Generate images with Stable Diffusion XL. Please see examples/sdxl_text_to_image.py
for more details.
Example 3: Stable Diffusion XL Turbo
Generate images with Stable Diffusion XL Turbo. You can see examples/sdxl_turbo.py
for more details, but we highly recommend you to use it in the WebUI.
Example 4: Toon Shading (Diffutoon)
This example is implemented based on Diffutoon. This approach is adept for rendering high-resoluton videos with rapid motion. You can easily modify the parameters in the config dict. See examples/diffutoon_toon_shading.py
.
https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/b54c05c5-d747-4709-be5e-b39af82404dd
Example 5: Toon Shading with Editing Signals (Diffutoon)
Coming soon.
https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/20528af5-5100-474a-8cdc-440b9efdd86c
Example 6: Toon Shading (in native Python code)
This example is provided for developers. If you don't want to use the config to manage parameters, you can see examples/sd_toon_shading.py
to learn how to use it in native Python code.
https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/607c199b-6140-410b-a111-3e4ffb01142c
Example 7: Text to Video
Given a prompt, DiffSynth Studio can generate a video using a Stable Diffusion model and an AnimateDiff model. We can break the limitation of number of frames! See examples/sd_text_to_video.py
.
https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/8f556355-4079-4445-9b48-e9da77699437
Example 8: Video Stylization
We provide an example for video stylization. In this pipeline, the rendered video is completely different from the original video, thus we need a powerful deflickering algorithm. We use FastBlend to implement the deflickering module. Please see examples/sd_video_rerender.py
for more details.
https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/59fb2f7b-8de0-4481-b79f-0c3a7361a1ea
Example 9: Prompt Processing
If you are not native English user, we provide translation service for you. Our prompter can translate other language to English and refine it using "BeautifulPrompt" models. Please see examples/sd_prompt_refining.py
for more details.
Prompt: "一个漂亮的女孩". The translation model will translate it to English.
Prompt: "一个漂亮的女孩". The translation model will translate it to English. Then the refining model will refine the translated prompt for better visual quality.