Text-to-Image
Diffusers
stable-diffusion
File size: 6,653 Bytes
8d5ebb6
 
 
 
 
08f9888
3e65770
7d4610c
8d5ebb6
 
 
 
df6d2ec
8d5ebb6
c6c10e8
8d5ebb6
d933bd7
8d5ebb6
 
 
2597ef9
 
bd63dbf
 
2597ef9
3bf58f3
 
645b056
 
 
3bf58f3
8d5ebb6
 
 
 
 
 
 
 
d2ebdfc
8d5ebb6
8a94aee
8d5ebb6
 
aa15e49
8a94aee
8d5ebb6
 
d2ebdfc
8a94aee
d2ebdfc
8d5ebb6
 
 
 
 
 
 
 
 
 
7374c77
8d5ebb6
 
 
 
 
 
aa15e49
8a94aee
8d5ebb6
 
 
 
 
 
 
 
 
 
 
 
 
 
b468d42
8d5ebb6
1dcaccf
 
8d5ebb6
 
d2ebdfc
8d5ebb6
8a94aee
8d5ebb6
 
aa15e49
8a94aee
8d5ebb6
 
d2ebdfc
8a94aee
d2ebdfc
8d5ebb6
 
 
 
 
 
 
 
 
 
 
 
 
 
3bf58f3
8d5ebb6
df6d2ec
 
8d5ebb6
df6d2ec
8d5ebb6
 
 
7374c77
 
df6d2ec
 
 
8d5ebb6
df6d2ec
8d5ebb6
d724fe0
8d5ebb6
b468d42
 
df6d2ec
 
 
 
0a5991b
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
---
license: openrail++
tags:
- text-to-image
- stable-diffusion
library_name: diffusers
inference: false
pipeline_tag: text-to-image
---

# SDXL-Lightning

![Intro Image](sdxl_lightning_samples.jpg)

SDXL-Lightning is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps. For more information, please refer to our research paper: [SDXL-Lightning: Progressive Adversarial Diffusion Distillation](https://arxiv.org/abs/2402.13929). We open-source the model as part of the research.

Our models are distilled from [stabilityai/stable-diffusion-xl-base-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0). This repository contains checkpoints for 1-step, 2-step, 4-step, and 8-step distilled models. The generation quality of our 2-step, 4-step, and 8-step model is amazing. Our 1-step model is more experimental.

We provide both full UNet and LoRA checkpoints. The full UNet models have the best quality while the LoRA models can be applied to other base models.

## Demos

* Generate with all configurations, best quality: [Demo](https://huggingface.co/spaces/ByteDance/SDXL-Lightning)
* Real-time generation as you type, lightning-fast: [Demo from fastsdxl.ai](https://fastsdxl.ai/)

## Checkpoints

* `sdxl_lightning_Nstep.safetensors`: All-in-one checkpoint, for ComfyUI.
* `sdxl_lightning_Nstep_unet.safetensors`: UNet checkpoint only, for Diffusers.
* `sdxl_lightning_Nstep_lora.safetensors`: LoRA checkpoint, for Diffusers and ComfyUI.

## Diffusers Usage

Please always use the correct checkpoint for the corresponding inference steps.

### 2-Step, 4-Step, 8-Step UNet

```python
import torch
from diffusers import StableDiffusionXLPipeline, UNet2DConditionModel, EulerDiscreteScheduler
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file

base = "stabilityai/stable-diffusion-xl-base-1.0"
repo = "ByteDance/SDXL-Lightning"
ckpt = "sdxl_lightning_4step_unet.safetensors" # Use the correct ckpt for your step setting!

# Load model.
unet = UNet2DConditionModel.from_config(base, subfolder="unet").to("cuda", torch.float16)
unet.load_state_dict(load_file(hf_hub_download(repo, ckpt), device="cuda"))
pipe = StableDiffusionXLPipeline.from_pretrained(base, unet=unet, torch_dtype=torch.float16, variant="fp16").to("cuda")

# Ensure sampler uses "trailing" timesteps.
pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")

# Ensure using the same inference steps as the loaded model and CFG set to 0.
pipe("A girl smiling", num_inference_steps=4, guidance_scale=0).images[0].save("output.png")
```

### 2-Step, 4-Step, 8-Step LoRA

Use LoRA only if you are using non-SDXL base models. Otherwise use our UNet checkpoint for better quality.
```python
import torch
from diffusers import StableDiffusionXLPipeline, EulerDiscreteScheduler
from huggingface_hub import hf_hub_download

base = "stabilityai/stable-diffusion-xl-base-1.0"
repo = "ByteDance/SDXL-Lightning"
ckpt = "sdxl_lightning_4step_lora.safetensors" # Use the correct ckpt for your step setting!

# Load model.
pipe = StableDiffusionXLPipeline.from_pretrained(base, torch_dtype=torch.float16, variant="fp16").to("cuda")
pipe.load_lora_weights(hf_hub_download(repo, ckpt))
pipe.fuse_lora()

# Ensure sampler uses "trailing" timesteps.
pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")

# Ensure using the same inference steps as the loaded model and CFG set to 0.
pipe("A girl smiling", num_inference_steps=4, guidance_scale=0).images[0].save("output.png")
```

### 1-Step UNet
The 1-step model is only experimental and the quality is much less stable. Consider using the 2-step model for much better quality.

The 1-step model uses "sample" prediction instead of "epsilon" prediction! The scheduler needs to be configured correctly.

```python
import torch
from diffusers import StableDiffusionXLPipeline, UNet2DConditionModel, EulerDiscreteScheduler
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file

base = "stabilityai/stable-diffusion-xl-base-1.0"
repo = "ByteDance/SDXL-Lightning"
ckpt = "sdxl_lightning_1step_unet_x0.safetensors" # Use the correct ckpt for your step setting!

# Load model.
unet = UNet2DConditionModel.from_config(base, subfolder="unet").to("cuda", torch.float16)
unet.load_state_dict(load_file(hf_hub_download(repo, ckpt), device="cuda"))
pipe = StableDiffusionXLPipeline.from_pretrained(base, unet=unet, torch_dtype=torch.float16, variant="fp16").to("cuda")

# Ensure sampler uses "trailing" timesteps and "sample" prediction type.
pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing", prediction_type="sample")

# Ensure using the same inference steps as the loaded model and CFG set to 0.
pipe("A girl smiling", num_inference_steps=1, guidance_scale=0).images[0].save("output.png")
```


## ComfyUI Usage

Please always use the correct checkpoint for the corresponding inference steps.
Please use Euler sampler with sgm_uniform scheduler.

### 2-Step, 4-Step, 8-Step Full

1. Download the full checkpoint (`sdxl_lightning_Nstep.safetensors`) to `/ComfyUI/models/checkpoints`.
1. Download our [ComfyUI full workflow](comfyui/sdxl_lightning_workflow_full.json).

![SDXL-Lightning ComfyUI Full Workflow](comfyui/sdxl_lightning_workflow_full.jpg)

### 2-Step, 4-Step, 8-Step LoRA

Use LoRA only if you are using non-SDXL base models. Otherwise use our full checkpoint for better quality.

1. Prepare your own base model.
1. Download the LoRA checkpoint (`sdxl_lightning_Nstep_lora.safetensors`) to `/ComfyUI/models/loras`
1. Download our [ComfyUI LoRA workflow](comfyui/sdxl_lightning_workflow_lora.json).

![SDXL-Lightning ComfyUI LoRA Workflow](comfyui/sdxl_lightning_workflow_lora.jpg)

### 1-Step

The 1-step model is only experimental and the quality is much less stable. Consider using the 2-step model for much better quality.

1. Update your ComfyUI to the latest version.
1. Download the full checkpoint (`sdxl_lightning_1step_x0.safetensors`) to `/ComfyUI/models/checkpoints`.
1. Download our [ComfyUI full 1-step workflow](comfyui/sdxl_lightning_workflow_full_1step.json).

![SDXL-Lightning ComfyUI Full 1-Step Workflow](comfyui/sdxl_lightning_workflow_full_1step.jpg)


## Cite Our Work
```
@misc{lin2024sdxllightning,
      title={SDXL-Lightning: Progressive Adversarial Diffusion Distillation}, 
      author={Shanchuan Lin and Anran Wang and Xiao Yang},
      year={2024},
      eprint={2402.13929},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
```