File size: 4,118 Bytes
ca822d3
164da9c
eedc0c0
ca822d3
 
 
 
9796138
311d898
ca822d3
 
8c2b71b
 
c5db356
665ac47
b6e0a71
 
e356def
 
aa4560c
 
c5db356
aa4560c
205e830
 
dd9c27c
aa4560c
dd9c27c
 
 
c5db356
592470d
 
 
c5db356
 
 
 
 
 
205e830
 
 
 
 
 
c5db356
3e47535
 
c5db356
 
85c91b3
 
c5db356
85c91b3
 
c5db356
 
3e47535
 
c5db356
b6e0a71
dd9c27c
c5db356
205e830
 
 
 
 
 
 
 
c5db356
 
 
 
205e830
 
c5db356
205e830
 
 
 
 
c5db356
 
 
 
 
 
 
205e830
 
 
 
dd9c27c
c5db356
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b6e0a71
c5db356
aa4560c
 
c5db356
b6e0a71
 
 
c5db356
b6e0a71
dd9c27c
aa4560c
dd9c27c
c5db356
aa4560c
 
 
b6e0a71
 
 
6df186b
 
 
 
 
 
 
b6e0a71
 
 
c5db356
aa4560c
1123781
 
 
 
 
 
b6e0a71
6732f1c
665ac47
dd9c27c
9e152c1
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
---
title: Real-Time Latent Consistency Model Text to Image
emoji: 💬🖼️
colorFrom: gray
colorTo: indigo
sdk: docker
pinned: false
suggested_hardware: a10g-small
disable_embedding: true
---

# Real-Time Latent Consistency Model

This demo showcases [Latent Consistency Model (LCM)](https://latent-consistency-models.github.io/) using [Diffusers](https://huggingface.co/docs/diffusers/using-diffusers/lcm) with a MJPEG stream server. You can read more about LCM + LoRAs with diffusers [here](https://huggingface.co/blog/lcm_lora).

You need a webcam to run this demo. 🤗

See a collecting with live demos [here](https://huggingface.co/collections/latent-consistency/latent-consistency-model-demos-654e90c52adb0688a0acbe6f)

## Running Locally

You need CUDA and Python 3.10, Node > 19, Mac with an M1/M2/M3 chip or Intel Arc GPU


## Install

```bash
python -m venv venv
source venv/bin/activate
pip3 install -r requirements.txt
cd frontend && npm install && npm run build && cd ..
# fastest pipeline
python run.py --reload --pipeline img2imgSD21Turbo 
 ```

# Pipelines
You can build your own pipeline following examples here [here](pipelines),
don't forget to fuild the frontend first 
```bash
cd frontend && npm install && npm run build && cd ..
```

# LCM
### Image to Image

```bash
python run.py --reload --pipeline img2img 
```

# LCM
### Text to Image

```bash
python run.py --reload --pipeline txt2img 
```

### Image to Image ControlNet Canny


```bash
python run.py --reload --pipeline controlnet 
```


# LCM + LoRa

Using LCM-LoRA, giving it the super power of doing inference in as little as 4 steps. [Learn more here](https://huggingface.co/blog/lcm_lora) or [technical report](https://huggingface.co/papers/2311.05556)



### Image to Image ControlNet Canny LoRa

```bash
python run.py --reload --pipeline controlnetLoraSD15
```
or SDXL, note that SDXL is slower than SD15 since the inference runs on 1024x1024 images

```bash
python run.py --reload --pipeline controlnetLoraSDXL
```

### Text to Image

```bash
python run.py --reload --pipeline txt2imgLora
```

or 

```bash
python run.py --reload --pipeline txt2imgLoraSDXL
```


### Setting environment variables


`TIMEOUT`: limit user session timeout  
`SAFETY_CHECKER`: disabled if you want NSFW filter off  
`MAX_QUEUE_SIZE`: limit number of users on current app instance  
`TORCH_COMPILE`: enable if you want to use torch compile for faster inference works well on A100 GPUs
`USE_TAESD`: enable if you want to use Autoencoder Tiny

If you run using `bash build-run.sh` you can set `PIPELINE` variables to choose the pipeline you want to run

```bash
PIPELINE=txt2imgLoraSDXL bash build-run.sh
```

and setting environment variables

```bash
TIMEOUT=120 SAFETY_CHECKER=True MAX_QUEUE_SIZE=4 python run.py --reload --pipeline txt2imgLoraSDXL
```

If you're running locally and want to test it on Mobile Safari, the webserver needs to be served over HTTPS, or follow this instruction on my [comment](https://github.com/radames/Real-Time-Latent-Consistency-Model/issues/17#issuecomment-1811957196)

```bash
openssl req -newkey rsa:4096 -nodes -keyout key.pem -x509 -days 365 -out certificate.pem
python run.py --reload --ssl-certfile=certificate.pem --ssl-keyfile=key.pem
```

## Docker

You need NVIDIA Container Toolkit for Docker, defaults to `controlnet``

```bash
docker build -t lcm-live .
docker run -ti -p 7860:7860 --gpus all lcm-live
```

reuse models data from host to avoid downloading them again, you can change `~/.cache/huggingface` to any other directory, but if you use hugingface-cli locally, you can share the same cache

```bash
docker run -ti -p 7860:7860 -e HF_HOME=/data -v ~/.cache/huggingface:/data  --gpus all lcm-live
```
 

or with environment variables

```bash
docker run -ti -e PIPELINE=txt2imgLoraSDXL -p 7860:7860 --gpus all lcm-live
```
# Development Mode


```bash
python run.py --reload  
```

# Demo on Hugging Face

https://huggingface.co/spaces/radames/Real-Time-Latent-Consistency-Model

https://github.com/radames/Real-Time-Latent-Consistency-Model/assets/102277/c4003ac5-e7ff-44c0-97d3-464bb659de70