Wis
Browse filesSsddkkwdwjejjvjvwwd
README.md
CHANGED
@@ -1,3 +1,46 @@
|
|
1 |
---
|
2 |
license: bigscience-bloom-rail-1.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: bigscience-bloom-rail-1.0
|
3 |
---
|
4 |
+
tags:
|
5 |
+
- stable-diffusion
|
6 |
+
- text-to-image
|
7 |
+
inference: false
|
8 |
+
---
|
9 |
+
|
10 |
+
# Stable Diffusion v2 Model Card
|
11 |
+
This model card focuses on the model associated with the Stable Diffusion v2 model, available [here](https://github.com/Stability-AI/stablediffusion).
|
12 |
+
|
13 |
+
This `stable-diffusion-2` model is resumed from [stable-diffusion-2-base](https://huggingface.co/stabilityai/stable-diffusion-2-base) (`512-base-ema.ckpt`) and trained for 150k steps using a [v-objective](https://arxiv.org/abs/2202.00512) on the same dataset. Resumed for another 140k steps on `768x768` images.
|
14 |
+
|
15 |
+
![image](https://github.com/Stability-AI/stablediffusion/blob/main/assets/stable-samples/txt2img/768/merged-0005.png?raw=true)
|
16 |
+
|
17 |
+
- Use it with the [`stablediffusion`](https://github.com/Stability-AI/stablediffusion) repository: download the `768-v-ema.ckpt` [here](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/768-v-ema.ckpt).
|
18 |
+
- Use it with 🧨 [`diffusers`](https://huggingface.co/stabilityai/stable-diffusion-2#examples)
|
19 |
+
|
20 |
+
## Model Details
|
21 |
+
- **Developed by:** Robin Rombach, Patrick Esser
|
22 |
+
- **Model type:** Diffusion-based text-to-image generation model
|
23 |
+
- **Language(s):** English
|
24 |
+
- **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL)
|
25 |
+
- **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses a fixed, pretrained text encoder ([OpenCLIP-ViT/H](https://github.com/mlfoundations/open_clip)).
|
26 |
+
- **Resources for more information:** [GitHub Repository](https://github.com/Stability-AI/).
|
27 |
+
- **Cite as:**
|
28 |
+
|
29 |
+
@InProceedings{Rombach_2022_CVPR,
|
30 |
+
author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\"orn},
|
31 |
+
title = {High-Resolution Image Synthesis With Latent Diffusion Models},
|
32 |
+
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
|
33 |
+
month = {June},
|
34 |
+
year = {2022},
|
35 |
+
pages = {10684-10695}
|
36 |
+
}
|
37 |
+
|
38 |
+
|
39 |
+
## Examples
|
40 |
+
|
41 |
+
Using the [🤗's Diffusers library](https://github.com/huggingface/diffusers) to run Stable Diffusion 2 in a simple and efficient manner.
|
42 |
+
|
43 |
+
```bash
|
44 |
+
pip install --upgrade git+https://github.com/huggingface/diffusers.git transformers accelerate scipy
|
45 |
+
```
|
46 |
+
Running the pipeline (if you don't swap the scheduler it will run with the default DDIM, in this example we are swapping it to EulerDiscreteScheduler):
|