update README
Browse files- .gitattributes +1 -0
- README.md +57 -0
- examples/controlnet.png +3 -0
- examples/framework.png +3 -0
- examples/qualitative.png +3 -0
- examples/teaser.png +3 -0
.gitattributes
CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
*.png filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
@@ -1,3 +1,60 @@
|
|
1 |
---
|
|
|
|
|
|
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
tags:
|
3 |
+
- text-to-image
|
4 |
+
- stable-diffusion
|
5 |
license: apache-2.0
|
6 |
+
language:
|
7 |
+
- en
|
8 |
+
library_name: diffusers
|
9 |
---
|
10 |
+
|
11 |
+
# EasyRef Model Card
|
12 |
+
|
13 |
+
<div align="center">
|
14 |
+
|
15 |
+
[**Project Page**](https://easyref-gen.github.io/) **|** [**Paper**](https://arxiv.org/pdf/2412.09618) **|** [**Code**](https://github.com/TempleX98/EasyRef) **|** [🤗 **Demo**](https://huggingface.co/spaces/zongzhuofan/EasyRef)
|
16 |
+
|
17 |
+
|
18 |
+
</div>
|
19 |
+
|
20 |
+
## Introduction
|
21 |
+
|
22 |
+
EasyRef is capable of modeling the consistent visual elements of various group image references with a single generalist multimodal LLM in a zero-shot setting.
|
23 |
+
|
24 |
+
<div align="center">
|
25 |
+
<img src='examples/framework.png'>
|
26 |
+
</div>
|
27 |
+
|
28 |
+
## Demos
|
29 |
+
More visualization examples are available in our [project page](https://easyref-gen.github.io/).
|
30 |
+
### Style, Identity, and Character Preservation
|
31 |
+
<img src='examples/teaser.png'>
|
32 |
+
|
33 |
+
### Comparison with IP-Adapter
|
34 |
+
|
35 |
+
<img src='examples/qualitative.png'>
|
36 |
+
|
37 |
+
### Compatibility with ControlNet
|
38 |
+
|
39 |
+
<img src='examples/controlnet.png'>
|
40 |
+
|
41 |
+
## Inference
|
42 |
+
We provide the inference code of EasyRef with SDXL in [**easyref_demo**](https://github.com/TempleX98/EasyRef/blob/main/easyref_demo.ipynb).
|
43 |
+
|
44 |
+
### Usage Tips
|
45 |
+
- EasyRef performs best when provided with multiple reference images (more than 2).
|
46 |
+
- To ensure better identity preservation, we strongly recommend that users upload multiple square face images, ensuring the face occupies the majority of each image.
|
47 |
+
- Using multimodal prompts (both reference images and non-empty text prompt) can achieve better results.
|
48 |
+
- We set `scale=1.0` by default. Lowering the `scale` value leads to more diverse but less consistent generation results.
|
49 |
+
|
50 |
+
## Cite
|
51 |
+
If you find EasyRef useful for your research and applications, please cite us using this BibTeX:
|
52 |
+
|
53 |
+
```bibtex
|
54 |
+
@article{easyref,
|
55 |
+
title={EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM},
|
56 |
+
author={Zong, Zhuofan and Jiang, Dongzhi and Ma, Bingqi and Song, Guanglu and Shao, Hao and Shen, Dazhong and Liu, Yu and Li, Hongsheng},
|
57 |
+
journal={arXiv preprint arXiv:2412.09618},
|
58 |
+
year={2024}
|
59 |
+
}
|
60 |
+
```
|
examples/controlnet.png
ADDED
Git LFS Details
|
examples/framework.png
ADDED
Git LFS Details
|
examples/qualitative.png
ADDED
Git LFS Details
|
examples/teaser.png
ADDED
Git LFS Details
|