A157801 commited on
Commit
de9cf4e
·
1 Parent(s): b37bc2b

update README

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,60 @@
1
  ---
 
 
 
2
  license: apache-2.0
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ tags:
3
+ - text-to-image
4
+ - stable-diffusion
5
  license: apache-2.0
6
+ language:
7
+ - en
8
+ library_name: diffusers
9
  ---
10
+
11
+ # EasyRef Model Card
12
+
13
+ <div align="center">
14
+
15
+ [**Project Page**](https://easyref-gen.github.io/) **|** [**Paper**](https://arxiv.org/pdf/2412.09618) **|** [**Code**](https://github.com/TempleX98/EasyRef) **|** [🤗 **Demo**](https://huggingface.co/spaces/zongzhuofan/EasyRef)
16
+
17
+
18
+ </div>
19
+
20
+ ## Introduction
21
+
22
+ EasyRef is capable of modeling the consistent visual elements of various group image references with a single generalist multimodal LLM in a zero-shot setting.
23
+
24
+ <div align="center">
25
+ <img src='examples/framework.png'>
26
+ </div>
27
+
28
+ ## Demos
29
+ More visualization examples are available in our [project page](https://easyref-gen.github.io/).
30
+ ### Style, Identity, and Character Preservation
31
+ <img src='examples/teaser.png'>
32
+
33
+ ### Comparison with IP-Adapter
34
+
35
+ <img src='examples/qualitative.png'>
36
+
37
+ ### Compatibility with ControlNet
38
+
39
+ <img src='examples/controlnet.png'>
40
+
41
+ ## Inference
42
+ We provide the inference code of EasyRef with SDXL in [**easyref_demo**](https://github.com/TempleX98/EasyRef/blob/main/easyref_demo.ipynb).
43
+
44
+ ### Usage Tips
45
+ - EasyRef performs best when provided with multiple reference images (more than 2).
46
+ - To ensure better identity preservation, we strongly recommend that users upload multiple square face images, ensuring the face occupies the majority of each image.
47
+ - Using multimodal prompts (both reference images and non-empty text prompt) can achieve better results.
48
+ - We set `scale=1.0` by default. Lowering the `scale` value leads to more diverse but less consistent generation results.
49
+
50
+ ## Cite
51
+ If you find EasyRef useful for your research and applications, please cite us using this BibTeX:
52
+
53
+ ```bibtex
54
+ @article{easyref,
55
+ title={EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM},
56
+ author={Zong, Zhuofan and Jiang, Dongzhi and Ma, Bingqi and Song, Guanglu and Shao, Hao and Shen, Dazhong and Liu, Yu and Li, Hongsheng},
57
+ journal={arXiv preprint arXiv:2412.09618},
58
+ year={2024}
59
+ }
60
+ ```
examples/controlnet.png ADDED

Git LFS Details

  • SHA256: f38c3173769e97d6baedb27d68cffb0cea22512b44f98a71bd55aa5f220e767c
  • Pointer size: 132 Bytes
  • Size of remote file: 3.04 MB
examples/framework.png ADDED

Git LFS Details

  • SHA256: 86f1ba397b70973e257572e25d2dfc9e7fe34d0836c9ffa852be5c32bdcf8832
  • Pointer size: 131 Bytes
  • Size of remote file: 218 kB
examples/qualitative.png ADDED

Git LFS Details

  • SHA256: b9ba730253d3a9070dd8698441a88752a00257695682ca6bc1a314af022b17db
  • Pointer size: 132 Bytes
  • Size of remote file: 4 MB
examples/teaser.png ADDED

Git LFS Details

  • SHA256: af3643dbb20cc10d8fe736ffd0fda7e0da30d555034de8b1a528f6b575629881
  • Pointer size: 132 Bytes
  • Size of remote file: 2.39 MB