update README

Browse files

Files changed (6) hide show

.gitattributes +1 -0
README.md +57 -0
examples/controlnet.png +3 -0
examples/framework.png +3 -0
examples/qualitative.png +3 -0
examples/teaser.png +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+*.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,60 @@
 ---
 license: apache-2.0
 ---

 ---
+tags:
+- text-to-image
+- stable-diffusion
 license: apache-2.0
+language:
+- en
+library_name: diffusers
 ---
+# EasyRef Model Card
+<div align="center">
+[**Project Page**](https://easyref-gen.github.io/) **|** [**Paper**](https://arxiv.org/pdf/2412.09618) **|** [**Code**](https://github.com/TempleX98/EasyRef) **|** [🤗 **Demo**](https://huggingface.co/spaces/zongzhuofan/EasyRef)
+</div>
+## Introduction
+EasyRef is capable of modeling the consistent visual elements of various group image references with a single generalist multimodal LLM in a zero-shot setting.
+<div  align="center">
+<img src='examples/framework.png'>
+</div>
+## Demos
+More visualization examples are available in our [project page](https://easyref-gen.github.io/).
+### Style, Identity, and Character Preservation
+<img src='examples/teaser.png'>
+### Comparison with IP-Adapter
+<img src='examples/qualitative.png'>
+### Compatibility with ControlNet
+<img src='examples/controlnet.png'>
+## Inference
+We provide the inference code of EasyRef with SDXL in [**easyref_demo**](https://github.com/TempleX98/EasyRef/blob/main/easyref_demo.ipynb).
+### Usage Tips
+- EasyRef performs best when provided with multiple reference images (more than 2).
+- To ensure better identity preservation, we strongly recommend that users upload multiple square face images, ensuring the face occupies the majority of each image.
+- Using multimodal prompts (both reference images and non-empty text prompt) can achieve better results.
+- We set `scale=1.0` by default. Lowering the `scale` value leads to more diverse but less consistent generation results.
+## Cite
+If you find EasyRef useful for your research and applications, please cite us using this BibTeX:
+```bibtex
+@article{easyref,
+  title={EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM},
+  author={Zong, Zhuofan and Jiang, Dongzhi and Ma, Bingqi and Song, Guanglu and Shao, Hao and Shen, Dazhong and Liu, Yu and Li, Hongsheng},
+  journal={arXiv preprint arXiv:2412.09618},
+  year={2024}
+}
+```

examples/controlnet.png ADDED Viewed

Git LFS Details

SHA256: f38c3173769e97d6baedb27d68cffb0cea22512b44f98a71bd55aa5f220e767c
Pointer size: 132 Bytes
Size of remote file: 3.04 MB

examples/framework.png ADDED Viewed

Git LFS Details

SHA256: 86f1ba397b70973e257572e25d2dfc9e7fe34d0836c9ffa852be5c32bdcf8832
Pointer size: 131 Bytes
Size of remote file: 218 kB

examples/qualitative.png ADDED Viewed

Git LFS Details

SHA256: b9ba730253d3a9070dd8698441a88752a00257695682ca6bc1a314af022b17db
Pointer size: 132 Bytes
Size of remote file: 4 MB

examples/teaser.png ADDED Viewed

Git LFS Details

SHA256: af3643dbb20cc10d8fe736ffd0fda7e0da30d555034de8b1a528f6b575629881
Pointer size: 132 Bytes
Size of remote file: 2.39 MB