Visual Question Answering
Transformers
Safetensors
llava
image-text-to-text
AIGC
LLaVA
Inference Endpoints
ponytail commited on
Commit
a70b13c
·
verified ·
1 Parent(s): f5426be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -1
README.md CHANGED
@@ -103,7 +103,15 @@ verified_shikra: https://github.com/shikras/shikra
103
  ## Citation
104
 
105
  ```
106
- Coming soon!!!
 
 
 
 
 
 
 
 
107
  ```
108
 
109
  ## contact
 
103
  ## Citation
104
 
105
  ```
106
+ @misc{dai2024humanvlmfoundationhumanscenevisionlanguage,
107
+ title={HumanVLM: Foundation for Human-Scene Vision-Language Model},
108
+ author={Dawei Dai and Xu Long and Li Yutang and Zhang Yuanhui and Shuyin Xia},
109
+ year={2024},
110
+ eprint={2411.03034},
111
+ archivePrefix={arXiv},
112
+ primaryClass={cs.AI},
113
+ url={https://arxiv.org/abs/2411.03034},
114
+ }
115
  ```
116
 
117
  ## contact