erwannd commited on
Commit
1f311bb
·
verified ·
1 Parent(s): 4225c96

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md CHANGED
@@ -22,6 +22,12 @@ This model aims to detect visual manipulation in bar charts.
22
  - **Model type:** Multi-Modal LLM
23
  - **Finetuned from model:** llava-1.6-mistral-7b
24
 
 
 
 
 
 
 
25
  ## Training Details
26
 
27
  Finetuned with LoRA for 1 epoch on ~2700 images of misleading and non misleading bar charts
@@ -49,3 +55,16 @@ bias="none"
49
  #### Training Hyperparameters
50
 
51
  - **Training regime:** bf16 non-mixed precision
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  - **Model type:** Multi-Modal LLM
23
  - **Finetuned from model:** llava-1.6-mistral-7b
24
 
25
+ ## How to Get Started with the Model
26
+
27
+ This is not a HuggingFace-based model, please refer to this
28
+ [Colab notebook](https://colab.research.google.com/drive/1UpnztYv46faXj-kmFpL_GAbOCjP2u6zM?usp=sharing)
29
+ to run inference. Only works on GPU.
30
+
31
  ## Training Details
32
 
33
  Finetuned with LoRA for 1 epoch on ~2700 images of misleading and non misleading bar charts
 
55
  #### Training Hyperparameters
56
 
57
  - **Training regime:** bf16 non-mixed precision
58
+
59
+
60
+ ## Citation
61
+
62
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
63
+
64
+ **BibTeX:**
65
+
66
+ - Liu, Haotian, Li, Chunyuan, Li, Yuheng, Li, Bo, Zhang, Yuanhan, Shen, Sheng, & Lee, Yong Jae. (2024, January). **LLaVA-NeXT: Improved reasoning, OCR, and world knowledge**. Retrieved from [https://llava-vl.github.io/blog/2024-01-30-llava-next/](https://llava-vl.github.io/blog/2024-01-30-llava-next/).
67
+
68
+ - Liu, Haotian, Li, Chunyuan, Li, Yuheng, & Lee, Yong Jae. (2023). **Improved Baselines with Visual Instruction Tuning**. *arXiv:2310.03744*.
69
+
70
+ - Liu, Haotian, Li, Chunyuan, Wu, Qingyang, & Lee, Yong Jae. (2023). **Visual Instruction Tuning**. *NeurIPS*.