Update README.md
Browse files
README.md
CHANGED
@@ -22,6 +22,12 @@ This model aims to detect visual manipulation in bar charts.
|
|
22 |
- **Model type:** Multi-Modal LLM
|
23 |
- **Finetuned from model:** llava-1.6-mistral-7b
|
24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
## Training Details
|
26 |
|
27 |
Finetuned with LoRA for 1 epoch on ~2700 images of misleading and non misleading bar charts
|
@@ -49,3 +55,16 @@ bias="none"
|
|
49 |
#### Training Hyperparameters
|
50 |
|
51 |
- **Training regime:** bf16 non-mixed precision
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
- **Model type:** Multi-Modal LLM
|
23 |
- **Finetuned from model:** llava-1.6-mistral-7b
|
24 |
|
25 |
+
## How to Get Started with the Model
|
26 |
+
|
27 |
+
This is not a HuggingFace-based model, please refer to this
|
28 |
+
[Colab notebook](https://colab.research.google.com/drive/1UpnztYv46faXj-kmFpL_GAbOCjP2u6zM?usp=sharing)
|
29 |
+
to run inference. Only works on GPU.
|
30 |
+
|
31 |
## Training Details
|
32 |
|
33 |
Finetuned with LoRA for 1 epoch on ~2700 images of misleading and non misleading bar charts
|
|
|
55 |
#### Training Hyperparameters
|
56 |
|
57 |
- **Training regime:** bf16 non-mixed precision
|
58 |
+
|
59 |
+
|
60 |
+
## Citation
|
61 |
+
|
62 |
+
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
63 |
+
|
64 |
+
**BibTeX:**
|
65 |
+
|
66 |
+
- Liu, Haotian, Li, Chunyuan, Li, Yuheng, Li, Bo, Zhang, Yuanhan, Shen, Sheng, & Lee, Yong Jae. (2024, January). **LLaVA-NeXT: Improved reasoning, OCR, and world knowledge**. Retrieved from [https://llava-vl.github.io/blog/2024-01-30-llava-next/](https://llava-vl.github.io/blog/2024-01-30-llava-next/).
|
67 |
+
|
68 |
+
- Liu, Haotian, Li, Chunyuan, Li, Yuheng, & Lee, Yong Jae. (2023). **Improved Baselines with Visual Instruction Tuning**. *arXiv:2310.03744*.
|
69 |
+
|
70 |
+
- Liu, Haotian, Li, Chunyuan, Wu, Qingyang, & Lee, Yong Jae. (2023). **Visual Instruction Tuning**. *NeurIPS*.
|