kaist-ai
/

volcano-7b

text-generation

visual-question-answering

image-captioning

Inference Endpoints

Model card Files Files and versions Community

Seongyun commited on Nov 13, 2023

Commit

256f910

·

1 Parent(s): 2683acc

Update README.md

Files changed (1) hide show

README.md +7 -4

README.md CHANGED Viewed

@@ -5,16 +5,19 @@ Volcano employs a single LMM to generate initial responses, feedback, and revisi
 # Model details
 **Model type:**
-Volcano is a multimodal self-feedback guided revision model that was trained using the vicuna model with visual instruction tuning data and multimodal feedback and revision data obtained through gpt-3.5-turbo, following the methodology of LLaVA.
 **Model date:**
 Volcano-7b was trained in October 2023.
 **Paper or resources for more information:**
-## Training dataset
-- 274k Volcano-train data
 - 558K filtered image-text pairs from LAION/CC/SBU, captioned by BLIP.
 - 158K GPT-generated multimodal instruction-following data.
 - 450K academic-task-oriented VQA data mixture.
-- 40K ShareGPT data.

 # Model details
 **Model type:**
+Volcano is a multimodal self-feedback guided revision model that was fine-tuned by mixing the visual instruction tuning dataset used in LLaVA-1.5 with multimodal feedback and revision data collected through gpt-3.5-turbo, applied to the vicuna model.
 **Model date:**
 Volcano-7b was trained in October 2023.
 **Paper or resources for more information:**
+# Training dataset
+- 274k multimodal feedback and revision data
 - 558K filtered image-text pairs from LAION/CC/SBU, captioned by BLIP.
 - 158K GPT-generated multimodal instruction-following data.
 - 450K academic-task-oriented VQA data mixture.
+- 40K ShareGPT data
+# Evaluation dataset