kaist-ai
/

volcano-7b

text-generation

visual-question-answering

image-captioning

Inference Endpoints

Model card Files Files and versions Community

Seongyun commited on Nov 13, 2023

Commit

2683acc

·

1 Parent(s): fc14337

Update README.md

Files changed (1) hide show

README.md +18 -1

README.md CHANGED Viewed

@@ -1,3 +1,20 @@
 # Overview
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6550c4f27bbfce1878f5f280/AnqbCNf6pRiQ_5uNX0r4d.png)
-Volcano employs a single LMM to generate initial responses, feedback, and revisions, as well as decisions to accept revisions. It follows a sequential procedure of an iterative critique-revision-decide loop.

 # Overview
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6550c4f27bbfce1878f5f280/AnqbCNf6pRiQ_5uNX0r4d.png)
+Volcano employs a single LMM to generate initial responses, feedback, and revisions, as well as decisions to accept revisions. It follows a sequential procedure of an iterative critique-revision-decide loop.
+# Model details
+**Model type:**
+Volcano is a multimodal self-feedback guided revision model that was trained using the vicuna model with visual instruction tuning data and multimodal feedback and revision data obtained through gpt-3.5-turbo, following the methodology of LLaVA.
+**Model date:**
+Volcano-7b was trained in October 2023.
+**Paper or resources for more information:**
+## Training dataset
+- 274k Volcano-train data
+- 558K filtered image-text pairs from LAION/CC/SBU, captioned by BLIP.
+- 158K GPT-generated multimodal instruction-following data.
+- 450K academic-task-oriented VQA data mixture.
+- 40K ShareGPT data.