Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ Volcano employs a single LMM to generate initial responses, feedback, and revisi
|
|
5 |
# Model details
|
6 |
|
7 |
**Model type:**
|
8 |
-
Volcano is a multimodal self-feedback guided revision model that was fine-tuned by mixing the visual instruction tuning dataset used in LLaVA-
|
9 |
|
10 |
**Model date:**
|
11 |
Volcano-7b was trained in October 2023.
|
@@ -13,7 +13,7 @@ Volcano-7b was trained in October 2023.
|
|
13 |
**Paper or resources for more information:**
|
14 |
|
15 |
# Training dataset
|
16 |
-
-
|
17 |
- 558K filtered image-text pairs from LAION/CC/SBU, captioned by BLIP.
|
18 |
- 158K GPT-generated multimodal instruction-following data.
|
19 |
- 450K academic-task-oriented VQA data mixture.
|
|
|
5 |
# Model details
|
6 |
|
7 |
**Model type:**
|
8 |
+
Volcano-7b is a multimodal self-feedback guided revision model that was fine-tuned by mixing the visual instruction tuning dataset used in [LLaVA-v1.5](https://llava-vl.github.io/) with multimodal feedback and revision data collected through gpt-3.5-turbo, applied to the [vicuna-7b-v1.5](https://huggingface.co/lmsys/vicuna-7b-v1.5) model.
|
9 |
|
10 |
**Model date:**
|
11 |
Volcano-7b was trained in October 2023.
|
|
|
13 |
**Paper or resources for more information:**
|
14 |
|
15 |
# Training dataset
|
16 |
+
- 274K multimodal feedback and revision data
|
17 |
- 558K filtered image-text pairs from LAION/CC/SBU, captioned by BLIP.
|
18 |
- 158K GPT-generated multimodal instruction-following data.
|
19 |
- 450K academic-task-oriented VQA data mixture.
|