kaist-ai
/

volcano-7b

text-generation

visual-question-answering

image-captioning

Inference Endpoints

Model card Files Files and versions Community

Seongyun commited on Nov 13, 2023

Commit

c449933

·

1 Parent(s): 9314773

Update README.md

Files changed (1) hide show

README.md +19 -1

README.md CHANGED Viewed

@@ -1,3 +1,21 @@
 # Overview
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6550c4f27bbfce1878f5f280/AnqbCNf6pRiQ_5uNX0r4d.png)
 Volcano employs a single LMM to generate initial responses, feedback, and revisions, as well as decisions to accept revisions. It follows a sequential procedure of an iterative critique-revision-decide loop.
@@ -22,4 +40,4 @@ Volcano-7b was trained in October 2023.
 You can find [here](https://huggingface.co/datasets/kaist-ai/volcano-train) the dataset used to train Volcano, which includes all the aforementioned datasets.
 # Evaluation dataset
-A collection of three multimodal hallucination benchmarks ([MMHal-Bench](https://huggingface.co/datasets/Shengcao1006/MMHal-Bench), [Pope](https://github.com/RUCAIBox/POPE), [GAVIE](https://github.com/FuxiaoLiu/LRV-Instruction)) and two multimodal understanding benchmarks ([MM-Vet](https://github.com/yuweihao/MM-Vet), [MMBench](https://github.com/open-compass/MMBench)).

+---
+tags:
+- image-to-text
+- visual-question-answering
+- image-captioning
+datasets:
+- kaist-ai/Feedback-Collection
+license: apache-2.0
+language:
+- en
+pipeline_tag: image-to-text
+library_name: transformers
+---
+## Links for Reference
+- **Repository:**
+- **Paper:**
 # Overview
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6550c4f27bbfce1878f5f280/AnqbCNf6pRiQ_5uNX0r4d.png)
 Volcano employs a single LMM to generate initial responses, feedback, and revisions, as well as decisions to accept revisions. It follows a sequential procedure of an iterative critique-revision-decide loop.
 You can find [here](https://huggingface.co/datasets/kaist-ai/volcano-train) the dataset used to train Volcano, which includes all the aforementioned datasets.
 # Evaluation dataset
+A collection of three multimodal hallucination benchmarks ([MMHal-Bench](https://huggingface.co/datasets/Shengcao1006/MMHal-Bench), [Pope](https://github.com/RUCAIBox/POPE), [GAVIE](https://github.com/FuxiaoLiu/LRV-Instruction)) and two multimodal understanding benchmarks ([MM-Vet](https://github.com/yuweihao/MM-Vet), [MMBench](https://github.com/open-compass/MMBench)).