yonatanbitton
commited on
Commit
·
2604091
1
Parent(s):
c6c05f8
Upload visit_bench_leaderboard.tsv
Browse files- visit_bench_leaderboard.tsv +16 -12
visit_bench_leaderboard.tsv
CHANGED
@@ -1,12 +1,16 @@
|
|
1 |
-
Model
|
2 |
-
Human Verified GPT-4 Reference
|
3 |
-
LLaVA (13B)
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
VisualGPT (Da Vinci 003)
|
9 |
-
MiniGPT-4 (7B)
|
10 |
-
OpenFlamingo (9B)
|
11 |
-
|
12 |
-
|
|
|
|
|
|
|
|
|
|
1 |
+
Category Model Elo matches Win vs. Reference (w/ # ratings)
|
2 |
+
Single Image Human Verified GPT-4 Reference 1370 5442 -
|
3 |
+
Single Image LLaVA (13B) 1106 5446 17.81% (n=494)
|
4 |
+
Single Image LlamaAdapter-v2 (7B) 1082 5445 13.75% (n=502)
|
5 |
+
Single Image mPLUG-Owl (7B) 1081 5452 15.29% (n=497)
|
6 |
+
Single Image InstructBLIP (13B) 1011 5444 13.73% (n=517)
|
7 |
+
Single Image Otter (9B) 991 5450 6.84% (n=512)
|
8 |
+
Single Image VisualGPT (Da Vinci 003) 972 5445 1.52% (n=527)
|
9 |
+
Single Image MiniGPT-4 (7B) 921 5442 3.26% (n=522)
|
10 |
+
Single Image OpenFlamingo (9B) 877 5449 2.86% (n=524)
|
11 |
+
Single Image PandaGPT (13B) 826 5441 2.63% (n=533)
|
12 |
+
Single Image Multimodal GPT 763 5450 0.18% (n=544)
|
13 |
+
Multiple Images Human Verified GPT-4 Reference 1192 180 -
|
14 |
+
Multiple Images mPLUG-Owl 995 180 6.67% (n=60)
|
15 |
+
Multiple Images Otter 911 180 1.69% (n=59)
|
16 |
+
Multiple Images OpenFlamingo 902 180 1.67% (n=60)
|