File size: 611 Bytes
a6ac944
 
85669ef
 
 
a6ac944
d47a54c
 
cc92c5a
 
 
d47a54c
 
 
cc92c5a
a9c63a0
cc92c5a
 
 
 
7891887
a9c63a0
cc92c5a
 
372ee7a
cc92c5a
 
372ee7a
cc92c5a
 
372ee7a
cc92c5a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
---
license: apache-2.0
tags:
- llava
pipeline_tag: image-text-to-text
---
**Base Model**: BLIP2-t5 pretrained version

**Finetune data**: 
* LLAVA 150k (sample one pair of instruction-answer if multi-round conversations)
* MiniGPT4 3500 pairs

**Hyper-parameters**: 

* BLIP2-flant5-xl + LLAVA (initial commits)
  * **v0**:
  * lr = 2e-5 --> 0.0 with cosine lr scheduler
  * gbs = 32
  * image size = 480
  * weight decay = 0.05

  * **v1 (same as LLAVA)**:
  * lr = 2e-5
  * gbs = 32
  * image size = 224
  * weight decay = 0.0

* Others
  * lr = 2e-5
  * gbs = 32
  * image size = 224
  * weight decay = 0.0