xjtupanda commited on
Commit
a4c328c
·
verified ·
1 Parent(s): 749e0f9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -3
README.md CHANGED
@@ -1,3 +1,46 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - xwwu/video-chatgpt
4
+ - Share14/ShareGemini
5
+ base_model:
6
+ - HuggingFaceM4/Idefics3-8B-Llama3
7
+ pipeline_tag: video-text-to-text
8
+ tags:
9
+ - Idefics3
10
+ - finetune
11
+ - MLLM
12
+ license: apache-2.0
13
+ language:
14
+ - en
15
+ library_name: transformers
16
+ ---
17
+
18
+
19
+ <h1>T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs</h1>
20
+
21
+
22
+ [Github](https://github.com/xjtupanda/T2Vid) | [Paper](https://arxiv.org/pdf/24xx.xxxxx)
23
+
24
+
25
+ ## Model Summary
26
+
27
+ * This is a part of the project [T2Vid](https://github.com/xjtupanda/T2Vid)
28
+ * The video-LLM is fine-tuned from the image-LLM [Idefics3-8B-Llama3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3).
29
+
30
+
31
+ ## License
32
+
33
+ #### Model License
34
+
35
+ * The model is built on top of the pre-trained model: [HuggingFaceM4/Idefics3-8B-Llama3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3). We release the fine-tuned Idefics3 checkpoints under the Apache 2.0 license.
36
+ * The code in this repo is released under the [Apache-2.0](https://github.com/OpenBMB/MiniCPM/blob/main/LICENSE) License.
37
+
38
+
39
+ #### Statement
40
+ * As an LLM, Idefics3-8B-Llama3 generates contents by learning a large mount of texts, but it cannot comprehend, express personal opinions or make value judgement. Anything generated by Idefics3-8B-Llama3 does not represent the views and positions of the model developers
41
+ * We will not be liable for any problems arising from the use of the Idefics3-8B-Llama3 open Source model, including but not limited to data security issues, risk of public opinion, or any risks and problems arising from the misdirection, misuse, dissemination or misuse of the model.
42
+
43
+
44
+ ## Training dataset
45
+ - 100K video instruction data from Video-ChatGPT
46
+ - 100K video caption data from ShareGemini