Visual Question Answering
Transformers
Safetensors
English
videollama2_mixtral
text-generation
multimodal large language model
large video-language model
Inference Endpoints
File size: 134 Bytes
00ce666
 
 
1
2
3
4
version https://git-lfs.github.com/spec/v1
oid sha256:9d40c9d92899ba50f642ee3bf1f045a829511022d8c9d49aa5933dece8e7c9d4
size 933796424