File size: 268 Bytes
473c771
 
 
 
 
 
 
1
2
3
4
5
6
7
---
pipeline_tag: image-text-to-text
---

This repository contains the VisVM model described in [Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension](https://huggingface.co/papers/2412.03704).

Code: https://github.com/si0wang/VisVM