nielsr's picture
nielsr HF staff
Add model card, link to paper and pipeline tag
473c771 verified
|
raw
history blame
268 Bytes
metadata
pipeline_tag: image-text-to-text

This repository contains the VisVM model described in Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension.

Code: https://github.com/si0wang/VisVM