VideoRAG: Retrieval-Augmented Generation over Video Corpus Paper • 2501.05874 • Published 11 days ago • 65
view article Article CinePile 2.0 - making stronger datasets with adversarial refinement Oct 23, 2024 • 13
view article Article Breaking resolution curse of vision-language models By visheratin • Feb 24, 2024 • 13
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression Paper • 2403.12968 • Published Mar 19, 2024 • 25
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models Paper • 2406.04271 • Published Jun 6, 2024 • 29
FeatUp: A Model-Agnostic Framework for Features at Any Resolution Paper • 2403.10516 • Published Mar 15, 2024 • 16
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated Dec 13, 2024 • 143
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control Paper • 2307.15818 • Published Jul 28, 2023 • 29
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset Paper • 2403.09029 • Published Mar 14, 2024 • 55