MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models Paper • 2501.00316 • Published 4 days ago • 18
MapQaTor: A System for Efficient Annotation of Map Query Datasets Paper • 2412.21015 • Published 5 days ago • 7
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Paper • 2501.00599 • Published 3 days ago • 29
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published 1 day ago • 31
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Paper • 2501.01427 • Published 1 day ago • 32
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published 2 days ago • 47
MLLM-as-a-Judge for Image Safety without Human Labeling Paper • 2501.00192 • Published 4 days ago • 14
ProgCo: Program Helps Self-Correction of Large Language Models Paper • 2501.01264 • Published 1 day ago • 19
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper • 2412.19723 • Published 8 days ago • 63