Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding Paper • 2501.07888 • Published 4 days ago • 12