Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published 24 days ago • 136
SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization Paper • 2501.01245 • Published 4 days ago • 5
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Paper • 2501.00599 • Published 6 days ago • 37