DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark Paper • 2405.19707 • Published May 30, 2024 • 6
Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations Paper • 2410.08049 • Published Oct 10, 2024 • 8
DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding Paper • 2411.14347 • Published Nov 21, 2024 • 13
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper • 2412.04424 • Published Dec 5, 2024 • 59
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper • 2412.04467 • Published Dec 5, 2024 • 105