Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle Paper • 2407.19548 • Published Jul 28, 2024 • 25
OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model Paper • 2409.01199 • Published Sep 2, 2024 • 14
WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model Paper • 2411.17459 • Published Nov 26, 2024 • 10
Open-Sora Plan: Open-Source Large Video Generation Model Paper • 2412.00131 • Published Nov 28, 2024 • 33
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Paper • 2406.04325 • Published Jun 6, 2024 • 73
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators Paper • 2404.05014 • Published Apr 7, 2024 • 33
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models Paper • 2401.15947 • Published Jan 29, 2024 • 49
Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models Paper • 2311.16103 • Published Nov 27, 2023 • 1
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment Paper • 2310.01852 • Published Oct 3, 2023 • 2
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection Paper • 2311.10122 • Published Nov 16, 2023 • 26