BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Paper • 2411.10640 • Published Nov 16, 2024 • 44
Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature Mimicking Paper • 2303.05475 • Published Mar 9, 2023
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation Paper • 2410.13861 • Published Oct 17, 2024 • 53
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation Paper • 2410.13861 • Published Oct 17, 2024 • 53
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation Paper • 2410.13861 • Published Oct 17, 2024 • 53 • 3
FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis Paper • 2403.12963 • Published Mar 19, 2024 • 7
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling Paper • 2401.15977 • Published Jan 29, 2024 • 37