The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 14 days ago • 82
Enhancing Human-Like Responses in Large Language Models Paper • 2501.05032 • Published 14 days ago • 49
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction Paper • 2501.06282 • Published 13 days ago • 39
Presto! Distilling Steps and Layers for Accelerating Music Generation Paper • 2410.05167 • Published Oct 7, 2024 • 17
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22, 2024 • 124
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher Paper • 2408.14176 • Published Aug 26, 2024 • 61
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper • 2408.15237 • Published Aug 27, 2024 • 39
Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond Paper • 2408.03900 • Published Aug 7, 2024 • 10
Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis Paper • 2407.09732 • Published Jul 13, 2024 • 8