Chameleon: Mixed-Modal Early-Fusion Foundation Models Paper • 2405.09818 • Published May 16, 2024 • 127
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models Paper • 2406.02430 • Published Jun 4, 2024 • 31
An Image is Worth 32 Tokens for Reconstruction and Generation Paper • 2406.07550 • Published Jun 11, 2024 • 57
ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild Paper • 2407.04172 • Published Jul 4, 2024 • 22
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22, 2024 • 124
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation Paper • 2409.09214 • Published Sep 13, 2024 • 50