MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts Paper • 2407.21770 • Published Jul 31, 2024 • 22 • 5
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss Paper • 2402.10790 • Published Feb 16, 2024 • 42 • 8
Meta-Transformer: A Unified Framework for Multimodal Learning Paper • 2307.10802 • Published Jul 20, 2023 • 43 • 3