Collections
Discover the best community collections!
Collections including paper arxiv:2408.12637
-
Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation
Paper • 2408.11812 • Published • 5 -
RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands
Paper • 2408.11048 • Published • 4 -
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning
Paper • 2408.08441 • Published • 8 -
Building and better understanding vision-language models: insights and future directions
Paper • 2408.12637 • Published • 124
-
LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Paper • 2408.10188 • Published • 51 -
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Paper • 2408.08872 • Published • 98 -
Building and better understanding vision-language models: insights and future directions
Paper • 2408.12637 • Published • 124 -
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Paper • 2408.12528 • Published • 51