IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models Paper • 2501.13920 • Published 9 days ago • 13
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models Paper • 2501.13920 • Published 9 days ago • 13
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation Paper • 2501.01895 • Published 29 days ago • 50
Qwen2-VL Collection Vision-language model series based on Qwen2 • 16 items • Updated Dec 6, 2024 • 200
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 567
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want Paper • 2403.20271 • Published Mar 29, 2024 • 3
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions Paper • 2409.15278 • Published Sep 23, 2024 • 24
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want Paper • 2403.20271 • Published Mar 29, 2024 • 3
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions Paper • 2409.15278 • Published Sep 23, 2024 • 24