Learnings from Scaling Visual Tokenizers for Reconstruction and Generation Paper • 2501.09755 • Published 5 days ago • 30
Eagle 2 Collection Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding. • 8 items • Updated about 8 hours ago • 10
Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches Paper • 2408.04567 • Published Aug 8, 2024 • 25
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs Paper • 2406.16860 • Published Jun 24, 2024 • 60
WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space Paper • 2311.13570 • Published Nov 22, 2023 • 3
Exploiting Diffusion Prior for Real-World Image Super-Resolution Paper • 2305.07015 • Published May 11, 2023 • 4