OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper • 2412.19723 • Published 8 days ago • 63
PERSE: Personalized 3D Generative Avatars from A Single Portrait Paper • 2412.21206 • Published 4 days ago • 14
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization Paper • 2412.18525 • Published 11 days ago • 59
Bringing Objects to Life: 4D generation from 3D objects Paper • 2412.20422 • Published 6 days ago • 32
Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage Paper • 2412.15484 • Published 15 days ago • 14
Revisiting In-Context Learning with Long Context Language Models Paper • 2412.16926 • Published 13 days ago • 27
Large Motion Video Autoencoding with Cross-modal Video VAE Paper • 2412.17805 • Published 11 days ago • 23
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning Paper • 2412.15797 • Published 15 days ago • 15
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published 12 days ago • 37
VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models Paper • 2412.19645 • Published 8 days ago • 13
From Elements to Design: A Layered Approach for Automatic Graphic Design Composition Paper • 2412.19712 • Published 8 days ago • 14
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models Paper • 2412.18605 • Published 10 days ago • 17
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published 10 days ago • 82
DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation Paper • 2412.15200 • Published 15 days ago • 9
UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit Consistency Paper • 2412.15216 • Published 15 days ago • 5
LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis Paper • 2412.15214 • Published 15 days ago • 15
No More Adam: Learning Rate Scaling at Initialization is All You Need Paper • 2412.11768 • Published 19 days ago • 41
Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation Paper • 2412.14015 • Published 17 days ago • 12