Multimodal Latent Language Modeling with Next-Token Diffusion Paper • 2412.08635 • Published 25 days ago • 41
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation Paper • 2412.07589 • Published 27 days ago • 46 • 4
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation Paper • 2412.07589 • Published 27 days ago • 46
Research Paper Collection Research Papers from Researcher/Member of MeissonFlow. • 1 item • Updated 26 days ago
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation Paper • 2412.07589 • Published 27 days ago • 46
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation Paper • 2412.07589 • Published 27 days ago • 46 • 4
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing Paper • 2412.04280 • Published Dec 5, 2024 • 13
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing Paper • 2412.04280 • Published Dec 5, 2024 • 13
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation Paper • 2410.13848 • Published Oct 17, 2024 • 32
Generalizable Entity Grounding via Assistance of Large Language Model Paper • 2402.02555 • Published Feb 4, 2024
DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries Paper • 2404.00086 • Published Mar 29, 2024
SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow Paper • 2405.20282 • Published May 30, 2024