poeroz
's Collections
Paper list
updated
Finetuned Multimodal Language Models Are High-Quality Image-Text Data
Filters
Paper
•
2403.02677
•
Published
•
16
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large
Language Models
Paper
•
2403.03003
•
Published
•
9
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
Paper
•
2403.01487
•
Published
•
14
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper
•
2403.00522
•
Published
•
44
FuseChat: Knowledge Fusion of Chat Models
Paper
•
2402.16107
•
Published
•
36
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Paper
•
2403.07508
•
Published
•
74
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper
•
2403.09611
•
Published
•
125
An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference
Acceleration for Large Vision-Language Models
Paper
•
2403.06764
•
Published
•
26
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Paper
•
2403.05525
•
Published
•
40
ShortGPT: Layers in Large Language Models are More Redundant Than You
Expect
Paper
•
2403.03853
•
Published
•
61
Enhancing Vision-Language Pre-training with Rich Supervisions
Paper
•
2403.03346
•
Published
•
14
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
Paper
•
2403.13248
•
Published
•
78
When Do We Not Need Larger Vision Models?
Paper
•
2403.13043
•
Published
•
25
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
Paper
•
2403.11703
•
Published
•
16
Aria: An Open Multimodal Native Mixture-of-Experts Model
Paper
•
2410.05993
•
Published
•
107
Personalized Visual Instruction Tuning
Paper
•
2410.07113
•
Published
•
69
Paper
•
2410.07073
•
Published
•
62
MM-Ego: Towards Building Egocentric Multimodal LLMs
Paper
•
2410.07177
•
Published
•
21
UniMuMo: Unified Text, Music and Motion Generation
Paper
•
2410.04534
•
Published
•
19
Video Instruction Tuning With Synthetic Data
Paper
•
2410.02713
•
Published
•
38
Distilling an End-to-End Voice Assistant Without Instruction Training
Data
Paper
•
2410.02678
•
Published
•
22