What matters when building vision-language models? Paper • 2405.02246 • Published May 3, 2024 • 101 • 3