view article Article Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions By mikelabs • Nov 19, 2024 • 3
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content Paper • 2410.10783 • Published Oct 14, 2024 • 26