Model,Language Model,Open Source,Text Recognition,Text Referring,Text Spotting,Relation Extraction,Element Parsing,Mathematical Calculation,Visual Text Understanding,Knowledge Reasoning,Average Score,Link Qwen-VL,,,7.2,5.3,10.7,11.5,11.2,9.2 Qwen-VL-chat,,,9.5,8.2,9.3,11.0,21.1,11.8 Qwen2-VL-8B,,,51.3,51.4,21.6,52.5,37.5,42.9 InternVL2-8B,,,20.6,45.2,23.2,54.4,38.1,36.3 InternVL2-26B,,,21.9,46.0,34.8,50.9,34.8,37.7 InternVL2.5-8B,,,52.8,52.8,28.6,56.4,40.5,46.2 InternVL2.5-26B,,,32.4,56.1,32.6,56.3,43.6,44.2 TextMonkey,,,23.5,14.8,8.4,19.9,12.2,15.8 LLaVA-Next-8B,,,5.7,2.9,12.2,7.5,17.2,9.1 Monkey,,,4.6,11.2,8.4,21.5,20.0,13.1 XComposer2-4KHD,,,16.7,18.8,12.1,27.5,2.3,15.5 Molmo-7B,,,7.1,15.0,9.2,9.0,23.7,12.8 EMU2-chat,,,2.3,0.5,8.5,1.0,7.3,3.9 mPLUG-Owl3,,,6.6,17.9,9.7,6.0,26.1,13.3 CogVLM-chat,,,5.5,10.0,9.8,1.5,2.5,5.9 Deepseek-VL-7B,,,8.0,13.3,15.7,5.5,18.5,12.2 GLM-4V-9B,,,24.4,60.6,20.4,52.8,25.2,36.6 MiniCPM-V-2.6,,,51.0,29.9,21.2,34.0,33.6,33.9 TextHarmony,,,1.8,4.5,8.2,1.5,11.9,5.6 ViLA1.5-8B,,,5.4,8.8,8.5,3.0,15.5,8.2 LLaVAR,,,2.3,1.7,8.9,0,2.5,3.1 DocOwl2,,,4.2,10.3,8.6,4.0,9.6,7.3 UReader,,,6.8,2.7,8.4,2.5,7.2,5.5 Yi-VL-6B,,,4.8,4.4,8.5,4.0,25.0,9.4 Janus-1.3B,,,7.6,8.7,11.4,4.5,10.7,8.6 Cambrian-1-8B,,,5.3,14.9,12.6,8.5,8.1,9.9 LLaVA-OV-7B,,,14.8,15.7,13.7,16.0,28.7,17.8 Eagle-X5-7B,,,7.5,12.0,11.6,5.0,19.2,11.1 Idefics3-8B,,,7.0,15.5,15.9,9.0,18.1,13.1 Ovis1.6-3B,,,11.5,23.7,22.8,28.8,18.9,21.1 Pixtral-12B,,,13.4,10.9,21.0,7.0,20.7,14.6 GLM-4V-Plus,,,34.5,60.6,23.9,49.8,28.2,39.4 GPT-4V,,,49.9,52.2,34.6,40.8,22.9,40.1 GPT-4o,,,21.6,53.0,29.8,38.5,18.2,32.2 GPT-4o-mini,,,13.1,38.9,27.2,28.8,16.9,25.0 Gemini-Pro,,,52.5,47.3,30.9,51.5,33.4,43.1 Claude3.5-sonnet,,,21.0,56.2,35.2,55.0,30.5,39.6 Step-1V,,,56.7,41.1,37.6,38.3,39.2,42.6