openbmb
/

MiniCPM-V

@@ -4,8 +4,10 @@ pipeline_tag: visual-question-answering
 ## MiniCPM-V
 ### News
--  [5/20]🔥 GPT-4V level multimodal model [**MiniCPM-Llama3-V 2.5**](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5) is out.
--  [4/11]🔥 [**MiniCPM-V 2.0**](https://huggingface.co/openbmb/MiniCPM-V-2) is out.
 **MiniCPM-V** (i.e., OmniLMM-3B) is an efficient version with promising performance for deployment. The model is built based on SigLip-400M and [MiniCPM-2.4B](https://github.com/OpenBMB/MiniCPM/), connected by a perceiver resampler. Notable features of OmniLMM-3B include:

 ## MiniCPM-V
 ### News
+- [2025.01.14] 🔥🔥 We open source [**MiniCPM-o 2.6**](https://huggingface.co/openbmb/MiniCPM-o-2_6), with significant performance improvement over **MiniCPM-V 2.6**, and support real-time speech-to-speech conversation and multimodal live streaming. Try it now.
+- [2024.08.06] 🔥 We open-source [**MiniCPM-V 2.6**](https://huggingface.co/openbmb/MiniCPM-V-2_6), which outperforms GPT-4V on single image, multi-image and video understanding. It advances popular features of MiniCPM-Llama3-V 2.5, and can support real-time video understanding on iPad.
+- [2024.05.20] 🔥 GPT-4V level multimodal model [**MiniCPM-Llama3-V 2.5**](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5) is out.
+- [2024.04.11] 🔥 [**MiniCPM-V 2.0**](https://huggingface.co/openbmb/MiniCPM-V-2) is out.
 **MiniCPM-V** (i.e., OmniLMM-3B) is an efficient version with promising performance for deployment. The model is built based on SigLip-400M and [MiniCPM-2.4B](https://github.com/OpenBMB/MiniCPM/), connected by a perceiver resampler. Notable features of OmniLMM-3B include: