Progressive Multimodal Reasoning via Active Retrieval Paper • 2412.14835 • Published 18 days ago • 70
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM Paper • 2412.09618 • Published 25 days ago • 21
Evaluating and Aligning CodeLLMs on Human Preference Paper • 2412.05210 • Published about 1 month ago • 47
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5, 2024 • 183
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability Paper • 2411.19943 • Published Nov 29, 2024 • 56
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published Nov 25, 2024 • 41
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 112
view post Post 2545 Let’s dive into the exciting releases from the Chinese community last week 🔥🚀More details 👉 https://huggingface.co/zh-ai-communityCode model:✨Qwen 2.5 coder by Alibaba Qwen Qwen/qwen25-coder-66eaa22e6f99801bf65b0c2f✨OpenCoder by InflyAI - Fully open code model🙌 infly/opencoder-672cec44bbb86c39910fb55eImage model: ✨Hunyuan3D-1.0 by Tencent tencent/Hunyuan3D-1MLLM: ✨JanusFlow by DeepSeek deepseek-ai/JanusFlow-1.3B deepseek-ai/JanusFlow-1.3B✨Mono-InternVL-2B by OpenGVlab OpenGVLab/Mono-InternVL-2BVideo model: ✨CogVideoX 1.5 by ChatGLM THUDM/CogVideoX1.5-5B-SATAudio model: ✨Fish Agent by FishAudio fishaudio/fish-agent-v0.1-3bDataset: ✨OPI dataset by BAAIBeijing BAAI/OPI 🔥 10 10 👀 4 4 🚀 2 2 + Reply
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7, 2024 • 113
GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models Paper • 2411.05830 • Published Nov 5, 2024 • 20