RAG Playground: A Framework for Systematic Evaluation of Retrieval Strategies and Prompt Engineering in RAG Systems Paper • 2412.12322 • Published 21 days ago • 1
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search Paper • 2412.18319 • Published 13 days ago • 34
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought Paper • 2412.17498 • Published 14 days ago • 21
The Open Source Advantage in Large Language Models (LLMs) Paper • 2412.12004 • Published 21 days ago • 9
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published 24 days ago • 136
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials Paper • 2412.09605 • Published 25 days ago • 26
POINTS1.5: Building a Vision-Language Model towards Real World Applications Paper • 2412.08443 • Published 26 days ago • 38
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published about 1 month ago • 123
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper • 2412.04467 • Published Dec 5, 2024 • 105
Imagine360: Immersive 360 Video Generation from Perspective Anchor Paper • 2412.03552 • Published Dec 4, 2024 • 26
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation Paper • 2412.02592 • Published Dec 3, 2024 • 20
ScribeAgent: Towards Specialized Web Agents Using Production-Scale Workflow Data Paper • 2411.15004 • Published Nov 22, 2024 • 1
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Paper • 2411.19146 • Published Nov 28, 2024 • 13
SketchAgent: Language-Driven Sequential Sketch Generation Paper • 2411.17673 • Published Nov 26, 2024 • 18