Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs Paper • 2412.21187 • Published 4 days ago • 24
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published 17 days ago • 116
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search Paper • 2412.18319 • Published 11 days ago • 33
Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published 12 days ago • 40
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published 12 days ago • 42
Outcome-Refining Process Supervision for Code Generation Paper • 2412.15118 • Published 15 days ago • 19
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought Paper • 2412.17498 • Published 12 days ago • 21
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Paper • 2412.14161 • Published 16 days ago • 47
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models Paper • 2412.11605 • Published 19 days ago • 16
Compressed Chain of Thought: Efficient Reasoning Through Dense Representations Paper • 2412.13171 • Published 17 days ago • 31
Smaller Language Models Are Better Instruction Evolvers Paper • 2412.11231 • Published 20 days ago • 26
Training Large Language Models to Reason in a Continuous Latent Space Paper • 2412.06769 • Published 25 days ago • 64
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published 26 days ago • 72
Evaluating Language Models as Synthetic Data Generators Paper • 2412.03679 • Published about 1 month ago • 45
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability Paper • 2411.19943 • Published Nov 29, 2024 • 55
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS Paper • 2411.18478 • Published Nov 27, 2024 • 32
OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs Paper • 2411.14199 • Published Nov 21, 2024 • 29
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Paper • 2411.13543 • Published Nov 20, 2024 • 18