WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning Paper • 2411.02337 • Published Nov 4, 2024 • 35
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models Paper • 2411.04996 • Published Nov 7, 2024 • 49
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level Paper • 2411.03562 • Published Nov 5, 2024 • 63
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization Paper • 2410.08815 • Published Oct 11, 2024 • 44
Game-theoretic LLM: Agent Workflow for Negotiation Games Paper • 2411.05990 • Published Nov 8, 2024 • 7
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Paper • 2411.10640 • Published Nov 16, 2024 • 44
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Paper • 2411.19146 • Published Nov 28, 2024 • 13
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases Paper • 2412.04862 • Published 29 days ago • 48
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability Paper • 2411.19943 • Published Nov 29, 2024 • 55
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation Paper • 2412.02592 • Published Dec 3, 2024 • 20
RL Zero: Zero-Shot Language to Behaviors without any Supervision Paper • 2412.05718 • Published 27 days ago • 4
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation Paper • 2412.10704 • Published 21 days ago • 15
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment Paper • 2412.13746 • Published 17 days ago • 9
Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture Paper • 2412.11834 • Published 19 days ago • 6
Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents Paper • 2412.13194 • Published 17 days ago • 12
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing Paper • 2412.14711 • Published 16 days ago • 14
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning Paper • 2412.15797 • Published 15 days ago • 15
Progressive Multimodal Reasoning via Active Retrieval Paper • 2412.14835 • Published 16 days ago • 69
MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design Paper • 2412.14590 • Published 16 days ago • 13
ericsonwillians/distilbert-base-uncased-steam-sentiment Text Classification • Updated 22 days ago • 16
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search Paper • 2412.18319 • Published 11 days ago • 33