-
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
Paper ā¢ 2402.14083 ā¢ Published ā¢ 47 -
Linear Transformers are Versatile In-Context Learners
Paper ā¢ 2402.14180 ā¢ Published ā¢ 6 -
Training-Free Long-Context Scaling of Large Language Models
Paper ā¢ 2402.17463 ā¢ Published ā¢ 19 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper ā¢ 2402.17764 ā¢ Published ā¢ 605
Collections
Discover the best community collections!
Collections including paper arxiv:2402.17753
-
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Paper ā¢ 2303.16634 ā¢ Published ā¢ 3 -
miracl/miracl-corpus
Viewer ā¢ Updated ā¢ 77.2M ā¢ 5.47k ā¢ 44 -
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Paper ā¢ 2306.05685 ā¢ Published ā¢ 32 -
How is ChatGPT's behavior changing over time?
Paper ā¢ 2307.09009 ā¢ Published ā¢ 23
-
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models
Paper ā¢ 2310.11954 ā¢ Published ā¢ 25 -
Training Chain-of-Thought via Latent-Variable Inference
Paper ā¢ 2312.02179 ā¢ Published ā¢ 8 -
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception
Paper ā¢ 2401.16158 ā¢ Published ā¢ 19 -
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Paper ā¢ 2402.09727 ā¢ Published ā¢ 36
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper ā¢ 2401.02038 ā¢ Published ā¢ 62 -
Learning To Teach Large Language Models Logical Reasoning
Paper ā¢ 2310.09158 ā¢ Published ā¢ 1 -
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper ā¢ 2311.00176 ā¢ Published ā¢ 8 -
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Paper ā¢ 2308.09583 ā¢ Published ā¢ 7
-
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper ā¢ 2310.08740 ā¢ Published ā¢ 14 -
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Paper ā¢ 2310.12823 ā¢ Published ā¢ 35 -
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
Paper ā¢ 2308.10848 ā¢ Published ā¢ 1 -
CLEX: Continuous Length Extrapolation for Large Language Models
Paper ā¢ 2310.16450 ā¢ Published ā¢ 9