World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models Paper • 2306.08685 • Published Jun 14, 2023 • 1
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL Paper • 2402.19446 • Published Feb 29, 2024
DANLI: Deliberative Agent for Following Natural Language Instructions Paper • 2210.12485 • Published Oct 22, 2022
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning Paper • 2405.10292 • Published May 16, 2024 • 1
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23, 2024 • 69
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning Paper • 2406.11896 • Published Jun 14, 2024 • 19
Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models Paper • 1910.03756 • Published Oct 9, 2019
Hierarchical Task Learning from Language Instructions with Unified Transformers and Self-Monitoring Paper • 2106.03427 • Published Jun 7, 2021
A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning Paper • 2009.08115 • Published Sep 17, 2020
Task-Oriented Dialog Systems that Consider Multiple Appropriate Responses under the Same Context Paper • 1911.10484 • Published Nov 24, 2019
Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans? Paper • 2311.00047 • Published Oct 31, 2023 • 8
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation Paper • 2402.16846 • Published Feb 26, 2024
Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans? Paper • 2311.00047 • Published Oct 31, 2023 • 8
DANLI: Deliberative Agent for Following Natural Language Instructions Paper • 2210.12485 • Published Oct 22, 2022