-
Learning Language Games through Interaction
Paper • 1606.02447 • Published -
Naturalizing a Programming Language via Interactive Learning
Paper • 1704.06956 • Published -
Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration
Paper • 1802.08802 • Published -
Mapping Natural Language Commands to Web Elements
Paper • 1808.09132 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2412.14470
-
Agent-SafetyBench: Evaluating the Safety of LLM Agents
Paper • 2412.14470 • Published • 11 -
RED QUEEN: Safeguarding Large Language Models against Concealed Multi-Turn Jailbreaking
Paper • 2409.17458 • Published -
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors
Paper • 2406.14598 • Published
-
Video Creation by Demonstration
Paper • 2412.09551 • Published • 8 -
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation
Paper • 2412.07589 • Published • 46 -
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation
Paper • 2412.06531 • Published • 71 -
APOLLO: SGD-like Memory, AdamW-level Performance
Paper • 2412.05270 • Published • 38
-
More Agents Is All You Need
Paper • 2402.05120 • Published • 51 -
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Paper • 2402.07456 • Published • 42 -
Generative Agents: Interactive Simulacra of Human Behavior
Paper • 2304.03442 • Published • 12 -
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 8
-
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 187 -
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Paper • 2311.16502 • Published • 35 -
BLINK: Multimodal Large Language Models Can See but Not Perceive
Paper • 2404.12390 • Published • 24 -
RULER: What's the Real Context Size of Your Long-Context Language Models?
Paper • 2404.06654 • Published • 34
-
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Paper • 2402.15506 • Published • 14 -
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
Paper • 2404.03648 • Published • 24 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 31 -
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Paper • 2405.19888 • Published • 7