NATURAL PLAN: Benchmarking LLMs on Natural Language Planning Paper • 2406.04520 • Published Jun 6, 2024 • 11
Beyond ChatBots: ExploreLLM for Structured Thoughts and Personalized Model Responses Paper • 2312.00763 • Published Dec 1, 2023 • 19
Instruction-Following Evaluation for Large Language Models Paper • 2311.07911 • Published Nov 14, 2023 • 19
InstructExcel: A Benchmark for Natural Language Instruction in Excel Paper • 2310.14495 • Published Oct 23, 2023 • 1
How FaR Are Large Language Models From Agents with Theory-of-Mind? Paper • 2310.03051 • Published Oct 4, 2023 • 34
Large Language Models Cannot Self-Correct Reasoning Yet Paper • 2310.01798 • Published Oct 3, 2023 • 33