Challenges in Trustworthy Human Evaluation of Chatbots Paper • 2412.04363 • Published Dec 5, 2024 • 2 • 2
A Controlled Study on Long Context Extension and Generalization in LLMs Paper • 2409.12181 • Published Sep 18, 2024 • 44
A Controlled Study on Long Context Extension and Generalization in LLMs Paper • 2409.12181 • Published Sep 18, 2024 • 44
UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations Paper • 2311.08469 • Published Nov 14, 2023 • 10