BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Paper • 2411.13543 • Published Nov 20, 2024 • 18
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis Paper • 2410.02749 • Published Oct 3, 2024 • 12