BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Paper • 2411.13543 • Published Nov 20, 2024 • 18
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Paper • 2411.13543 • Published Nov 20, 2024 • 18
TinyCodeLM Collection Tiny generative language models for code -- pretrained for Python understanding, instruction tuned to synthesize Python with diff sequences • 4 items • Updated Oct 13, 2024