CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs Paper • 2410.01999 • Published Oct 2, 2024 • 10
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale Paper • 2409.16299 • Published Sep 9, 2024 • 11
Learning to Predict Program Execution by Modeling Dynamic Dependency on Code Graphs Paper • 2408.02816 • Published Aug 5, 2024 • 5
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published Jun 20, 2024 • 87
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities Paper • 2406.14562 • Published Jun 20, 2024 • 28
REPOEXEC: Evaluate Code Generation with a Repository-Level Executable Benchmark Paper • 2406.11927 • Published Jun 17, 2024 • 11
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Paper • 2406.11931 • Published Jun 17, 2024 • 58
AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology Paper • 2406.11912 • Published Jun 16, 2024 • 26