CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published 4 days ago • 40
Evaluating and Aligning CodeLLMs on Human Preference Paper • 2412.05210 • Published about 1 month ago • 47
Evaluating and Aligning CodeLLMs on Human Preference Paper • 2412.05210 • Published about 1 month ago • 47 • 2
LongIns: A Challenging Long-context Instruction-based Exam for LLMs Paper • 2406.17588 • Published Jun 25, 2024 • 22
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated Nov 28, 2024 • 259