The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models Paper • 2501.09653 • Published 1 day ago • 8
Long Code Arena: a Set of Benchmarks for Long-Context Code Models Paper • 2406.11612 • Published Jun 17, 2024 • 25
STACC: Code Comment Classification using SentenceTransformers Paper • 2302.13149 • Published Feb 25, 2023
Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binaries Paper • 2301.01701 • Published Jan 4, 2023
Targeted Attack on GPT-Neo for the SATML Language Model Data Extraction Challenge Paper • 2302.07735 • Published Feb 13, 2023