Towards Automated Penetration Testing: Introducing LLM Benchmark, Analysis, and Improvements Oct 25, 2024 • 1
Code Evaluation Collection Collection of Papers on Code Evaluation (from code generation language models) • 45 items • Updated Oct 29, 2024 • 15
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Paper • 2403.09629 • Published Mar 14, 2024 • 75
GROVE: A Retrieval-augmented Complex Story Generation Framework with A Forest of Evidence Paper • 2310.05388 • Published Oct 9, 2023 • 4