Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck Paper • 2404.07647 • Published Apr 11, 2024 • 4
SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning Paper • 2401.07950 • Published Jan 15, 2024 • 4
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models Paper • 2312.06585 • Published Dec 11, 2023 • 28
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning Paper • 2412.16849 • Published 17 days ago • 7