Fine-tuning - a kd303 Collection

kd303 's Collections

Reasoning-lastest

code

Models

RAG

Synthetic Data papers

Agents

Fine-tuning

updated 7 days ago

Extending Llama-3's Context Ten-Fold Overnight

Paper • 2404.19553 • Published Apr 30, 2024 • 33
ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4, 2024 • 91
Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck

Paper • 2404.07647 • Published Apr 11, 2024 • 4
SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning

Paper • 2401.07950 • Published Jan 15, 2024 • 4
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Paper • 2312.06585 • Published Dec 11, 2023 • 28
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning

Paper • 2412.16849 • Published 17 days ago • 7