yicui
's Collections
Training
updated
DataComp-LM: In search of the next generation of training sets for
language models
Paper
•
2406.11794
•
Published
•
50
Training Language Models on Synthetic Edit Sequences Improves Code
Synthesis
Paper
•
2410.02749
•
Published
•
12
Fewer Truncations Improve Language Modeling
Paper
•
2404.10830
•
Published
•
3
How to Train Long-Context Language Models (Effectively)
Paper
•
2410.02660
•
Published
•
2
Towards a Unified View of Preference Learning for Large Language Models:
A Survey
Paper
•
2409.02795
•
Published
•
71
ORPO: Monolithic Preference Optimization without Reference Model
Paper
•
2403.07691
•
Published
•
64
Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single
Process
Paper
•
2405.11870
•
Published
LoRA Dropout as a Sparsity Regularizer for Overfitting Control
Paper
•
2404.09610
•
Published
•
1
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning
Paper
•
2402.13669
•
Published
•
1
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A
Gradient Perspective
Paper
•
2410.23743
•
Published
•
59
Rho-1: Not All Tokens Are What You Need
Paper
•
2404.07965
•
Published
•
88
Loss-to-Loss Prediction: Scaling Laws for All Datasets
Paper
•
2411.12925
•
Published
•
5
RedPajama: an Open Dataset for Training Large Language Models
Paper
•
2411.12372
•
Published
•
47