-
Rethinking Optimization and Architecture for Tiny Language Models
Paper • 2402.02791 • Published • 13 -
More Agents Is All You Need
Paper • 2402.05120 • Published • 53 -
Scaling Laws for Forgetting When Fine-Tuning Large Language Models
Paper • 2401.05605 • Published -
Aligning Large Language Models with Counterfactual DPO
Paper • 2401.09566 • Published • 2
Collections
Discover the best community collections!
Collections including paper arxiv:2405.20541
-
AlpaGasus: Training A Better Alpaca with Fewer Data
Paper • 2307.08701 • Published • 22 -
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Paper • 2303.03915 • Published • 6 -
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 22 -
SlimPajama-DC: Understanding Data Combinations for LLM Training
Paper • 2309.10818 • Published • 10