Collections
Discover the best community collections!
Collections including paper arxiv:2305.14387
-
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Paper • 2403.18421 • Published • 22 -
Long-form factuality in large language models
Paper • 2403.18802 • Published • 24 -
stanford-crfm/BioMedLM
Text Generation • Updated • 1.23k • 407 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 50
-
Proximal Policy Optimization Algorithms
Paper • 1707.06347 • Published • 4 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 50 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 145 -
Training language models to follow instructions with human feedback
Paper • 2203.02155 • Published • 16