Collections
Discover the best community collections!
Collections including paper arxiv:2305.18290
-
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
Paper • 2401.01967 • Published -
Secrets of RLHF in Large Language Models Part I: PPO
Paper • 2307.04964 • Published • 28 -
Zephyr: Direct Distillation of LM Alignment
Paper • 2310.16944 • Published • 123 -
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Paper • 2404.05961 • Published • 65
-
A General Theoretical Paradigm to Understand Learning from Human Preferences
Paper • 2310.12036 • Published • 13 -
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 64 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 52
-
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 62 -
A Survey on Data Selection for Language Models
Paper • 2402.16827 • Published • 4 -
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Paper • 2402.00159 • Published • 62 -
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Paper • 2306.01116 • Published • 33
-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 52 -
HyperCLOVA X Technical Report
Paper • 2404.01954 • Published • 21 -
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Paper • 2404.09956 • Published • 12 -
Learn Your Reference Model for Real Good Alignment
Paper • 2404.09656 • Published • 83
-
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Paper • 2403.18421 • Published • 23 -
Long-form factuality in large language models
Paper • 2403.18802 • Published • 25 -
stanford-crfm/BioMedLM
Text Generation • Updated • 1.71k • 409 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 52
-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 52 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 41 -
Dueling RL: Reinforcement Learning with Trajectory Preferences
Paper • 2111.04850 • Published • 2
-
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 58 -
WARM: On the Benefits of Weight Averaged Reward Models
Paper • 2401.12187 • Published • 18 -
RewardBench: Evaluating Reward Models for Language Modeling
Paper • 2403.13787 • Published • 21 -
DreamReward: Text-to-3D Generation with Human Preference
Paper • 2403.14613 • Published • 36