Collections

Discover the best community collections!

Collections including paper arxiv:2404.09656
Papers - Reward Model - Training
Collection by May 6, 2024
Papers - Reward Model - Bradley-Terry
https://web.stanford.edu/class/archive/stats/stats200/stats200.1172/Lecture24.pdf
Papers - Reward Model
Collection by Apr 19, 2024
Papers - Fine-tuning - DPO
Refer to additional papers: https://link.springer.com/article/10.1007/s10994-014-5458-8 and https://link.springer.com/article/10.1007/BF00992696
RLHF
Collection by 22 days ago
Papers - Fine-tuning
Collection by Dec 22, 2024