Llama 3.1 SPPO Finetunes Collection Versions of Llama 3.1 fine-tuned using Self-Play Preference Optimization (SPPO): https://uclaml.github.io/SPPO/ • 3 items • Updated Jul 24, 2024
Llama 3.1 Instruct SPPO Collection Llama 3.1 models fine-tuned with Self-Play Preference Optimization (SPPO): https://uclaml.github.io/SPPO/ • 3 items • Updated Jul 24, 2024