a simpo-like DPO method, trained on simpo data AlpacaEval:44.8(+2)
- Downloads last month
- 12
Model tree for zhou-xl/xpo-lla-3-8b-instruct
Base model
meta-llama/Meta-Llama-3-8B-Instructa simpo-like DPO method, trained on simpo data AlpacaEval:44.8(+2)
Base model
meta-llama/Meta-Llama-3-8B-Instruct