Model Details
Model Description
Llama-3.1-8B model trained with ORPO trainer.
Training Details
Training Data
mlabonne/orpo-dpo-mix-40k is used for finetuning this model.
[More Information Needed]
Training Procedure
Trained with ORPO trainer, and only first 5K rows are used for finetuning (5K out of 40K).
- Downloads last month
- 59
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.