Model Details

Model Description

Llama-3.1-8B model trained with ORPO trainer.

Training Details

Training Data

mlabonne/orpo-dpo-mix-40k is used for finetuning this model.

[More Information Needed]

Training Procedure

Trained with ORPO trainer, and only first 5K rows are used for finetuning (5K out of 40K).

Downloads last month
59
Safetensors
Model size
4.65B params
Tensor type
BF16
F32
U8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.