cls_finred_llama3_v3

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the generator dataset. It achieves the following results on the evaluation set:

Loss: 0.4113

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 2
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.03
num_epochs: 2
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.7177	0.1116	20	0.6751
0.6323	0.2232	40	0.6166
0.6119	0.3347	60	0.5802
0.5471	0.4463	80	0.5532
0.5299	0.5579	100	0.5321
0.5265	0.6695	120	0.5062
0.5306	0.7810	140	0.4888
0.5094	0.8926	160	0.4764
0.4769	1.0042	180	0.4640
0.342	1.1158	200	0.4644
0.3271	1.2273	220	0.4534
0.342	1.3389	240	0.4448
0.3659	1.4505	260	0.4395
0.3159	1.5621	280	0.4284
0.3356	1.6736	300	0.4248
0.3476	1.7852	320	0.4165
0.3168	1.8968	340	0.4113

Framework versions

PEFT 0.11.1
Transformers 4.41.1
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

Sorour
/

cls_finred_llama3_v3

cls_finred_llama3_v3

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Sorour/cls_finred_llama3_v3

Evaluation results