peft-llama-bmr-test

This model is a fine-tuned version of meta-llama/Meta-Llama-3.1-8B on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 8
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 1
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
2.5744	0.0183	10	2.4501
2.3995	0.0367	20	2.3571
2.3147	0.0550	30	2.2727
2.2684	0.0734	40	2.2084
2.1813	0.0917	50	2.1577
2.1365	0.1100	60	2.1166
2.1404	0.1284	70	2.0843
2.0564	0.1467	80	2.0595
2.0717	0.1651	90	2.0398
1.9985	0.1834	100	2.0215
1.9806	0.2017	110	2.0039
1.9974	0.2201	120	1.9896
1.9678	0.2384	130	1.9800
2.0037	0.2568	140	1.9709
1.9703	0.2751	150	1.9647
2.0012	0.2934	160	1.9599
1.9475	0.3118	170	1.9525
2.0115	0.3301	180	1.9474
1.9348	0.3485	190	1.9428
2.0	0.3668	200	1.9378
1.9661	0.3851	210	1.9332
1.9389	0.4035	220	1.9292
1.9141	0.4218	230	1.9264
1.9356	0.4402	240	1.9222
1.9395	0.4585	250	1.9193
1.9322	0.4768	260	1.9157
1.9227	0.4952	270	1.9133
1.9244	0.5135	280	1.9102
1.8914	0.5319	290	1.9076
1.8998	0.5502	300	1.9051
1.8878	0.5685	310	1.9035
1.9012	0.5869	320	1.9012
1.9044	0.6052	330	1.8993
1.9121	0.6236	340	1.8971
1.9032	0.6419	350	1.8949
1.9058	0.6602	360	1.8933
1.9262	0.6786	370	1.8919
1.8939	0.6969	380	1.8905
1.8734	0.7153	390	1.8891
1.9305	0.7336	400	1.8878
1.8918	0.7519	410	1.8869
1.8988	0.7703	420	1.8854
1.8895	0.7886	430	1.8843
1.8887	0.8070	440	1.8836
1.8958	0.8253	450	1.8826
1.8966	0.8436	460	1.8819
1.901	0.8620	470	1.8811
1.9045	0.8803	480	1.8806
1.8838	0.8987	490	1.8800
1.8869	0.9170	500	1.8795
1.8603	0.9354	510	1.8792
1.8871	0.9537	520	1.8789
1.8874	0.9720	530	1.8787
1.8852	0.9904	540	1.8785