llama2-7b-qlora-finetuned_1

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 16
total_train_batch_size: 64
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 3
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time
8.4619	0.1669	100	0.5918	0.0048
0.5531	0.3339	200	0.5314	0.0048
0.5311	0.5008	300	0.5164	0.0048
0.5179	0.6677	400	0.5114	0.0048
0.5168	0.8346	500	0.5072	0.0048
0.5124	1.0016	600	0.5034	0.0048
0.5053	1.1685	700	0.5003	0.0048
0.5047	1.3354	800	0.5001	0.0048
0.5008	1.5023	900	0.4967	0.0048
0.4985	1.6693	1000	0.4969	0.0048
0.4998	1.8362	1100	0.4941	0.0048
0.4987	2.0031	1200	0.4978	0.0048
0.4939	2.1701	1300	0.4933	0.0048
0.4907	2.3370	1400	0.4923	0.0048
0.4947	2.5039	1500	0.4910	0.0048
0.4896	2.6708	1600	0.4901	0.0048
0.4923	2.8378	1700	0.4896	0.0048