peft-llama-bmr-test
This model is a fine-tuned version of meta-llama/Meta-Llama-3.1-8B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.8785
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 8
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 1
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.5744 | 0.0183 | 10 | 2.4501 |
2.3995 | 0.0367 | 20 | 2.3571 |
2.3147 | 0.0550 | 30 | 2.2727 |
2.2684 | 0.0734 | 40 | 2.2084 |
2.1813 | 0.0917 | 50 | 2.1577 |
2.1365 | 0.1100 | 60 | 2.1166 |
2.1404 | 0.1284 | 70 | 2.0843 |
2.0564 | 0.1467 | 80 | 2.0595 |
2.0717 | 0.1651 | 90 | 2.0398 |
1.9985 | 0.1834 | 100 | 2.0215 |
1.9806 | 0.2017 | 110 | 2.0039 |
1.9974 | 0.2201 | 120 | 1.9896 |
1.9678 | 0.2384 | 130 | 1.9800 |
2.0037 | 0.2568 | 140 | 1.9709 |
1.9703 | 0.2751 | 150 | 1.9647 |
2.0012 | 0.2934 | 160 | 1.9599 |
1.9475 | 0.3118 | 170 | 1.9525 |
2.0115 | 0.3301 | 180 | 1.9474 |
1.9348 | 0.3485 | 190 | 1.9428 |
2.0 | 0.3668 | 200 | 1.9378 |
1.9661 | 0.3851 | 210 | 1.9332 |
1.9389 | 0.4035 | 220 | 1.9292 |
1.9141 | 0.4218 | 230 | 1.9264 |
1.9356 | 0.4402 | 240 | 1.9222 |
1.9395 | 0.4585 | 250 | 1.9193 |
1.9322 | 0.4768 | 260 | 1.9157 |
1.9227 | 0.4952 | 270 | 1.9133 |
1.9244 | 0.5135 | 280 | 1.9102 |
1.8914 | 0.5319 | 290 | 1.9076 |
1.8998 | 0.5502 | 300 | 1.9051 |
1.8878 | 0.5685 | 310 | 1.9035 |
1.9012 | 0.5869 | 320 | 1.9012 |
1.9044 | 0.6052 | 330 | 1.8993 |
1.9121 | 0.6236 | 340 | 1.8971 |
1.9032 | 0.6419 | 350 | 1.8949 |
1.9058 | 0.6602 | 360 | 1.8933 |
1.9262 | 0.6786 | 370 | 1.8919 |
1.8939 | 0.6969 | 380 | 1.8905 |
1.8734 | 0.7153 | 390 | 1.8891 |
1.9305 | 0.7336 | 400 | 1.8878 |
1.8918 | 0.7519 | 410 | 1.8869 |
1.8988 | 0.7703 | 420 | 1.8854 |
1.8895 | 0.7886 | 430 | 1.8843 |
1.8887 | 0.8070 | 440 | 1.8836 |
1.8958 | 0.8253 | 450 | 1.8826 |
1.8966 | 0.8436 | 460 | 1.8819 |
1.901 | 0.8620 | 470 | 1.8811 |
1.9045 | 0.8803 | 480 | 1.8806 |
1.8838 | 0.8987 | 490 | 1.8800 |
1.8869 | 0.9170 | 500 | 1.8795 |
1.8603 | 0.9354 | 510 | 1.8792 |
1.8871 | 0.9537 | 520 | 1.8789 |
1.8874 | 0.9720 | 530 | 1.8787 |
1.8852 | 0.9904 | 540 | 1.8785 |
Framework versions
- PEFT 0.13.2
- Transformers 4.46.2
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3
- Downloads last month
- 0
Model tree for abdullahzubairwan/peft-llama-bmr-test
Base model
meta-llama/Llama-3.1-8B