--- library_name: peft license: llama3.1 base_model: meta-llama/Meta-Llama-3.1-8B tags: - generated_from_trainer model-index: - name: peft-llama-bmr-test results: [] datasets: - malaysia-ai/crawl-my-website language: - ms --- # peft-llama-bmr-test This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.8785 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 8 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: linear - num_epochs: 1 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 2.5744 | 0.0183 | 10 | 2.4501 | | 2.3995 | 0.0367 | 20 | 2.3571 | | 2.3147 | 0.0550 | 30 | 2.2727 | | 2.2684 | 0.0734 | 40 | 2.2084 | | 2.1813 | 0.0917 | 50 | 2.1577 | | 2.1365 | 0.1100 | 60 | 2.1166 | | 2.1404 | 0.1284 | 70 | 2.0843 | | 2.0564 | 0.1467 | 80 | 2.0595 | | 2.0717 | 0.1651 | 90 | 2.0398 | | 1.9985 | 0.1834 | 100 | 2.0215 | | 1.9806 | 0.2017 | 110 | 2.0039 | | 1.9974 | 0.2201 | 120 | 1.9896 | | 1.9678 | 0.2384 | 130 | 1.9800 | | 2.0037 | 0.2568 | 140 | 1.9709 | | 1.9703 | 0.2751 | 150 | 1.9647 | | 2.0012 | 0.2934 | 160 | 1.9599 | | 1.9475 | 0.3118 | 170 | 1.9525 | | 2.0115 | 0.3301 | 180 | 1.9474 | | 1.9348 | 0.3485 | 190 | 1.9428 | | 2.0 | 0.3668 | 200 | 1.9378 | | 1.9661 | 0.3851 | 210 | 1.9332 | | 1.9389 | 0.4035 | 220 | 1.9292 | | 1.9141 | 0.4218 | 230 | 1.9264 | | 1.9356 | 0.4402 | 240 | 1.9222 | | 1.9395 | 0.4585 | 250 | 1.9193 | | 1.9322 | 0.4768 | 260 | 1.9157 | | 1.9227 | 0.4952 | 270 | 1.9133 | | 1.9244 | 0.5135 | 280 | 1.9102 | | 1.8914 | 0.5319 | 290 | 1.9076 | | 1.8998 | 0.5502 | 300 | 1.9051 | | 1.8878 | 0.5685 | 310 | 1.9035 | | 1.9012 | 0.5869 | 320 | 1.9012 | | 1.9044 | 0.6052 | 330 | 1.8993 | | 1.9121 | 0.6236 | 340 | 1.8971 | | 1.9032 | 0.6419 | 350 | 1.8949 | | 1.9058 | 0.6602 | 360 | 1.8933 | | 1.9262 | 0.6786 | 370 | 1.8919 | | 1.8939 | 0.6969 | 380 | 1.8905 | | 1.8734 | 0.7153 | 390 | 1.8891 | | 1.9305 | 0.7336 | 400 | 1.8878 | | 1.8918 | 0.7519 | 410 | 1.8869 | | 1.8988 | 0.7703 | 420 | 1.8854 | | 1.8895 | 0.7886 | 430 | 1.8843 | | 1.8887 | 0.8070 | 440 | 1.8836 | | 1.8958 | 0.8253 | 450 | 1.8826 | | 1.8966 | 0.8436 | 460 | 1.8819 | | 1.901 | 0.8620 | 470 | 1.8811 | | 1.9045 | 0.8803 | 480 | 1.8806 | | 1.8838 | 0.8987 | 490 | 1.8800 | | 1.8869 | 0.9170 | 500 | 1.8795 | | 1.8603 | 0.9354 | 510 | 1.8792 | | 1.8871 | 0.9537 | 520 | 1.8789 | | 1.8874 | 0.9720 | 530 | 1.8787 | | 1.8852 | 0.9904 | 540 | 1.8785 | ### Framework versions - PEFT 0.13.2 - Transformers 4.46.2 - Pytorch 2.5.1+cu121 - Datasets 3.1.0 - Tokenizers 0.20.3