Jsoo/llama3.2-3b-hard

Browse files

Files changed (5) hide show

README.md +29 -129
adapter_model.safetensors +1 -1
runs/Oct23_12-48-29_DESKTOP-UVSTCBR/events.out.tfevents.1729655317.DESKTOP-UVSTCBR.210884.0 +3 -0
tokenizer.json +2 -2
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.7021
 ## Model description
@@ -46,138 +46,38 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.01
-- num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch  | Step  | Validation Loss |
-|:-------------:|:------:|:-----:|:---------------:|
-| 2.8038        | 0.08   | 100   | 2.5553          |
-| 2.507         | 0.16   | 200   | 2.4913          |
-| 2.4522        | 0.24   | 300   | 2.4433          |
-| 2.417         | 0.32   | 400   | 2.4122          |
-| 2.3949        | 0.4    | 500   | 2.3822          |
-| 2.3586        | 0.48   | 600   | 2.3510          |
-| 2.3359        | 0.56   | 700   | 2.3276          |
-| 2.3081        | 0.64   | 800   | 2.3043          |
-| 2.307         | 0.72   | 900   | 2.2872          |
-| 2.2741        | 0.8    | 1000  | 2.2695          |
-| 2.2594        | 0.88   | 1100  | 2.2515          |
-| 2.2591        | 0.96   | 1200  | 2.2363          |
-| 2.2163        | 1.04   | 1300  | 2.2191          |
-| 2.1996        | 1.12   | 1400  | 2.2057          |
-| 2.1729        | 1.2    | 1500  | 2.1928          |
-| 2.1718        | 1.28   | 1600  | 2.1799          |
-| 2.1586        | 1.3600 | 1700  | 2.1692          |
-| 2.1386        | 1.44   | 1800  | 2.1530          |
-| 2.1338        | 1.52   | 1900  | 2.1418          |
-| 2.1189        | 1.6    | 2000  | 2.1307          |
-| 2.1055        | 1.6800 | 2100  | 2.1187          |
-| 2.1074        | 1.76   | 2200  | 2.1075          |
-| 2.0919        | 1.8400 | 2300  | 2.0959          |
-| 2.0812        | 1.92   | 2400  | 2.0845          |
-| 2.0621        | 2.0    | 2500  | 2.0743          |
-| 2.0172        | 2.08   | 2600  | 2.0666          |
-| 2.0159        | 2.16   | 2700  | 2.0602          |
-| 2.0075        | 2.24   | 2800  | 2.0476          |
-| 2.0042        | 2.32   | 2900  | 2.0394          |
-| 2.0062        | 2.4    | 3000  | 2.0262          |
-| 1.989         | 2.48   | 3100  | 2.0157          |
-| 1.9808        | 2.56   | 3200  | 2.0086          |
-| 1.9792        | 2.64   | 3300  | 1.9985          |
-| 1.9751        | 2.7200 | 3400  | 1.9904          |
-| 1.963         | 2.8    | 3500  | 1.9810          |
-| 1.9498        | 2.88   | 3600  | 1.9718          |
-| 1.9495        | 2.96   | 3700  | 1.9662          |
-| 1.9053        | 3.04   | 3800  | 1.9578          |
-| 1.8905        | 3.12   | 3900  | 1.9486          |
-| 1.8873        | 3.2    | 4000  | 1.9412          |
-| 1.8963        | 3.2800 | 4100  | 1.9347          |
-| 1.8847        | 3.36   | 4200  | 1.9274          |
-| 1.8819        | 3.44   | 4300  | 1.9187          |
-| 1.8789        | 3.52   | 4400  | 1.9151          |
-| 1.8635        | 3.6    | 4500  | 1.9057          |
-| 1.8557        | 3.68   | 4600  | 1.9010          |
-| 1.8518        | 3.76   | 4700  | 1.8927          |
-| 1.8444        | 3.84   | 4800  | 1.8863          |
-| 1.8318        | 3.92   | 4900  | 1.8801          |
-| 1.8387        | 4.0    | 5000  | 1.8737          |
-| 1.7994        | 4.08   | 5100  | 1.8701          |
-| 1.7866        | 4.16   | 5200  | 1.8634          |
-| 1.8005        | 4.24   | 5300  | 1.8623          |
-| 1.7951        | 4.32   | 5400  | 1.8558          |
-| 1.7818        | 4.4    | 5500  | 1.8477          |
-| 1.7874        | 4.48   | 5600  | 1.8426          |
-| 1.7771        | 4.5600 | 5700  | 1.8386          |
-| 1.7574        | 4.64   | 5800  | 1.8353          |
-| 1.7758        | 4.72   | 5900  | 1.8273          |
-| 1.7864        | 4.8    | 6000  | 1.8244          |
-| 1.7741        | 4.88   | 6100  | 1.8262          |
-| 1.7638        | 4.96   | 6200  | 1.8151          |
-| 1.7485        | 5.04   | 6300  | 1.8085          |
-| 1.7239        | 5.12   | 6400  | 1.8017          |
-| 1.7231        | 5.2    | 6500  | 1.7985          |
-| 1.7212        | 5.28   | 6600  | 1.7950          |
-| 1.7183        | 5.36   | 6700  | 1.7907          |
-| 1.7234        | 5.44   | 6800  | 1.7856          |
-| 1.7082        | 5.52   | 6900  | 1.7830          |
-| 1.7128        | 5.6    | 7000  | 1.7792          |
-| 1.7114        | 5.68   | 7100  | 1.7743          |
-| 1.7193        | 5.76   | 7200  | 1.7714          |
-| 1.7093        | 5.84   | 7300  | 1.7672          |
-| 1.6974        | 5.92   | 7400  | 1.7643          |
-| 1.7176        | 6.0    | 7500  | 1.7599          |
-| 1.6657        | 6.08   | 7600  | 1.7575          |
-| 1.679         | 6.16   | 7700  | 1.7560          |
-| 1.6663        | 6.24   | 7800  | 1.7526          |
-| 1.6634        | 6.32   | 7900  | 1.7499          |
-| 1.6736        | 6.4    | 8000  | 1.7466          |
-| 1.661         | 6.48   | 8100  | 1.7448          |
-| 1.6535        | 6.5600 | 8200  | 1.7438          |
-| 1.6734        | 6.64   | 8300  | 1.7395          |
-| 1.6611        | 6.72   | 8400  | 1.7370          |
-| 1.6841        | 6.8    | 8500  | 1.7337          |
-| 1.6735        | 6.88   | 8600  | 1.7331          |
-| 1.6679        | 6.96   | 8700  | 1.7316          |
-| 1.6459        | 7.04   | 8800  | 1.7305          |
-| 1.6438        | 7.12   | 8900  | 1.7296          |
-| 1.6436        | 7.2    | 9000  | 1.7283          |
-| 1.6293        | 7.28   | 9100  | 1.7278          |
-| 1.6424        | 7.36   | 9200  | 1.7252          |
-| 1.64          | 7.44   | 9300  | 1.7244          |
-| 1.6114        | 7.52   | 9400  | 1.7227          |
-| 1.6331        | 7.6    | 9500  | 1.7214          |
-| 1.628         | 7.68   | 9600  | 1.7173          |
-| 1.6464        | 7.76   | 9700  | 1.7159          |
-| 1.6355        | 7.84   | 9800  | 1.7138          |
-| 1.6489        | 7.92   | 9900  | 1.7127          |
-| 1.6436        | 8.0    | 10000 | 1.7113          |
-| 1.6108        | 8.08   | 10100 | 1.7108          |
-| 1.6252        | 8.16   | 10200 | 1.7097          |
-| 1.6228        | 8.24   | 10300 | 1.7087          |
-| 1.617         | 8.32   | 10400 | 1.7084          |
-| 1.6255        | 8.4    | 10500 | 1.7079          |
-| 1.6212        | 8.48   | 10600 | 1.7070          |
-| 1.6146        | 8.56   | 10700 | 1.7068          |
-| 1.625         | 8.64   | 10800 | 1.7060          |
-| 1.6282        | 8.72   | 10900 | 1.7056          |
-| 1.614         | 8.8    | 11000 | 1.7054          |
-| 1.612         | 8.88   | 11100 | 1.7051          |
-| 1.6145        | 8.96   | 11200 | 1.7040          |
-| 1.6125        | 9.04   | 11300 | 1.7037          |
-| 1.6282        | 9.12   | 11400 | 1.7030          |
-| 1.6085        | 9.2    | 11500 | 1.7030          |
-| 1.6008        | 9.28   | 11600 | 1.7027          |
-| 1.6109        | 9.36   | 11700 | 1.7024          |
-| 1.6318        | 9.44   | 11800 | 1.7023          |
-| 1.5976        | 9.52   | 11900 | 1.7022          |
-| 1.5975        | 9.6    | 12000 | 1.7022          |
-| 1.6108        | 9.68   | 12100 | 1.7021          |
-| 1.6158        | 9.76   | 12200 | 1.7021          |
-| 1.6232        | 9.84   | 12300 | 1.7021          |
-| 1.6109        | 9.92   | 12400 | 1.7021          |
-| 1.6005        | 10.0   | 12500 | 1.7021          |
 ### Framework versions

 This model is a fine-tuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.9956
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.01
+- num_epochs: 20
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 2.6788        | 0.3901 | 100  | 2.2881          |
+| 2.4361        | 0.7801 | 200  | 2.2154          |
+| 2.3903        | 1.1702 | 300  | 2.1747          |
+| 2.3166        | 1.5602 | 400  | 2.1358          |
+| 2.2868        | 1.9503 | 500  | 2.1058          |
+| 2.2048        | 2.3403 | 600  | 2.0800          |
+| 2.1999        | 2.7304 | 700  | 2.0613          |
+| 2.1711        | 3.1204 | 800  | 2.0471          |
+| 2.1038        | 3.5105 | 900  | 2.0329          |
+| 2.1115        | 3.9005 | 1000 | 2.0185          |
+| 2.0859        | 4.2906 | 1100 | 2.0129          |
+| 2.0455        | 4.6806 | 1200 | 2.0084          |
+| 2.0338        | 5.0707 | 1300 | 2.0022          |
+| 1.9991        | 5.4608 | 1400 | 2.0011          |
+| 1.9948        | 5.8508 | 1500 | 1.9966          |
+| 1.948         | 6.2409 | 1600 | 1.9977          |
+| 1.9773        | 6.6309 | 1700 | 1.9909          |
+| 1.9228        | 7.0210 | 1800 | 1.9915          |
+| 1.8997        | 7.4110 | 1900 | 1.9947          |
+| 1.9212        | 7.8011 | 2000 | 1.9868          |
+| 1.8786        | 8.1911 | 2100 | 2.0092          |
+| 1.8762        | 8.5812 | 2200 | 2.0070          |
+| 1.8724        | 8.9712 | 2300 | 2.0023          |
+| 1.8604        | 9.3613 | 2400 | 1.9978          |
+| 1.8436        | 9.7513 | 2500 | 1.9956          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6218ff3ca163c1ffc14aa4de9f6aebbb7b331ae0a484960314076edb87c7199c
 size 36715216

 version https://git-lfs.github.com/spec/v1
+oid sha256:516752b5acbc965b718d6b574b27a362a8ead0175f19a71abb25be26c449d19a
 size 36715216

runs/Oct23_12-48-29_DESKTOP-UVSTCBR/events.out.tfevents.1729655317.DESKTOP-UVSTCBR.210884.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2c25b30b39fbe1cc2afddeabd842aa76850ab098ed9317892b72370589074803
+size 17903

tokenizer.json CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:76cfe2f054560aae896b2b75e273dc97a39e304d4ad19c44a9727a1d6b33c4cc
-size 17210021

 version https://git-lfs.github.com/spec/v1
+oid sha256:9c85066e7642934ed09b44155e6566b0b5dab2637fb9433439ba5c9c7f8b50d3
+size 17210018

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b80f78d46420cd5b82723bd77d68e989c62705f2fccb568de3478cb149ae85d6
 size 5432

 version https://git-lfs.github.com/spec/v1
+oid sha256:94c489243f58e0385e630c2a899dc27c4bd2d04b61d534b43b69d108a9380236
 size 5432