t5-small-finetuned-xsum
This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.2438
- Rouge1: 39.0888
- Rouge2: 16.4223
- Rougel: 39.0782
- Rougelsum: 38.9078
- Gen Len: 12.5217
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 13 | 2.9857 | 35.054 | 9.3338 | 34.7182 | 34.6078 | 12.8261 |
No log | 2.0 | 26 | 2.9275 | 36.982 | 10.81 | 36.5767 | 36.2152 | 12.2174 |
No log | 3.0 | 39 | 2.8723 | 34.1773 | 9.1875 | 33.6848 | 33.5565 | 13.1739 |
No log | 4.0 | 52 | 2.8198 | 35.5137 | 9.1875 | 34.9118 | 34.881 | 12.3043 |
No log | 5.0 | 65 | 2.7754 | 34.8751 | 9.1875 | 34.2983 | 34.1915 | 12.4348 |
No log | 6.0 | 78 | 2.7416 | 33.8933 | 9.72 | 33.9317 | 33.7063 | 12.5217 |
No log | 7.0 | 91 | 2.7116 | 35.2289 | 9.5122 | 35.285 | 35.2747 | 13.0 |
No log | 8.0 | 104 | 2.6810 | 34.0577 | 8.5778 | 34.055 | 34.0948 | 13.3913 |
No log | 9.0 | 117 | 2.6567 | 32.9647 | 8.0047 | 32.9981 | 33.0779 | 13.9565 |
No log | 10.0 | 130 | 2.6316 | 32.6956 | 8.4205 | 32.7352 | 32.6825 | 14.0435 |
No log | 11.0 | 143 | 2.6114 | 32.9656 | 8.7371 | 33.0318 | 33.0445 | 14.6087 |
No log | 12.0 | 156 | 2.5918 | 33.1517 | 8.7584 | 33.1374 | 33.2133 | 14.2174 |
No log | 13.0 | 169 | 2.5751 | 34.4539 | 9.3237 | 34.4496 | 34.4093 | 13.3043 |
No log | 14.0 | 182 | 2.5601 | 33.1915 | 9.4081 | 33.3879 | 33.1639 | 14.8696 |
No log | 15.0 | 195 | 2.5445 | 33.1915 | 9.4081 | 33.3879 | 33.1639 | 14.8696 |
No log | 16.0 | 208 | 2.5266 | 33.7765 | 9.3237 | 33.8297 | 33.7816 | 14.4348 |
No log | 17.0 | 221 | 2.5162 | 35.5923 | 9.3237 | 35.5421 | 35.701 | 13.5217 |
No log | 18.0 | 234 | 2.5053 | 35.9508 | 9.2685 | 35.9326 | 36.0466 | 13.0435 |
No log | 19.0 | 247 | 2.4925 | 34.5022 | 9.3237 | 34.4987 | 34.5612 | 14.0435 |
No log | 20.0 | 260 | 2.4797 | 33.6928 | 8.834 | 33.7552 | 33.8446 | 14.2609 |
No log | 21.0 | 273 | 2.4650 | 34.1559 | 8.834 | 34.2838 | 34.2381 | 14.6957 |
No log | 22.0 | 286 | 2.4544 | 36.3845 | 11.2985 | 36.4183 | 36.4242 | 13.6957 |
No log | 23.0 | 299 | 2.4436 | 35.3914 | 10.6288 | 35.2846 | 35.38 | 13.6522 |
No log | 24.0 | 312 | 2.4311 | 34.0662 | 10.1449 | 34.0753 | 34.1617 | 14.3043 |
No log | 25.0 | 325 | 2.4245 | 35.8294 | 10.3069 | 35.8467 | 35.863 | 12.6957 |
No log | 26.0 | 338 | 2.4154 | 35.4974 | 10.3069 | 35.4581 | 35.531 | 13.2174 |
No log | 27.0 | 351 | 2.4071 | 34.7645 | 9.9172 | 34.8136 | 34.714 | 12.913 |
No log | 28.0 | 364 | 2.3971 | 37.3155 | 13.3583 | 37.4269 | 37.4838 | 12.3913 |
No log | 29.0 | 377 | 2.3978 | 36.9809 | 13.3583 | 37.1434 | 37.1564 | 12.913 |
No log | 30.0 | 390 | 2.3977 | 35.7727 | 12.9485 | 35.9758 | 35.8481 | 13.3478 |
No log | 31.0 | 403 | 2.3889 | 36.9445 | 15.3623 | 37.0191 | 36.7931 | 13.6957 |
No log | 32.0 | 416 | 2.3791 | 36.9312 | 15.3623 | 37.0096 | 36.7726 | 13.7826 |
No log | 33.0 | 429 | 2.3698 | 38.0341 | 15.3623 | 38.1607 | 37.8796 | 13.6957 |
No log | 34.0 | 442 | 2.3566 | 38.0813 | 15.3623 | 38.2011 | 37.9199 | 13.4348 |
No log | 35.0 | 455 | 2.3508 | 38.8091 | 15.5572 | 39.0225 | 38.6247 | 12.8261 |
No log | 36.0 | 468 | 2.3421 | 38.1147 | 16.2361 | 38.2251 | 37.947 | 13.7826 |
No log | 37.0 | 481 | 2.3368 | 38.6438 | 15.5572 | 38.8191 | 38.5675 | 13.0435 |
No log | 38.0 | 494 | 2.3402 | 38.7863 | 15.9985 | 38.9217 | 38.543 | 13.0 |
2.6022 | 39.0 | 507 | 2.3356 | 38.9191 | 15.9985 | 39.1569 | 38.6792 | 12.6957 |
2.6022 | 40.0 | 520 | 2.3275 | 37.4775 | 15.2311 | 37.5157 | 37.3333 | 13.0435 |
2.6022 | 41.0 | 533 | 2.3233 | 38.1397 | 15.5276 | 38.2189 | 38.0466 | 12.6087 |
2.6022 | 42.0 | 546 | 2.3179 | 38.8484 | 16.743 | 38.8904 | 38.6264 | 12.4348 |
2.6022 | 43.0 | 559 | 2.3129 | 38.6908 | 16.743 | 38.751 | 38.5224 | 12.7391 |
2.6022 | 44.0 | 572 | 2.3033 | 38.9067 | 17.0916 | 38.8757 | 38.6766 | 12.913 |
2.6022 | 45.0 | 585 | 2.2955 | 37.9908 | 16.3768 | 38.0649 | 37.9065 | 13.1304 |
2.6022 | 46.0 | 598 | 2.2905 | 37.5099 | 16.3599 | 37.575 | 37.3251 | 13.3913 |
2.6022 | 47.0 | 611 | 2.2858 | 37.9779 | 16.3599 | 38.0428 | 37.861 | 13.3043 |
2.6022 | 48.0 | 624 | 2.2846 | 37.9779 | 16.3599 | 38.0428 | 37.861 | 13.3043 |
2.6022 | 49.0 | 637 | 2.2801 | 39.738 | 17.3604 | 39.8135 | 39.4361 | 12.5652 |
2.6022 | 50.0 | 650 | 2.2778 | 39.738 | 17.3604 | 39.8135 | 39.4361 | 12.5652 |
2.6022 | 51.0 | 663 | 2.2787 | 39.4204 | 17.2625 | 39.6063 | 39.1539 | 12.6957 |
2.6022 | 52.0 | 676 | 2.2742 | 40.4777 | 16.9406 | 40.6081 | 40.2489 | 12.4348 |
2.6022 | 53.0 | 689 | 2.2710 | 37.833 | 16.1811 | 37.919 | 37.8445 | 13.0435 |
2.6022 | 54.0 | 702 | 2.2670 | 37.4441 | 15.9708 | 37.5638 | 37.4827 | 13.3043 |
2.6022 | 55.0 | 715 | 2.2680 | 38.3007 | 16.4741 | 38.2323 | 38.0907 | 13.6957 |
2.6022 | 56.0 | 728 | 2.2662 | 37.7769 | 16.1816 | 37.786 | 37.6852 | 13.5217 |
2.6022 | 57.0 | 741 | 2.2657 | 37.855 | 16.1816 | 37.8885 | 37.8099 | 13.2174 |
2.6022 | 58.0 | 754 | 2.2631 | 37.7579 | 16.1816 | 37.8278 | 37.7224 | 13.3043 |
2.6022 | 59.0 | 767 | 2.2626 | 37.7579 | 16.1816 | 37.8278 | 37.7224 | 13.3043 |
2.6022 | 60.0 | 780 | 2.2600 | 39.4733 | 17.6398 | 39.195 | 39.1691 | 13.0435 |
2.6022 | 61.0 | 793 | 2.2603 | 39.4733 | 17.6398 | 39.195 | 39.1691 | 13.0435 |
2.6022 | 62.0 | 806 | 2.2592 | 39.397 | 17.6398 | 39.1585 | 39.1322 | 13.1304 |
2.6022 | 63.0 | 819 | 2.2572 | 39.3316 | 17.6398 | 39.0728 | 39.0356 | 13.4348 |
2.6022 | 64.0 | 832 | 2.2555 | 39.3316 | 17.6398 | 39.0728 | 39.0356 | 13.4348 |
2.6022 | 65.0 | 845 | 2.2549 | 39.3316 | 17.6398 | 39.0728 | 39.0356 | 13.4348 |
2.6022 | 66.0 | 858 | 2.2576 | 39.3566 | 17.6398 | 39.1001 | 39.0755 | 13.2609 |
2.6022 | 67.0 | 871 | 2.2570 | 38.8037 | 17.6398 | 38.6503 | 38.4646 | 13.087 |
2.6022 | 68.0 | 884 | 2.2573 | 38.8037 | 17.6398 | 38.6503 | 38.4646 | 13.087 |
2.6022 | 69.0 | 897 | 2.2570 | 39.3566 | 17.6398 | 39.1001 | 39.0755 | 13.2609 |
2.6022 | 70.0 | 910 | 2.2558 | 39.9086 | 17.2751 | 39.7524 | 39.6725 | 13.087 |
2.6022 | 71.0 | 923 | 2.2564 | 40.3049 | 17.4384 | 40.3022 | 40.1013 | 12.913 |
2.6022 | 72.0 | 936 | 2.2577 | 39.316 | 17.6398 | 39.0515 | 38.975 | 13.3478 |
2.6022 | 73.0 | 949 | 2.2571 | 39.3566 | 17.6398 | 39.1001 | 39.0755 | 13.2609 |
2.6022 | 74.0 | 962 | 2.2545 | 39.3566 | 17.6398 | 39.1001 | 39.0755 | 13.2609 |
2.6022 | 75.0 | 975 | 2.2526 | 38.8037 | 17.6398 | 38.6503 | 38.4646 | 13.087 |
2.6022 | 76.0 | 988 | 2.2519 | 39.3566 | 17.6398 | 39.1001 | 39.0755 | 13.2609 |
2.0634 | 77.0 | 1001 | 2.2514 | 38.6902 | 16.4479 | 38.1227 | 37.9513 | 13.1739 |
2.0634 | 78.0 | 1014 | 2.2503 | 38.6012 | 16.4479 | 38.0664 | 37.8945 | 13.2609 |
2.0634 | 79.0 | 1027 | 2.2489 | 38.6012 | 16.4479 | 38.0664 | 37.8945 | 13.2609 |
2.0634 | 80.0 | 1040 | 2.2473 | 38.6012 | 16.4479 | 38.0664 | 37.8945 | 13.2609 |
2.0634 | 81.0 | 1053 | 2.2454 | 38.1692 | 16.4479 | 37.6072 | 37.2747 | 13.087 |
2.0634 | 82.0 | 1066 | 2.2442 | 38.1692 | 16.4479 | 37.6072 | 37.2747 | 13.087 |
2.0634 | 83.0 | 1079 | 2.2448 | 38.4097 | 16.4479 | 38.3716 | 38.1151 | 13.087 |
2.0634 | 84.0 | 1092 | 2.2451 | 38.6179 | 16.4479 | 38.4637 | 38.2782 | 13.2609 |
2.0634 | 85.0 | 1105 | 2.2453 | 38.6419 | 16.4479 | 38.5054 | 38.3152 | 13.087 |
2.0634 | 86.0 | 1118 | 2.2450 | 38.9217 | 15.5035 | 38.8622 | 38.6592 | 12.8261 |
2.0634 | 87.0 | 1131 | 2.2447 | 38.9217 | 15.5035 | 38.8622 | 38.6592 | 12.8261 |
2.0634 | 88.0 | 1144 | 2.2449 | 38.9217 | 15.5035 | 38.8622 | 38.6592 | 12.8261 |
2.0634 | 89.0 | 1157 | 2.2446 | 39.0888 | 16.4223 | 39.0782 | 38.9078 | 12.5217 |
2.0634 | 90.0 | 1170 | 2.2448 | 39.0888 | 16.4223 | 39.0782 | 38.9078 | 12.5217 |
2.0634 | 91.0 | 1183 | 2.2449 | 39.0888 | 16.4223 | 39.0782 | 38.9078 | 12.5217 |
2.0634 | 92.0 | 1196 | 2.2445 | 39.0888 | 16.4223 | 39.0782 | 38.9078 | 12.5217 |
2.0634 | 93.0 | 1209 | 2.2442 | 39.0888 | 16.4223 | 39.0782 | 38.9078 | 12.5217 |
2.0634 | 94.0 | 1222 | 2.2439 | 39.0888 | 16.4223 | 39.0782 | 38.9078 | 12.5217 |
2.0634 | 95.0 | 1235 | 2.2443 | 39.0888 | 16.4223 | 39.0782 | 38.9078 | 12.5217 |
2.0634 | 96.0 | 1248 | 2.2442 | 39.0888 | 16.4223 | 39.0782 | 38.9078 | 12.5217 |
2.0634 | 97.0 | 1261 | 2.2439 | 39.0888 | 16.4223 | 39.0782 | 38.9078 | 12.5217 |
2.0634 | 98.0 | 1274 | 2.2440 | 39.0888 | 16.4223 | 39.0782 | 38.9078 | 12.5217 |
2.0634 | 99.0 | 1287 | 2.2437 | 39.0888 | 16.4223 | 39.0782 | 38.9078 | 12.5217 |
2.0634 | 100.0 | 1300 | 2.2438 | 39.0888 | 16.4223 | 39.0782 | 38.9078 | 12.5217 |
Framework versions
- Transformers 4.30.2
- Pytorch 2.0.1+cu118
- Datasets 2.13.0
- Tokenizers 0.13.3
- Downloads last month
- 105
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.