File size: 4,491 Bytes
3d230ac fd531a9 5621d05 fd531a9 3d230ac 6efa7ad 13aaba0 ad02a7c 63b00e7 5b103a9 c4db840 eb61bcc c72998f c65fbc1 98f3a36 7008756 5621d05 83021cf 4295a8b 061af09 429cd0e 8dbf041 f752918 38edf60 f58c4ce 3b0a65f 42a57be fd531a9 3d230ac |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
---
license: apache-2.0
base_model: google/mt5-large
tags:
- generated_from_keras_callback
model-index:
- name: pakawadeep/mt5-large-finetuned-ctfl-augmented
results: []
---
<!-- This model card has been generated automatically according to the information Keras had access to. You should
probably proofread and complete it, then remove this comment. -->
# pakawadeep/mt5-large-finetuned-ctfl-augmented
This model is a fine-tuned version of [google/mt5-large](https://huggingface.co/google/mt5-large) on an unknown dataset.
It achieves the following results on the evaluation set:
- Train Loss: 0.3089
- Validation Loss: 0.6932
- Train Rouge1: 8.6987
- Train Rouge2: 1.2871
- Train Rougel: 8.8402
- Train Rougelsum: 8.9993
- Train Gen Len: 11.9208
- Epoch: 23
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: float32
### Training results
| Train Loss | Validation Loss | Train Rouge1 | Train Rouge2 | Train Rougel | Train Rougelsum | Train Gen Len | Epoch |
|:----------:|:---------------:|:------------:|:------------:|:------------:|:---------------:|:-------------:|:-----:|
| 7.1544 | 4.7384 | 0.2888 | 0.0 | 0.2888 | 0.2475 | 16.6436 | 0 |
| 5.1911 | 2.1738 | 1.4851 | 0.3438 | 1.5242 | 1.5572 | 12.8564 | 1 |
| 3.6169 | 1.7018 | 5.5693 | 0.5776 | 5.5281 | 5.6931 | 11.6931 | 2 |
| 2.8957 | 1.4685 | 5.8581 | 0.8251 | 5.8581 | 5.9818 | 11.0347 | 3 |
| 1.9769 | 1.2807 | 6.6832 | 1.8152 | 6.8069 | 6.8688 | 11.4505 | 4 |
| 1.5772 | 1.1149 | 6.5064 | 1.1881 | 6.7185 | 6.7185 | 11.6485 | 5 |
| 1.3661 | 0.9914 | 8.4158 | 2.3762 | 8.4158 | 8.5809 | 11.8762 | 6 |
| 1.2399 | 0.8926 | 7.9915 | 2.1287 | 8.0269 | 8.2037 | 11.9604 | 7 |
| 1.0788 | 0.8530 | 8.4158 | 2.1287 | 8.6987 | 8.6987 | 11.9901 | 8 |
| 0.9825 | 0.8069 | 8.8637 | 2.3762 | 8.9345 | 9.0288 | 11.9653 | 9 |
| 0.9078 | 0.7803 | 8.4866 | 1.8812 | 8.6987 | 8.7341 | 11.9653 | 10 |
| 0.8409 | 0.7522 | 8.4866 | 1.8812 | 8.6987 | 8.7341 | 11.9802 | 11 |
| 0.7715 | 0.7171 | 8.2390 | 1.2871 | 8.4512 | 8.4866 | 11.9851 | 12 |
| 0.7063 | 0.7045 | 8.2390 | 1.2871 | 8.4512 | 8.4866 | 11.9505 | 13 |
| 0.6558 | 0.6797 | 8.2390 | 1.2871 | 8.4512 | 8.4866 | 11.9554 | 14 |
| 0.6074 | 0.6651 | 8.2390 | 1.2871 | 8.4512 | 8.4866 | 11.9455 | 15 |
| 0.5571 | 0.6555 | 8.2390 | 1.2871 | 8.4512 | 8.4866 | 11.9356 | 16 |
| 0.5126 | 0.6531 | 8.2390 | 1.2871 | 8.4512 | 8.4866 | 11.9257 | 17 |
| 0.4744 | 0.6481 | 8.2390 | 1.2871 | 8.4512 | 8.4866 | 11.9406 | 18 |
| 0.4356 | 0.6521 | 8.2390 | 1.2871 | 8.4512 | 8.4866 | 11.9406 | 19 |
| 0.3982 | 0.6618 | 8.2390 | 1.2871 | 8.4512 | 8.4866 | 11.9307 | 20 |
| 0.3667 | 0.6628 | 8.2390 | 1.2871 | 8.4512 | 8.4866 | 11.9208 | 21 |
| 0.3371 | 0.6723 | 8.2390 | 1.2871 | 8.4512 | 8.4866 | 11.9307 | 22 |
| 0.3089 | 0.6932 | 8.6987 | 1.2871 | 8.8402 | 8.9993 | 11.9208 | 23 |
### Framework versions
- Transformers 4.38.2
- TensorFlow 2.15.0
- Datasets 2.18.0
- Tokenizers 0.15.2
|