long_t5_bos

This model is a fine-tuned version of google/long-t5-tglobal-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8505
  • Rouge1: 0.2354
  • Rouge2: 0.0709
  • Rougel: 0.1685
  • Rougelsum: 0.1685
  • Gen Len: 121.0824

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 19 2.3142 0.2398 0.0511 0.1415 0.1414 104.2824
No log 2.0 38 2.1693 0.2385 0.0504 0.1413 0.1413 103.3765
No log 3.0 57 2.0813 0.2431 0.0529 0.1453 0.1456 103.7412
No log 4.0 76 2.0266 0.2425 0.0521 0.144 0.1443 103.2588
No log 5.0 95 1.9855 0.2367 0.049 0.1415 0.1413 100.8824
No log 6.0 114 1.9570 0.2396 0.0523 0.1452 0.1451 101.5412
No log 7.0 133 1.9336 0.2437 0.0549 0.149 0.1489 99.6706
No log 8.0 152 1.9153 0.2442 0.0592 0.1554 0.1553 103.1176
No log 9.0 171 1.9022 0.2397 0.0584 0.1557 0.1555 103.7294
No log 10.0 190 1.8911 0.2363 0.061 0.159 0.1587 108.9412
No log 11.0 209 1.8834 0.2366 0.0639 0.1644 0.1641 112.7765
No log 12.0 228 1.8758 0.2399 0.0671 0.1646 0.1647 109.3529
No log 13.0 247 1.8691 0.2469 0.0715 0.1728 0.1726 113.4588
No log 14.0 266 1.8640 0.2439 0.071 0.1731 0.1727 114.3176
No log 15.0 285 1.8598 0.242 0.0711 0.17 0.1703 113.9647
No log 16.0 304 1.8561 0.2434 0.0732 0.1707 0.1708 113.1059
No log 17.0 323 1.8534 0.2391 0.0723 0.1705 0.1705 119.5647
No log 18.0 342 1.8517 0.2358 0.0709 0.1684 0.1682 121.0824
No log 19.0 361 1.8508 0.2354 0.0709 0.1685 0.1685 121.0824
No log 20.0 380 1.8505 0.2354 0.0709 0.1685 0.1685 121.0824

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.2.0+cu118
  • Datasets 3.0.0
  • Tokenizers 0.19.1
Downloads last month
5
Safetensors
Model size
248M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for zera09/long_t5_bos

Finetuned
(20)
this model