flan-t5-base-turkish-summarisation-qlora

This model is a fine-tuned version of google/flan-t5-base on the mlsum dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3322
  • Rouge1: 17.1573
  • Rouge2: 11.0617
  • Rougel: 16.4608
  • Rougelsum: 16.5524
  • Gen Len: 20.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.6967 0.0802 1000 1.3809 17.1243 10.8379 16.3406 16.4851 19.9931
1.5702 0.1605 2000 1.3652 17.1131 10.9041 16.3464 16.4593 20.0
1.5445 0.2407 3000 1.3527 17.1524 11.0021 16.417 16.5102 20.0
1.5434 0.3209 4000 1.3503 17.102 10.9546 16.3797 16.4841 20.0
1.5155 0.4012 5000 1.3451 17.1088 10.9964 16.3958 16.4918 20.0
1.5344 0.4814 6000 1.3392 17.1348 10.9911 16.4023 16.4937 20.0
1.5197 0.5616 7000 1.3461 17.1305 11.0374 16.4125 16.5098 20.0
1.4971 0.6418 8000 1.3360 17.1411 11.0398 16.4332 16.5279 20.0
1.5239 0.7221 9000 1.3350 17.0932 10.9845 16.3934 16.498 20.0
1.5081 0.8023 10000 1.3347 17.1313 11.0259 16.4294 16.5297 20.0
1.4829 0.8825 11000 1.3312 17.1703 11.0584 16.4678 16.5651 20.0
1.5018 0.9628 12000 1.3322 17.1573 11.0617 16.4608 16.5524 20.0

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.0
  • Pytorch 2.2.2
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
2
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Model tree for kaixkhazaki/flan-t5-base-turkish-summarisation-qlora

Adapter
(133)
this model

Dataset used to train kaixkhazaki/flan-t5-base-turkish-summarisation-qlora