esahit's picture
First training run with the ul2-small-dutch model (no prompt-tuning, full model finetuning) on the increased dataset
00d5f01 verified
metadata
library_name: transformers
license: apache-2.0
base_model: yhavinga/ul2-small-dutch
tags:
  - generated_from_trainer
model-index:
  - name: ul2-small-dutch-finetuned-oba-book-search-1
    results: []

ul2-small-dutch-finetuned-oba-book-search-1

This model is a fine-tuned version of yhavinga/ul2-small-dutch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.2663
  • Top-5-accuracy: 0.0597

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.3
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Top-5-accuracy
6.4124 0.0848 500 6.3536 0.0
6.7952 0.1696 1000 6.8349 0.0
6.7698 0.2544 1500 6.5339 0.0
6.8332 0.3392 2000 6.5288 0.0
6.7705 0.4239 2500 7.1165 0.0
6.997 0.5087 3000 6.8968 0.0
6.7059 0.5935 3500 6.8425 0.0
6.8888 0.6783 4000 6.3301 0.0
6.4895 0.7631 4500 6.7270 0.0
6.6173 0.8479 5000 6.1497 0.0796
6.2961 0.9327 5500 5.9938 0.0796
6.2324 1.0175 6000 5.8314 0.0
6.3068 1.1023 6500 6.1160 0.0
5.9052 1.1870 7000 5.8330 0.0
6.2636 1.2718 7500 5.7442 0.0
5.9638 1.3566 8000 5.9032 0.0
5.4903 1.4414 8500 5.2657 0.0
5.3572 1.5262 9000 5.3021 0.0
5.268 1.6110 9500 4.8651 0.0
5.1901 1.6958 10000 4.8621 0.0
4.9671 1.7806 10500 4.8482 0.0
4.8376 1.8654 11000 4.7743 0.0
4.7365 1.9501 11500 5.2166 0.0
4.529 2.0349 12000 4.4299 0.0796
4.3585 2.1197 12500 4.2365 0.0199
4.2613 2.2045 13000 4.0390 0.0
4.1379 2.2893 13500 3.8705 0.1791
3.9627 2.3741 14000 3.6937 0.1791
3.8334 2.4589 14500 3.6306 0.0
3.6814 2.5437 15000 3.5674 0.2388
3.6308 2.6285 15500 3.4017 0.1990
3.5401 2.7132 16000 3.3286 0.0597
3.5336 2.7980 16500 3.2694 0.0199
3.4879 2.8828 17000 3.2742 0.0398
3.4277 2.9676 17500 3.2663 0.0597

Framework versions

  • Transformers 4.44.2
  • Pytorch 1.13.0+cu116
  • Datasets 3.0.0
  • Tokenizers 0.19.1