barthez-deft-chimie

This model is a fine-tuned version of moussaKam/barthez on an unknown dataset.

Note: this model is one of the preliminary experiments and it underperforms the models published in the paper (using MBartHez and HAL/Wiki pre-training + copy mechanisms)

It achieves the following results on the evaluation set:

  • Loss: 2.0710
  • Rouge1: 31.8947
  • Rouge2: 16.7563
  • Rougel: 23.5428
  • Rougelsum: 23.4918
  • Gen Len: 38.5256

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
3.8022 1.0 118 2.5491 16.8208 7.0027 13.957 14.0479 19.1538
2.9286 2.0 236 2.3074 17.5356 7.8717 14.4874 14.5044 19.9487
2.5422 3.0 354 2.2322 19.6491 9.4156 15.9467 15.9433 19.7051
2.398 4.0 472 2.1500 18.7166 9.859 15.7535 15.8036 19.9231
2.2044 5.0 590 2.1372 19.978 10.6235 16.1348 16.1274 19.6154
1.9405 6.0 708 2.0992 20.226 10.551 16.6928 16.7211 19.9744
1.8544 7.0 826 2.0841 19.8869 10.8456 16.1072 16.097 19.8846
1.7536 8.0 944 2.0791 19.3017 9.4921 16.1541 16.2167 19.859
1.6914 9.0 1062 2.0710 21.3848 10.4088 17.1963 17.2254 19.8846
1.654 10.0 1180 2.1069 22.3811 10.7987 18.7595 18.761 19.9231
1.5899 11.0 1298 2.0919 20.8546 10.6958 16.8637 16.9499 19.8077
1.4661 12.0 1416 2.1065 22.3677 11.7472 18.262 18.3 19.9744
1.4205 13.0 1534 2.1164 20.5845 10.7825 16.9972 17.0216 19.9359
1.3797 14.0 1652 2.1240 22.2561 11.303 17.5064 17.5815 19.9744
1.3724 15.0 1770 2.1187 23.2825 11.912 18.5208 18.5499 19.9359
1.3404 16.0 1888 2.1394 22.1305 10.5258 17.772 17.8202 19.9744
1.2846 17.0 2006 2.1502 21.567 11.0557 17.2562 17.2974 20.0
1.2871 18.0 2124 2.1572 22.5871 11.702 18.2906 18.3826 19.9744
1.2422 19.0 2242 2.1613 23.0935 11.6824 18.6087 18.6777 19.9744
1.2336 20.0 2360 2.1581 22.6789 11.4363 18.1661 18.2346 19.9487

Framework versions

  • Transformers 4.10.2
  • Pytorch 1.7.1+cu110
  • Datasets 1.11.0
  • Tokenizers 0.10.3
Downloads last month
3
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.