Whisper Larget V3 GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-large-v3 on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0552
  • Bleu: 11.86
  • Chrf: 28.37
  • Wer: 127.1049

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.03
  • training_steps: 8000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Bleu Chrf Validation Loss Wer
2.5918 0.0138 100 0.61 8.48 2.1791 238.2260
2.476 0.0276 200 0.63 10.43 2.1702 275.7317
2.2358 0.0414 300 4.76 19.98 2.0420 120.0810
2.1778 0.0552 400 2.78 12.85 1.9506 86.8528
1.9779 0.0690 500 4.53 18.47 1.8609 137.1905
1.9435 0.0828 600 6.67 22.37 1.7726 82.4403
1.7928 0.0966 700 4.54 17.32 1.7445 133.8586
1.9004 0.1103 800 1.58 12.65 1.7290 195.2724
1.7856 0.1241 900 4.84 17.5 1.6990 83.9262
1.6783 0.1379 1000 8.46 24.24 1.6329 113.5074
1.6095 0.1517 1100 7.35 20.22 1.6083 102.5214
1.6328 0.1655 1200 11.46 25.29 1.5267 76.5871
1.6093 0.1793 1300 6.51 17.77 1.4947 112.4719
1.5776 0.1931 1400 6.21 19.86 1.4952 90.6348
1.4767 0.2069 1500 4.86 19.57 1.4515 145.1148
1.3447 0.2207 1600 6.77 19.96 1.3974 90.5448
1.3273 0.2345 1700 4.77 16.31 1.4323 152.1837
1.4253 0.2483 1800 3.95 15.66 1.3598 173.2553
1.3505 0.2621 1900 11.25 23.4 1.3517 80.3692
1.2593 0.2759 2000 12.71 26.55 1.3236 77.5777
1.2483 0.2897 2100 17.88 32.0 1.2825 73.3003
1.161 0.3034 2200 10.08 20.69 1.2567 115.8937
1.1597 0.3172 2300 8.61 19.54 1.2581 93.8766
1.0937 0.3310 2400 12.37 25.67 1.2577 99.0095
1.0606 0.3448 2500 6.46 23.47 1.2228 172.9401
1.039 0.3586 2600 9.55 21.56 1.2186 89.7794
1.0193 0.3724 2700 3.08 17.58 1.1844 281.8100
1.1153 0.3862 2800 2.69 18.38 1.1693 350.2927
1.012 0.4 2900 3.56 14.74 1.1233 194.9122
0.8936 0.4138 3000 5.21 17.38 1.1161 158.3521
0.8893 0.4276 3100 11.52 25.02 1.1119 80.9095
0.9491 0.4414 3200 5.93 20.91 1.1213 174.0207
0.9233 0.4552 3300 5.54 20.95 1.0656 186.2224
0.8915 0.4690 3400 7.26 23.99 1.0736 155.6506
0.8296 0.4828 3500 6.74 21.46 1.0461 146.1054
0.8163 0.4966 3600 11.35 24.11 1.0706 101.8010
0.8115 0.5103 3700 12.84 26.92 1.0199 115.8487
0.8245 0.5241 3800 12.47 24.29 1.0163 101.9361
0.7988 0.5379 3900 15.29 28.54 0.9891 92.7960
0.769 0.5517 4000 15.23 28.15 0.9885 92.7060
0.9048 0.5655 4100 1.1588 11.58 25.38 84.6466
1.015 0.5793 4200 1.1907 8.93 18.79 86.6276
0.9254 0.5931 4300 1.1832 7.96 20.76 80.2792
0.9458 0.6069 4400 1.1789 12.03 25.59 82.6204
0.9783 0.6207 4500 1.1607 7.62 20.23 100.8555
0.9935 0.6345 4600 1.2477 8.89 21.49 81.7650
0.9747 0.6483 4700 1.1994 14.51 28.26 76.5421
0.9794 0.6621 4800 1.1219 16.11 27.49 81.1796
0.8919 0.6759 4900 1.1540 5.19 19.48 139.9820
0.8333 0.6897 5000 1.1388 9.38 20.8 84.6015
0.9083 0.7034 5100 1.1244 6.71 22.08 176.0018
0.8039 0.7172 5200 1.1072 11.42 21.77 107.2040
0.8064 0.7310 5300 1.0705 8.89 17.34 122.8276
0.8319 0.7448 5400 1.0968 7.64 24.95 170.0585
0.7984 0.7586 5500 1.1110 10.44 24.66 79.2886
0.7288 0.7724 5600 1.0820 10.4 23.09 82.5754
0.8128 0.7862 5700 1.1287 12.13 25.86 96.9833
0.7016 0.8 5800 1.0698 4.84 21.49 207.7893
0.7456 0.8138 5900 1.0809 5.53 22.33 204.9077
0.7575 0.8276 6000 1.0611 6.24 27.03 196.4430
0.6076 0.8414 6100 1.0868 7.93 22.14 134.7591
0.6913 0.8552 6200 1.0786 8.25 19.46 84.1963
0.6251 0.8690 6300 1.0372 8.69 21.0 83.4309
0.6357 0.8828 6400 1.0408 13.83 25.16 83.2508
0.666 0.8966 6500 1.0528 9.45 21.12 101.8910
0.6397 0.9103 6600 1.0394 8.21 20.5 118.1450
0.6475 0.9241 6700 1.0438 4.72 20.26 191.9856
0.642 0.9379 6800 1.0421 4.84 21.12 200.1801
0.6867 0.9517 6900 1.0231 5.44 21.48 214.3629
0.5254 0.9655 7000 1.0436 9.96 24.2 131.6074
0.599 0.9793 7100 1.0231 16.23 30.07 86.4475
0.6589 0.9931 7200 1.0365 12.51 26.46 107.5191
0.3222 1.0069 7300 1.0790 9.22 24.16 131.4723
0.3309 1.0207 7400 1.1012 7.17 25.54 166.2314
0.3402 1.0345 7500 1.0839 14.56 28.4 98.1990
0.3004 1.0483 7600 1.0615 15.49 29.84 104.0522
0.2561 1.0621 7700 1.0724 11.72 28.66 125.1688
0.3021 1.0759 7800 1.0592 10.85 28.55 130.3917
0.2932 1.0897 7900 1.0554 11.62 28.17 123.8631
0.2619 1.1034 8000 1.0552 11.86 28.37 127.1049

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.2+git70dfd51
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
22
Safetensors
Model size
1.54B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ymoslem/whisper-large-v3-ga2en-v3.1.0-r

Finetuned
(361)
this model

Datasets used to train ymoslem/whisper-large-v3-ga2en-v3.1.0-r

Evaluation results

  • Bleu on IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop
    self-reported
    11.860
  • Wer on IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop
    self-reported
    127.105