hubert-large-ll60k-librispeech-clean-100h-demo-dist

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Cer: 0.0316
  • Loss: 0.2143
  • Wer: 0.0995

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 256
  • total_eval_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Cer Validation Loss Wer
2.911 0.89 100 1.0 2.9202 1.0
2.6638 1.79 200 1.0 2.6310 1.0
0.3898 2.68 300 0.0968 0.3892 0.3366
0.2156 3.57 400 0.0591 0.2250 0.2090
0.1517 4.46 500 0.0474 0.1834 0.1695
0.1059 5.36 600 0.0428 0.1668 0.1502
0.0825 6.25 700 0.0393 0.1662 0.1406
0.0679 7.14 800 0.0393 0.1747 0.1357
0.0602 8.04 900 0.0390 0.1767 0.1334
0.0587 8.93 1000 0.0376 0.1708 0.1292
0.0517 9.82 1100 0.0372 0.1677 0.1255
0.0413 10.71 1200 0.0361 0.1771 0.1234
0.0418 11.61 1300 0.0358 0.1731 0.1229
0.0424 12.5 1400 0.0348 0.1796 0.1191
0.0469 13.39 1500 0.0358 0.1848 0.1207
0.0414 14.29 1600 0.0367 0.1863 0.1213
0.0338 15.18 1700 0.0347 0.1889 0.1177
0.0334 16.07 1800 0.0360 0.1900 0.1188
0.0315 16.96 1900 0.0346 0.1901 0.1158
0.0317 17.86 2000 0.0341 0.1790 0.1134
0.0264 18.75 2100 0.0356 0.1864 0.1159
0.0271 19.64 2200 0.0341 0.1861 0.1150
0.0272 20.54 2300 0.0339 0.1945 0.1129
0.0278 21.43 2400 0.0343 0.1950 0.1131
0.0254 22.32 2500 0.0330 0.2015 0.1097
0.0204 23.21 2600 0.0326 0.1952 0.1069
0.0259 24.11 2700 0.0330 0.1976 0.1103
0.0325 25.0 2800 0.0328 0.1958 0.1088
0.0359 25.89 2900 0.0346 0.1908 0.1105
0.0265 26.79 3000 0.0337 0.1991 0.1096
0.0223 27.68 3100 0.0345 0.1948 0.1107
0.025 28.57 3200 0.0330 0.2046 0.1077
0.0242 29.46 3300 0.0335 0.2055 0.1072
0.0187 30.36 3400 0.0307 0.1980 0.1021
0.0219 31.25 3500 0.0322 0.1998 0.1054
0.0198 32.14 3600 0.0322 0.2104 0.1048
0.0181 33.04 3700 0.0325 0.2093 0.1050
0.0166 33.93 3800 0.0315 0.2120 0.1032
0.0212 34.82 3900 0.0300 0.2021 0.1003
0.0214 35.71 4000 0.0316 0.2045 0.1033
0.016 36.61 4100 0.0302 0.2022 0.1000
0.0169 37.5 4200 0.0299 0.2060 0.0996
0.0191 38.39 4300 0.0307 0.2114 0.1006
0.0218 39.29 4400 0.0314 0.2066 0.1015
0.0182 40.18 4500 0.0300 0.2054 0.0988
0.0185 41.07 4600 0.0303 0.2050 0.0994
0.0171 41.96 4700 0.0306 0.2136 0.0994
0.0171 42.86 4800 0.0318 0.2062 0.1007
0.0161 43.75 4900 0.0319 0.2101 0.1013
0.0168 44.64 5000 0.0306 0.2111 0.0985
0.015 45.54 5100 0.0318 0.2110 0.1003
0.0126 46.43 5200 0.0319 0.2086 0.0999
0.0153 47.32 5300 0.0310 0.2095 0.0981
0.0172 48.21 5400 0.0310 0.2130 0.0985
0.017 49.11 5500 0.0316 0.2137 0.0994
0.0152 50.0 5600 0.0316 0.2143 0.0995

Framework versions

  • Transformers 4.39.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.8.0
  • Tokenizers 0.15.2
Downloads last month
10
Safetensors
Model size
315M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.