Automatic Speech Recognition
ESPnet
Javanese
audio
Siddhant's picture
import from zenodo
3feb9b8

RESULTS

Environments

  • date: Fri Jul 9 17:56:55 PDT 2021
  • python version: 3.8.5 (default, Sep 4 2020, 07:30:14) [GCC 7.3.0]
  • espnet version: espnet 0.10.0
  • pytorch version: pytorch 1.8.1+cu102
  • Git hash: 5830a6b49a60ae10b8c113a2b9635ec2273fbdab
    • Commit date: Fri Jul 9 08:36:41 2021 -0700

asr_train_asr_raw_bpe1000

WER

dataset Snt Wrd Corr Sub Del Ins Err S.Err
decode_asr_batch_size1_asr_model_valid.acc.best/dev_iban 473 11006 2.5 54.0 43.5 0.1 97.7 100.0
decode_asr_batch_size1_asr_model_valid.acc.best/java_test 1740 12117 81.9 16.4 1.7 0.9 19.0 52.3
decode_asr_batch_size1_asr_model_valid.acc.best/test_id_commonvoice 1643 9565 15.7 69.4 14.9 3.3 87.6 99.9

CER

dataset Snt Wrd Corr Sub Del Ins Err S.Err
decode_asr_batch_size1_asr_model_valid.acc.best/dev_iban 473 67025 53.0 17.6 29.3 5.4 52.3 100.0
decode_asr_batch_size1_asr_model_valid.acc.best/java_test 1740 80419 95.4 2.6 2.0 0.8 5.4 52.3
decode_asr_batch_size1_asr_model_valid.acc.best/test_id_commonvoice 1643 61563 69.4 14.3 16.3 5.0 35.6 99.9

TER

dataset Snt Wrd Corr Sub Del Ins Err S.Err
decode_asr_batch_size1_asr_model_valid.acc.best/dev_iban 473 22012 1.2 96.4 2.4 12.1 110.8 100.0
decode_asr_batch_size1_asr_model_valid.acc.best/java_test 1740 26604 84.6 10.6 4.8 1.2 16.6 52.3
decode_asr_batch_size1_asr_model_valid.acc.best/test_id_commonvoice 1643 27446 39.7 42.0 18.3 3.1 63.5 99.9