You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Whisper Small Xhosa - Beijuka Bruno

This model is a fine-tuned version of openai/whisper-small on the NCHLT_speech_corpus/Xhosa dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6980
  • Model Preparation Time: 0.0077
  • Wer: 68.2920
  • Cer: 30.3879

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Wer Cer
0.8602 1.0 671 0.5255 0.0077 41.0282 8.8247
0.1398 2.0 1342 0.4073 0.0077 31.3360 7.2621
0.0391 3.0 2013 0.3998 0.0077 27.0138 6.4473
0.015 4.0 2684 0.3925 0.0077 26.7518 6.1563
0.0081 5.0 3355 0.4090 0.0077 24.4597 5.5205
0.0045 6.0 4026 0.4028 0.0077 23.8376 5.5429
0.0029 7.0 4697 0.4145 0.0077 23.6739 5.5518
0.0024 8.0 5368 0.4073 0.0077 23.2482 5.4936
0.0024 9.0 6039 0.4387 0.0077 24.0013 5.5876
0.0034 10.0 6710 0.4386 0.0077 24.7872 5.9861
0.004 11.0 7381 0.4363 0.0077 23.4447 5.4265
0.0027 12.0 8052 0.4621 0.0077 23.0190 5.5966
0.0021 13.0 8723 0.4396 0.0077 23.5102 5.6638
0.0048 14.0 9394 0.4492 0.0077 24.0341 5.8608
0.0033 15.0 10065 0.4337 0.0077 22.3314 5.4578
0.002 16.0 10736 0.4442 0.0077 23.1827 5.5787
0.0024 17.0 11407 0.4648 0.0077 22.8225 5.6056
0.0037 18.0 12078 0.4663 0.0077 24.3615 6.1428
0.0022 19.0 12749 0.4539 0.0077 24.0013 7.9158
0.0018 20.0 13420 0.4558 0.0077 21.7420 5.4802
0.0008 21.0 14091 0.4777 0.0077 22.5606 5.4488
0.0032 22.0 14762 0.4593 0.0077 22.7898 5.9906
0.0018 23.0 15433 0.4592 0.0077 24.2305 8.0949
0.002 24.0 16104 0.4815 0.0077 23.7394 6.2145
0.0019 25.0 16775 0.5052 0.0077 21.6110 5.6503
0.0008 26.0 17446 0.4684 0.0077 22.7570 6.9129
0.0003 27.0 18117 0.4699 0.0077 21.2181 5.3772
0.0024 28.0 18788 0.4757 0.0077 23.7721 5.9145
0.0017 29.0 19459 0.4717 0.0077 22.0367 5.3459
0.0009 30.0 20130 0.4779 0.0077 23.4119 5.6951
0.0009 31.0 20801 0.4776 0.0077 22.6588 5.7085
0.0022 32.0 21472 0.4808 0.0077 23.6084 5.8428
0.0015 33.0 22143 0.5002 0.0077 23.7394 6.0309
0.0013 34.0 22814 0.4833 0.0077 22.4623 5.7041
0.001 35.0 23485 0.4900 0.0077 22.4951 5.4265
0.0006 36.0 24156 0.4907 0.0077 22.3641 5.4847
0.0004 37.0 24827 0.4819 0.0077 21.9057 5.5071

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.1.0+cu118
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
0
Safetensors
Model size
242M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for asr-africa/whisper_NCHLT_speech_corpus_Xhosa_5hr_v1

Finetuned
(2156)
this model

Collection including asr-africa/whisper_NCHLT_speech_corpus_Xhosa_5hr_v1

Evaluation results