|
--- |
|
tags: |
|
- generated_from_keras_callback |
|
model-index: |
|
- name: wav2vec2-xls-r-300m-mixed |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information Keras had access to. You should |
|
probably proofread and complete it, then remove this comment. --> |
|
|
|
# wav2vec2-xls-r-300m-mixed |
|
|
|
Finetuned https://huggingface.co/facebook/wav2vec2-xls-r-300m on https://github.com/huseinzol05/malaya-speech/tree/master/data/mixed-stt |
|
|
|
This model was finetuned on 3 languages, |
|
|
|
1. Malay |
|
2. Singlish |
|
3. Mandarin |
|
|
|
**This model trained on a single RTX 3090 Ti 24GB VRAM, provided by https://mesolitica.com/**. |
|
|
|
## Evaluation set |
|
|
|
Evaluation set from https://github.com/huseinzol05/malaya-speech/tree/master/pretrained-model/prepare-stt with sizes, |
|
|
|
``` |
|
len(malay), len(singlish), len(mandarin) |
|
-> (765, 3579, 614) |
|
``` |
|
|
|
It achieves the following results on the evaluation set based on [evaluate-gpu.ipynb](evaluate-gpu.ipynb): |
|
|
|
Mixed evaluation, |
|
|
|
``` |
|
CER: 0.0481054244857041 |
|
WER: 0.1322198446007387 |
|
CER with LM: 0.041196586938584696 |
|
WER with LM: 0.09880169127621556 |
|
``` |
|
|
|
Malay evaluation, |
|
|
|
``` |
|
CER: 0.051636391937588406 |
|
WER: 0.19561999547293663 |
|
CER with LM: 0.03917689630621449 |
|
WER with LM: 0.12710746406824835 |
|
``` |
|
|
|
Singlish evaluation, |
|
|
|
``` |
|
CER: 0.0494915200071987 |
|
WER: 0.12763802881676573 |
|
CER with LM: 0.04271234986432335 |
|
WER with LM: 0.09677160640413336 |
|
``` |
|
|
|
Mandarin evaluation, |
|
|
|
``` |
|
CER: 0.035626554824269824 |
|
WER: 0.07993515937860181 |
|
CER with LM: 0.03487760945087219 |
|
WER with LM: 0.07536807168546154 |
|
``` |
|
|
|
Language model from https://huggingface.co/huseinzol05/language-model-bahasa-manglish-combined |