metadata
datasets:
- mozilla-foundation/common_voice_17_0
- TalTechNLP/VoxLingua107
language:
- af
- ar
- ak
- en
- ee
- fr
- ha
- ig
- li
- ln
- mg
- xh
- yo
- zu
- pt
- wo
- ts
- to
- sw
- sn
- ny
base_model:
- utter-project/mHuBERT-147
AfriHuBERT: A self-supervised speech representation model for African languages
Model description
This is multilingual self-supervised speech model based on mHuBERT-147.
Pretraining data
- Dataset: AfriHuBERT was trained on data sources from 8 major sources which include: BibleTTS
Language Coverage
AfriHuBERT covers 44 languages in total