PyTorch
hubert
AfriHuBERT / README.md
ajesujoba's picture
Update README.md
03a0fe3 verified
metadata
datasets:
  - mozilla-foundation/common_voice_17_0
  - TalTechNLP/VoxLingua107
language:
  - af
  - ar
  - ak
  - en
  - ee
  - fr
  - ha
  - ig
  - li
  - ln
  - mg
  - xh
  - yo
  - zu
  - pt
  - wo
  - ts
  - to
  - sw
  - sn
  - ny
base_model:
  - utter-project/mHuBERT-147

AfriHuBERT: A self-supervised speech representation model for African languages

Model description

This is multilingual self-supervised speech model based on mHuBERT-147.

Pretraining data

  • Dataset: AfriHuBERT was trained on data sources from 8 major sources which include: BibleTTS

Language Coverage

AfriHuBERT covers 44 languages in total