Text-to-Speech
Finnish
AsmoKoskinen's picture
Update README.md
8a137ed verified
|
raw
history blame
1.71 kB
metadata
license: cc-by-nc-4.0
datasets:
  - mozilla-foundation/common_voice_17_0
  - facebook/voxpopuli
  - mrfakename/librivox-full-catalog-archive
language:
  - fi
base_model:
  - SWivid/F5-TTS
pipeline_tag: text-to-speech

Here are three Finnish models of the F5-TTS, listen speech samples for models.

Numbers cannot be understood by models. Convert numbers to words.


The Common Voice and Vox Populi Finnish datasets are used for the first round.

  • 20241206

  • Speakers: Several speakers from different corpus

  • Use these with "f5-tts_infer-gradio":

Model: hf://AsmoKoskinen/F5-TTS_Finnish_Model/model_common_voice_fi_vox_populi_fi_20241206.safetensors

Vocab: hf://AsmoKoskinen/F5-TTS_Finnish_Model/vocab.txt


The second round is based on the Common Voice, LibriVox and Vox Populi Finnish data sets. Use this as a default one.

  • 20241217

  • Speakers: Several speakers from different corpus

  • Use these with "f5-tts_infer-gradio":

Model: hf://AsmoKoskinen/F5-TTS_Finnish_Model/model_commonvoice_fi_librivox_fi_vox_populi_fi_20241217/model_last_20241217.safetensors

Vocab: hf://AsmoKoskinen/F5-TTS_Finnish_Model/model_commonvoice_fi_librivox_fi_vox_populi_fi_20241217/vocab.txt


The third round is based on the Common Voice, LibriVox and Vox Populi Finnish data sets, same as the second round. This one is no better.

  • 20250125

  • Speakers: Several speakers from different corpus

  • Use these with "f5-tts_infer-gradio":

Model: hf://AsmoKoskinen/F5-TTS_Finnish_Model/model_commonvoice_fi_librivox_fi_vox_populi_fi_20250125/model_last_20250125.safetensors

Vocab: hf://AsmoKoskinen/F5-TTS_Finnish_Model/model_commonvoice_fi_librivox_fi_vox_populi_fi_20250125/vocab.txt