Runtime error

maskgct / pretrained /
Hecheng0625's picture
Upload 409 files
c968fc3 verified
# Pretrained Models Dependency
The models dependency of Amphion are as follows (sort alphabetically):
- [Pretrained Models Dependency](#pretrained-models-dependency)
- [Amphion Singing BigVGAN](#amphion-singing-bigvgan)
- [Amphion Speech HiFi-GAN](#amphion-speech-hifi-gan)
- [ContentVec](#contentvec)
- [WeNet](#wenet)
- [Whisper](#whisper)
- [RawNet3](#rawnet3)
The instructions about how to download them is displayed as follows.
## Amphion Singing BigVGAN
We fine-tune the official BigVGAN pretrained model with over 120 hours singing voice data. The fine-tuned checkpoint can be downloaded [here]( You need to download the `` and `args.json` files into `Amphion/pretrained/bigvgan`:
┣ pretrained
┃ ┣ bivgan
┃ ┃ ┣
┃ ┃ ┣ args.json
## Amphion Speech HiFi-GAN
We trained our HiFi-GAN pretrained model with 685 hours speech data. Which can be downloaded [here]( You need to download the whole folder of `hifigan_speech` into `Amphion/pretrained/hifigan`.
┣ pretrained
┃ ┣ hifigan
┃ ┃ ┣ hifigan_speech
┃ ┃ ┃ ┣ log
┃ ┃ ┃ ┣ result
┃ ┃ ┃ ┣ checkpoint
┃ ┃ ┃ ┣ args.json
## Amphion DiffWave
We trained our DiffWave pretrained model with 125 hours speech data and around 80 hours of singing voice data. Which can be downloaded [here]( You need to download the whole folder of `diffwave` into `Amphion/pretrained/diffwave`.
┣ pretrained
┃ ┣ diffwave
┃ ┃ ┣ diffwave_speech
┃ ┃ ┃ ┣ samples
┃ ┃ ┃ ┣ checkpoint
┃ ┃ ┃ ┣ args.json
## ContentVec
You can download the pretrained ContentVec model [here]( Note that we use the `ContentVec_legacy-500 classes` checkpoint. Assume that you download the `` into the `Amphion/pretrained/contentvec`.
┣ pretrained
┃ ┣ contentvec
┃ ┃ ┣
## WeNet
You can download the pretrained WeNet model [here]( Take the `wenetspeech` pretrained checkpoint as an example, assume you download the `wenetspeech_u2pp_conformer_exp.tar` into the `Amphion/pretrained/wenet`. Unzip it and modify its configuration file as follows:
cd Amphion/pretrained/wenet
### Unzip the expt dir
tar -xvf wenetspeech_u2pp_conformer_exp.tar.gz
### Specify the updated path in train.yaml
cd 20220506_u2pp_conformer_exp
vim train.yaml
# TODO: Change the value of "cmvn_file" (Line 2) to the absolute path of the `global_cmvn` file. (Eg: [YourPath]/Amphion/pretrained/wenet/20220506_u2pp_conformer_exp/global_cmvn)
The final file struture tree is like:
┣ pretrained
┃ ┣ wenet
┃ ┃ ┣ 20220506_u2pp_conformer_exp
┃ ┃ ┃ ┣
┃ ┃ ┃ ┣ global_cmvn
┃ ┃ ┃ ┣ train.yaml
┃ ┃ ┃ ┣ units.txt
## Whisper
The official pretrained whisper checkpoints can be available [here]( In Amphion, we use the `medium` whisper model by default. You can download it as follows:
cd Amphion/pretrained
mkdir whisper
cd whisper
The final file structure tree is like:
┣ pretrained
┃ ┣ whisper
┃ ┃ ┣
## RawNet3
The official pretrained RawNet3 checkpoints can be available [here]( You need to download the `` file and put it in the folder.
The final file structure tree is like:
┣ pretrained
┃ ┣ rawnet3
┃ ┃ ┣
# (Optional) Model Dependencies for Evaluation
When utilizing Amphion's Evaluation Pipelines, terminals without access to `` may encounter error messages such as "OSError: Can't load tokenizer for ...". To work around this, the dependant models for evaluation can be pre-prepared and stored here, at `Amphion/pretrained`, and follow [this README](../egs/metrics/ to configure your environment to load local models.
The dependant models of Amphion's evaluation pipeline are as follows (sort alphabetically):
- [Evaluation Pipeline Models Dependency](#optional-model-dependencies-for-evaluation)
- [bert-base-uncased](#bert-base-uncased)
- [facebook/bart-base](#facebookbart-base)
- [roberta-base](#roberta-base)
- [wavlm](#wavlm)
The instructions about how to download them is displayed as follows.
## bert-base-uncased
To load `bert-base-uncased` locally, follow [this link]( to download all files for `bert-base-uncased` model, and store them under `Amphion/pretrained/bert-base-uncased`, conforming to the following file structure tree:
┣ pretrained
┃ ┣ bert-base-uncased
┃ ┃ ┣ config.json
┃ ┃ ┣ coreml
┃ ┃ ┃ ┣ fill-mask
┃ ┃ ┃ ┣ float32_model.mlpackage
┃ ┃ ┃ ┣ Data
┃ ┃ ┃ ┣
┃ ┃ ┃ ┣ model.mlmodel
┃ ┃ ┣ flax_model.msgpack
┃ ┃ ┣ model.onnx
┃ ┃ ┣ model.safetensors
┃ ┃ ┣ pytorch_model.bin
┃ ┃ ┣
┃ ┃ ┣ rust_model.ot
┃ ┃ ┣ tf_model.h5
┃ ┃ ┣ tokenizer_config.json
┃ ┃ ┣ tokenizer.json
┃ ┃ ┣ vocab.txt
## facebook/bart-base
To load `facebook/bart-base` locally, follow [this link]( to download all files for `facebook/bart-base` model, and store them under `Amphion/pretrained/facebook/bart-base`, conforming to the following file structure tree:
┣ pretrained
┃ ┣ facebook
┃ ┃ ┣ bart-base
┃ ┃ ┃ ┣ config.json
┃ ┃ ┃ ┣ flax_model.msgpack
┃ ┃ ┃ ┣ gitattributes.txt
┃ ┃ ┃ ┣ merges.txt
┃ ┃ ┃ ┣ model.safetensors
┃ ┃ ┃ ┣ pytorch_model.bin
┃ ┃ ┃ ┣ README.txt
┃ ┃ ┃ ┣ rust_model.ot
┃ ┃ ┃ ┣ tf_model.h5
┃ ┃ ┃ ┣ tokenizer.json
┃ ┃ ┃ ┣ vocab.json
## roberta-base
To load `roberta-base` locally, follow [this link]( to download all files for `roberta-base` model, and store them under `Amphion/pretrained/roberta-base`, conforming to the following file structure tree:
┣ pretrained
┃ ┣ roberta-base
┃ ┃ ┣ config.json
┃ ┃ ┣ dict.txt
┃ ┃ ┣ flax_model.msgpack
┃ ┃ ┣ gitattributes.txt
┃ ┃ ┣ merges.txt
┃ ┃ ┣ model.safetensors
┃ ┃ ┣ pytorch_model.bin
┃ ┃ ┣ README.txt
┃ ┃ ┣ rust_model.ot
┃ ┃ ┣ tf_model.h5
┃ ┃ ┣ tokenizer.json
┃ ┃ ┣ vocab.json
## wavlm
The official pretrained wavlm checkpoints can be available [here]( The file structure tree is as follows:
┣ wavlm
┃ ┣ config.json
┃ ┣ preprocessor_config.json
┃ ┣ pytorch_model.bin