metadata

license: other
tags:
  - text-to-speech
  - LLMs
  - zero-shot text-to-speech
inference: false
datasets:
  - LJSpeech
extra_gated_prompt: >-
  One more step before getting this model.

  This model is open access and available to all, with a license further
  specifying rights and usage.


  Any organization or individual is prohibited from using any technology
  mentioned in this paper to generate someone's speech without his/her consent,
  including but not limited to government leaders, political figures, and
  celebrities. If you do not comply with this item, you could be in violation of
  copyright laws.



  By clicking on "Access repository" below, you accept that your *contact
  information* (email address and username) can be shared with the model authors
  as well.
    
extra_gated_fields:
  I have read the License and agree with its terms: checkbox

MVoice Model Card

Model Details

Model type: Voice LLM for Zero-shot text-to-speech
Language(s): English, Mandarin
Resources for more information: MVoice GitHub Repository, MVoice Paper.
Cite as:

@article{huang2023make,
  title={Make-A-Voice: Unified Voice Synthesis With Discrete Representation},
  author={Huang, Rongjie and Zhang, Chunlei and Wang, Yongqi and Yang, Dongchao and Liu, Luping and Ye, Zhenhui and Jiang, Ziyue and Weng, Chao and Zhao, Zhou and Yu, Dong},
  journal={arXiv preprint arXiv:2305.19269},
  year={2023}
}

This model card was written based on the DALL-E Mini model card.