Spaces:
Running
on
Zero
Running
on
Zero
sanchit-gandhi
commited on
Commit
·
33d12bd
1
Parent(s):
80ca0fc
html
Browse files
app.py
CHANGED
@@ -325,8 +325,8 @@ with gr.Blocks(css=css) as block:
|
|
325 |
|
326 |
<p>Tips for ensuring good generation:
|
327 |
<ul>
|
328 |
-
<li>Include the term "very clear audio" to generate the highest quality audio, and "very noisy audio" for high levels of background noise</li>
|
329 |
-
<li>When using the fine-tuned model, include the term "Jenny" to pick out her voice</li>
|
330 |
<li>Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech</li>
|
331 |
<li>The remaining speech features (gender, speaking rate, pitch and reverberation) can be controlled directly through the prompt</li>
|
332 |
</ul>
|
@@ -368,9 +368,8 @@ with gr.Blocks(css=css) as block:
|
|
368 |
<p>To improve the prosody and naturalness of the speech further, we're scaling up the amount of training data to 50k hours of speech.
|
369 |
The v1 release of the model will be trained on this data, as well as inference optimisations, such as flash attention
|
370 |
and torch compile, that will improve the latency by 2-4x. If you want to find out more about how this model was trained and even fine-tune it yourself, check-out the
|
371 |
-
<a href="https://github.com/huggingface/parler-tts"> Parler-TTS</a> repository on GitHub
|
372 |
-
|
373 |
-
<p>The Parler-TTS codebase and its associated checkpoints are licensed under <a href='https://github.com/huggingface/parler-tts?tab=Apache-2.0-1-ov-file#readme'> Apache 2.0</a>.</p>
|
374 |
"""
|
375 |
)
|
376 |
|
|
|
325 |
|
326 |
<p>Tips for ensuring good generation:
|
327 |
<ul>
|
328 |
+
<li>Include the term <b>"very clear audio"</b> to generate the highest quality audio, and "very noisy audio" for high levels of background noise</li>
|
329 |
+
<li>When using the fine-tuned model, include the term <b>"Jenny"</b> to pick out her voice</li>
|
330 |
<li>Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech</li>
|
331 |
<li>The remaining speech features (gender, speaking rate, pitch and reverberation) can be controlled directly through the prompt</li>
|
332 |
</ul>
|
|
|
368 |
<p>To improve the prosody and naturalness of the speech further, we're scaling up the amount of training data to 50k hours of speech.
|
369 |
The v1 release of the model will be trained on this data, as well as inference optimisations, such as flash attention
|
370 |
and torch compile, that will improve the latency by 2-4x. If you want to find out more about how this model was trained and even fine-tune it yourself, check-out the
|
371 |
+
<a href="https://github.com/huggingface/parler-tts"> Parler-TTS</a> repository on GitHub. The Parler-TTS codebase and its
|
372 |
+
associated checkpoints are licensed under <a href='https://github.com/huggingface/parler-tts?tab=Apache-2.0-1-ov-file#readme'> Apache 2.0</a>.</p>
|
|
|
373 |
"""
|
374 |
)
|
375 |
|