OuteAI
/

OuteTTS-0.3-500M

Model card Files Files and versions Community

edwko commited on 4 days ago

Commit

e6297e8

·

verified ·

1 Parent(s): 997b839

Update README.md

Files changed (1) hide show

README.md +9 -7

README.md CHANGED Viewed

@@ -117,12 +117,7 @@ model_config = outetts.HFModelConfig_v2(
 interface = outetts.InterfaceHF(model_version="0.3", cfg=model_config)
 # You can create a speaker profile for voice cloning, which is compatible across all backends.
-# speaker = interface.create_speaker(
-#    audio_path="path/to/audio/file.wav",
-#    transcript=None,            # Set to None to use Whisper for transcription
-#    whisper_model="turbo",      # Optional: specify Whisper model (default: "turbo")
-#    whisper_device=None,        # Optional: specify device for Whisper (default: None)
-# )
 # interface.save_speaker(speaker, "speaker.json")
 # speaker = interface.load_speaker("speaker.json")
@@ -144,8 +139,15 @@ output = interface.generate(config=gen_cfg)
 # Save the generated speech to a file
 output.save("output.wav")
 ```
 > [!IMPORTANT]
-> ## For additional usage examples and recommendations, visit the [GitHub repository](https://github.com/edwko/OuteTTS?tab=readme-ov-file#usage).
 ---

 interface = outetts.InterfaceHF(model_version="0.3", cfg=model_config)
 # You can create a speaker profile for voice cloning, which is compatible across all backends.
+# speaker = interface.create_speaker(audio_path="path/to/audio/file.wav")
 # interface.save_speaker(speaker, "speaker.json")
 # speaker = interface.load_speaker("speaker.json")
 # Save the generated speech to a file
 output.save("output.wav")
 ```
+### Additional Usage Examples
+> [!IMPORTANT]
+> For additional usage examples and recommendations, visit the: [GitHub repository](https://github.com/edwko/OuteTTS?tab=readme-ov-file#usage).
+### Generation Performance
 > [!IMPORTANT]
+> The model performs best with 30-second generation batches. This window is reduced based on the length of your speaker samples. For example, if the speaker reference sample is 10 seconds, the effective window becomes approximately 20 seconds. I am currently working on adding batched generation capabilities to the library, along with further improvements that are not yet implemented.
 ---