File size: 3,587 Bytes
6bf0fa9 019e2a3 8c00071 019e2a3 8331c06 4345759 5067878 6bf0fa9 5d9a91a 2a2d5c1 faf666f 2a2d5c1 7fa53df 8a23304 5d9a91a 3f40cd9 5d9a91a 1fc3525 5d9a91a 23acb9e 5067878 c8dd13e 8a23304 7fa53df 8a23304 7fa53df 5067878 c8dd13e 7abc8f8 c8dd13e 794578f 5067878 5ffcd95 c8dd13e 794578f c8dd13e 22a403e 972caea 7fa53df 7184f5f 972caea 23acb9e 2a2d5c1 23acb9e 7184f5f 4377106 2639eaf 794578f 4377106 6ab4672 4377106 dd13de0 2a2d5c1 081324f 4377106 c4effd2 5ffcd95 c4effd2 4377106 c4effd2 5067878 459d7a3 5ffcd95 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
---
license: mit
language:
- en
pipeline_tag: text-to-audio
tags:
- audiocraft
- audiogen
- styletts2
- shift
- audeering
- sound
- audio-generation
- text-to-speech
- mimic3
---
# Affective TTS / SoundScapes
- [SHIFT TTS tool](https://github.com/audeering/shift)
- Analysis of TTS emotionality [#1](https://huggingface.co/dkounadis/artificial-styletts2/discussions/2)
- Soundscapes `trees, water, ..` via [AudioGen](https://huggingface.co/dkounadis/artificial-styletts2/discussions/3)
- `landscape2soundscape.py` generates soundscape / overlays TTS / creates video from image.
## Available Voices
<a href="https://audeering.github.io/shift/">Listen to available voices!</a>
## Flask API
<details>
<summary>
Create virtualenv
</summary>
Clone
```
git clone https://huggingface.co/dkounadis/artificial-styletts2
```
Install
```
virtualenv --python=python3 ~/.envs/.my_env
source ~/.envs/.my_env/bin/activate
cd artificial-styletts2/
pip install -r requirements.txt
```
</details>
Flask API
```
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=./hf_home CUDA_VISIBLE_DEVICES=2 python api.py
```
## Landscape 2 Soundscape
The following needs `api.py` to be already running on a tmux session.
```python
# TTS & soundscape - overlay to .mp4
python landscape2soundscape.py
```
For SHIFT demo / Collaboration with [SMB](https://www.smb.museum/home/)
- YouTube Videos
[![01](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____01_Schick_AII840_001.jpg)](https://youtu.be/SSi3gUO4GtY)
[![02](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____02_Constable_AI555_001.jpg)](https://youtu.be/2YjxAPkdXIc)
[![03](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____03_Schinkel_WS200-002.jpg)](https://youtu.be/BhMh02knkco)
[![05](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____05_Blechen_FV40_001.jpg)](https://youtu.be/a3qk9S87v60)
[![06](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____06_Menzel_AI900_001.jpg)](https://youtu.be/3M0y9OYzDfU)
[![07](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____07_Courbet_AI967_001.jpg)](https://youtu.be/OBY666_By1k)
[![08](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____08_Monet_AI1013_001.jpg)](https://youtu.be/gnGCYLcdLsA)
[![10](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____10_Boecklin_967648_NG2-80_001_rsz.jpg)](https://www.youtube.com/watch?v=Y8QyYUgLaCg)
[![11](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____11_Liebermann_NG4-94_001.jpg)](https://youtu.be/XDDzxDSrhb0)
[![12](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____12_Slevogt_AII1022_001.jpg)](https://youtu.be/I3YYKiUzHpA)
# Live Demo - Paplay
Special Flask API for playing sounds live
```python
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=/data/dkounadis/.hf7/ CUDA_VISIBLE_DEVICES=4 python live_api.py
```
Client - Describe any sound with words and it will be played back to you.
```python
python live_demo.py # will ask text input & play soundscape
```
# Simple Demo
```python
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=/data/dkounadis/.hf7/ CUDA_VISIBLE_DEVICES=4 python demo.py
```
# AudioBook
Convert your `.docx` to audio `.wav`. Via multiple voices, then concatenate all `audiobooks.wav` made with each voice to a full one
`concatenate audiobook has noisy speech, the individual single-voice audiobooks are clean, some issue with ffmpeg`. Therefore, for now, SHIFT repo only produces
single-voice audiobook. Archiving the multiple-voice `audiobook.py` here.
```python
# uses Flask api.py
# needs to load ../shift/assets/INCLUSION_IN_MUSEUMS_audiobook.docx
#
#
python audiobook.py
```
|