File size: 3,091 Bytes
6bf0fa9
019e2a3
 
 
 
 
 
 
 
 
 
 
 
 
8331c06
 
 
4345759
6bf0fa9
5d9a91a
 
2a2d5c1
faf666f
2a2d5c1
faf666f
23acb9e
2a2d5c1
5d9a91a
3f40cd9
5d9a91a
1fc3525
5d9a91a
23acb9e
 
 
 
 
 
 
 
c8dd13e
 
 
 
 
 
 
 
23acb9e
c8dd13e
 
 
 
 
2a2d5c1
5d9a91a
22a403e
972caea
 
 
 
 
 
2a2d5c1
c8dd13e
2502403
c8dd13e
 
 
 
 
 
 
 
 
 
 
8331c06
 
 
dfeb7ed
8331c06
 
 
c8dd13e
 
 
 
 
 
 
 
 
 
5d9a91a
 
 
8331c06
c8dd13e
8331c06
c8dd13e
da43e6e
 
 
2502403
c8dd13e
 
 
 
 
c4effd2
4b59bb9
2a2d5c1
4b59bb9
 
 
 
 
 
 
 
 
23acb9e
 
 
 
2a2d5c1
 
23acb9e
 
 
2a2d5c1
c4effd2
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
license: mit
language:
- en
pipeline_tag: text-to-speech
tags:
- audiocraft
- audiogen
- styletts2
- audio
- synthesis
- shift
- audeering
- dkounadis
- sound
- scene
- acoustic-scene
- audio-generation
---


# Affective TTS / SoundScapes

  - [SHIFT TTS tool](https://github.com/audeering/shift) 
  - Analysis of emotionality [#1](https://huggingface.co/dkounadis/artificial-styletts2/discussions/2)
  - Soundscape `e.g. trees, water` via [AudioGen](https://huggingface.co/dkounadis/artificial-styletts2/discussions/3)
  - `landscape2soundscape.py` shows how to overlay TTS & sound to image and create video

## Available Voices

<a href="https://audeering.github.io/shift/">Listen to available voices!</a>

## Flask API

Clone this repo

```
git clone https://huggingface.co/dkounadis/artificial-styletts2
```
Build env

```
virtualenv --python=python3 ~/.envs/.my_env
source ~/.envs/.my_env/bin/activate
cd shift/
pip install -r requirements.txt
```

Start API

```
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=./hf_home CUDA_VISIBLE_DEVICES=2 python api.py
```

The following need `api.py` to be already running `on a tmux session`. 

## Landscape 2 Soundscape

```python
# TTS & soundscape - overlay to .mp4
python landscape2soundscape.py
```

# YouTube Videos / Examples

Substitute Native voice via TTS

[![Native voice ANBPR video](assets/native_video_thumb.png)](https://www.youtube.com/watch?v=tmo2UbKYAqc)

##

Same video where Native voice is replaced with English TTS voice with similar emotion


[![Same video w. Native voice replaced with English TTS](assets/tts_video_thumb.png)](https://www.youtube.com/watch?v=geI1Vqn4QpY)


<details>
<summary>

Video dubbing from subtitles `.srt`

</summary>

## Video Dubbing

[![Review demo SHIFT](assets/review_demo_thumb.png)](https://www.youtube.com/watch?v=bpt7rOBENcQ)

Generate dubbed video:


```python
python tts.py --text assets/head_of_fortuna_en.srt --video assets/head_of_fortuna.mp4

```


</details>

## Joint Application of D3.1 & D3.2

<a href="https://youtu.be/wWC8DpOKVvQ" rel="Subtitles to Video">![Foo4](assets/caption_to_video_thumb.png)</a>


From an image and text create a video:

```python

python tts.py --text sample.txt --image assets/image_from_T31.jpg
```

## Landscape 2 Soundscape





```python
# Loads image & text & sound-scene text and creates .mp4
python landscape2soundscape.py
```

For SHIFT demo / Collaboration with [SMB](https://www.smb.museum/home/)
  - YouTube Videos


[![01](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____01_Schick_AII840_001.jpg)](https://youtu.be/SSi3gUO4GtY)

[![02](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____02_Constable_AI555_001.jpg)](https://youtu.be/2YjxAPkdXIc)

[![10](uc_spk_Landscape2Soundscape_Masterpieces_pics/thumb____10_Boecklin_967648_NG2-80_001_rsz.jpg)](https://www.youtube.com/watch?v=Y8QyYUgLaCg)


# Live Demo - Paplay

Flask

```python
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=/data/dkounadis/.hf7/ CUDA_VISIBLE_DEVICES=4 python live_api.py
```

Client (Ubutu)

```python
python live_demo.py  # will ask text input & play soundscape
```