Dionyssos's picture
link
23acb9e
|
raw
history blame
3.09 kB
metadata
license: mit
language:
  - en
pipeline_tag: text-to-speech
tags:
  - audiocraft
  - audiogen
  - styletts2
  - audio
  - synthesis
  - shift
  - audeering
  - dkounadis
  - sound
  - scene
  - acoustic-scene
  - audio-generation

Affective TTS / SoundScapes

  • SHIFT TTS tool
  • Analysis of emotionality #1
  • Soundscape e.g. trees, water via AudioGen
  • landscape2soundscape.py shows how to overlay TTS & sound to image and create video

Available Voices

Listen to available voices!

Flask API

Clone this repo

git clone https://huggingface.co/dkounadis/artificial-styletts2

Build env

virtualenv --python=python3 ~/.envs/.my_env
source ~/.envs/.my_env/bin/activate
cd shift/
pip install -r requirements.txt

Start API

CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=./hf_home CUDA_VISIBLE_DEVICES=2 python api.py

The following need api.py to be already running on a tmux session.

Landscape 2 Soundscape

# TTS & soundscape - overlay to .mp4
python landscape2soundscape.py

YouTube Videos / Examples

Substitute Native voice via TTS

Native voice ANBPR video

Same video where Native voice is replaced with English TTS voice with similar emotion

Same video w. Native voice replaced with English TTS

Video dubbing from subtitles .srt

Video Dubbing

Review demo SHIFT

Generate dubbed video:

python tts.py --text assets/head_of_fortuna_en.srt --video assets/head_of_fortuna.mp4

Joint Application of D3.1 & D3.2

Foo4

From an image and text create a video:


python tts.py --text sample.txt --image assets/image_from_T31.jpg

Landscape 2 Soundscape

# Loads image & text & sound-scene text and creates .mp4
python landscape2soundscape.py

For SHIFT demo / Collaboration with SMB

  • YouTube Videos

01

02

10

Live Demo - Paplay

Flask

CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=/data/dkounadis/.hf7/ CUDA_VISIBLE_DEVICES=4 python live_api.py

Client (Ubutu)

python live_demo.py  # will ask text input & play soundscape