vtrungnhan9 commited on
Commit
7bfe67a
·
verified ·
1 Parent(s): 0c05877

update readme

Browse files
Files changed (1) hide show
  1. README.md +75 -1
README.md CHANGED
@@ -6,4 +6,78 @@ language:
6
  library_name: diffusers
7
  tags:
8
  - music
9
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  library_name: diffusers
7
  tags:
8
  - music
9
+ ---
10
+ # AudioLDM 2 Music for Zalo AI Challenge 2023
11
+
12
+ <!-- Provide a quick summary of what the model is/does. -->
13
+
14
+ This checkpoint is the result of finetuning AudioLDM 2 Music (https://huggingface.co/cvssp/audioldm2-music) on the challenge dataset + MusicCaps (https://www.kaggle.com/datasets/googleai/musiccaps)
15
+
16
+ ## Uses
17
+
18
+ First, install the required packages:
19
+
20
+ ```
21
+ pip install --upgrade diffusers transformers accelerate
22
+ ```
23
+
24
+ ### Text-to-Audio
25
+
26
+ ```python
27
+ from diffusers import AudioLDM2Pipeline
28
+ import torch
29
+
30
+ repo_id = "vtrungnhan9/audioldm2-music-zac2023"
31
+ pipe = AudioLDM2Pipeline.from_pretrained(repo_id, torch_dtype=torch.float16)
32
+ pipe = pipe.to("cuda")
33
+
34
+ prompt = "This music is instrumental. The tempo is medium with synthesiser arrangements, digital drums and electronic music. The music is upbeat, pulsating, youthful, buoyant, exciting, punchy, psychedelic and has propulsive beats with a dance groove. This music is Techno Pop/EDM."
35
+ neg_prompt = "bad quality"
36
+ audio = pipe(prompt, negative_prompt=neg_prompt, num_inference_steps=200, audio_length_in_s=10.0, guidance_scale=10).audios[0]
37
+ ```
38
+
39
+ The resulting audio output can be saved as a .wav file:
40
+ ```python
41
+ import scipy
42
+
43
+ scipy.io.wavfile.write("techno.wav", rate=16000, data=audio)
44
+ ```
45
+
46
+ Or displayed in a Jupyter Notebook / Google Colab:
47
+ ```python
48
+ from IPython.display import Audio
49
+
50
+ Audio(audio, rate=16000)
51
+ ```
52
+
53
+ ## Training Details
54
+
55
+ ### Training Data
56
+
57
+ * You can download the challenge dataset from link: https://challenge.zalo.ai/portal/background-music-generation
58
+ * You can download MusicCaps from link: https://www.kaggle.com/datasets/googleai/musiccaps
59
+
60
+ [More Information Needed]
61
+
62
+ ### Training Procedure
63
+
64
+ Please refer at https://github.com/declare-lab/tango/blob/master/train.py for training procedure
65
+
66
+ ## Citation
67
+
68
+ **BibTeX:**
69
+
70
+ ```
71
+ @article{liu2023audioldm2,
72
+ title={"AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining"},
73
+ author={Haohe Liu and Qiao Tian and Yi Yuan and Xubo Liu and Xinhao Mei and Qiuqiang Kong and Yuping Wang and Wenwu Wang and Yuxuan Wang and Mark D. Plumbley},
74
+ journal={arXiv preprint arXiv:2308.05734},
75
+ year={2023}
76
+ }
77
+ ```
78
+
79
+ ## Model Card Contact
80
+
81
82
+
83
+