RoyalCities
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,673 @@
|
|
1 |
-
---
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: en
|
3 |
+
tags:
|
4 |
+
- audio
|
5 |
+
- music-generation
|
6 |
+
- sample-generation
|
7 |
+
- EDM
|
8 |
+
- Audio-to-Audio
|
9 |
+
- fine-tuning
|
10 |
+
- stable-audio
|
11 |
+
datasets:
|
12 |
+
- custom
|
13 |
+
model_name: Audialab - EDM Elements 2024
|
14 |
+
base_model: stabilityai/stable-audio-open-1.0
|
15 |
+
license: other
|
16 |
+
license_name: stabilityai-community-license
|
17 |
+
license_link: https://stability.ai/license
|
18 |
+
---
|
19 |
+
|
20 |
+
|
21 |
+
<center><img src="https://i.imgur.com/T8diXrL.jpeg" alt="DS Image" width="120%"></center>
|
22 |
+
|
23 |
+
<center>
|
24 |
+
<h2 style="font-size: 30px;"><u>Audialab - EDM Elements 2024</u></h2>
|
25 |
+
</center>
|
26 |
+
<center>
|
27 |
+
<h2 style="font-size: 19px;">Introduction</h2>
|
28 |
+
</center>
|
29 |
+
This model specializes in generating high quality key-locked and tempo synced samples to support granular music production workflows. It features <b></b>high musicality, robust Audio-to-Audio capabilities and is capable of generating several effects with each sample.</b> The model can generate an infinite variety of musical components with a specialization in supersaw chord progressions, top line melody leads, bell plucks, and bass pluck riffs. All output is BPM-synced and key-locked to any note within the 12-tone chromatic scale, in both major and minor keys. Developed over several weeks, this model was trained on a custom internal dataset and features several distinct sound types:
|
30 |
+
|
31 |
+
<table style="width: 100%; border-collapse: collapse; margin: 20px 0;">
|
32 |
+
<tr>
|
33 |
+
<th style="border: 1px solid #000; padding: 12px; text-align: left; background-color: #f2f2f2;">Sound Description</th>
|
34 |
+
<th style="border: 1px solid #000; padding: 12px; text-align: center; background-color: #f2f2f2;">Audio</th>
|
35 |
+
</tr>
|
36 |
+
<tr>
|
37 |
+
<td style="border: 1px solid #000; padding: 12px; font-weight: bold;">1. Sine / Bell Plucks</td>
|
38 |
+
<td style="border: 1px solid #000; padding: 12px; text-align: center;">
|
39 |
+
<audio controls style="width: 200px;">
|
40 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Sine%2C%20Bell%2C%20Pluck%2C%20chord%20progression%20with%20slow%20melody%2C%20B%20minor%2C%20high%20reverb%2C%20rising%20low-pass%2C%208%20bars%2C%20140BPM%2C%202.mp3" type="audio/mpeg">
|
41 |
+
Your browser does not support the audio element.
|
42 |
+
</audio>
|
43 |
+
</td>
|
44 |
+
</tr>
|
45 |
+
<tr>
|
46 |
+
<td style="border: 1px solid #000; padding: 12px; font-weight: bold;">2. Lead Square Legato Synth</td>
|
47 |
+
<td style="border: 1px solid #000; padding: 12px; text-align: center;">
|
48 |
+
<audio controls style="width: 200px;">
|
49 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/legato_ex_1.mp3" type="audio/mpeg">
|
50 |
+
Your browser does not support the audio element.
|
51 |
+
</audio>
|
52 |
+
</td>
|
53 |
+
</tr>
|
54 |
+
<tr>
|
55 |
+
<td style="border: 1px solid #000; padding: 12px; font-weight: bold;">3. Lead Square Warm Synth</td>
|
56 |
+
<td style="border: 1px solid #000; padding: 12px; text-align: center;">
|
57 |
+
<audio controls style="width: 200px;">
|
58 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Synth%2C%20saw%2C%20fast%20speed%2C%20falling%20arp%2C%20medium%20reverb%2C%20falling%20high-cut%2C%20C%20major%2C%204%2C%20150%2C%201.mp3" type="audio/mpeg">
|
59 |
+
Your browser does not support the audio element.
|
60 |
+
</audio>
|
61 |
+
</td>
|
62 |
+
</tr>
|
63 |
+
<tr>
|
64 |
+
<td style="border: 1px solid #000; padding: 12px; font-weight: bold;">4. Bass Plucks</td>
|
65 |
+
<td style="border: 1px solid #000; padding: 12px; text-align: center;">
|
66 |
+
<audio controls style="width: 200px;">
|
67 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Sine%2C%20Bass%2C%20Bounce%2C%20Catchy%20Melody%2C%20G%20minor%2C%208%20bars%2C%20110BPM%2C%203.mp3" type="audio/mpeg">
|
68 |
+
Your browser does not support the audio element.
|
69 |
+
</audio>
|
70 |
+
</td>
|
71 |
+
</tr>
|
72 |
+
<tr>
|
73 |
+
<td style="border: 1px solid #000; padding: 12px; font-weight: bold;">5. Supersaw Leads & Chord Progressions</td>
|
74 |
+
<td style="border: 1px solid #000; padding: 12px; text-align: center;">
|
75 |
+
<audio controls style="width: 200px;">
|
76 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Supersaw%2C%20dance%20chord%20progression%2C%20top%20triplet%20melody%2C%20F%23%20minor%2C%20medium%20reverb%2C%2C%208%20bars%2C%20130BPM%2C%202.mp3" type="audio/mpeg">
|
77 |
+
Your browser does not support the audio element.
|
78 |
+
</audio>
|
79 |
+
</td>
|
80 |
+
</tr>
|
81 |
+
</table>
|
82 |
+
|
83 |
+
Furthermore, <b>the model is capable of generating multiple post-processing effects, </b> with <i>several</i> different levels of control based on the prompt alone. All effects can be prompted <b>individually or together.</b>
|
84 |
+
|
85 |
+
- **1. Reverb** - Small Reverb, Medium Reverb, High Reverb
|
86 |
+
- **2. EQ Sweeps** - Rising Low-Pass, Falling High-Cut
|
87 |
+
- **3. Gate Effect** - Quarter-Beat gate, Half-Beat Gate
|
88 |
+
|
89 |
+
<b>The model has high Audio-To-Audio / Style Transfer capabilities.</b> For example you can upload a .wav sample and have the model turn the sample into say a Supersaw version with a rising low-pass.
|
90 |
+
|
91 |
+
<div style="text-align: center; margin: 20px 0;">
|
92 |
+
<table style="width: 100%; border-collapse: collapse; margin: 0 auto;">
|
93 |
+
<thead>
|
94 |
+
<tr>
|
95 |
+
<th style="border: 1px solid #000; padding: 8px; text-align: center;">Prompt</th>
|
96 |
+
<th style="border: 1px solid #000; padding: 8px; text-align: center;">Settings</th>
|
97 |
+
<th style="border: 1px solid #000; padding: 8px; text-align: center;">Before</th>
|
98 |
+
<th style="border: 1px solid #000; padding: 8px; text-align: center;">After</th>
|
99 |
+
</tr>
|
100 |
+
</thead>
|
101 |
+
<tbody>
|
102 |
+
<tr>
|
103 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
104 |
+
Supersaw, high reverb, rising low-pass
|
105 |
+
</td>
|
106 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
107 |
+
<b>CFG:</b> 10.0 / <b>Noise:</b> 1.29
|
108 |
+
</td>
|
109 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
110 |
+
<audio controls style="width: 150px;">
|
111 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/a2a_1_in.mp3" type="audio/mpeg">
|
112 |
+
Your browser does not support the audio element.
|
113 |
+
</audio>
|
114 |
+
</td>
|
115 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
116 |
+
<audio controls style="width: 150px;">
|
117 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/a2a_1_in_130a2a_out_%20Supersaw%2C%20high%20reverb%2C%20rising%20low-pass%20CFG%2010%20noise%201_29.mp3" type="audio/mpeg">
|
118 |
+
Your browser does not support the audio element.
|
119 |
+
</audio>
|
120 |
+
</td>
|
121 |
+
</tr>
|
122 |
+
</tbody>
|
123 |
+
</table>
|
124 |
+
</div>
|
125 |
+
|
126 |
+
<center>
|
127 |
+
<h2 style="font-size: 19px;">Model Features</h2>
|
128 |
+
</center>
|
129 |
+
|
130 |
+
- **Multiple Types of Sample Generation:** - Outputs multiple types of high quality EDM-Centric samples with several variations on melody and chord progressions.
|
131 |
+
- **Dynamic FX Chain:** - Add multiple effects based on prompt alone (prompts above) - i.e add "Medium Reverb" for a medium reverb effect on the output.
|
132 |
+
- **Tonal Versatility:** - Generates stems in any key across the 12-tone chromatic scale, in both major and minor scales.
|
133 |
+
- **Triplet Time Signature Support:** - Add "triplets" to any prompt and get triplet outputs.
|
134 |
+
- **AI Style Trasnfer / Audio to Audio:** - The model has robust AI Style Transfer capabilities.
|
135 |
+
- **Speed Controls Independent of BPM:** - The model can be prompted with "Slow Speed", "Medium Speed" & "Fast Speed" that is independent of BPM. i.e. ask for an arp at a "Fast speed" at 140BPM vs "slow speed" at 140BPM and the model will subdivide the notes accordingly while staying within the same bpm.
|
136 |
+
- **Simplified Scale Notation:** - Scales are written using <b><i>sharps only</i></b> in the following format:
|
137 |
+
|
138 |
+
<pre>
|
139 |
+
<b>Minor Scales</b>
|
140 |
+
A minor, A# minor, B minor, C minor, C# minor, D minor, D# minor,
|
141 |
+
E minor, F minor, F# minor, G minor, G# minor
|
142 |
+
|
143 |
+
<b>Major Scales</b>
|
144 |
+
A major, A# major, B major, C major, C# major, D major, D# major,
|
145 |
+
E major, F major, F# major, G major, G# major
|
146 |
+
</pre>
|
147 |
+
|
148 |
+
|
149 |
+
<center>
|
150 |
+
<h2 style="font-size: 19px;">Training Methodology</h2>
|
151 |
+
</center>
|
152 |
+
|
153 |
+
This model was designed to understand and generate several types of samples:
|
154 |
+
1. **Chord Progression Samples**
|
155 |
+
2. **Chord Progression with Melody Samples**
|
156 |
+
3. **Melody Only Samples**
|
157 |
+
4. **Supersaw Dance Chord Progressions**
|
158 |
+
5. **Bass Plucks / Riffs**
|
159 |
+
6. **Arps**
|
160 |
+
|
161 |
+
|
162 |
+
By exposing the model to various musical motifs and distinct sample differences, it has SOTA musicality and melody structure with output samples ready to use in music production.
|
163 |
+
|
164 |
+
<center>
|
165 |
+
<h2 style="font-size: 24px;"><u>Prompt Structure</u></h2>
|
166 |
+
</center>
|
167 |
+
|
168 |
+
Both the Audialab Engine (VST) and RC Github fork are best used when working with this model. The RC Github also features a randomized prompt button tuned to this models metadata.
|
169 |
+
|
170 |
+
If you wish to prompt outside of these interfaces - then to ensure the best results use the following format for your prompts:
|
171 |
+
<pre><b>
|
172 |
+
[Sound Type], [Modifier][Chord Progression], [Modifier][Melody Type], [Key], [FX], [BPM], [Bar Count]
|
173 |
+
</b></pre>
|
174 |
+
|
175 |
+
[Sound Type Prompts]
|
176 |
+
- **1. Bell Plucks:** - 'Pluck, Sine, Bright, Clean, Bell'
|
177 |
+
- **2. Lead Square Legato Synth** - 'Lead, Square, Synth, Buzzy, Legato'
|
178 |
+
- **3. Lead Square Warm Synth** - 'Lead, Saw, Synth, Warm, Supersaw'
|
179 |
+
- **4. Pluck Bass** - 'Bass, Punchy, Pluck, Clean, Sub, Sine'
|
180 |
+
- **5. Supersaw Leads & Chord Progressions** 'Supersaw, Synth, Warm, Saw'
|
181 |
+
|
182 |
+
<b><u>Major prompt terms and their effects</u></b>
|
183 |
+
- **Chord Progression** - produces chord progressions
|
184 |
+
- **Arp** - produces arps
|
185 |
+
- **Melody** - adds melodies
|
186 |
+
- **Triplets** - outputs triplet time
|
187 |
+
- **Bounce** - more syncopassion/ off beat rhythm
|
188 |
+
- **Epic** - encourages more interesting melodic output
|
189 |
+
- **Simple** - slower / simpler melodies / chord progressions
|
190 |
+
- **Slow speed** - slower chord progressions, arps and melodies
|
191 |
+
- **Medium speed** - encouraged 4/4 chord changes or melodies
|
192 |
+
- **Fast speed** - higher subdivisions / faster arps and melodies
|
193 |
+
- **Complex** - adds complex melodies or chord structure.
|
194 |
+
- **Rising** - encourages rising arps / melodies
|
195 |
+
- **Falling** - encourages falling arps / melodies
|
196 |
+
- **See random prompt document for further terms / examples** [HERE](./prompt_list_examples.md)
|
197 |
+
|
198 |
+
<center>
|
199 |
+
<h2 style="font-size: 20px;"><u>Prompt Examples with Audio</u></h2>
|
200 |
+
</center>
|
201 |
+
<!-- Audio Examples Table -->
|
202 |
+
<div style="text-align: center; margin: 20px 0;">
|
203 |
+
<table style="width: 100%; border-collapse: collapse; margin: 0 auto;">
|
204 |
+
<thead>
|
205 |
+
<tr>
|
206 |
+
<th style="border: 1px solid #000; padding: 8px; text-align: left;">Prompt</th>
|
207 |
+
<th style="border: 1px solid #000; padding: 8px; text-align: center;">Example 1</th>
|
208 |
+
<th style="border: 1px solid #000; padding: 8px; text-align: center;">Example 2</th>
|
209 |
+
<th style="border: 1px solid #000; padding: 8px; text-align: center;">Example 3</th>
|
210 |
+
</tr>
|
211 |
+
</thead>
|
212 |
+
<tbody>
|
213 |
+
<tr>
|
214 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: left;">
|
215 |
+
<b>Bell pluck,</b> chord progression, top catchy melody, E major, <b>high reverb,</b> 8 bars, 100BPM
|
216 |
+
</td>
|
217 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
218 |
+
<audio controls style="width: 150px;">
|
219 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Bell%20pluck%2C%20chord%20progression%2C%20top%20catchy%20melody%2C%20E%20major%2C%20%20high%20reverb%2C%208%20bars%2C%20100BPM%2C%201.mp3" type="audio/mpeg">
|
220 |
+
Your browser does not support the audio element.
|
221 |
+
</audio>
|
222 |
+
</td>
|
223 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
224 |
+
<audio controls style="width: 150px;">
|
225 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Bell%20pluck%2C%20chord%20progression%2C%20top%20catchy%20melody%2C%20E%20major%2C%20%20high%20reverb%2C%208%20bars%2C%20100BPM%2C%202.mp3" type="audio/mpeg">
|
226 |
+
Your browser does not support the audio element.
|
227 |
+
</audio>
|
228 |
+
</td>
|
229 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
230 |
+
<audio controls style="width: 150px;">
|
231 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Bell%20pluck%2C%20chord%20progression%2C%20top%20catchy%20melody%2C%20E%20major%2C%20%20high%20reverb%2C%208%20bars%2C%20100BPM%2C%203.mp3" type="audio/mpeg">
|
232 |
+
Your browser does not support the audio element.
|
233 |
+
</audio>
|
234 |
+
</td>
|
235 |
+
</tr>
|
236 |
+
<tr>
|
237 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: left;">
|
238 |
+
<b>Saw, supersaw,</b> chord progression, top melody, E minor, <b>rising low-pass,</b> 4 bars, 128BPM
|
239 |
+
</td>
|
240 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
241 |
+
<audio controls style="width: 150px;">
|
242 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Saw%2C%20supersaw%2C%20chord%20progression%2C%20top%20melody%2C%20E%20minor%2C%20rising%20low-pass%2C%20%2C%204%20bars%2C%20128BPM%2C%201.mp3" type="audio/mpeg">
|
243 |
+
Your browser does not support the audio element.
|
244 |
+
</audio>
|
245 |
+
</td>
|
246 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
247 |
+
<audio controls style="width: 150px;">
|
248 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Saw%2C%20supersaw%2C%20chord%20progression%2C%20top%20melody%2C%20E%20minor%2C%20rising%20low-pass%2C%20%2C%204%20bars%2C%20128BPM%2C%202.mp3" type="audio/mpeg">
|
249 |
+
Your browser does not support the audio element.
|
250 |
+
</audio>
|
251 |
+
</td>
|
252 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
253 |
+
<audio controls style="width: 150px;">
|
254 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Saw%2C%20supersaw%2C%20chord%20progression%2C%20top%20melody%2C%20E%20minor%2C%20rising%20low-pass%2C%20%2C%204%20bars%2C%20128BPM%2C%203.mp3" type="audio/mpeg">
|
255 |
+
Your browser does not support the audio element.
|
256 |
+
</audio>
|
257 |
+
</td>
|
258 |
+
</tr>
|
259 |
+
<tr>
|
260 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: left;">
|
261 |
+
<b>Sine, Bass,</b> Bounce, Catchy Melody, G minor, 8 bars, 110BPM
|
262 |
+
</td>
|
263 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
264 |
+
<audio controls style="width: 150px;">
|
265 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Sine%2C%20Bass%2C%20Bounce%2C%20Catchy%20Melody%2C%20G%20minor%2C%208%20bars%2C%20110BPM%2C%201.mp3" type="audio/mpeg">
|
266 |
+
Your browser does not support the audio element.
|
267 |
+
</audio>
|
268 |
+
</td>
|
269 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
270 |
+
<audio controls style="width: 150px;">
|
271 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Sine%2C%20Bass%2C%20Bounce%2C%20Catchy%20Melody%2C%20G%20minor%2C%208%20bars%2C%20110BPM%2C%202.mp3" type="audio/mpeg">
|
272 |
+
Your browser does not support the audio element.
|
273 |
+
</audio>
|
274 |
+
</td>
|
275 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
276 |
+
<audio controls style="width: 150px;">
|
277 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Sine%2C%20Bass%2C%20Bounce%2C%20Catchy%20Melody%2C%20G%20minor%2C%208%20bars%2C%20110BPM%2C%203.mp3" type="audio/mpeg">
|
278 |
+
Your browser does not support the audio element.
|
279 |
+
</audio>
|
280 |
+
</td>
|
281 |
+
</tr>
|
282 |
+
<tr>
|
283 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: left;">
|
284 |
+
<b>Sine, Bell, Pluck,</b> chord progression with slow melody, B minor, <b>high reverb, rising low-pass,</b> 8 bars, 140BPM
|
285 |
+
</td>
|
286 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
287 |
+
<audio controls style="width: 150px;">
|
288 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Sine%2C%20Bell%2C%20Pluck%2C%20chord%20progression%20with%20slow%20melody%2C%20B%20minor%2C%20high%20reverb%2C%20rising%20low-pass%2C%208%20bars%2C%20140BPM%2C%201.mp3" type="audio/mpeg">
|
289 |
+
Your browser does not support the audio element.
|
290 |
+
</audio>
|
291 |
+
</td>
|
292 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
293 |
+
<audio controls style="width: 150px;">
|
294 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Sine%2C%20Bell%2C%20Pluck%2C%20chord%20progression%20with%20slow%20melody%2C%20B%20minor%2C%20high%20reverb%2C%20rising%20low-pass%2C%208%20bars%2C%20140BPM%2C%202.mp3" type="audio/mpeg">
|
295 |
+
Your browser does not support the audio element.
|
296 |
+
</audio>
|
297 |
+
</td>
|
298 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
299 |
+
<audio controls style="width: 150px;">
|
300 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Sine%2C%20Bell%2C%20Pluck%2C%20chord%20progression%20with%20slow%20melody%2C%20B%20minor%2C%20high%20reverb%2C%20rising%20low-pass%2C%208%20bars%2C%20140BPM%2C%203.mp3" type="audio/mpeg">
|
301 |
+
Your browser does not support the audio element.
|
302 |
+
</audio>
|
303 |
+
</td>
|
304 |
+
</tr>
|
305 |
+
<tr>
|
306 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: left;">
|
307 |
+
<b>Supersaw,</b> dance chord progression, top triplet melody, F# minor, <b>medium reverb,</b> 8 bars, 130BPM
|
308 |
+
</td>
|
309 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
310 |
+
<audio controls style="width: 150px;">
|
311 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Supersaw%2C%20dance%20chord%20progression%2C%20top%20triplet%20melody%2C%20F%23%20minor%2C%20medium%20reverb%2C%2C%208%20bars%2C%20130BPM%2C%201.mp3" type="audio/mpeg">
|
312 |
+
Your browser does not support the audio element.
|
313 |
+
</audio>
|
314 |
+
</td>
|
315 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
316 |
+
<audio controls style="width: 150px;">
|
317 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Supersaw%2C%20dance%20chord%20progression%2C%20top%20triplet%20melody%2C%20F%23%20minor%2C%20medium%20reverb%2C%2C%208%20bars%2C%20130BPM%2C%202.mp3" type="audio/mpeg">
|
318 |
+
Your browser does not support the audio element.
|
319 |
+
</audio>
|
320 |
+
</td>
|
321 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
322 |
+
<audio controls style="width: 150px;">
|
323 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Supersaw%2C%20dance%20chord%20progression%2C%20top%20triplet%20melody%2C%20F%23%20minor%2C%20medium%20reverb%2C%2C%208%20bars%2C%20130BPM%2C%203.mp3" type="audio/mpeg">
|
324 |
+
Your browser does not support the audio element.
|
325 |
+
</audio>
|
326 |
+
</td>
|
327 |
+
</tr>
|
328 |
+
<tr>
|
329 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: left;">
|
330 |
+
<b>Synth, saw,</b> fast speed, falling arp, <b>medium reverb, falling high-cut,</b> C major, 4 bars, 150 BPM
|
331 |
+
</td>
|
332 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
333 |
+
<audio controls style="width: 150px;">
|
334 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Synth%2C%20saw%2C%20fast%20speed%2C%20falling%20arp%2C%20medium%20reverb%2C%20falling%20high-cut%2C%20C%20major%2C%204%2C%20150%2C%201.mp3" type="audio/mpeg">
|
335 |
+
Your browser does not support the audio element.
|
336 |
+
</audio>
|
337 |
+
</td>
|
338 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
339 |
+
<audio controls style="width: 150px;">
|
340 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Synth%2C%20saw%2C%20fast%20speed%2C%20falling%20arp%2C%20medium%20reverb%2C%20falling%20high-cut%2C%20C%20major%2C%204%2C%20150%2C%202.mp3" type="audio/mpeg">
|
341 |
+
Your browser does not support the audio element.
|
342 |
+
</audio>
|
343 |
+
</td>
|
344 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
345 |
+
<audio controls style="width: 150px;">
|
346 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/Synth%2C%20saw%2C%20fast%20speed%2C%20falling%20arp%2C%20medium%20reverb%2C%20falling%20high-cut%2C%20C%20major%2C%204%2C%20150%2C%203.mp3" type="audio/mpeg">
|
347 |
+
Your browser does not support the audio element.
|
348 |
+
</audio>
|
349 |
+
</td>
|
350 |
+
</tr>
|
351 |
+
<tr>
|
352 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: left;">
|
353 |
+
<b>legato, square,</b> catchy melody, <b> high reverb,</b> E minor, 8 bars, 130PM
|
354 |
+
</td>
|
355 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
356 |
+
<audio controls style="width: 150px;">
|
357 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/legato_ex_1.mp3" type="audio/mpeg">
|
358 |
+
Your browser does not support the audio element.
|
359 |
+
</audio>
|
360 |
+
</td>
|
361 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
362 |
+
<audio controls style="width: 150px;">
|
363 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/legato_ex_2.mp3" type="audio/mpeg">
|
364 |
+
Your browser does not support the audio element.
|
365 |
+
</audio>
|
366 |
+
</td>
|
367 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
368 |
+
<audio controls style="width: 150px;">
|
369 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/legato_ex_3.mp3" type="audio/mpeg">
|
370 |
+
Your browser does not support the audio element.
|
371 |
+
</audio>
|
372 |
+
</td>
|
373 |
+
</tr>
|
374 |
+
<tr>
|
375 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: left;">
|
376 |
+
<b>supersaw,</b> complex, epic, arp, D# minor, 8 bars, 140BPM
|
377 |
+
</td>
|
378 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
379 |
+
<audio controls style="width: 150px;">
|
380 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/supersaw%2C%20complex%2C%20epic%2C%20arp%2C%20D%23%20minor%2C%208%20bars%2C%20140BPM%2C%201.mp3" type="audio/mpeg">
|
381 |
+
Your browser does not support the audio element.
|
382 |
+
</audio>
|
383 |
+
</td>
|
384 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
385 |
+
<audio controls style="width: 150px;">
|
386 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/supersaw%2C%20complex%2C%20epic%2C%20arp%2C%20D%23%20minor%2C%208%20bars%2C%20140BPM%2C%202.mp3" type="audio/mpeg">
|
387 |
+
Your browser does not support the audio element.
|
388 |
+
</audio>
|
389 |
+
</td>
|
390 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
391 |
+
<audio controls style="width: 150px;">
|
392 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/supersaw%2C%20complex%2C%20epic%2C%20arp%2C%20D%23%20minor%2C%208%20bars%2C%20140BPM%2C%203.mp3" type="audio/mpeg">
|
393 |
+
Your browser does not support the audio element.
|
394 |
+
</audio>
|
395 |
+
</td>
|
396 |
+
</tr>
|
397 |
+
</tbody>
|
398 |
+
</table>
|
399 |
+
</div>
|
400 |
+
|
401 |
+
<center>
|
402 |
+
<h2 style="font-size: 20px;"><u>Audio to Audio / Style Transfer Showcase</u></h2>
|
403 |
+
</center>
|
404 |
+
|
405 |
+
<div style="text-align: center; margin: 20px 0;">
|
406 |
+
<table style="width: 100%; border-collapse: collapse; margin: 0 auto;">
|
407 |
+
<thead>
|
408 |
+
<tr>
|
409 |
+
<th style="border: 1px solid #000; padding: 8px; text-align: center;">Prompt</th>
|
410 |
+
<th style="border: 1px solid #000; padding: 8px; text-align: center;">Settings</th>
|
411 |
+
<th style="border: 1px solid #000; padding: 8px; text-align: center;">Before</th>
|
412 |
+
<th style="border: 1px solid #000; padding: 8px; text-align: center;">After</th>
|
413 |
+
</tr>
|
414 |
+
</thead>
|
415 |
+
<tbody>
|
416 |
+
<tr>
|
417 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
418 |
+
supersaw, medium reverb
|
419 |
+
</td>
|
420 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
421 |
+
<b>CFG:</b> 9.0 / <b>Noise:</b> 1.16
|
422 |
+
</td>
|
423 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
424 |
+
<audio controls style="width: 150px;">
|
425 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/in_a2a_example_guitar_130_4.mp3" type="audio/mpeg">
|
426 |
+
Your browser does not support the audio element.
|
427 |
+
</audio>
|
428 |
+
</td>
|
429 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
430 |
+
<audio controls style="width: 150px;">
|
431 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/in_a2a_example_guitar_130_4_output_in_a2a_out_guitar_supersaw%2C%20medium%20reverb%2C%20cfg%209%20noise%201.16%2C.mp3" type="audio/mpeg">
|
432 |
+
Your browser does not support the audio element.
|
433 |
+
</audio>
|
434 |
+
</td>
|
435 |
+
</tr>
|
436 |
+
<tr>
|
437 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
438 |
+
Pluck Bell, Medium Reverb
|
439 |
+
</td>
|
440 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
441 |
+
<b>CFG:</b> 8.2 / <b>Noise:</b> 1.21
|
442 |
+
</td>
|
443 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
444 |
+
<audio controls style="width: 150px;">
|
445 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/a2a_example_piano_150_8.mp3" type="audio/mpeg">
|
446 |
+
Your browser does not support the audio element.
|
447 |
+
</audio>
|
448 |
+
</td>
|
449 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
450 |
+
<audio controls style="width: 150px;">
|
451 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/a2a_out_piano_%20150%20pluck%2C%20bell%2C%20medium%20reverb%20cfg%208_2%20noise%201_21_out.mp3" type="audio/mpeg">
|
452 |
+
Your browser does not support the audio element.
|
453 |
+
</audio>
|
454 |
+
</td>
|
455 |
+
</tr>
|
456 |
+
<tr>
|
457 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
458 |
+
Supersaw, high reverb, rising low-pass
|
459 |
+
</td>
|
460 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
461 |
+
<b>CFG:</b> 10 / <b>Noise:</b> 1.29
|
462 |
+
</td>
|
463 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
464 |
+
<audio controls style="width: 150px;">
|
465 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/a2a_1_in.mp3" type="audio/mpeg">
|
466 |
+
Your browser does not support the audio element.
|
467 |
+
</audio>
|
468 |
+
</td>
|
469 |
+
<td style="border: 1px solid #000; padding: 8px; text-align: center;">
|
470 |
+
<audio controls style="width: 150px;">
|
471 |
+
<source src="https://huggingface.co/adlb/Audialab_EDM_Elements/resolve/main/audio_examples/a2a_1_in_130a2a_out_%20Supersaw%2C%20high%20reverb%2C%20rising%20low-pass%20CFG%2010%20noise%201_29.mp3" type="audio/mpeg">
|
472 |
+
Your browser does not support the audio element.
|
473 |
+
</audio>
|
474 |
+
</td>
|
475 |
+
</tr>
|
476 |
+
</tbody>
|
477 |
+
</table>
|
478 |
+
</div>
|
479 |
+
|
480 |
+
|
481 |
+
|
482 |
+
### Prompt Examples
|
483 |
+
You can find a handy list of starter prompts [HERE](./prompt_list_examples.md)
|
484 |
+
|
485 |
+
|
486 |
+
#### BPMs/Bars:
|
487 |
+
|
488 |
+
The BPMs ranged from as low as 100BPM up to 150BPM. The main denominations are **100BPM, 110BPM, 120BPM, 128BPM, 130BPM, 140BPM, 150BPM**.
|
489 |
+
|
490 |
+
There are 2 bar settings: **4 bars** and **8 bars**.
|
491 |
+
|
492 |
+
<center>
|
493 |
+
<h2 style="font-size: 24px;"><u>Useage Guide</u></h2>
|
494 |
+
</center>
|
495 |
+
|
496 |
+
<center>
|
497 |
+
<h2 style="font-size: 22px;"><u>VST Support</u></h2>
|
498 |
+
</center>
|
499 |
+
|
500 |
+
<b>
|
501 |
+
<p align="center" style="font-size: 18px;">
|
502 |
+
This model has direct VST compatibility in <a href="https://audialab.com/products/deep-sampler-2/" style="font-size: 20px;">Audialab Engine.</a>
|
503 |
+
</b>
|
504 |
+
|
505 |
+
|
506 |
+
<center><img src="https://i.imgur.com/q7VqxTZ.jpeg" alt="DS Image" width="60%"></center>
|
507 |
+
|
508 |
+
|
509 |
+
<center>
|
510 |
+
<h2 style="font-size: 19px;"><u>Gradio Interfaces</u></h2>
|
511 |
+
</center>
|
512 |
+
|
513 |
+
|
514 |
+
<b>
|
515 |
+
<p align="center" style="font-size: 20px;">
|
516 |
+
You can find a <a href="https://github.com/RoyalCities/RC-stable-audio-tools" style="font-size: 20px;">direct link to a custom GitHub interface here.</a>
|
517 |
+
</p>
|
518 |
+
|
519 |
+
<p align="center" style="font-size: 20px;">
|
520 |
+
If you wish to use the original Stable Audio Github then you can <a href="https://github.com/Stability-AI/stable-audio-tools" style="font-size: 20px;">follow this link.</a>
|
521 |
+
</p>
|
522 |
+
</b>
|
523 |
+
|
524 |
+
<p align="center" style="font-size: 18px; line-height: 1.5;">
|
525 |
+
<b>
|
526 |
+
You will find 2 checkpoints files in the repo:<br>
|
527 |
+
Audialab_EDM_Elements.ckpt - Full 32 bit model<br>
|
528 |
+
and<br>
|
529 |
+
Audialab_EDM_Elements_Small.ckpt - 16-bit low VRAM version<br>
|
530 |
+
along with the associated config file - model_config.json<br>
|
531 |
+
<br>
|
532 |
+
To use the model you would choose <i>either</i> the full model or the 16-bit quantized model. Once chosen place the .ckpt file and the config .json inside their own sub-folder in the "models" folder and launch the gradio.
|
533 |
+
</b>
|
534 |
+
</p>
|
535 |
+
|
536 |
+
|
537 |
+
<center>
|
538 |
+
<h2 style="font-size: 24px;"><u>Dataset Breakdown</u></h2>
|
539 |
+
</center>
|
540 |
+
|
541 |
+
<center>
|
542 |
+
<h2 style="font-size: 19px;">Overview</h2>
|
543 |
+
</center>
|
544 |
+
|
545 |
+
- **Total .wav files**: 606,976
|
546 |
+
- **Total Size**: 58.03 GB (pre-encode)
|
547 |
+
- **Sample Rate**: 44100 Hz
|
548 |
+
<center>
|
549 |
+
<h2 style="font-size: 19px; text-align: center;">Sample Breakdown</h2>
|
550 |
+
</center>
|
551 |
+
|
552 |
+
<table align="center" style="width: 80%; border-collapse: collapse; margin: 0 auto; margin-bottom: 30px;">
|
553 |
+
<thead>
|
554 |
+
<tr>
|
555 |
+
<th style="border: 1px solid black; padding: 8px;">Sound Type</th>
|
556 |
+
<th style="border: 1px solid black; padding: 8px;">Sample Count</th>
|
557 |
+
</tr>
|
558 |
+
</thead>
|
559 |
+
<tbody>
|
560 |
+
<tr>
|
561 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;"><b>Bell Pluck</b></td>
|
562 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;">100,553</td>
|
563 |
+
</tr>
|
564 |
+
<tr>
|
565 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;"><b>Warm Supersaw</b></td>
|
566 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;">100,320</td>
|
567 |
+
</tr>
|
568 |
+
<tr>
|
569 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;"><b>Square Lead</b></td>
|
570 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;">54,475</td>
|
571 |
+
</tr>
|
572 |
+
<tr>
|
573 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;"><b>Square Buzzy Lead</b></td>
|
574 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;">54,845</td>
|
575 |
+
</tr>
|
576 |
+
<tr>
|
577 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;"><b>Pluck Bass</b></td>
|
578 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;">54,732</td>
|
579 |
+
</tr>
|
580 |
+
</tbody>
|
581 |
+
</table>
|
582 |
+
|
583 |
+
<center>
|
584 |
+
<h2 style="font-size: 19px; text-align: center;">Augment Breakdown</h2>
|
585 |
+
</center>
|
586 |
+
|
587 |
+
<table align="center" style="width: 80%; border-collapse: collapse; margin: 0 auto;">
|
588 |
+
<thead>
|
589 |
+
<tr>
|
590 |
+
<th style="border: 1px solid black; padding: 8px;">Augment Category</th>
|
591 |
+
<th style="border: 1px solid black; padding: 8px;">File Count</th>
|
592 |
+
</tr>
|
593 |
+
</thead>
|
594 |
+
<tbody>
|
595 |
+
<tr>
|
596 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;"><b>Low Cut Rise w/ Half Gate</b></td>
|
597 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;">22,763</td>
|
598 |
+
</tr>
|
599 |
+
<tr>
|
600 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;"><b>High Cut Faller w/ Half Gate</b></td>
|
601 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;">22,849</td>
|
602 |
+
</tr>
|
603 |
+
<tr>
|
604 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;"><b>Reverb Augments</b></td>
|
605 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;">90,180</td>
|
606 |
+
</tr>
|
607 |
+
<tr>
|
608 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;"><b>Gate Augments</b></td>
|
609 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;">30,045</td>
|
610 |
+
</tr>
|
611 |
+
<tr>
|
612 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;"><b>High Cut Faller Quarter Bar Gate</b></td>
|
613 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;">22,703</td>
|
614 |
+
</tr>
|
615 |
+
<tr>
|
616 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;"><b>Low Cut Rise w/ Quarter Gate</b></td>
|
617 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;">22,589</td>
|
618 |
+
</tr>
|
619 |
+
<tr>
|
620 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;"><b>High Cut & Low Cut Augments</b></td>
|
621 |
+
<td style="border: 1px solid black; padding: 8px; text-align: center;">30,922</td>
|
622 |
+
</tr>
|
623 |
+
</tbody>
|
624 |
+
</table>
|
625 |
+
|
626 |
+
While the dataset appears quite large it featured *heavy augmentation.* For example a single melody could be stretched into multiple keys and multiple tempos with slight changes for variance. This means the training time must be carefully watched to ensure it doesn't overfit
|
627 |
+
to specific melodic patterns.
|
628 |
+
|
629 |
+
Further the FX augments used the original stem but with only say a "high reverb" preset applied. This stretches the data further and lets the model hone in one this specific component of the sound (as it's seen both no-FX versions along with identical FX applied versions)
|
630 |
+
|
631 |
+
<center>
|
632 |
+
<h2 style="font-size: 19px;">Technical Specifications</h2>
|
633 |
+
</center>
|
634 |
+
|
635 |
+
- **Platform**: Runpod
|
636 |
+
- **Monitoring Tool**: Weights and Biases
|
637 |
+
- **Total Steps**: 33,436
|
638 |
+
- **Final Checkpoint Step**: 28,452
|
639 |
+
- **Learning Rate**: 5e-5
|
640 |
+
- **Dataset type**: Pre-encode
|
641 |
+
- **Optimizer**: AdamW
|
642 |
+
- **Scheduler**: InverseLR
|
643 |
+
- **Batch Size**: 32
|
644 |
+
- **Hardware**: 2x NVIDIA A40 GPUs
|
645 |
+
|
646 |
+
See config file for further details.
|
647 |
+
|
648 |
+
<center>
|
649 |
+
<h2 style="font-size: 24px;"><u>Limitations and Biases</u></h2>
|
650 |
+
</center>
|
651 |
+
|
652 |
+
The model itself has very high musicality and can essentialy be used as an infinite sample + melody generator. A limitation however of the model is since it was created using our own dataset it is not set up for genre prompting and/or sound types outside of the ones mentioned above. i.e. A prompt like "House, Pad, Guitar" would not produce such samples.
|
653 |
+
|
654 |
+
In the future with scale - genre prompting and additional sound types can be released in future models.
|
655 |
+
|
656 |
+
Lastly as mentioned it has very robust audio-to-audio capabilities however the input audio should be within a similiar frequency space as the desired output. For example if you had a piano progression and wanted to change it to a supersaw then it will do very well. But if you put a bass sample as the input (which usually falls within the lower frequency spectrum) and ask for a supersaw version (which falls within the mid-to high frequency) then the model
|
657 |
+
will struggle with the conversion. A workaround is to pitch the input sample up or down to get near to the desired output frequency range. i.e. if you wanted to AI style transfer a piano melody to a bass pluck then pitching the input piano sample down a few semitones and then converting to bass will have better results.
|
658 |
+
|
659 |
+
<center>
|
660 |
+
<h2 style="font-size: 24px;"><u>Additional details</u></h2>
|
661 |
+
</center>
|
662 |
+
|
663 |
+
- **Dataset:** Internal IP + Tooling
|
664 |
+
- **Gate Module:** Custom NumPy FX
|
665 |
+
- **Reverb Effect** Solaris Reverb
|
666 |
+
- **EQ:** Simple EQ Effect (riser/faller)
|
667 |
+
|
668 |
+
<center>
|
669 |
+
<h2 style="font-size: 24px;"><u>License</u></h2>
|
670 |
+
</center>
|
671 |
+
|
672 |
+
|
673 |
+
This model is licensed under the Stability AI Community License. It is available for non-commercial use or limited commercial use by entities with annual revenues below USD $1M. For revenues exceeding USD $1M, please refer to the [LICENSE](./LICENSE.md) for detailed terms.
|