RoyalCities commited on
Commit
3610af9
·
verified ·
1 Parent(s): a16674b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -38,6 +38,7 @@ This finetuned Stable Audio Open model specializes in Vocal / Operatic Chord Pro
38
 
39
  - **Multiple Types of Stem Generation:** Outputs three types of voicings with a focus on Chord Progressions only,
40
  - **Tonal Versatility:** Generates stems in any key across the 12-tone chromatic scale, in both major and minor scales.
 
41
  - **Simplified Scale Notation:** Scales are written using <b><i>sharps only</i></b> in the following format:
42
 
43
  <pre>
@@ -222,7 +223,7 @@ See config file for further details.
222
 
223
  The Model has high accuracy when it comes to staying in key due to the balance in the dataset. The metadata however was designed in such a way that the model is mainly designed to generate chord progressions only - as opposed to the piano model which also generated melodies. This is due to Vocal choir samples often having very long attacks so often sit at the back of a mix or to fill out the frequency space.
224
 
225
- I have noticed some light artifacting in the outputs - in particular the ensemble progressions. It almost sounds like the model is trying to add other instrumentation in. I think this may be due to the base model primarily being trained on music stems rather than vocals but I cannot say for certain - this will need to be corrected in a future model or when there is far more vocal data.
226
 
227
  Best use case may be to add some light reverb or post-processing for best results / use in a song.
228
 
 
38
 
39
  - **Multiple Types of Stem Generation:** Outputs three types of voicings with a focus on Chord Progressions only,
40
  - **Tonal Versatility:** Generates stems in any key across the 12-tone chromatic scale, in both major and minor scales.
41
+ - **Audio-To-Audio:** Generates interesting vocal timbres when paired with vocal stems.
42
  - **Simplified Scale Notation:** Scales are written using <b><i>sharps only</i></b> in the following format:
43
 
44
  <pre>
 
223
 
224
  The Model has high accuracy when it comes to staying in key due to the balance in the dataset. The metadata however was designed in such a way that the model is mainly designed to generate chord progressions only - as opposed to the piano model which also generated melodies. This is due to Vocal choir samples often having very long attacks so often sit at the back of a mix or to fill out the frequency space.
225
 
226
+ I have noticed some light noise in the outputs - in particular the ensemble progressions. It almost sounds like the model is trying to add other instrumentation in. I think this may be due to the base model primarily being trained on music stems rather than vocals but I cannot say for certain - this will need to be corrected in a future model or when there is far more vocal data.
227
 
228
  Best use case may be to add some light reverb or post-processing for best results / use in a song.
229