JanBabela commited on
Commit
f2e7c58
·
1 Parent(s): 88bdd0f

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +7 -5
index.html CHANGED
@@ -10,20 +10,22 @@
10
  <div class="card">
11
  <h1>Riffusion-Melodiff-v1</h1>
12
  <p>Riffusion-Melodiff is simple, but interesting idea, (that I have not seen anywhere else) how to modify your music.</p>
13
- <p>Riffusion-Melodiff is built on a top of Riffusion model, which is fine-tuned Stable Diffusion model to generate Mel Spectrogram. (Spectrogram is kind of
 
 
14
  visual representation of music by dividing waveforms into frequencies.) Riffusion-Melodiff does not contain new model, there was no new training, nor fine-tunig.
15
  It uses the same model as Riffusion only in a different way.</p>
16
  <p>Riffusion-Melodiff uses Img2Img pipeline from Diffusers library to modify images of Mel Spectrograms to produce cover versions of music. Just upload your audio
17
  in wav format (if you have audio in a different format, transfer it first to wav by online converter). Then you may use Img2img pipeline from the Diffusers library
18
- with your prompr, seed and strength. Stregth parameter decides, how much will modified audio relate to initial audio and how much it will relate to the prompt.
19
  When strength is too low the spectrogram is too similar with original one and we do not receive new modification. When strength is too high, then spectrogram is too
20
  close to the new promopt, which may cause loss of melody and/or tempo from the base image. Good values of strength are usually about 0,4-0,5.</p>
21
  <p>Good modifications are possible for proper prompt, seed and strength values. Those modifications will keep the tempo and melody from the initial audio, but
22
  they will change eg. instrument, playing that melody. Also with this pipeline longer than 5s music modifications are possible. If you cut your audio into 5s pieces
23
- and use the same prompt, seed and strength for each modification, generated smaples will be somewhat consistent. So if you concatenate them together, you will have
24
  longer audio modified.</p>
25
- <p>Quality of the generated music is not amazing, (mediocre, I would say) and it needs a bit of prompt and seed engineering. But it shows one way, how to do it
26
- in the future.</p>
27
  <p>
28
  Colab notebook is included, where you can find step by step, how to do it.
29
  <a href="https://huggingface.co/spaces/JanBabela/Riffusion-Melodiff-v1/blob/main/melodiff_v1.ipynb" target="_blank">Melodiff_v1</a>.
 
10
  <div class="card">
11
  <h1>Riffusion-Melodiff-v1</h1>
12
  <p>Riffusion-Melodiff is simple, but interesting idea, (that I have not seen anywhere else) how to modify your music.</p>
13
+ <p>Riffusion-Melodiff is built on a top of
14
+ <a href="https://huggingface.co/riffusion/riffusion-model-v1" target="_blank">Riffusion </a>
15
+ model, which is fine-tuned Stable Diffusion model to generate Mel Spectrograms. (Spectrogram is kind of
16
  visual representation of music by dividing waveforms into frequencies.) Riffusion-Melodiff does not contain new model, there was no new training, nor fine-tunig.
17
  It uses the same model as Riffusion only in a different way.</p>
18
  <p>Riffusion-Melodiff uses Img2Img pipeline from Diffusers library to modify images of Mel Spectrograms to produce cover versions of music. Just upload your audio
19
  in wav format (if you have audio in a different format, transfer it first to wav by online converter). Then you may use Img2img pipeline from the Diffusers library
20
+ with your prompt, seed and strength. Stregth parameter decides, how much will modified audio relate to initial audio and how much it will relate to the prompt.
21
  When strength is too low the spectrogram is too similar with original one and we do not receive new modification. When strength is too high, then spectrogram is too
22
  close to the new promopt, which may cause loss of melody and/or tempo from the base image. Good values of strength are usually about 0,4-0,5.</p>
23
  <p>Good modifications are possible for proper prompt, seed and strength values. Those modifications will keep the tempo and melody from the initial audio, but
24
  they will change eg. instrument, playing that melody. Also with this pipeline longer than 5s music modifications are possible. If you cut your audio into 5s pieces
25
+ and use the same prompt, seed and strength for each modification, generated samples will be somewhat consistent. So if you concatenate them together, you will have
26
  longer audio modified.</p>
27
+ <p>Quality of the generated music is not amazing, (mediocre, I would say) and it needs a bit of prompt and seed engineering. But it shows one way, how to make cover
28
+ versions of music in the future.</p>
29
  <p>
30
  Colab notebook is included, where you can find step by step, how to do it.
31
  <a href="https://huggingface.co/spaces/JanBabela/Riffusion-Melodiff-v1/blob/main/melodiff_v1.ipynb" target="_blank">Melodiff_v1</a>.