cemsubakan
commited on
Commit
·
5568849
1
Parent(s):
6dbfa1f
Update README.md
Browse files
README.md
CHANGED
@@ -24,12 +24,12 @@ metrics:
|
|
24 |
<br/><br/>
|
25 |
|
26 |
# SepFormer trained on WHAMR! for speech enhancement (8k sampling frequency)
|
27 |
-
This repository provides all the necessary tools to perform speech enhancement (denoising + dereverberation) with a [SepFormer](https://arxiv.org/abs/2010.13154v2) model, implemented with SpeechBrain, and pretrained on [WHAMR!](http://wham.whisper.ai/) dataset with 8k sampling frequency, which is basically a version of WSJ0-Mix dataset with environmental noise and reverberation in 8k. For a better experience we encourage you to learn more about [SpeechBrain](https://speechbrain.github.io). The given model performance is
|
28 |
|
29 |
|
30 |
| Release | Test-Set SI-SNR | Test-Set PESQ |
|
31 |
|:-------------:|:--------------:|:--------------:|
|
32 |
-
| 01-12-21 |
|
33 |
|
34 |
|
35 |
## Install SpeechBrain
|
@@ -48,17 +48,14 @@ Please notice that we encourage you to read our tutorials and learn more about [
|
|
48 |
from speechbrain.pretrained import SepformerSeparation as separator
|
49 |
import torchaudio
|
50 |
|
51 |
-
model = separator.from_hparams(source="speechbrain/sepformer-
|
52 |
|
53 |
# for custom file, change path
|
54 |
-
est_sources = model.separate_file(path='speechbrain/sepformer-
|
55 |
|
56 |
-
torchaudio.save("
|
57 |
-
torchaudio.save("source2hat.wav", est_sources[:, :, 1].detach().cpu(), 16000)
|
58 |
-
```
|
59 |
|
60 |
-
|
61 |
-
If your signal has a different sample rate, resample it (e.g, using torchaudio or sox) before using the interface.
|
62 |
|
63 |
### Inference on GPU
|
64 |
To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
|
|
|
24 |
<br/><br/>
|
25 |
|
26 |
# SepFormer trained on WHAMR! for speech enhancement (8k sampling frequency)
|
27 |
+
This repository provides all the necessary tools to perform speech enhancement (denoising + dereverberation) with a [SepFormer](https://arxiv.org/abs/2010.13154v2) model, implemented with SpeechBrain, and pretrained on [WHAMR!](http://wham.whisper.ai/) dataset with 8k sampling frequency, which is basically a version of WSJ0-Mix dataset with environmental noise and reverberation in 8k. For a better experience we encourage you to learn more about [SpeechBrain](https://speechbrain.github.io). The given model performance is dB SI-SNRi on the test set of WHAMR! dataset.
|
28 |
|
29 |
|
30 |
| Release | Test-Set SI-SNR | Test-Set PESQ |
|
31 |
|:-------------:|:--------------:|:--------------:|
|
32 |
+
| 01-12-21 | 10.59 | 2.84 |
|
33 |
|
34 |
|
35 |
## Install SpeechBrain
|
|
|
48 |
from speechbrain.pretrained import SepformerSeparation as separator
|
49 |
import torchaudio
|
50 |
|
51 |
+
model = separator.from_hparams(source="speechbrain/sepformer-whamr-enhancement", savedir='pretrained_models/sepformer-whamr-enhancement')
|
52 |
|
53 |
# for custom file, change path
|
54 |
+
est_sources = model.separate_file(path='speechbrain/sepformer-whamr-enhancement/example_whamr.wav')
|
55 |
|
56 |
+
torchaudio.save("enhanced_whamr.wav", est_sources[:, :, 0].detach().cpu(), 16000)
|
|
|
|
|
57 |
|
58 |
+
```
|
|
|
59 |
|
60 |
### Inference on GPU
|
61 |
To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
|