Titouan
commited on
Commit
·
f627588
1
Parent(s):
7ff95b4
update readme
Browse files
README.md
CHANGED
@@ -24,7 +24,7 @@ SpeechBrain. For a better experience we encourage you to learn more about
|
|
24 |
|
25 |
| Release | Test clean WER | Test other WER | GPUs |
|
26 |
|:-------------:|:--------------:|:--------------:|:--------:|
|
27 |
-
| 05-03-21 | 2.
|
28 |
|
29 |
## Pipeline description
|
30 |
|
@@ -32,11 +32,8 @@ This ASR system is composed with 3 different but linked blocks:
|
|
32 |
1. Tokenizer (unigram) that transforms words into subword units and trained with
|
33 |
the train transcriptions of LibriSpeech.
|
34 |
2. Neural language model (Transformer LM) trained on the full 10M words dataset.
|
35 |
-
3. Acoustic model
|
36 |
-
|
37 |
-
frequency domain. Then, a bidirectional LSTM with projection layers is connected
|
38 |
-
to a final DNN to obtain the final acoustic representation that is given to
|
39 |
-
the CTC and attention decoders.
|
40 |
|
41 |
## Intended uses & limitations
|
42 |
|
@@ -61,7 +58,7 @@ Please notice that we encourage you to read our tutorials and learn more about
|
|
61 |
```python
|
62 |
from speechbrain.pretrained import EncoderDecoderASR
|
63 |
|
64 |
-
asr_model = EncoderDecoderASR.from_hparams(source="speechbrain/asr-
|
65 |
asr_model.transcribe_file("path_to_your_file.wav")
|
66 |
|
67 |
```
|
|
|
24 |
|
25 |
| Release | Test clean WER | Test other WER | GPUs |
|
26 |
|:-------------:|:--------------:|:--------------:|:--------:|
|
27 |
+
| 05-03-21 | 2.55 | 5.99 | 2xV100 32GB |
|
28 |
|
29 |
## Pipeline description
|
30 |
|
|
|
32 |
1. Tokenizer (unigram) that transforms words into subword units and trained with
|
33 |
the train transcriptions of LibriSpeech.
|
34 |
2. Neural language model (Transformer LM) trained on the full 10M words dataset.
|
35 |
+
3. Acoustic model made of a transformer encoder and a joint decoder with CTC +
|
36 |
+
transformer. Hence, the decoding also incorporate the CTC probabilities.
|
|
|
|
|
|
|
37 |
|
38 |
## Intended uses & limitations
|
39 |
|
|
|
58 |
```python
|
59 |
from speechbrain.pretrained import EncoderDecoderASR
|
60 |
|
61 |
+
asr_model = EncoderDecoderASR.from_hparams(source="speechbrain/asr-transformer-transformerlm-librispeech")
|
62 |
asr_model.transcribe_file("path_to_your_file.wav")
|
63 |
|
64 |
```
|