AndyGo commited on
Commit
269e301
·
1 Parent(s): 01b781c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "ru"
3
+ thumbnail:
4
+ tags:
5
+ - automatic-speech-recognition
6
+ - CTC
7
+ - Attention
8
+ - pytorch
9
+ - speechbrain
10
+ license: "apache-2.0"
11
+ datasets:
12
+ - buriy-audiobooks-2-val
13
+ metrics:
14
+ - wer
15
+ - cer
16
+ ---
17
+
18
+ | Release | Test WER | GPUs |
19
+ |:-------------:|:--------------:| :--------:|
20
+ | 22-05-11 | - | 1xK80 24GB |
21
+
22
+ ## Pipeline description
23
+ (by Speech brain text)
24
+
25
+ This ASR system is composed with 3 different but linked blocks:
26
+ - Tokenizer (unigram) that transforms words into subword units and trained with
27
+ the train transcriptions of LibriSpeech.
28
+ - Neural language model (RNNLM) trained on the full (380K) words dataset.
29
+ - Acoustic model (CRDNN + CTC/Attention). The CRDNN architecture is made of
30
+ N blocks of convolutional neural networks with normalisation and pooling on the
31
+ frequency domain. Then, a bidirectional LSTM is connected to a final DNN to obtain
32
+ the final acoustic representation that is given to the CTC and attention decoders.
33
+
34
+ The system is trained with recordings sampled at 16kHz (single channel).
35
+ The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling *transcribe_file* if needed.
36
+
37
+ ## Install SpeechBrain
38
+
39
+ First of all, please install SpeechBrain with the following command:
40
+
41
+ ```
42
+ pip install speechbrain
43
+ ```
44
+
45
+ Please notice that SpeechBrain encourage you to read tutorials and learn more about
46
+ [SpeechBrain](https://speechbrain.github.io).
47
+
48
+ ### Transcribing your own audio files (in Russian)
49
+
50
+ ```python
51
+ from speechbrain.pretrained import EncoderDecoderASR
52
+ asr_model = EncoderDecoderASR.from_hparams(source="AndyGo/speech-brain-asr-crdnn-rnnlm-buriy-audiobooks-2-val", savedir="pretrained_models/speech-brain-asr-crdnn-rnnlm-buriy-audiobooks-2-val")
53
+ asr_model.transcribe_file('speech-brain-asr-crdnn-rnnlm-buriy-audiobooks-2-val/example.wav')
54
+ ```
55
+
56
+ ### Inference on GPU
57
+ To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.