Update README.md
Browse files
README.md
CHANGED
@@ -8,6 +8,10 @@ tags:
|
|
8 |
|
9 |
# Cascaded English Speech2Text Translation
|
10 |
This is a pipeline for speech-to-text translation from English speech to any target language text based on the cascaded approach, that consists of ASR and translation.
|
|
|
|
|
|
|
|
|
11 |
|
12 |
## Usage
|
13 |
Here is an example to translate English speech into Japanese text translation.
|
@@ -23,7 +27,7 @@ from transformers import pipeline
|
|
23 |
# load model
|
24 |
pipe = pipeline(
|
25 |
model="japanese-asr/en-cascaded-s2t-translation",
|
26 |
-
model_translation="facebook/nllb-200-
|
27 |
tgt_lang="jpn_Jpan",
|
28 |
model_kwargs={"attn_implementation": "sdpa"},
|
29 |
chunk_length_s=15,
|
@@ -34,3 +38,9 @@ pipe = pipeline(
|
|
34 |
output = pipe("./sample.wav")
|
35 |
```
|
36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
|
9 |
# Cascaded English Speech2Text Translation
|
10 |
This is a pipeline for speech-to-text translation from English speech to any target language text based on the cascaded approach, that consists of ASR and translation.
|
11 |
+
The pipeline employs [distil-whisper/distil-large-v3](https://huggingface.co/distil-whisper/distil-large-v3) for ASR (English speech -> English text)
|
12 |
+
and [facebook/nllb-200-3.3B](https://huggingface.co/facebook/nllb-200-3.3B) for text translation.
|
13 |
+
The input must be English speech, while the translation can be in any languages NLLB trained on. Please find the all available languages and their language codes
|
14 |
+
[here](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200).
|
15 |
|
16 |
## Usage
|
17 |
Here is an example to translate English speech into Japanese text translation.
|
|
|
27 |
# load model
|
28 |
pipe = pipeline(
|
29 |
model="japanese-asr/en-cascaded-s2t-translation",
|
30 |
+
model_translation="facebook/nllb-200-3.3B",
|
31 |
tgt_lang="jpn_Jpan",
|
32 |
model_kwargs={"attn_implementation": "sdpa"},
|
33 |
chunk_length_s=15,
|
|
|
38 |
output = pipe("./sample.wav")
|
39 |
```
|
40 |
|
41 |
+
Other NLLB models can be used by setting `model_translation` such as following.
|
42 |
+
- [facebook/nllb-200-3.3B](https://huggingface.co/facebook/nllb-200-3.3B)
|
43 |
+
- [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M)
|
44 |
+
- [facebook/nllb-200-distilled-1.3B](https://huggingface.co/facebook/nllb-200-distilled-1.3B)
|
45 |
+
- [facebook/nllb-200-1.3B](https://huggingface.co/facebook/nllb-200-1.3B)
|
46 |
+
|