updating readme
Browse files
README.md
CHANGED
@@ -1,10 +1,12 @@
|
|
1 |
Hugging Face's logo
|
2 |
---
|
3 |
-
language:
|
|
|
|
|
4 |
datasets:
|
5 |
- JW300 + [Menyo-20k](https://huggingface.co/datasets/menyo20k_mt)
|
6 |
---
|
7 |
-
#
|
8 |
## Model description
|
9 |
**mT5_base_yor_eng_mt** is a **machine translation** model from Yorùbá language to English language based on a fine-tuned mT5-base model. It establishes a **strong baseline** for automatically translating texts from Yorùbá to English.
|
10 |
|
@@ -13,14 +15,15 @@ Specifically, this model is a *mT5_base* model that was fine-tuned on JW300 Yor
|
|
13 |
#### How to use
|
14 |
You can use this model with Transformers *pipeline* for ADR.
|
15 |
```python
|
16 |
-
from transformers import
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
|
|
24 |
```
|
25 |
#### Limitations and bias
|
26 |
This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains.
|
|
|
1 |
Hugging Face's logo
|
2 |
---
|
3 |
+
language:
|
4 |
+
- yo
|
5 |
+
- en
|
6 |
datasets:
|
7 |
- JW300 + [Menyo-20k](https://huggingface.co/datasets/menyo20k_mt)
|
8 |
---
|
9 |
+
# mT5_base_yor_eng_mt
|
10 |
## Model description
|
11 |
**mT5_base_yor_eng_mt** is a **machine translation** model from Yorùbá language to English language based on a fine-tuned mT5-base model. It establishes a **strong baseline** for automatically translating texts from Yorùbá to English.
|
12 |
|
|
|
15 |
#### How to use
|
16 |
You can use this model with Transformers *pipeline* for ADR.
|
17 |
```python
|
18 |
+
from transformers import MT5ForConditionalGeneration, T5Tokenizer
|
19 |
+
|
20 |
+
model = MT5ForConditionalGeneration.from_pretrained("Davlan/mt5_base_yor_eng_mt")
|
21 |
+
tokenizer = T5Tokenizer.from_pretrained("google/mt5-base")
|
22 |
+
input_string = "Akọni ajìjàgbara obìnrin tó sun àtìmalé torí owó orí"
|
23 |
+
inputs = tokenizer.encode(input_string, return_tensors="pt")
|
24 |
+
generated_tokens = model.generate(inputs)
|
25 |
+
results = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
|
26 |
+
print(results)
|
27 |
```
|
28 |
#### Limitations and bias
|
29 |
This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains.
|