Davlan commited on
Commit
37bd44c
·
1 Parent(s): b8d1d1d

updating readme

Browse files
Files changed (1) hide show
  1. README.md +13 -10
README.md CHANGED
@@ -1,10 +1,12 @@
1
  Hugging Face's logo
2
  ---
3
- language: yo
 
 
4
  datasets:
5
  - JW300 + [Menyo-20k](https://huggingface.co/datasets/menyo20k_mt)
6
  ---
7
- # mT5_base_yoruba_adr
8
  ## Model description
9
  **mT5_base_yor_eng_mt** is a **machine translation** model from Yorùbá language to English language based on a fine-tuned mT5-base model. It establishes a **strong baseline** for automatically translating texts from Yorùbá to English.
10
 
@@ -13,14 +15,15 @@ Specifically, this model is a *mT5_base* model that was fine-tuned on JW300 Yor
13
  #### How to use
14
  You can use this model with Transformers *pipeline* for ADR.
15
  ```python
16
- from transformers import AutoTokenizer, AutoModelForTokenClassification
17
- from transformers import pipeline
18
- tokenizer = AutoTokenizer.from_pretrained("")
19
- model = AutoModelForTokenClassification.from_pretrained("")
20
- nlp = pipeline("", model=model, tokenizer=tokenizer)
21
- example = "Emir of Kano turban Zhang wey don spend 18 years for Nigeria"
22
- ner_results = nlp(example)
23
- print(ner_results)
 
24
  ```
25
  #### Limitations and bias
26
  This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains.
 
1
  Hugging Face's logo
2
  ---
3
+ language:
4
+ - yo
5
+ - en
6
  datasets:
7
  - JW300 + [Menyo-20k](https://huggingface.co/datasets/menyo20k_mt)
8
  ---
9
+ # mT5_base_yor_eng_mt
10
  ## Model description
11
  **mT5_base_yor_eng_mt** is a **machine translation** model from Yorùbá language to English language based on a fine-tuned mT5-base model. It establishes a **strong baseline** for automatically translating texts from Yorùbá to English.
12
 
 
15
  #### How to use
16
  You can use this model with Transformers *pipeline* for ADR.
17
  ```python
18
+ from transformers import MT5ForConditionalGeneration, T5Tokenizer
19
+
20
+ model = MT5ForConditionalGeneration.from_pretrained("Davlan/mt5_base_yor_eng_mt")
21
+ tokenizer = T5Tokenizer.from_pretrained("google/mt5-base")
22
+ input_string = "Akọni ajìjàgbara obìnrin tó sun àtìmalé torí owó orí"
23
+ inputs = tokenizer.encode(input_string, return_tensors="pt")
24
+ generated_tokens = model.generate(inputs)
25
+ results = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
26
+ print(results)
27
  ```
28
  #### Limitations and bias
29
  This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains.