Michael Beukman commited on
Commit
fac445f
·
1 Parent(s): 88f2392

Fixed a typo.

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -21,7 +21,7 @@ More information, and other similar models can be found in the [main Github repo
21
 
22
  ## About
23
  This models is transformer based and was fine-tuned on the MasakhaNER dataset. It is a named entity recognition dataset, containing mostly news articles in 10 different African languages.
24
- The model was fine-tuned for 50 epochs, with a maximum sequence length of 200, 32 batch size, 5e-5 learning rate. This process was repeated 5 times (with different random seeds), and this uploaded model performed the best out of those 5 seeds (aggregate F1 on on test set).
25
 
26
  This model was fine-tuned by me, Michael Beukman while doing a project at the University of the Witwatersrand, Johannesburg. This is version 1, as of 20 November 2021.
27
  This models is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
@@ -103,7 +103,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_name)
103
  model = AutoModelForTokenClassification.from_pretrained(model_name)
104
 
105
  nlp = pipeline("ner", model=model, tokenizer=tokenizer)
106
- example = "A (Luo) sentence that may contain entities"
107
 
108
  ner_results = nlp(example)
109
  print(ner_results)
 
21
 
22
  ## About
23
  This models is transformer based and was fine-tuned on the MasakhaNER dataset. It is a named entity recognition dataset, containing mostly news articles in 10 different African languages.
24
+ The model was fine-tuned for 50 epochs, with a maximum sequence length of 200, 32 batch size, 5e-5 learning rate. This process was repeated 5 times (with different random seeds), and this uploaded model performed the best out of those 5 seeds (aggregate F1 on test set).
25
 
26
  This model was fine-tuned by me, Michael Beukman while doing a project at the University of the Witwatersrand, Johannesburg. This is version 1, as of 20 November 2021.
27
  This models is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
 
103
  model = AutoModelForTokenClassification.from_pretrained(model_name)
104
 
105
  nlp = pipeline("ner", model=model, tokenizer=tokenizer)
106
+ example = "Jii 2 moko jowito ngimagi ka machielo 1 to ohinyore marach mokalo e masira makoch mar apaya mane otimore e apaya mawuok Oyugis kochimo Chabera e sub county ma Rachuonyo East e County ma Homa Bay ewii odhiambo makawuononi"
107
 
108
  ner_results = nlp(example)
109
  print(ner_results)