Michael Beukman
commited on
Commit
·
5de71b0
1
Parent(s):
28ec6e2
Fixed a typo.
Browse files
README.md
CHANGED
@@ -21,7 +21,7 @@ More information, and other similar models can be found in the [main Github repo
|
|
21 |
|
22 |
## About
|
23 |
This models is transformer based and was fine-tuned on the MasakhaNER dataset. It is a named entity recognition dataset, containing mostly news articles in 10 different African languages.
|
24 |
-
The model was fine-tuned for 50 epochs, with a maximum sequence length of 200, 32 batch size, 5e-5 learning rate. This process was repeated 5 times (with different random seeds), and this uploaded model performed the best out of those 5 seeds (aggregate F1 on
|
25 |
|
26 |
This model was fine-tuned by me, Michael Beukman while doing a project at the University of the Witwatersrand, Johannesburg. This is version 1, as of 20 November 2021.
|
27 |
This models is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
|
@@ -103,7 +103,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_name)
|
|
103 |
model = AutoModelForTokenClassification.from_pretrained(model_name)
|
104 |
|
105 |
nlp = pipeline("ner", model=model, tokenizer=tokenizer)
|
106 |
-
example = "
|
107 |
|
108 |
ner_results = nlp(example)
|
109 |
print(ner_results)
|
|
|
21 |
|
22 |
## About
|
23 |
This models is transformer based and was fine-tuned on the MasakhaNER dataset. It is a named entity recognition dataset, containing mostly news articles in 10 different African languages.
|
24 |
+
The model was fine-tuned for 50 epochs, with a maximum sequence length of 200, 32 batch size, 5e-5 learning rate. This process was repeated 5 times (with different random seeds), and this uploaded model performed the best out of those 5 seeds (aggregate F1 on test set).
|
25 |
|
26 |
This model was fine-tuned by me, Michael Beukman while doing a project at the University of the Witwatersrand, Johannesburg. This is version 1, as of 20 November 2021.
|
27 |
This models is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
|
|
|
103 |
model = AutoModelForTokenClassification.from_pretrained(model_name)
|
104 |
|
105 |
nlp = pipeline("ner", model=model, tokenizer=tokenizer)
|
106 |
+
example = "Mixed Martial Arts joinbodi , Ultimate Fighting Championship , UFC don decide say dem go enta back di octagon on Saturday , 9 May , for Jacksonville , Florida ."
|
107 |
|
108 |
ner_results = nlp(example)
|
109 |
print(ner_results)
|