--- language: - sk license: mit tags: - generated_from_trainer datasets: - wikiann metrics: - precision - recall - f1 - accuracy inference: false widget: - text: Zuzana Čaputová sa narodila 21. júna 1973 v Bratislave. example_title: Named Entity Recognition base_model: gerulata/slovakbert model-index: - name: slovakbert-ner results: - task: type: token-classification name: Token Classification dataset: name: wikiann type: wikiann args: sk metrics: - type: precision value: 0.9327115256495669 name: Precision - type: recall value: 0.9470124013528749 name: Recall - type: f1 value: 0.9398075632132469 name: F1 - type: accuracy value: 0.9785228256835333 name: Accuracy --- # Named Entity Recognition based on SlovakBERT This model is a fine-tuned version of [gerulata/slovakbert](https://huggingface.co/gerulata/slovakbert) on the Slovak wikiann dataset. It achieves the following results on the evaluation set: - Loss: 0.1600 - Precision: 0.9327 - Recall: 0.9470 - F1: 0.9398 - Accuracy: 0.9785 ## Intended uses & limitations Supported classes: LOCATION, PERSON, ORGANIZATION ``` from transformers import pipeline ner_pipeline = pipeline(task='ner', model='crabz/slovakbert-ner') input_sentence = "Minister financií a líder mandátovo najsilnejšieho hnutia OĽaNO Igor Matovič upozorňuje, že následky tretej vlny budú na Slovensku veľmi veľké." classifications = ner_pipeline(input_sentence) ``` with `displaCy`: ``` import spacy from spacy import displacy ner_map = {0: '0', 1: 'B-OSOBA', 2: 'I-OSOBA', 3: 'B-ORGANIZÁCIA', 4: 'I-ORGANIZÁCIA', 5: 'B-LOKALITA', 6: 'I-LOKALITA'} entities = [] for i in range(len(classifications)): if classifications[i]['entity'] != 0: if ner_map[classifications[i]['entity']][0] == 'B': j = i + 1 while j < len(classifications) and ner_map[classifications[j]['entity']][0] == 'I': j += 1 entities.append((ner_map[classifications[i]['entity']].split('-')[1], classifications[i]['start'], classifications[j - 1]['end'])) nlp = spacy.blank("en") # it should work with any language doc = nlp(input_sentence) ents = [] for ee in entities: ents.append(doc.char_span(ee[1], ee[2], ee[0])) doc.ents = ents options = {"ents": ["OSOBA", "ORGANIZÁCIA", "LOKALITA"], "colors": {"OSOBA": "lightblue", "ORGANIZÁCIA": "lightcoral", "LOKALITA": "lightgreen"}} displacy_html = displacy.render(doc, style="ent", options=options) ```