Model Card for medical-ner-koelectra
Model Summary
This model is a fine-tuned version of the monologg/koelectra-base-v3-discriminator.
We fine-tuned the model using the KBMC and Naver X Changwon Univ NER dataset datasets.
Model Details
Model Description
- Developed by: Sungjoo Byun (Grace Byun)
- Language(s) (NLP): Korean
- License: Apache 2.0
- Finetuned from model: monologg/koelectra-base-v3-discriminator
Training Data
The model was trained using the dataset Naver X Changwon Univ NER dataset and Korean Bio-Medical Corpus (KBMC).
Model Performance
Overall Metrics
- F1 Score: 0.8886
- Loss: 0.2949
- Precision: 0.8844
- Recall: 0.8928
Class-wise Performance
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
AFW | 0.6676 | 0.6326 | 0.6496 | 362 |
ANM | 0.7476 | 0.7800 | 0.7635 | 600 |
Body | 0.9731 | 0.9813 | 0.9772 | 1068 |
CVL | 0.8492 | 0.8579 | 0.8536 | 4977 |
DAT | 0.9078 | 0.9286 | 0.9181 | 2130 |
Disease | 0.9738 | 0.9872 | 0.9805 | 2109 |
EVT | 0.7332 | 0.7446 | 0.7389 | 1026 |
FLD | 0.6138 | 0.6170 | 0.6154 | 188 |
LOC | 0.8721 | 0.8691 | 0.8706 | 1734 |
MAT | 0.5385 | 0.5000 | 0.5185 | 14 |
NUM | 0.9227 | 0.9305 | 0.9266 | 4660 |
ORG | 0.8917 | 0.8866 | 0.8892 | 3307 |
PER | 0.8918 | 0.9049 | 0.8983 | 3626 |
PLT | 0.2941 | 0.2174 | 0.2500 | 23 |
TIM | 0.8644 | 0.9173 | 0.8901 | 278 |
Treatment | 0.9468 | 0.9852 | 0.9656 | 271 |
Averages
Metric | Micro Avg | Macro Avg | Weighted Avg |
---|---|---|---|
Precision | 0.8844 | 0.7930 | 0.8841 |
Recall | 0.8928 | 0.7963 | 0.8928 |
F1-Score | 0.8886 | 0.7941 | 0.8884 |
Citations
Please cite our KBMC paper:
@misc{byun2024korean,
title={Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition},
author={Sungjoo Byun and Jiseung Hong and Sumin Park and Dongjun Jang and Jean Seo and Minseok Kim and Chaeyoung Oh and Hyopil Shin},
year={2024},
eprint={2403.16158},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Model Card Contact
For any questions or issues, please contact [email protected].
- Downloads last month
- 13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.