Feature | Description |
---|---|
Name | en_roberta_base_plant_ner_case |
Version | 1.0.0 |
spaCy | >=3.5.2,<3.6.0 |
Default Pipeline | transformer , ner |
Components | transformer , ner |
Vectors | 0 keys, 0 unique vectors (0 dimensions) |
Sources | n/a |
License | Apache-2.0 |
Author | Mohammad Othman |
GitHub | Github |
Model Architecture and Training
The Named Entity Recognition (NER) model is based on a pipeline architecture consisting of a Transformer component and an NER component. The Transformer component uses a pre-trained RoBERTa-base model, which is based on the BERT architecture. This component uses a fast tokenizer and processes input text in windows of 128 tokens with a stride of 96 tokens.
The NER component is a Transition-Based Parser (v2) with a hidden width of 64 and maxout pieces set to 2. It uses a Transformer Listener for the tok2vec layer with a grad_factor of 1.0 and mean pooling.
During training, a Tesla V100 GPU was used for its superior performance. The optimizer used was Adam with a warmup-linear learning rate schedule, L2 regularization of 0.01, and gradient clipping of 1.0. A batch size of 128 was used, with accumulated gradients for 3 steps and a dropout rate of 0.1. The model was trained with a patience of 1600, max steps of 20,000, and an evaluation frequency of 200.
A warmup period of 250 steps was used with an initial learning rate of 0.00005, followed by a linear increase until the total steps of 20,000 were reached. This training process allowed for excellent results in terms of both accuracy and efficiency.
Model Capabilities
This model is capable of identifying more than 500 different fruits and vegetables, including various kinds and variations. The model has been thoroughly tested and provides high accuracy for plant named entity recognition.
Requirements
- spaCy:
>=3.5.2,<3.6.0
- spaCy Transformers:
>=1.2.3,<1.3.0
Example Usage
!pip install https://huggingface.co/MohammadOthman/en_roberta_base_plant_ner_case/resolve/main/en_roberta_base_plant_ner_case-any-py3-none-any.whl
import spacy
from spacy import displacy
nlp = spacy.load("en_roberta_base_plant_ner_case")
text = "I bought some bananas, apples, and oranges from the market."
doc = nlp(text)
displacy.render(doc, style='ent', jupyter=True)
Label Scheme
Component | Labels |
---|---|
ner |
PLANT |
Citation
If you use this model in your research or applications, please cite it as follows:
@misc{othman2023en_roberta_base_plant_ner_case,
author = {Mohammad Othman},
title = {en_roberta_base_plant_ner_case: A Named Entity Recognition Model for Identifying Plant Names},
year = {2023},
publisher = {Hugging Face Model Hub},
url = {https://huggingface.co/MohammadOthman/en_roberta_base_plant_ner_case}
}
Feedback and Support
For any questions, issues, or suggestions related to this model, please feel free to start a discussion on the model's discussion board.
If you need further assistance or would like to provide feedback directly to the author, you can contact Mohammad Othman via email at: [email protected]
- Downloads last month
- 16
Evaluation results
- NER Precisionself-reported0.970
- NER Recallself-reported0.975
- NER F Scoreself-reported0.973