|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: urchade/gliner_small-v2 |
|
datasets: |
|
- gretelai/synthetic_pii_finance_multilingual |
|
--- |
|
|
|
# GLiNER-Finance-PII-Detection |
|
|
|
## Training and evaluation data |
|
|
|
I have used 0.5 epochs in fine tuning. |
|
|
|
## Training procedure notebook |
|
|
|
https://github.com/mit1280/fined-tuning/blob/main/Fine_Tune_GLiNER_Token_Classification.ipynb |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 1e-5 |
|
|
|
### Inference Code |
|
|
|
```python |
|
|
|
!pip install -q gliner |
|
|
|
import os |
|
import re |
|
import torch |
|
from gliner import GLiNERConfig, GLiNER |
|
|
|
fine_tuned_model = GLiNER.from_pretrained("Mit1208/gliner-fine-tuned-pii-finance-multilingual") |
|
|
|
text = "Loan Application\n\nFull Legal Name: Luigi Clelia Togliatti\nDate of Birth: 11/27/1967\n\nMailing Address:\n4893 Justin Terrace\n[City, State, Zip Code]\n\nPhone Number: [(123) 456-7890]\nEmail Address: [[email protected]]\n\nEducational Institution: University of Toronto\nExpected Graduation Date: [Graduation Year]\n\nProgram of Study: Bachelor of Science in Computer Science\n\nFuture Career Plans: After graduation, I plan to pursue a career as a software engineer at a tech company. I am particularly interested in the field of artificial intelligence and machine learning.\n\nLoan Amount Requested: $20,000\n\nPersonal Financial Information:\n\n* Monthly Income: $2,500\n* Monthly Expenses: $1,500\n* Total Assets: $10,000\n* Total Debts: $5,000\n\nI confirm that all the information provided is true and accurate to the best of my knowledge.\n\nSignature: Luigi Clelia Togliatti\nDate: [Today's Date]" |
|
|
|
# Labels for entity prediction |
|
labels = ["street_address", "company", "date_of_birth", "email", "date", "name"] |
|
|
|
# Perform entity prediction |
|
entities = fine_tuned_model.predict_entities(text, labels, threshold=0.85) |
|
|
|
# Display predicted entities and their labels |
|
for entity in entities: |
|
print("(", entity["text"], "=>", entity["label"], ") (start & end ==>", entity["start"], "&", entity["end"], ")") |
|
|
|
|
|
# Output |
|
''' |
|
( Luigi Clelia Togliatti => name ) (start & end ==> 35 & 57 ) |
|
( 11/27/1967 => date_of_birth ) (start & end ==> 73 & 83 ) |
|
( 4893 Justin Terrace => street_address ) (start & end ==> 102 & 121 ) |
|
( [email protected] => email ) (start & end ==> 194 & 219 ) |
|
( Luigi Clelia Togliatti => name ) (start & end ==> 842 & 864 ) |
|
''' |
|
``` |
|
|