bert-phishing-classifier_student
This model is modified version of distilbert/distilbert-base-uncased trained via knowledge distillation from shawhin/bert-phishing-classifier_teacher using the shawhin/phishing-site-classification dataset. It achieves the following results on the testing set:
- Loss (training): 0.0563
- Accuracy: 0.9022
- Precision: 0.9426
- Recall: 0.8603
- F1 Score: 0.8995
Model description
Student model for knowledge distillation example.
Video | Blog | Example code
Intended uses & limitations
This model was created for educational purposes.
Training and evaluation data
The Training, Testing, and Validation data are available here: shawhin/phishing-site-classification.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 32
- eval_batch_size: 32
- num_epochs: 5
- temperature: 2.0
- adam optimizer alpha: 0.5
- Downloads last month
- 53
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.