bert-phishing-classifier_student

This model is modified version of distilbert/distilbert-base-uncased trained via knowledge distillation from shawhin/bert-phishing-classifier_teacher using the shawhin/phishing-site-classification dataset. It achieves the following results on the testing set:

  • Loss (training): 0.0563
  • Accuracy: 0.9022
  • Precision: 0.9426
  • Recall: 0.8603
  • F1 Score: 0.8995

Model description

Student model for knowledge distillation example.

Video | Blog | Example code

Intended uses & limitations

This model was created for educational purposes.

Training and evaluation data

The Training, Testing, and Validation data are available here: shawhin/phishing-site-classification.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • num_epochs: 5
  • temperature: 2.0
  • adam optimizer alpha: 0.5
Downloads last month
53
Safetensors
Model size
52.8M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for shawhin/bert-phishing-classifier_student

Finetuned
(7161)
this model
Quantizations
1 model

Dataset used to train shawhin/bert-phishing-classifier_student