--- license: apache-2.0 datasets: - shawhin/phishing-site-classification metrics: - accuracy - recall - precision - f1 base_model: distilbert/distilbert-base-uncased pipeline_tag: text-classification library_name: transformers --- # bert-phishing-classifier_student This model is modified version of [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) trained via knowledge distillation from [shawhin/bert-phishing-classifier_teacher](https://huggingface.co/shawhin/bert-phishing-classifier_teacher) using the [shawhin/phishing-site-classification](https://huggingface.co/datasets/shawhin/phishing-site-classification) dataset. It achieves the following results on the testing set: - Loss (training): 0.0563 - Accuracy: 0.9022 - Precision: 0.9426 - Recall: 0.8603 - F1 Score: 0.8995 ## Model description Student model for knowledge distillation example. [Video](https://youtu.be/FLkUOkeMd5M) | [Blog](https://towardsdatascience.com/compressing-large-language-models-llms-9f406eea5b5e) | [Example code](https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/model-compression) ## Intended uses & limitations This model was created for educational purposes. ## Training and evaluation data The Training, Testing, and Validation data are available here: [shawhin/phishing-site-classification](https://huggingface.co/datasets/shawhin/phishing-site-classification). ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 32 - eval_batch_size: 32 - num_epochs: 5 - temperature: 2.0 - adam optimizer alpha: 0.5