bert-phishing-classifier_student

This model is modified version of distilbert/distilbert-base-uncased trained via knowledge distillation from shawhin/bert-phishing-classifier_teacher using the shawhin/phishing-site-classification dataset. It achieves the following results on the testing set:

  • Loss (training): 0.0563
  • Accuracy: 0.9022
  • Precision: 0.9426
  • Recall: 0.8603
  • F1 Score: 0.8995

Model description

Student model for knowledge distillation example.

Video | Blog | Example code

Intended uses & limitations

This model was created for educational purposes.

Training and evaluation data

The Training, Testing, and Validation data are available here: shawhin/phishing-site-classification.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • num_epochs: 5
  • temperature: 2.0
  • adam optimizer alpha: 0.5
Downloads last month
169
Safetensors
Model size
52.8M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for shawhin/bert-phishing-classifier_student

Finetuned
(7389)
this model
Quantizations
1 model

Dataset used to train shawhin/bert-phishing-classifier_student