distilbert
This model is a fine-tuned version of distilbert-base-cased on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.0005
o
precision: 0.9946o
recall: 0.9960o
f1: 0.9953i
precision: 0.9994i
recall: 0.9993i
f1: 0.9994- Weighted avg f1: 0.9989
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 426
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 2
Training results
Training Loss | Epoch | Step | Validation Loss | o precision |
o recall |
o f1 |
i precision |
i recall |
i f1 |
Weighted avg f1 |
---|---|---|---|---|---|---|---|---|---|---|
0.0187 | 0.08 | 500 | 0.0033 | 0.9363 | 0.9990 | 0.9666 | 0.9999 | 0.9906 | 0.9952 | 0.9918 |
0.002 | 0.16 | 1000 | 0.0014 | 0.9762 | 0.9947 | 0.9854 | 0.9993 | 0.9967 | 0.9980 | 0.9964 |
0.0016 | 0.24 | 1500 | 0.0012 | 0.9813 | 0.9918 | 0.9865 | 0.9989 | 0.9974 | 0.9981 | 0.9967 |
0.0015 | 0.32 | 2000 | 0.0012 | 0.9801 | 0.9960 | 0.9880 | 0.9994 | 0.9972 | 0.9983 | 0.9971 |
0.0013 | 0.4 | 2500 | 0.0010 | 0.9834 | 0.9960 | 0.9896 | 0.9994 | 0.9977 | 0.9986 | 0.9975 |
0.001 | 0.48 | 3000 | 0.0008 | 0.9881 | 0.9959 | 0.9920 | 0.9994 | 0.9983 | 0.9989 | 0.9980 |
0.0009 | 0.56 | 3500 | 0.0009 | 0.9854 | 0.9955 | 0.9904 | 0.9994 | 0.9980 | 0.9987 | 0.9977 |
0.0009 | 0.64 | 4000 | 0.0008 | 0.9883 | 0.9946 | 0.9914 | 0.9993 | 0.9984 | 0.9988 | 0.9979 |
0.0009 | 0.72 | 4500 | 0.0009 | 0.9935 | 0.9884 | 0.9910 | 0.9984 | 0.9991 | 0.9988 | 0.9978 |
0.0008 | 0.8 | 5000 | 0.0008 | 0.9913 | 0.9926 | 0.9920 | 0.9990 | 0.9988 | 0.9989 | 0.9981 |
0.0008 | 0.88 | 5500 | 0.0007 | 0.9874 | 0.9976 | 0.9925 | 0.9997 | 0.9982 | 0.9990 | 0.9982 |
0.0008 | 0.96 | 6000 | 0.0008 | 0.9924 | 0.9923 | 0.9923 | 0.9989 | 0.9990 | 0.9989 | 0.9982 |
0.0005 | 1.04 | 6500 | 0.0007 | 0.9924 | 0.9948 | 0.9936 | 0.9993 | 0.9990 | 0.9991 | 0.9985 |
0.0005 | 1.12 | 7000 | 0.0007 | 0.9885 | 0.9973 | 0.9929 | 0.9996 | 0.9984 | 0.9990 | 0.9983 |
0.0005 | 1.2 | 7500 | 0.0007 | 0.9890 | 0.9970 | 0.9930 | 0.9996 | 0.9985 | 0.9990 | 0.9983 |
0.0006 | 1.28 | 8000 | 0.0006 | 0.9927 | 0.9965 | 0.9946 | 0.9995 | 0.9990 | 0.9993 | 0.9987 |
0.0004 | 1.36 | 8500 | 0.0005 | 0.9934 | 0.9962 | 0.9948 | 0.9995 | 0.9991 | 0.9993 | 0.9987 |
0.0004 | 1.44 | 9000 | 0.0006 | 0.9941 | 0.9953 | 0.9947 | 0.9994 | 0.9992 | 0.9993 | 0.9987 |
0.0004 | 1.52 | 9500 | 0.0005 | 0.9940 | 0.9951 | 0.9946 | 0.9993 | 0.9992 | 0.9993 | 0.9987 |
0.0004 | 1.6 | 10000 | 0.0005 | 0.9942 | 0.9958 | 0.9950 | 0.9994 | 0.9992 | 0.9993 | 0.9988 |
0.0003 | 1.68 | 10500 | 0.0006 | 0.9940 | 0.9951 | 0.9945 | 0.9993 | 0.9992 | 0.9992 | 0.9987 |
0.0005 | 1.76 | 11000 | 0.0005 | 0.9953 | 0.9947 | 0.9950 | 0.9993 | 0.9994 | 0.9993 | 0.9988 |
0.0004 | 1.84 | 11500 | 0.0005 | 0.9944 | 0.9958 | 0.9951 | 0.9994 | 0.9992 | 0.9993 | 0.9988 |
0.0004 | 1.92 | 12000 | 0.0005 | 0.9943 | 0.9962 | 0.9953 | 0.9995 | 0.9992 | 0.9993 | 0.9989 |
0.0004 | 2.0 | 12500 | 0.0005 | 0.9946 | 0.9960 | 0.9953 | 0.9994 | 0.9993 | 0.9994 | 0.9989 |
Framework versions
- Transformers 4.34.1
- Pytorch 2.1.0
- Datasets 2.14.6
- Tokenizers 0.14.1
- Downloads last month
- 7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for bsharpataz/distilbert
Base model
distilbert/distilbert-base-cased