--- license: apache-2.0 base_model: distilbert/distilbert-base-uncased tags: - generated_from_trainer model-index: - name: FAQs_DistillBERT results: [] --- # FAQs_DistillBERT This model is a fine-tuned version of [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.1075 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 50 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | No log | 1.0 | 194 | 0.1131 | | No log | 2.0 | 388 | 0.1129 | | 0.0986 | 3.0 | 582 | 0.1211 | | 0.0986 | 4.0 | 776 | 0.1023 | | 0.0986 | 5.0 | 970 | 0.1023 | | 0.0589 | 6.0 | 1164 | 0.1051 | | 0.0589 | 7.0 | 1358 | 0.1044 | | 0.0327 | 8.0 | 1552 | 0.1100 | | 0.0327 | 9.0 | 1746 | 0.0981 | | 0.0327 | 10.0 | 1940 | 0.1029 | | 0.0209 | 11.0 | 2134 | 0.0960 | | 0.0209 | 12.0 | 2328 | 0.0969 | | 0.0126 | 13.0 | 2522 | 0.1022 | | 0.0126 | 14.0 | 2716 | 0.0985 | | 0.0126 | 15.0 | 2910 | 0.1139 | | 0.0095 | 16.0 | 3104 | 0.1139 | | 0.0095 | 17.0 | 3298 | 0.1025 | | 0.0095 | 18.0 | 3492 | 0.0986 | | 0.0062 | 19.0 | 3686 | 0.0962 | | 0.0062 | 20.0 | 3880 | 0.0949 | | 0.0029 | 21.0 | 4074 | 0.1059 | | 0.0029 | 22.0 | 4268 | 0.1098 | | 0.0029 | 23.0 | 4462 | 0.1088 | | 0.0035 | 24.0 | 4656 | 0.1047 | | 0.0035 | 25.0 | 4850 | 0.1138 | | 0.003 | 26.0 | 5044 | 0.1101 | | 0.003 | 27.0 | 5238 | 0.1064 | | 0.003 | 28.0 | 5432 | 0.0978 | | 0.0028 | 29.0 | 5626 | 0.1100 | | 0.0028 | 30.0 | 5820 | 0.1007 | | 0.0027 | 31.0 | 6014 | 0.1038 | | 0.0027 | 32.0 | 6208 | 0.0982 | | 0.0027 | 33.0 | 6402 | 0.1057 | | 0.0044 | 34.0 | 6596 | 0.1100 | | 0.0044 | 35.0 | 6790 | 0.0932 | | 0.0044 | 36.0 | 6984 | 0.0961 | | 0.0027 | 37.0 | 7178 | 0.0973 | | 0.0027 | 38.0 | 7372 | 0.0968 | | 0.0018 | 39.0 | 7566 | 0.0942 | | 0.0018 | 40.0 | 7760 | 0.1006 | | 0.0018 | 41.0 | 7954 | 0.1028 | | 0.0025 | 42.0 | 8148 | 0.1023 | | 0.0025 | 43.0 | 8342 | 0.1006 | | 0.0018 | 44.0 | 8536 | 0.1051 | | 0.0018 | 45.0 | 8730 | 0.1056 | | 0.0018 | 46.0 | 8924 | 0.1072 | | 0.0019 | 47.0 | 9118 | 0.1075 | | 0.0019 | 48.0 | 9312 | 0.1074 | | 0.0017 | 49.0 | 9506 | 0.1074 | | 0.0017 | 50.0 | 9700 | 0.1075 | ### Framework versions - Transformers 4.38.2 - Pytorch 2.2.1+cu121 - Datasets 2.18.0 - Tokenizers 0.15.2