distilbert-base-uncased-finetuned-legal_data

This model is a fine-tuned version of distilbert-base-uncased on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 6.9101

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 26 5.3529
No log 2.0 52 5.4226
No log 3.0 78 5.2550
No log 4.0 104 5.1011
No log 5.0 130 5.1857
No log 6.0 156 5.5119
No log 7.0 182 5.4480
No log 8.0 208 5.6993
No log 9.0 234 5.9614
No log 10.0 260 5.6987
No log 11.0 286 5.6679
No log 12.0 312 5.9850
No log 13.0 338 5.6065
No log 14.0 364 5.3162
No log 15.0 390 5.7856
No log 16.0 416 5.5786
No log 17.0 442 5.6028
No log 18.0 468 5.7649
No log 19.0 494 5.5382
1.8345 20.0 520 6.3654
1.8345 21.0 546 5.3575
1.8345 22.0 572 5.3808
1.8345 23.0 598 5.9340
1.8345 24.0 624 6.1475
1.8345 25.0 650 6.2188
1.8345 26.0 676 5.7651
1.8345 27.0 702 6.2629
1.8345 28.0 728 6.1356
1.8345 29.0 754 5.9255
1.8345 30.0 780 6.4252
1.8345 31.0 806 5.6967
1.8345 32.0 832 6.4324
1.8345 33.0 858 6.5087
1.8345 34.0 884 6.1113
1.8345 35.0 910 6.7443
1.8345 36.0 936 6.6970
1.8345 37.0 962 6.5578
1.8345 38.0 988 6.1963
0.2251 39.0 1014 6.4893
0.2251 40.0 1040 6.6347
0.2251 41.0 1066 6.7106
0.2251 42.0 1092 6.8129
0.2251 43.0 1118 6.6386
0.2251 44.0 1144 6.4134
0.2251 45.0 1170 6.6883
0.2251 46.0 1196 6.6406
0.2251 47.0 1222 6.3065
0.2251 48.0 1248 7.0281
0.2251 49.0 1274 7.3646
0.2251 50.0 1300 7.1086
0.2251 51.0 1326 6.4749
0.2251 52.0 1352 6.3303
0.2251 53.0 1378 6.2919
0.2251 54.0 1404 6.3855
0.2251 55.0 1430 6.9501
0.2251 56.0 1456 6.8714
0.2251 57.0 1482 6.9856
0.0891 58.0 1508 6.9910
0.0891 59.0 1534 6.9293
0.0891 60.0 1560 7.3493
0.0891 61.0 1586 7.1834
0.0891 62.0 1612 7.0479
0.0891 63.0 1638 6.7674
0.0891 64.0 1664 6.7553
0.0891 65.0 1690 7.3074
0.0891 66.0 1716 6.8071
0.0891 67.0 1742 7.6622
0.0891 68.0 1768 6.9555
0.0891 69.0 1794 7.0153
0.0891 70.0 1820 7.2085
0.0891 71.0 1846 6.7582
0.0891 72.0 1872 6.7989
0.0891 73.0 1898 6.7012
0.0891 74.0 1924 7.0088
0.0891 75.0 1950 7.1024
0.0891 76.0 1976 6.6968
0.058 77.0 2002 7.5249
0.058 78.0 2028 6.9199
0.058 79.0 2054 7.1995
0.058 80.0 2080 6.9349
0.058 81.0 2106 7.4025
0.058 82.0 2132 7.4199
0.058 83.0 2158 6.8081
0.058 84.0 2184 7.4777
0.058 85.0 2210 7.1990
0.058 86.0 2236 7.0062
0.058 87.0 2262 7.5724
0.058 88.0 2288 6.9362
0.058 89.0 2314 7.1368
0.058 90.0 2340 7.2183
0.058 91.0 2366 6.8684
0.058 92.0 2392 7.1433
0.058 93.0 2418 7.2161
0.058 94.0 2444 7.1442
0.058 95.0 2470 7.3098
0.058 96.0 2496 7.1264
0.0512 97.0 2522 6.9424
0.0512 98.0 2548 6.9155
0.0512 99.0 2574 6.9038
0.0512 100.0 2600 6.9101

Framework versions

  • Transformers 4.11.3
  • Pytorch 1.9.0+cu102
  • Datasets 1.12.1
  • Tokenizers 0.10.3
Downloads last month
24
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.