File size: 16,722 Bytes
93a6a1f
 
03c12e5
93a6a1f
 
 
 
 
 
 
 
03c12e5
93a6a1f
c86f59e
 
 
 
 
 
 
 
 
 
93a6a1f
 
 
 
 
75c2d1d
 
 
 
 
 
 
 
 
 
03c12e5
93a6a1f
c86f59e
93a6a1f
03c12e5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93a6a1f
 
 
 
 
 
 
03c12e5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93a6a1f
 
 
 
 
 
 
c86f59e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
---
license: apache-2.0
base_model: distilbert-base-multilingual-cased
tags:
- generated_from_trainer
metrics:
- precision
- recall
- f1
- accuracy
model-index:
- name: distilbert-base-multilingual-cased-pii
  results: []
datasets:
- ai4privacy/pii-masking-300k
pipeline_tag: token-classification

widget:
- text: "My name is Yoni Go and I live in Israel. My phone number is 054-1234567"

inference:
  parameters:
    aggregation_strategy: "first"
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

Usage:
```python
from transformers import pipeline

pipe = pipeline("token-classification", model="yonigo/distilbert-base-multilingual-cased-pii", aggregation_strategy="first")
pipe("My name is Yoni Go and I live in Israel. My phone number is 054-1234567")
```

training code [git](https://github.com/yonigottesman/pii-model)

# distilbert-base-multilingual-cased-pii

This model is a fine-tuned version of [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased) on [ai4privacy/pii-masking-300k](https://huggingface.co/datasets/ai4privacy/pii-masking-300k)..
It achieves the following results on the evaluation set:
- Loss: 0.0470
- Bod F1: 0.9642
- Building F1: 0.9789
- Cardissuer F1: 0.9697
- City F1: 0.9566
- Country F1: 0.9737
- Date F1: 0.9264
- Driverlicense F1: 0.9633
- Email F1: 0.9833
- Geocoord F1: 0.9654
- Givenname1 F1: 0.8653
- Givenname2 F1: 0.8170
- Idcard F1: 0.9390
- Ip F1: 0.9842
- Lastname1 F1: 0.8495
- Lastname2 F1: 0.7609
- Lastname3 F1: 0.7281
- Pass F1: 0.9247
- Passport F1: 0.9540
- Postcode F1: 0.9808
- Secaddress F1: 0.9732
- Sex F1: 0.9700
- Socialnumber F1: 0.9689
- State F1: 0.9761
- Street F1: 0.9609
- Tel F1: 0.9777
- Time F1: 0.9701
- Title F1: 0.9572
- Username F1: 0.9594
- Precision: 0.9428
- Recall: 0.9582
- F1: 0.9504
- Accuracy: 0.9909



### Training results

| Training Loss | Epoch   | Step  | Validation Loss | Bod F1 | Building F1 | Cardissuer F1 | City F1 | Country F1 | Date F1 | Driverlicense F1 | Email F1 | Geocoord F1 | Givenname1 F1 | Givenname2 F1 | Idcard F1 | Ip F1  | Lastname1 F1 | Lastname2 F1 | Lastname3 F1 | Pass F1 | Passport F1 | Postcode F1 | Secaddress F1 | Sex F1 | Socialnumber F1 | State F1 | Street F1 | Tel F1 | Time F1 | Title F1 | Username F1 | Precision | Recall | F1     | Accuracy |
|:-------------:|:-------:|:-----:|:---------------:|:------:|:-----------:|:-------------:|:-------:|:----------:|:-------:|:----------------:|:--------:|:-----------:|:-------------:|:-------------:|:---------:|:------:|:------------:|:------------:|:------------:|:-------:|:-----------:|:-----------:|:-------------:|:------:|:---------------:|:--------:|:---------:|:------:|:-------:|:--------:|:-----------:|:---------:|:------:|:------:|:--------:|
| 0.2604        | 0.3601  | 1000  | 0.1439          | 0.8486 | 0.8928      | 0.0           | 0.6347  | 0.7409     | 0.6650  | 0.4865           | 0.9454   | 0.8685      | 0.4884        | 0.0           | 0.4298    | 0.9051 | 0.4869       | 0.0          | 0.0          | 0.6948  | 0.5073      | 0.7842      | 0.4352        | 0.6765 | 0.7223          | 0.7680   | 0.6802    | 0.8438 | 0.9211  | 0.5403   | 0.8180      | 0.6715    | 0.7248 | 0.6971 | 0.9663   |
| 0.0866        | 0.7202  | 2000  | 0.0707          | 0.9385 | 0.9611      | 0.0           | 0.9027  | 0.9564     | 0.8655  | 0.8200           | 0.9750   | 0.9546      | 0.7057        | 0.2081        | 0.8231    | 0.9689 | 0.6300       | 0.1133       | 0.0          | 0.8483  | 0.8467      | 0.9453      | 0.9564        | 0.9319 | 0.8831          | 0.9450   | 0.9101    | 0.9487 | 0.9529  | 0.8716   | 0.9285      | 0.8700    | 0.8839 | 0.8769 | 0.9839   |
| 0.0659        | 1.0803  | 3000  | 0.0554          | 0.9507 | 0.9705      | 0.0           | 0.9241  | 0.9644     | 0.8952  | 0.8736           | 0.9792   | 0.9280      | 0.8046        | 0.6345        | 0.8698    | 0.9748 | 0.7571       | 0.5305       | 0.0          | 0.8533  | 0.8883      | 0.9659      | 0.9678        | 0.9571 | 0.9209          | 0.9615   | 0.9303    | 0.9617 | 0.9630  | 0.9145   | 0.9455      | 0.9014    | 0.9216 | 0.9114 | 0.9868   |
| 0.0523        | 1.4404  | 4000  | 0.0484          | 0.9553 | 0.9766      | 0.0           | 0.9358  | 0.9677     | 0.9017  | 0.8924           | 0.9758   | 0.9645      | 0.8305        | 0.7005        | 0.8966    | 0.9765 | 0.7978       | 0.5920       | 0.0          | 0.8963  | 0.9195      | 0.9741      | 0.9688        | 0.9644 | 0.9266          | 0.9696   | 0.9421    | 0.9706 | 0.9656  | 0.9301   | 0.9520      | 0.9183    | 0.9325 | 0.9253 | 0.9884   |
| 0.0465        | 1.8005  | 5000  | 0.0467          | 0.9576 | 0.9759      | 0.0           | 0.9400  | 0.9701     | 0.9138  | 0.9209           | 0.9837   | 0.9568      | 0.8423        | 0.7384        | 0.9088    | 0.9835 | 0.8042       | 0.6235       | 0.2139       | 0.8985  | 0.9308      | 0.9711      | 0.9673        | 0.9649 | 0.9450          | 0.9714   | 0.9471    | 0.9708 | 0.9672  | 0.9447   | 0.9532      | 0.9206    | 0.9445 | 0.9324 | 0.9890   |
| 0.0401        | 2.1606  | 6000  | 0.0441          | 0.9629 | 0.9755      | 0.0           | 0.9486  | 0.9700     | 0.9154  | 0.9288           | 0.9809   | 0.9619      | 0.8485        | 0.7652        | 0.9180    | 0.9826 | 0.8231       | 0.6677       | 0.4724       | 0.8883  | 0.9343      | 0.9777      | 0.9734        | 0.9685 | 0.9490          | 0.9733   | 0.9529    | 0.9743 | 0.9672  | 0.9482   | 0.9555      | 0.9300    | 0.9454 | 0.9377 | 0.9895   |
| 0.0401        | 2.5207  | 7000  | 0.0428          | 0.9619 | 0.9769      | 0.0           | 0.9492  | 0.9709     | 0.9206  | 0.9401           | 0.9795   | 0.9615      | 0.8550        | 0.7776        | 0.9274    | 0.9827 | 0.8267       | 0.6742       | 0.5845       | 0.9085  | 0.9427      | 0.9798      | 0.9755        | 0.9690 | 0.9515          | 0.9736   | 0.9557    | 0.9764 | 0.9700  | 0.9479   | 0.9580      | 0.9340    | 0.9491 | 0.9415 | 0.9900   |
| 0.0394        | 2.8808  | 8000  | 0.0420          | 0.9616 | 0.9770      | 0.0           | 0.9481  | 0.9730     | 0.9185  | 0.9451           | 0.9832   | 0.9569      | 0.8526        | 0.7895        | 0.9269    | 0.9852 | 0.8312       | 0.7121       | 0.6234       | 0.9168  | 0.9441      | 0.9778      | 0.9737        | 0.9700 | 0.9514          | 0.9738   | 0.9565    | 0.9751 | 0.9674  | 0.9512   | 0.9562      | 0.9324    | 0.9535 | 0.9429 | 0.9901   |
| 0.0323        | 3.2409  | 9000  | 0.0422          | 0.9575 | 0.9781      | 0.0           | 0.9521  | 0.9725     | 0.9215  | 0.9445           | 0.9787   | 0.9601      | 0.8459        | 0.7863        | 0.9238    | 0.9834 | 0.8189       | 0.7040       | 0.6460       | 0.9117  | 0.9393      | 0.9792      | 0.9748        | 0.9679 | 0.9575          | 0.9746   | 0.9569    | 0.9732 | 0.9688  | 0.9509   | 0.9557      | 0.9336    | 0.9500 | 0.9418 | 0.9899   |
| 0.0313        | 3.6010  | 10000 | 0.0412          | 0.9630 | 0.9784      | 0.0           | 0.9551  | 0.9741     | 0.9235  | 0.9460           | 0.9826   | 0.9646      | 0.8619        | 0.7991        | 0.9277    | 0.9829 | 0.8386       | 0.7306       | 0.6767       | 0.9199  | 0.9454      | 0.9810      | 0.9746        | 0.9692 | 0.9598          | 0.9746   | 0.9589    | 0.9731 | 0.9685  | 0.9547   | 0.9583      | 0.9390    | 0.9527 | 0.9458 | 0.9904   |
| 0.0304        | 3.9611  | 11000 | 0.0404          | 0.9587 | 0.9792      | 0.1333        | 0.9511  | 0.9725     | 0.9219  | 0.9538           | 0.9769   | 0.9578      | 0.8589        | 0.8061        | 0.9255    | 0.9845 | 0.8402       | 0.7395       | 0.6790       | 0.9136  | 0.9479      | 0.9801      | 0.9748        | 0.9698 | 0.9628          | 0.9752   | 0.9581    | 0.9775 | 0.9695  | 0.9501   | 0.9597      | 0.9373    | 0.9540 | 0.9456 | 0.9904   |
| 0.0264        | 4.3212  | 12000 | 0.0416          | 0.9599 | 0.9794      | 0.5           | 0.9547  | 0.9735     | 0.9271  | 0.9557           | 0.9809   | 0.9537      | 0.8510        | 0.8016        | 0.9316    | 0.9816 | 0.8358       | 0.7412       | 0.6877       | 0.9212  | 0.9476      | 0.9779      | 0.9729        | 0.9682 | 0.9611          | 0.9748   | 0.9593    | 0.9742 | 0.9697  | 0.9551   | 0.9590      | 0.9370    | 0.9550 | 0.9459 | 0.9904   |
| 0.0266        | 4.6813  | 13000 | 0.0412          | 0.9629 | 0.9800      | 0.5           | 0.9511  | 0.9697     | 0.9276  | 0.9564           | 0.9826   | 0.9578      | 0.8590        | 0.8078        | 0.9303    | 0.9830 | 0.8423       | 0.7470       | 0.6945       | 0.9162  | 0.9468      | 0.9789      | 0.9713        | 0.9692 | 0.9597          | 0.9748   | 0.9584    | 0.9759 | 0.9698  | 0.9555   | 0.9575      | 0.9355    | 0.9579 | 0.9466 | 0.9905   |
| 0.0236        | 5.0414  | 14000 | 0.0414          | 0.9614 | 0.9786      | 0.6061        | 0.9562  | 0.9736     | 0.9223  | 0.9595           | 0.9821   | 0.9537      | 0.8673        | 0.8108        | 0.9367    | 0.9811 | 0.8422       | 0.7523       | 0.7140       | 0.9190  | 0.9503      | 0.9807      | 0.9679        | 0.9689 | 0.9676          | 0.9750   | 0.9611    | 0.9758 | 0.9699  | 0.9556   | 0.9589      | 0.9426    | 0.9543 | 0.9484 | 0.9907   |
| 0.0221        | 5.4015  | 15000 | 0.0420          | 0.9597 | 0.9797      | 0.6667        | 0.9554  | 0.9734     | 0.9210  | 0.9587           | 0.9832   | 0.9667      | 0.8637        | 0.8121        | 0.9367    | 0.9852 | 0.8449       | 0.7509       | 0.7145       | 0.9178  | 0.9498      | 0.9808      | 0.9746        | 0.9707 | 0.9650          | 0.9746   | 0.9604    | 0.9749 | 0.9692  | 0.9556   | 0.9591      | 0.9405    | 0.9563 | 0.9484 | 0.9906   |
| 0.021         | 5.7616  | 16000 | 0.0421          | 0.9613 | 0.9794      | 0.6667        | 0.9532  | 0.9736     | 0.9287  | 0.9554           | 0.9792   | 0.9599      | 0.8624        | 0.8146        | 0.9334    | 0.9790 | 0.8445       | 0.7534       | 0.7154       | 0.9181  | 0.9487      | 0.9791      | 0.9721        | 0.9691 | 0.9646          | 0.9748   | 0.9534    | 0.9757 | 0.9693  | 0.9561   | 0.9586      | 0.9403    | 0.9545 | 0.9473 | 0.9905   |
| 0.0174        | 6.1217  | 17000 | 0.0433          | 0.9617 | 0.9788      | 0.7879        | 0.9545  | 0.9738     | 0.9241  | 0.9598           | 0.9829   | 0.9589      | 0.8570        | 0.8131        | 0.9369    | 0.9838 | 0.8449       | 0.7581       | 0.7242       | 0.9230  | 0.9488      | 0.9798      | 0.9690        | 0.9691 | 0.9652          | 0.9759   | 0.9563    | 0.9769 | 0.9700  | 0.9556   | 0.9581      | 0.9403    | 0.9563 | 0.9482 | 0.9907   |
| 0.017         | 6.4818  | 18000 | 0.0442          | 0.9623 | 0.9790      | 0.9697        | 0.9566  | 0.9744     | 0.9258  | 0.9608           | 0.9833   | 0.9574      | 0.8565        | 0.8130        | 0.9350    | 0.9845 | 0.8450       | 0.7552       | 0.7329       | 0.9216  | 0.9519      | 0.9800      | 0.9723        | 0.9703 | 0.9675          | 0.9762   | 0.9605    | 0.9775 | 0.9713  | 0.9545   | 0.9582      | 0.9398    | 0.9582 | 0.9489 | 0.9907   |
| 0.017         | 6.8419  | 19000 | 0.0431          | 0.9639 | 0.9778      | 0.9697        | 0.9562  | 0.9738     | 0.9286  | 0.9612           | 0.9842   | 0.9607      | 0.8641        | 0.8160        | 0.9363    | 0.9828 | 0.8481       | 0.7610       | 0.7292       | 0.9198  | 0.9531      | 0.9800      | 0.9757        | 0.9699 | 0.9657          | 0.9751   | 0.9600    | 0.9767 | 0.9705  | 0.9565   | 0.9587      | 0.9414    | 0.9577 | 0.9495 | 0.9909   |
| 0.015         | 7.2020  | 20000 | 0.0438          | 0.9645 | 0.9795      | 0.9091        | 0.9550  | 0.9734     | 0.9295  | 0.9605           | 0.9824   | 0.9605      | 0.8594        | 0.8120        | 0.9382    | 0.9837 | 0.8452       | 0.7571       | 0.7222       | 0.9220  | 0.9540      | 0.9810      | 0.9745        | 0.9700 | 0.9672          | 0.9758   | 0.9599    | 0.9783 | 0.9702  | 0.9551   | 0.9596      | 0.9414    | 0.9576 | 0.9494 | 0.9908   |
| 0.0152        | 7.5621  | 21000 | 0.0451          | 0.9644 | 0.9795      | 0.9697        | 0.9570  | 0.9741     | 0.9271  | 0.9616           | 0.9826   | 0.9597      | 0.8649        | 0.8121        | 0.9374    | 0.9848 | 0.8469       | 0.7612       | 0.7261       | 0.9231  | 0.9530      | 0.9809      | 0.9747        | 0.9704 | 0.9661          | 0.9756   | 0.9618    | 0.9769 | 0.9706  | 0.9570   | 0.9601      | 0.9427    | 0.9573 | 0.9499 | 0.9908   |
| 0.0137        | 7.9222  | 22000 | 0.0450          | 0.9628 | 0.9780      | 0.9697        | 0.9565  | 0.9742     | 0.9289  | 0.9627           | 0.9832   | 0.9613      | 0.8643        | 0.8169        | 0.9374    | 0.9840 | 0.8497       | 0.7632       | 0.7292       | 0.9234  | 0.9514      | 0.9807      | 0.9737        | 0.9695 | 0.9674          | 0.9758   | 0.9610    | 0.9778 | 0.9701  | 0.9572   | 0.9596      | 0.9420    | 0.9582 | 0.9501 | 0.9908   |
| 0.0122        | 8.2823  | 23000 | 0.0463          | 0.9646 | 0.9789      | 0.9697        | 0.9560  | 0.9738     | 0.9276  | 0.9628           | 0.9835   | 0.9602      | 0.8643        | 0.8176        | 0.9386    | 0.9838 | 0.8494       | 0.7638       | 0.7275       | 0.9233  | 0.9519      | 0.9806      | 0.9739        | 0.9696 | 0.9682          | 0.9762   | 0.9604    | 0.9769 | 0.9698  | 0.9577   | 0.9592      | 0.9426    | 0.9578 | 0.9502 | 0.9908   |
| 0.0123        | 8.6424  | 24000 | 0.0459          | 0.9626 | 0.9782      | 0.9697        | 0.9566  | 0.9743     | 0.9276  | 0.9628           | 0.9839   | 0.9613      | 0.8670        | 0.8163        | 0.9394    | 0.9850 | 0.8487       | 0.7635       | 0.7357       | 0.9241  | 0.9539      | 0.9810      | 0.9737        | 0.9701 | 0.9680          | 0.9757   | 0.9617    | 0.9780 | 0.9702  | 0.9574   | 0.9601      | 0.9436    | 0.9578 | 0.9506 | 0.9909   |
| 0.0133        | 9.0025  | 25000 | 0.0462          | 0.9636 | 0.9788      | 0.9697        | 0.9563  | 0.9731     | 0.9273  | 0.9631           | 0.9835   | 0.9625      | 0.8672        | 0.8157        | 0.9393    | 0.9837 | 0.8495       | 0.7609       | 0.7289       | 0.9236  | 0.9541      | 0.9814      | 0.9737        | 0.9698 | 0.9684          | 0.9761   | 0.9618    | 0.9776 | 0.9698  | 0.9570   | 0.9591      | 0.9435    | 0.9574 | 0.9504 | 0.9909   |
| 0.0112        | 9.3626  | 26000 | 0.0467          | 0.9624 | 0.9789      | 0.9697        | 0.9567  | 0.9740     | 0.9243  | 0.9635           | 0.9832   | 0.9654      | 0.8643        | 0.8170        | 0.9375    | 0.9844 | 0.8489       | 0.7603       | 0.7303       | 0.9248  | 0.9534      | 0.9812      | 0.9735        | 0.9701 | 0.9685          | 0.9762   | 0.9617    | 0.9784 | 0.9698  | 0.9563   | 0.9594      | 0.9428    | 0.9576 | 0.9501 | 0.9909   |
| 0.0116        | 9.7227  | 27000 | 0.0464          | 0.9628 | 0.9789      | 0.9697        | 0.9562  | 0.9741     | 0.9260  | 0.9633           | 0.9826   | 0.9643      | 0.8637        | 0.8138        | 0.9379    | 0.9843 | 0.8492       | 0.7610       | 0.7278       | 0.9245  | 0.9536      | 0.9808      | 0.9725        | 0.9702 | 0.9686          | 0.9761   | 0.9613    | 0.9778 | 0.9698  | 0.9564   | 0.9591      | 0.9419    | 0.9583 | 0.9500 | 0.9908   |
| 0.011         | 10.0828 | 28000 | 0.0470          | 0.9637 | 0.9790      | 0.9697        | 0.9561  | 0.9736     | 0.9266  | 0.9632           | 0.9831   | 0.9646      | 0.8656        | 0.8160        | 0.9384    | 0.9843 | 0.8494       | 0.7597       | 0.7281       | 0.9239  | 0.9537      | 0.9805      | 0.9731        | 0.9701 | 0.9685          | 0.9759   | 0.9611    | 0.9778 | 0.9698  | 0.9573   | 0.9591      | 0.9423    | 0.9583 | 0.9502 | 0.9909   |
| 0.011         | 10.4429 | 29000 | 0.0469          | 0.9642 | 0.9790      | 0.9697        | 0.9567  | 0.9738     | 0.9267  | 0.9632           | 0.9834   | 0.9654      | 0.8653        | 0.8172        | 0.9393    | 0.9842 | 0.8495       | 0.7609       | 0.7287       | 0.9247  | 0.9544      | 0.9809      | 0.9732        | 0.9699 | 0.9687          | 0.9762   | 0.9614    | 0.9777 | 0.9699  | 0.9574   | 0.9596      | 0.9430    | 0.9581 | 0.9505 | 0.9909   |
| 0.0106        | 10.8030 | 30000 | 0.0470          | 0.9642 | 0.9789      | 0.9697        | 0.9566  | 0.9737     | 0.9264  | 0.9633           | 0.9833   | 0.9654      | 0.8653        | 0.8170        | 0.9390    | 0.9842 | 0.8495       | 0.7609       | 0.7281       | 0.9247  | 0.9540      | 0.9808      | 0.9732        | 0.9700 | 0.9689          | 0.9761   | 0.9609    | 0.9777 | 0.9701  | 0.9572   | 0.9594      | 0.9428    | 0.9582 | 0.9504 | 0.9909   |


### Framework versions

- Transformers 4.41.2
- Pytorch 2.3.1+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1