yonigo's picture
Update README.md
75c2d1d verified
metadata
license: apache-2.0
base_model: distilbert-base-multilingual-cased
tags:
  - generated_from_trainer
metrics:
  - precision
  - recall
  - f1
  - accuracy
model-index:
  - name: distilbert-base-multilingual-cased-pii
    results: []
datasets:
  - ai4privacy/pii-masking-300k
pipeline_tag: token-classification
widget:
  - text: My name is Yoni Go and I live in Israel. My phone number is 054-1234567
inference:
  parameters:
    aggregation_strategy: first

Usage:

from transformers import pipeline

pipe = pipeline("token-classification", model="yonigo/distilbert-base-multilingual-cased-pii", aggregation_strategy="first")
pipe("My name is Yoni Go and I live in Israel. My phone number is 054-1234567")

training code git

distilbert-base-multilingual-cased-pii

This model is a fine-tuned version of distilbert-base-multilingual-cased on ai4privacy/pii-masking-300k.. It achieves the following results on the evaluation set:

  • Loss: 0.0470
  • Bod F1: 0.9642
  • Building F1: 0.9789
  • Cardissuer F1: 0.9697
  • City F1: 0.9566
  • Country F1: 0.9737
  • Date F1: 0.9264
  • Driverlicense F1: 0.9633
  • Email F1: 0.9833
  • Geocoord F1: 0.9654
  • Givenname1 F1: 0.8653
  • Givenname2 F1: 0.8170
  • Idcard F1: 0.9390
  • Ip F1: 0.9842
  • Lastname1 F1: 0.8495
  • Lastname2 F1: 0.7609
  • Lastname3 F1: 0.7281
  • Pass F1: 0.9247
  • Passport F1: 0.9540
  • Postcode F1: 0.9808
  • Secaddress F1: 0.9732
  • Sex F1: 0.9700
  • Socialnumber F1: 0.9689
  • State F1: 0.9761
  • Street F1: 0.9609
  • Tel F1: 0.9777
  • Time F1: 0.9701
  • Title F1: 0.9572
  • Username F1: 0.9594
  • Precision: 0.9428
  • Recall: 0.9582
  • F1: 0.9504
  • Accuracy: 0.9909

Training results

Training Loss Epoch Step Validation Loss Bod F1 Building F1 Cardissuer F1 City F1 Country F1 Date F1 Driverlicense F1 Email F1 Geocoord F1 Givenname1 F1 Givenname2 F1 Idcard F1 Ip F1 Lastname1 F1 Lastname2 F1 Lastname3 F1 Pass F1 Passport F1 Postcode F1 Secaddress F1 Sex F1 Socialnumber F1 State F1 Street F1 Tel F1 Time F1 Title F1 Username F1 Precision Recall F1 Accuracy
0.2604 0.3601 1000 0.1439 0.8486 0.8928 0.0 0.6347 0.7409 0.6650 0.4865 0.9454 0.8685 0.4884 0.0 0.4298 0.9051 0.4869 0.0 0.0 0.6948 0.5073 0.7842 0.4352 0.6765 0.7223 0.7680 0.6802 0.8438 0.9211 0.5403 0.8180 0.6715 0.7248 0.6971 0.9663
0.0866 0.7202 2000 0.0707 0.9385 0.9611 0.0 0.9027 0.9564 0.8655 0.8200 0.9750 0.9546 0.7057 0.2081 0.8231 0.9689 0.6300 0.1133 0.0 0.8483 0.8467 0.9453 0.9564 0.9319 0.8831 0.9450 0.9101 0.9487 0.9529 0.8716 0.9285 0.8700 0.8839 0.8769 0.9839
0.0659 1.0803 3000 0.0554 0.9507 0.9705 0.0 0.9241 0.9644 0.8952 0.8736 0.9792 0.9280 0.8046 0.6345 0.8698 0.9748 0.7571 0.5305 0.0 0.8533 0.8883 0.9659 0.9678 0.9571 0.9209 0.9615 0.9303 0.9617 0.9630 0.9145 0.9455 0.9014 0.9216 0.9114 0.9868
0.0523 1.4404 4000 0.0484 0.9553 0.9766 0.0 0.9358 0.9677 0.9017 0.8924 0.9758 0.9645 0.8305 0.7005 0.8966 0.9765 0.7978 0.5920 0.0 0.8963 0.9195 0.9741 0.9688 0.9644 0.9266 0.9696 0.9421 0.9706 0.9656 0.9301 0.9520 0.9183 0.9325 0.9253 0.9884
0.0465 1.8005 5000 0.0467 0.9576 0.9759 0.0 0.9400 0.9701 0.9138 0.9209 0.9837 0.9568 0.8423 0.7384 0.9088 0.9835 0.8042 0.6235 0.2139 0.8985 0.9308 0.9711 0.9673 0.9649 0.9450 0.9714 0.9471 0.9708 0.9672 0.9447 0.9532 0.9206 0.9445 0.9324 0.9890
0.0401 2.1606 6000 0.0441 0.9629 0.9755 0.0 0.9486 0.9700 0.9154 0.9288 0.9809 0.9619 0.8485 0.7652 0.9180 0.9826 0.8231 0.6677 0.4724 0.8883 0.9343 0.9777 0.9734 0.9685 0.9490 0.9733 0.9529 0.9743 0.9672 0.9482 0.9555 0.9300 0.9454 0.9377 0.9895
0.0401 2.5207 7000 0.0428 0.9619 0.9769 0.0 0.9492 0.9709 0.9206 0.9401 0.9795 0.9615 0.8550 0.7776 0.9274 0.9827 0.8267 0.6742 0.5845 0.9085 0.9427 0.9798 0.9755 0.9690 0.9515 0.9736 0.9557 0.9764 0.9700 0.9479 0.9580 0.9340 0.9491 0.9415 0.9900
0.0394 2.8808 8000 0.0420 0.9616 0.9770 0.0 0.9481 0.9730 0.9185 0.9451 0.9832 0.9569 0.8526 0.7895 0.9269 0.9852 0.8312 0.7121 0.6234 0.9168 0.9441 0.9778 0.9737 0.9700 0.9514 0.9738 0.9565 0.9751 0.9674 0.9512 0.9562 0.9324 0.9535 0.9429 0.9901
0.0323 3.2409 9000 0.0422 0.9575 0.9781 0.0 0.9521 0.9725 0.9215 0.9445 0.9787 0.9601 0.8459 0.7863 0.9238 0.9834 0.8189 0.7040 0.6460 0.9117 0.9393 0.9792 0.9748 0.9679 0.9575 0.9746 0.9569 0.9732 0.9688 0.9509 0.9557 0.9336 0.9500 0.9418 0.9899
0.0313 3.6010 10000 0.0412 0.9630 0.9784 0.0 0.9551 0.9741 0.9235 0.9460 0.9826 0.9646 0.8619 0.7991 0.9277 0.9829 0.8386 0.7306 0.6767 0.9199 0.9454 0.9810 0.9746 0.9692 0.9598 0.9746 0.9589 0.9731 0.9685 0.9547 0.9583 0.9390 0.9527 0.9458 0.9904
0.0304 3.9611 11000 0.0404 0.9587 0.9792 0.1333 0.9511 0.9725 0.9219 0.9538 0.9769 0.9578 0.8589 0.8061 0.9255 0.9845 0.8402 0.7395 0.6790 0.9136 0.9479 0.9801 0.9748 0.9698 0.9628 0.9752 0.9581 0.9775 0.9695 0.9501 0.9597 0.9373 0.9540 0.9456 0.9904
0.0264 4.3212 12000 0.0416 0.9599 0.9794 0.5 0.9547 0.9735 0.9271 0.9557 0.9809 0.9537 0.8510 0.8016 0.9316 0.9816 0.8358 0.7412 0.6877 0.9212 0.9476 0.9779 0.9729 0.9682 0.9611 0.9748 0.9593 0.9742 0.9697 0.9551 0.9590 0.9370 0.9550 0.9459 0.9904
0.0266 4.6813 13000 0.0412 0.9629 0.9800 0.5 0.9511 0.9697 0.9276 0.9564 0.9826 0.9578 0.8590 0.8078 0.9303 0.9830 0.8423 0.7470 0.6945 0.9162 0.9468 0.9789 0.9713 0.9692 0.9597 0.9748 0.9584 0.9759 0.9698 0.9555 0.9575 0.9355 0.9579 0.9466 0.9905
0.0236 5.0414 14000 0.0414 0.9614 0.9786 0.6061 0.9562 0.9736 0.9223 0.9595 0.9821 0.9537 0.8673 0.8108 0.9367 0.9811 0.8422 0.7523 0.7140 0.9190 0.9503 0.9807 0.9679 0.9689 0.9676 0.9750 0.9611 0.9758 0.9699 0.9556 0.9589 0.9426 0.9543 0.9484 0.9907
0.0221 5.4015 15000 0.0420 0.9597 0.9797 0.6667 0.9554 0.9734 0.9210 0.9587 0.9832 0.9667 0.8637 0.8121 0.9367 0.9852 0.8449 0.7509 0.7145 0.9178 0.9498 0.9808 0.9746 0.9707 0.9650 0.9746 0.9604 0.9749 0.9692 0.9556 0.9591 0.9405 0.9563 0.9484 0.9906
0.021 5.7616 16000 0.0421 0.9613 0.9794 0.6667 0.9532 0.9736 0.9287 0.9554 0.9792 0.9599 0.8624 0.8146 0.9334 0.9790 0.8445 0.7534 0.7154 0.9181 0.9487 0.9791 0.9721 0.9691 0.9646 0.9748 0.9534 0.9757 0.9693 0.9561 0.9586 0.9403 0.9545 0.9473 0.9905
0.0174 6.1217 17000 0.0433 0.9617 0.9788 0.7879 0.9545 0.9738 0.9241 0.9598 0.9829 0.9589 0.8570 0.8131 0.9369 0.9838 0.8449 0.7581 0.7242 0.9230 0.9488 0.9798 0.9690 0.9691 0.9652 0.9759 0.9563 0.9769 0.9700 0.9556 0.9581 0.9403 0.9563 0.9482 0.9907
0.017 6.4818 18000 0.0442 0.9623 0.9790 0.9697 0.9566 0.9744 0.9258 0.9608 0.9833 0.9574 0.8565 0.8130 0.9350 0.9845 0.8450 0.7552 0.7329 0.9216 0.9519 0.9800 0.9723 0.9703 0.9675 0.9762 0.9605 0.9775 0.9713 0.9545 0.9582 0.9398 0.9582 0.9489 0.9907
0.017 6.8419 19000 0.0431 0.9639 0.9778 0.9697 0.9562 0.9738 0.9286 0.9612 0.9842 0.9607 0.8641 0.8160 0.9363 0.9828 0.8481 0.7610 0.7292 0.9198 0.9531 0.9800 0.9757 0.9699 0.9657 0.9751 0.9600 0.9767 0.9705 0.9565 0.9587 0.9414 0.9577 0.9495 0.9909
0.015 7.2020 20000 0.0438 0.9645 0.9795 0.9091 0.9550 0.9734 0.9295 0.9605 0.9824 0.9605 0.8594 0.8120 0.9382 0.9837 0.8452 0.7571 0.7222 0.9220 0.9540 0.9810 0.9745 0.9700 0.9672 0.9758 0.9599 0.9783 0.9702 0.9551 0.9596 0.9414 0.9576 0.9494 0.9908
0.0152 7.5621 21000 0.0451 0.9644 0.9795 0.9697 0.9570 0.9741 0.9271 0.9616 0.9826 0.9597 0.8649 0.8121 0.9374 0.9848 0.8469 0.7612 0.7261 0.9231 0.9530 0.9809 0.9747 0.9704 0.9661 0.9756 0.9618 0.9769 0.9706 0.9570 0.9601 0.9427 0.9573 0.9499 0.9908
0.0137 7.9222 22000 0.0450 0.9628 0.9780 0.9697 0.9565 0.9742 0.9289 0.9627 0.9832 0.9613 0.8643 0.8169 0.9374 0.9840 0.8497 0.7632 0.7292 0.9234 0.9514 0.9807 0.9737 0.9695 0.9674 0.9758 0.9610 0.9778 0.9701 0.9572 0.9596 0.9420 0.9582 0.9501 0.9908
0.0122 8.2823 23000 0.0463 0.9646 0.9789 0.9697 0.9560 0.9738 0.9276 0.9628 0.9835 0.9602 0.8643 0.8176 0.9386 0.9838 0.8494 0.7638 0.7275 0.9233 0.9519 0.9806 0.9739 0.9696 0.9682 0.9762 0.9604 0.9769 0.9698 0.9577 0.9592 0.9426 0.9578 0.9502 0.9908
0.0123 8.6424 24000 0.0459 0.9626 0.9782 0.9697 0.9566 0.9743 0.9276 0.9628 0.9839 0.9613 0.8670 0.8163 0.9394 0.9850 0.8487 0.7635 0.7357 0.9241 0.9539 0.9810 0.9737 0.9701 0.9680 0.9757 0.9617 0.9780 0.9702 0.9574 0.9601 0.9436 0.9578 0.9506 0.9909
0.0133 9.0025 25000 0.0462 0.9636 0.9788 0.9697 0.9563 0.9731 0.9273 0.9631 0.9835 0.9625 0.8672 0.8157 0.9393 0.9837 0.8495 0.7609 0.7289 0.9236 0.9541 0.9814 0.9737 0.9698 0.9684 0.9761 0.9618 0.9776 0.9698 0.9570 0.9591 0.9435 0.9574 0.9504 0.9909
0.0112 9.3626 26000 0.0467 0.9624 0.9789 0.9697 0.9567 0.9740 0.9243 0.9635 0.9832 0.9654 0.8643 0.8170 0.9375 0.9844 0.8489 0.7603 0.7303 0.9248 0.9534 0.9812 0.9735 0.9701 0.9685 0.9762 0.9617 0.9784 0.9698 0.9563 0.9594 0.9428 0.9576 0.9501 0.9909
0.0116 9.7227 27000 0.0464 0.9628 0.9789 0.9697 0.9562 0.9741 0.9260 0.9633 0.9826 0.9643 0.8637 0.8138 0.9379 0.9843 0.8492 0.7610 0.7278 0.9245 0.9536 0.9808 0.9725 0.9702 0.9686 0.9761 0.9613 0.9778 0.9698 0.9564 0.9591 0.9419 0.9583 0.9500 0.9908
0.011 10.0828 28000 0.0470 0.9637 0.9790 0.9697 0.9561 0.9736 0.9266 0.9632 0.9831 0.9646 0.8656 0.8160 0.9384 0.9843 0.8494 0.7597 0.7281 0.9239 0.9537 0.9805 0.9731 0.9701 0.9685 0.9759 0.9611 0.9778 0.9698 0.9573 0.9591 0.9423 0.9583 0.9502 0.9909
0.011 10.4429 29000 0.0469 0.9642 0.9790 0.9697 0.9567 0.9738 0.9267 0.9632 0.9834 0.9654 0.8653 0.8172 0.9393 0.9842 0.8495 0.7609 0.7287 0.9247 0.9544 0.9809 0.9732 0.9699 0.9687 0.9762 0.9614 0.9777 0.9699 0.9574 0.9596 0.9430 0.9581 0.9505 0.9909
0.0106 10.8030 30000 0.0470 0.9642 0.9789 0.9697 0.9566 0.9737 0.9264 0.9633 0.9833 0.9654 0.8653 0.8170 0.9390 0.9842 0.8495 0.7609 0.7281 0.9247 0.9540 0.9808 0.9732 0.9700 0.9689 0.9761 0.9609 0.9777 0.9701 0.9572 0.9594 0.9428 0.9582 0.9504 0.9909

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1