Update README.md
Browse files
README.md
CHANGED
@@ -30,6 +30,9 @@ pipeline_tag: token-classification
|
|
30 |
---
|
31 |
|
32 |
# Piiranha-v1: Protect your personal information!
|
|
|
|
|
|
|
33 |
Piiranha is trained to **detect 17 types** of Personally Identifiable Information (PII) across six languages. It successfully **catches 98.27% of PII** tokens, with an overall classification **accuracy of 99.44%**.
|
34 |
Piiranha is especially accurate at detecting passwords, emails (100%), phone numbers, and usernames.
|
35 |
|
@@ -47,6 +50,7 @@ Piiranha was trained on an H100 GPU rented through the [Akash Network](https://a
|
|
47 |
|
48 |
## Model Description
|
49 |
Piiranha is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base).
|
|
|
50 |
|
51 |
It achieves the following results on a test set of ~73,000 sentences containing PII:
|
52 |
- Accuracy: 99.44%
|
|
|
30 |
---
|
31 |
|
32 |
# Piiranha-v1: Protect your personal information!
|
33 |
+
<a target="_blank" href="https://colab.research.google.com/github/williamgao1729/piiranha-quickstart/blob/main/piiranha_quickstart.ipynb">
|
34 |
+
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
35 |
+
</a>
|
36 |
Piiranha is trained to **detect 17 types** of Personally Identifiable Information (PII) across six languages. It successfully **catches 98.27% of PII** tokens, with an overall classification **accuracy of 99.44%**.
|
37 |
Piiranha is especially accurate at detecting passwords, emails (100%), phone numbers, and usernames.
|
38 |
|
|
|
50 |
|
51 |
## Model Description
|
52 |
Piiranha is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base).
|
53 |
+
The context length is 256 Deberta tokens. If your text is longer than that, just split it up.
|
54 |
|
55 |
It achieves the following results on a test set of ~73,000 sentences containing PII:
|
56 |
- Accuracy: 99.44%
|