gaodrew commited on
Commit
6afda1a
1 Parent(s): 87e2e03

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -36,22 +36,23 @@ pipeline_tag: token-classification
36
  Piiranha is trained to **detect 17 types** of Personally Identifiable Information (PII) across six languages. It successfully **catches 98.27% of PII** tokens, with an overall classification **accuracy of 99.44%**.
37
  Piiranha is especially accurate at detecting passwords, emails (100%), phone numbers, and usernames.
38
 
39
- Supported languages: English, Spanish, French, German, Italian, Dutch
40
-
41
- Supported PII types: Account Number, Building Number, City, Credit Card Number, Date of Birth, Driver's License, Email, First Name, Last Name, ID Card, Password, Social Security Number, Street Address, Tax Number, Phone Number, Username, Zipcode.
42
-
43
  Performance on PII vs. Non PII classification task:
44
  - **Precision: 98.48%** (98.48% of tokens classified as PII are actually PII)
45
  - **Recall: 98.27%** (correctly identifies 98.27% of PII tokens)
46
  - **Specificity: 99.84%** (correctly identifies 99.84% of Non PII tokens)
47
 
48
- <img src="https://cloud-3i4ld6u5y-hack-club-bot.vercel.app/0home.png" alt="Akash Network logo" width="400"/>
49
  Piiranha was trained on an H100 GPU rented through the [Akash Network](https://akash.network/).
50
 
51
  ## Model Description
52
  Piiranha is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base).
53
  The context length is 256 Deberta tokens. If your text is longer than that, just split it up.
54
 
 
 
 
 
 
55
  It achieves the following results on a test set of ~73,000 sentences containing PII:
56
  - Accuracy: 99.44%
57
  - Loss: 0.0173
 
36
  Piiranha is trained to **detect 17 types** of Personally Identifiable Information (PII) across six languages. It successfully **catches 98.27% of PII** tokens, with an overall classification **accuracy of 99.44%**.
37
  Piiranha is especially accurate at detecting passwords, emails (100%), phone numbers, and usernames.
38
 
 
 
 
 
39
  Performance on PII vs. Non PII classification task:
40
  - **Precision: 98.48%** (98.48% of tokens classified as PII are actually PII)
41
  - **Recall: 98.27%** (correctly identifies 98.27% of PII tokens)
42
  - **Specificity: 99.84%** (correctly identifies 99.84% of Non PII tokens)
43
 
44
+ <img src="https://cloud-3i4ld6u5y-hack-club-bot.vercel.app/0home.png" alt="Akash Network logo" width="250"/>
45
  Piiranha was trained on an H100 GPU rented through the [Akash Network](https://akash.network/).
46
 
47
  ## Model Description
48
  Piiranha is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base).
49
  The context length is 256 Deberta tokens. If your text is longer than that, just split it up.
50
 
51
+ Supported languages: English, Spanish, French, German, Italian, Dutch
52
+
53
+ Supported PII types: Account Number, Building Number, City, Credit Card Number, Date of Birth, Driver's License, Email, First Name, Last Name, ID Card, Password, Social Security Number, Street Address, Tax Number, Phone Number, Username, Zipcode.
54
+
55
+
56
  It achieves the following results on a test set of ~73,000 sentences containing PII:
57
  - Accuracy: 99.44%
58
  - Loss: 0.0173