File size: 1,285 Bytes
c364e61 7d29a82 aa3c8e7 7d29a82 c364e61 7d29a82 2671a70 7d29a82 aa3c8e7 7d29a82 aa3c8e7 7d29a82 15d0715 aa3c8e7 f6a17fb 7d29a82 a49b8d2 f6a17fb 7d29a82 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
---
pipeline_tag: text-classification
tags:
- sentence-transformers
- transformers
language:
- en
- da
licence:
- apache-2.0
---
# SetFit-caesar-cipher-classifier
This was a [sentence-transformers](https://www.SBERT.net) model: It mapped sentences & paragraphs to a 768 dimensional dense vector space and could be used for tasks like clustering or semantic search. Now it's a SetFit classifier, determining if a sentence is gibberish or not. Hail Science!
## Usage (SetFitModel)
Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) and [SetFit](https://github.com/huggingface/setfit) installed:
```
pip install -U sentence-transformers setfit
```
Then you can use the model like this:
```python
from setfit import SetFitModel
sentences = ["This is an example sentence", "Each sentence is tested", "Aopz pz hu lehtwsl zlualujl", "Rnpu fragrapr vf grfgrq"]
model = SetFitModel.from_pretrained("trollek/setfit-gibberish-detector")
for sentence in sentences:
classification = model.predict(sentence)
print(classification)
```
- 0 is clear text
- 1 is gibberish
It would presumably work on Enigma encrypted text, but tests would have to be done. Anyway, the model has proven pretty reliable (99%) in classifying english and danish sentences.
|