|
--- |
|
pipeline_tag: text-classification |
|
tags: |
|
- sentence-transformers |
|
- transformers |
|
language: |
|
- en |
|
- da |
|
licence: |
|
- apache-2.0 |
|
--- |
|
# SetFit-caesar-cipher-classifier |
|
This was a [sentence-transformers](https://www.SBERT.net) model: It mapped sentences & paragraphs to a 768 dimensional dense vector space and could be used for tasks like clustering or semantic search. Now it's a SetFit classifier, determining if a sentence is gibberish or not. Hail Science! |
|
|
|
## Usage (SetFitModel) |
|
Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) and [SetFit](https://github.com/huggingface/setfit) installed: |
|
``` |
|
pip install -U sentence-transformers setfit |
|
``` |
|
Then you can use the model like this: |
|
```python |
|
from setfit import SetFitModel |
|
sentences = ["This is an example sentence", "Each sentence is tested", "Aopz pz hu lehtwsl zlualujl", "Rnpu fragrapr vf grfgrq"] |
|
model = SetFitModel.from_pretrained("trollek/setfit-gibberish-detector") |
|
for sentence in sentences: |
|
classification = model.predict(sentence) |
|
print(classification) |
|
|
|
``` |
|
|
|
- 0 is clear text |
|
- 1 is gibberish |
|
|
|
It would presumably work on Enigma encrypted text, but tests would have to be done. Anyway, the model has proven pretty reliable (99%) in classifying english and danish sentences. |
|
|