trollek's picture
Update README.md
2671a70 verified
|
raw
history blame
1.29 kB
metadata
pipeline_tag: text-classification
tags:
  - sentence-transformers
  - transformers
language:
  - en
  - da
licence:
  - apache-2.0

SetFit-caesar-cipher-classifier

This was a sentence-transformers model: It mapped sentences & paragraphs to a 768 dimensional dense vector space and could be used for tasks like clustering or semantic search. Now it's a SetFit classifier, determining if a sentence is gibberish or not. Hail Science!

Usage (SetFitModel)

Using this model becomes easy when you have sentence-transformers and SetFit installed:

pip install -U sentence-transformers setfit

Then you can use the model like this:

from setfit import SetFitModel
sentences = ["This is an example sentence", "Each sentence is tested", "Aopz pz hu lehtwsl zlualujl", "Rnpu fragrapr vf grfgrq"]
model = SetFitModel.from_pretrained("trollek/setfit-gibberish-detector")
for sentence in sentences:
  classification = model.predict(sentence)
  print(classification)
  • 0 is clear text
  • 1 is gibberish

It would presumably work on Enigma encrypted text, but tests would have to be done. Anyway, the model has proven pretty reliable (99%) in classifying english and danish sentences.