trollek
/

setfit-gibberish-detector

Text Classification

sentence-transformers

feature-extraction

Model card Files Files and versions Community

setfit-gibberish-detector / README.md

trollek's picture

Update README.md

2671a70 verified 9 months ago

|

1.29 kB

	---
	pipeline_tag: text-classification
	tags:
	- sentence-transformers
	- transformers
	language:
	- en
	- da
	licence:
	- apache-2.0
	---
	# SetFit-caesar-cipher-classifier
	This was a [sentence-transformers](https://www.SBERT.net) model: It mapped sentences & paragraphs to a 768 dimensional dense vector space and could be used for tasks like clustering or semantic search. Now it's a SetFit classifier, determining if a sentence is gibberish or not. Hail Science!

	## Usage (SetFitModel)
	Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) and [SetFit](https://github.com/huggingface/setfit) installed:
	```
	pip install -U sentence-transformers setfit
	```
	Then you can use the model like this:
	```python
	from setfit import SetFitModel
	sentences = ["This is an example sentence", "Each sentence is tested", "Aopz pz hu lehtwsl zlualujl", "Rnpu fragrapr vf grfgrq"]
	model = SetFitModel.from_pretrained("trollek/setfit-gibberish-detector")
	for sentence in sentences:
	classification = model.predict(sentence)
	print(classification)

	```

	- 0 is clear text
	- 1 is gibberish

	It would presumably work on Enigma encrypted text, but tests would have to be done. Anyway, the model has proven pretty reliable (99%) in classifying english and danish sentences.