File size: 1,859 Bytes
b9cf664 cf15d89 1c75c2c 5300d0c 1c75c2c dafa950 1c75c2c b9cf664 8b70b11 b9cf664 8b70b11 b9cf664 8b70b11 ba333d8 8b70b11 b9cf664 8b70b11 b9cf664 8b70b11 b9cf664 8b70b11 b9cf664 8b70b11 b9cf664 8b70b11 b9cf664 8b70b11 b9cf664 8b70b11 052aeac 8b70b11 b9cf664 8b70b11 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
---
library_name: transformers
license: openrail++
datasets:
- textdetox/multilingual_paradetox
- chameleon-lizard/synthetic-multilingual-paradetox
language:
- ru
- en
- am
- uk
- de
- es
- ar
- hi
- zh
pipeline_tag: text2text-generation
---
# Model Card for Model ID
Finetune of the mt0-xl model for text detoxification task.
## Model Details
### Model Description
This is a finetune of mt0-xl model for text detoxification task. Can be used for synthetic data generation from toxic examples.
- **Developed by:** Nikita Sushko
- **Model type:** mt5-xl
- **Language(s) (NLP):** English, Russian, Ukranian, Amharic, German, Spanish, Chinese, Arabic, Hindi
- **License:** OpenRail++
- **Finetuned from model:** mt0-xl
## Uses
This model is intended to be used as a text detoxification task in 9 languages: English, Russian, Ukranian, Amharic, German, Spanish, Chinese, Arabic, Hindi.
### Direct Use
The model may be directly used for text detoxification tasks.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
import transformers
checkpoint = 'chameleon-lizard/detox-mt0-xl'
tokenizer = transformers.AutoTokenizer.from_pretrained(checkpoint)
model = transformers.AutoModelForSeq2SeqLM.from_pretrained(checkpoint, torch_dtype='auto', device_map="auto")
pipe = transformers.pipeline(
"text2text-generation",
model=model,
tokenizer=tokenizer,
max_length=512,
truncation=True,
)
language = 'English'
text = "You are a major fucking disappointment."
print(pipe('Write a non-toxic version of the following text in {language}: {text}')[0]['generated_text'])
# Resulting text: "You are a major disappointment.""
```
Be sure to prompt with the provided prompt format for the best performance. Failure to include target language may result in model responses be in random language. |