|
--- |
|
library_name: zeroshot_classifier |
|
tags: |
|
- transformers |
|
- sentence-transformers |
|
- zeroshot_classifier |
|
license: mit |
|
datasets: |
|
- claritylab/UTCD |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
metrics: |
|
- accuracy |
|
--- |
|
|
|
# Zero-shot Vanilla GPT2 |
|
|
|
This is a modified GPT2 model. |
|
It was introduced in the Findings of ACL'23 Paper **Label Agnostic Pre-training for Zero-shot Text Classification** by ***Christopher Clarke, Yuzhao Heng, Yiping Kang, Krisztian Flautner, Lingjia Tang and Jason Mars***. |
|
The code for training and evaluating this model can be found [here](https://github.com/ChrisIsKing/zero-shot-text-classification/tree/master). |
|
|
|
## Model description |
|
|
|
This model is intended for zero-shot text classification. |
|
It was trained under the generative classification framework as a baseline with the aspect-normalized [UTCD](https://huggingface.co/datasets/claritylab/UTCD) dataset. |
|
|
|
- **Finetuned from model:** [`gpt2-medium`](https://huggingface.co/gpt2-medium) |
|
|
|
|
|
## Usage |
|
|
|
Install our [python package](https://pypi.org/project/zeroshot-classifier/): |
|
```bash |
|
pip install zeroshot-classifier |
|
``` |
|
|
|
Then, you can use the model like this: |
|
|
|
```python |
|
>>> import torch |
|
>>> from zeroshot_classifier.models import ZsGPT2Tokenizer, ZsGPT2LMHeadModel |
|
|
|
>>> training_strategy = 'vanilla' |
|
>>> model_name = f'claritylab/zero-shot-{training_strategy}-gpt2' |
|
>>> model = ZsGPT2LMHeadModel.from_pretrained(model_name) |
|
>>> tokenizer = ZsGPT2Tokenizer.from_pretrained(model_name, form=training_strategy) |
|
|
|
>>> text = "I'd like to have this track onto my Classical Relaxations playlist." |
|
>>> labels = [ |
|
>>> 'Add To Playlist', 'Book Restaurant', 'Get Weather', 'Play Music', 'Rate Book', 'Search Creative Work', |
|
>>> 'Search Screening Event' |
|
>>> ] |
|
|
|
>>> inputs = tokenizer(dict(text=text, label_options=labels), mode='inference-sample') |
|
>>> inputs = {k: torch.tensor(v).unsqueeze(0) for k, v in inputs.items()} |
|
>>> outputs = model.generate(**inputs, max_length=128) |
|
>>> decoded = tokenizer.batch_decode(outputs, skip_special_tokens=False)[0] |
|
>>> print(decoded) |
|
|
|
<|question|>How is the text best described? : " Rate Book ", " Search Screening Event ", " Add To Playlist ", " Search Creative Work ", " Get Weather ", " Play Music ", " Book Restaurant "<|endoftext|><|text|>I'd like to have this track onto my Classical Relaxations playlist.<|endoftext|><|answer|>Play Media<|endoftext|> |
|
``` |