claritylab
/

zero-shot-vanilla-gpt2

Text Generation

sentence-transformers

zeroshot_classifier

Model card Files Files and versions Community

zero-shot-vanilla-gpt2 / README.md

StefanH's picture

Update README.md

cdafb8a over 1 year ago

|

history blame contribute delete

2.38 kB

	---
	library_name: zeroshot_classifier
	tags:
	- transformers
	- sentence-transformers
	- zeroshot_classifier
	license: mit
	datasets:
	- claritylab/UTCD
	language:
	- en
	pipeline_tag: text-generation
	metrics:
	- accuracy
	---

	# Zero-shot Vanilla GPT2

	This is a modified GPT2 model.
	It was introduced in the Findings of ACL'23 Paper Label Agnostic Pre-training for Zero-shot Text Classification by *Christopher Clarke, Yuzhao Heng, Yiping Kang, Krisztian Flautner, Lingjia Tang and Jason Mars*.
	The code for training and evaluating this model can be found [here](https://github.com/ChrisIsKing/zero-shot-text-classification/tree/master).

	## Model description

	This model is intended for zero-shot text classification.
	It was trained under the generative classification framework as a baseline with the aspect-normalized [UTCD](https://huggingface.co/datasets/claritylab/UTCD) dataset.

	- Finetuned from model: [`gpt2-medium`](https://huggingface.co/gpt2-medium)


	## Usage

	Install our [python package](https://pypi.org/project/zeroshot-classifier/):
	```bash
	pip install zeroshot-classifier
	```

	Then, you can use the model like this:

	```python
	>>> import torch
	>>> from zeroshot_classifier.models import ZsGPT2Tokenizer, ZsGPT2LMHeadModel

	>>> training_strategy = 'vanilla'
	>>> model_name = f'claritylab/zero-shot-{training_strategy}-gpt2'
	>>> model = ZsGPT2LMHeadModel.from_pretrained(model_name)
	>>> tokenizer = ZsGPT2Tokenizer.from_pretrained(model_name, form=training_strategy)

	>>> text = "I'd like to have this track onto my Classical Relaxations playlist."
	>>> labels = [
	>>> 'Add To Playlist', 'Book Restaurant', 'Get Weather', 'Play Music', 'Rate Book', 'Search Creative Work',
	>>> 'Search Screening Event'
	>>> ]

	>>> inputs = tokenizer(dict(text=text, label_options=labels), mode='inference-sample')
	>>> inputs = {k: torch.tensor(v).unsqueeze(0) for k, v in inputs.items()}
	>>> outputs = model.generate(**inputs, max_length=128)
	>>> decoded = tokenizer.batch_decode(outputs, skip_special_tokens=False)[0]
	>>> print(decoded)

	<\|question\|>How is the text best described? : " Rate Book ", " Search Screening Event ", " Add To Playlist ", " Search Creative Work ", " Get Weather ", " Play Music ", " Book Restaurant "<\|endoftext\|><\|text\|>I'd like to have this track onto my Classical Relaxations playlist.<\|endoftext\|><\|answer\|>Play Media<\|endoftext\|>
	```