Zero-shot Explicit GPT2

This is a modified GPT2 model. It was introduced in the Findings of ACL'23 Paper Label Agnostic Pre-training for Zero-shot Text Classification by Christopher Clarke, Yuzhao Heng, Yiping Kang, Krisztian Flautner, Lingjia Tang and Jason Mars. The code for training and evaluating this model can be found here.

Model description

This model is intended for zero-shot text classification. It was trained under the generative classification framework via explicit training with the aspect-normalized UTCD dataset.

Usage

Install our python package:

pip install zeroshot-classifier

Then, you can use the model like this:

>>> import torch
>>> from zeroshot_classifier.models import ZsGPT2Tokenizer, ZsGPT2LMHeadModel

>>> training_strategy = 'explicit'
>>> model_name = f'claritylab/zero-shot-{training_strategy}-gpt2'
>>> model = ZsGPT2LMHeadModel.from_pretrained(model_name)
>>> tokenizer = ZsGPT2Tokenizer.from_pretrained(model_name, form=training_strategy)

>>> text = "I'd like to have this track onto my Classical Relaxations playlist."
>>> labels = [
>>>     'Add To Playlist', 'Book Restaurant', 'Get Weather', 'Play Music', 'Rate Book', 'Search Creative Work',
>>>     'Search Screening Event'
>>> ]

>>> inputs = tokenizer(dict(text=text, label_options=labels), mode='inference-sample')
>>> inputs = {k: torch.tensor(v).unsqueeze(0) for k, v in inputs.items()}
>>> outputs = model.generate(**inputs, max_length=128)
>>> decoded = tokenizer.batch_decode(outputs, skip_special_tokens=False)[0]
>>> print(decoded)

<|question|>Which of these choices best describes the following document? : " Play Music ", " Add To Playlist ", " Rate Book ", " Get Weather ", " Book Restaurant ", " Search Screening Event ", " Search Creative Work "<|endoftext|><|text|>I'd like to have this track onto my Classical Relaxations playlist.<|endoftext|><|answer|>Play Media<|endoftext|>
Downloads last month
23
Inference Examples
Inference API (serverless) does not yet support zeroshot_classifier models for this pipeline type.

Dataset used to train claritylab/zero-shot-explicit-gpt2