UniMC

Code for Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective

Update

[2022-10-18] Release preprint in arXiv.
[2022-10-14] Release code in GitHub.

Requirements

git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
cd Fengshenbang-LM
pip install --editable .

Quick Start

You can refer to our example.py

import argparse
from fengshen.pipelines.multiplechoice import UniMCPipelines

total_parser = argparse.ArgumentParser("TASK NAME")
total_parser = UniMCPipelines.piplines_args(total_parser)
args = total_parser.parse_args()
    
pretrained_model_path = 'IDEA-CCNL/Erlangshen-UniMC-Albert-235M-English'
args.language='english'
args.learning_rate=2e-5
args.max_length=512
args.max_epochs=3
args.batchsize=8
args.default_root_dir='./'
model = UniMCPipelines(args, model_path=pretrained_model_path)

train_data = [] 
dev_data = [] 
test_data = [{
    "texta": "it 's just incredibly dull .",
    "textb": "",
    "question": "What is sentiment of follow review?",
    "choice": ["it's great", "it's terrible"],
    "answer": "",
    "label": 0,
    "id": 19
}]

if args.train:
    model.train(train_data, dev_data)
result = model.predict(test_data)

Pretrained Model

For the English model, the model was pre-trained with 14 multiplechoice datasets. For the Chinese model, we have collected 48 datasets to pre-train the model, and we have open sourced the pre-trained model to the HuggingFace community.

Model	URL
Erlangshen-UniMC-Albert-235-English	https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-Albert-235M-English
Erlangshen-UniMC-RoBERTa-110M-Chinese	https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinese
Erlangshen-UniMC-RoBERTa-330M-Chinese	https://huggingface.co/IDEA-CCNL/Erlangshen-UnimC-RoBERTa-330M-Chinese
Erlangshen-UniMC-MegatronBERT-1.3B-Chinese	https://huggingface.co/IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinese

Experiments

To evaluate the performance of UniMC, we use 14 multiple-choice datasets to pre-train the model with the ability to make choices

Zero-shot

Model	T0 11B	GLaM 60B	FLAN 137B	PaLM 540B	UniMC 235M
ANLI R1	43.6	40.9	47.7	48.4	52.0
ANLI R2	38.7	38.2	43.9	44.2	44.4
ANLI R3	41.3	40.9	47.0	45.7	47.8
CB	70.1	33.9	64.1	51.8	75.7

Citation

If this repository helps you, please cite this paper:

@article{unimc,
  author    = {Ping Yang and
               Junjie Wang and
               Ruyi Gan and
               Xinyu Zhu and
               Lin Zhang and
               Ziwei Wu and
               Xinyu Gao and
               Jiaxing Zhang and
               Tetsuya Sakai},
  title     = {Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective},
  journal   = {CoRR},
  volume    = {abs/2210.08590},
  year      = {2022}
}

License

Apache License 2.0