ojasaar's picture
Added tokenizer, improved performance
50dfdd2
|
raw
history blame
2.74 kB
metadata
language:
  - en
tags:
  - qa
  - summarization
  - emotion-detection
license: Apache 2.0
datasets:
  - coqa
  - squad_v2
  - go_emotions
  - cnn_dailymail
metrics:
  - f1

T5 Base with QA + Summary + Emotion

Dependencies

Requires transformers>=4.0.0

Description

This model was finetuned on the CoQa, Squad 2, GoEmotions and CNN/DailyMail.

It achieves a score of F1 79.5 on the Squad 2 dev set and a score of F1 70.6 on the CoQa dev set.

Summarisation and emotion detection has not been evaluated yet.

Usage

Question answering

from transformers import T5ForConditionalGeneration, T5Tokenizer
model = T5ForConditionalGeneration.from_pretrained("kiri-ai/t5-base-qa-summary-emotion")
tokenizer = T5Tokenizer.from_pretrained("t5-base")

def get_answer(question, prev_qa, context):
    input_text = [f"q: {qa[0]} a: {qa[1]}" for qa in prev_qa]
    input_text.append(f"q: {question}")
    input_text.append(f"c: {context}")
    input_text = " ".join(input_text)
    features = tokenizer([input_text], return_tensors='pt')
    tokens = model.generate(input_ids=features['input_ids'], 
            attention_mask=features['attention_mask'], max_length=64)
    return tokenizer.decode(tokens[0], skip_special_tokens=True)

print(get_answer("Why is the moon yellow?", "I'm not entirely sure why the moon is yellow.")) # unknown

context = "Elon Musk left OpenAI to avoid possible future conflicts with his role as CEO of Tesla."

print(get_answer("Why not?", [("Does Elon Musk still work with OpenAI", "No")], context)) # to avoid possible future conflicts with his role as CEO of Tesla

Summarisation

from transformers import T5ForConditionalGeneration, T5Tokenizer
model = T5ForConditionalGeneration.from_pretrained("kiri-ai/t5-base-qa-summary-emotion")
tokenizer = T5Tokenizer.from_pretrained("t5-base")

def summary(context):
    input_text = f"summarize: {context}"
    features = tokenizer([input_text], return_tensors='pt')
    tokens = model.generate(input_ids=features['input_ids'], 
            attention_mask=features['attention_mask'], max_length=64)
    return tokenizer.decode(tokens[0], skip_special_tokens=True)

Emotion detection

from transformers import T5ForConditionalGeneration, T5Tokenizer
model = T5ForConditionalGeneration.from_pretrained("kiri-ai/t5-base-qa-summary-emotion")
tokenizer = T5Tokenizer.from_pretrained("t5-base")

def emotion(context):
    input_text = f"emotion: {context}"
    features = tokenizer([input_text], return_tensors='pt')
    tokens = model.generate(input_ids=features['input_ids'], 
            attention_mask=features['attention_mask'], max_length=64)
    return tokenizer.decode(tokens[0], skip_special_tokens=True)