t5-qa-builder / README.md
sgarbi's picture
Update README.md
2f04745 verified
|
raw
history blame
2.99 kB
metadata
language:
  - en
tags:
  - question-answering
  - t5
  - compact-model
  - sgarbi
license: apache-2.0
datasets:
  - squad2
  - quac
  - nq
  - stanfordnlp/coqa
  - ibm/duorc
  - squad_v2

Model Card for sgarbi/t5-compact-qa-gen

Model Description

sgarbi/t5-compact-qa-gen is a compact T5-based model designed to generate question and answer pairs from a given text. This model has been trained with a focus on efficiency and speed, making it suitable for deployment on devices with limited computational resources, including CPUs. It utilizes a novel data formatting approach for training, which simplifies the parsing process and enhances the model's performance.

Intended Use

This model is intended for a wide range of question-answering tasks, including but not limited to:

  • Generating study materials from educational texts.
  • Enhancing search engines with precise Q&A capabilities.
  • Supporting content creators in generating FAQs.
  • Deploying on edge devices for real-time question answering in various applications.

How to Use

Here is a simple way to use this model with the Transformers library:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("sgarbi/t5-compact-qa-gen")
model = AutoModelForSeq2SeqLM.from_pretrained("sgarbi/t5-compact-qa-gen")

text = "INPUT: <qa_builder_context>Your context here."
inputs = tokenizer(text, return_tensors="pt")
output = model.generate(inputs["input_ids"])
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training Data

The model was trained on the following datasets:

SQuAD 2.0: A large collection of question and answer pairs based on Wikipedia articles. QuAC: Question Answering in Context, a dataset for modeling, understanding, and participating in information-seeking dialogues. Natural Questions (NQ): A dataset containing real user questions sourced from Google search. Training Procedure The model was trained using a novel input and output formatting technique, focusing on generating "shallow" training data for efficient model training. The model's architecture, flan-T5-small, was selected for its balance between performance and computational efficiency. Training involved fine-tuning the model on the specified datasets, utilizing a custom XML-like format for simplifying the data structure.

Evaluation Results

(Include any evaluation metrics and results here to showcase the model's performance on various benchmarks or tasks.)

Limitations and Bias

(Describe any limitations of the model, including potential biases in the training data and areas where the model's performance may be suboptimal.)

Ethical Considerations

(Provide guidance on ethical considerations for users of the model, including appropriate and inappropriate uses.)

Citation

@misc{sgarbi_t5_compact_qa_gen, author = {Erick Sgarbi}, title = {T5 Compact QA Generator}, year = {2024}, publisher = {Hugging Face}, journal = {Hugging Face Model Hub} }