Electra small ⚡ + SQuAD v1 ❓

Electra-small-discriminator fine-tuned on SQUAD v1.1 dataset for Q&A downstream task.

Details of the downstream task (Q&A) - Model 🧠

ELECTRA is a new method for self-supervised language representation learning. It can be used to pre-train transformer networks using relatively little compute. ELECTRA models are trained to distinguish "real" input tokens vs "fake" input tokens generated by another neural network, similar to the discriminator of a GAN. At small scale, ELECTRA achieves strong results even when trained on a single GPU. At large scale, ELECTRA achieves state-of-the-art results on the SQuAD 2.0 dataset.

Details of the downstream task (Q&A) - Dataset 📚

Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable. SQuAD v1.1 contains 100,000+ question-answer pairs on 500+ articles.

Model training 🏋️‍

The model was trained on a Tesla P100 GPU and 25GB of RAM with the following command:

python transformers/examples/question-answering/run_squad.py \
  --model_type electra \
  --model_name_or_path 'google/electra-small-discriminator' \
  --do_eval \
  --do_train \
  --do_lower_case \
  --train_file '/content/dataset/train-v1.1.json' \
  --predict_file '/content/dataset/dev-v1.1.json' \
  --per_gpu_train_batch_size 16 \
  --learning_rate 3e-5 \
  --num_train_epochs 10 \
  --max_seq_length 384 \
  --doc_stride 128 \
  --output_dir '/content/output' \
  --overwrite_output_dir \
  --save_steps 1000

Test set Results 🧾

Metric # Value
EM 77.70
F1 85.74
Size 50 MB

Very good metrics for such a "small" model!


{
'exact': 77.70104068117313,
'f1': 85.73991234187997,
'total': 10570,
'HasAns_exact': 77.70104068117313,
'HasAns_f1': 85.73991234187997,
'HasAns_total': 10570,
'best_exact': 77.70104068117313,
'best_exact_thresh': 0.0,
'best_f1': 85.73991234187997,
'best_f1_thresh': 0.0
}

Model in action 🚀

Fast usage with pipelines:

from transformers import pipeline

QnA_pipeline = pipeline('question-answering', model='mrm8488/electra-small-finetuned-squadv1')
QnA_pipeline({
    'context': 'A new strain of flu that has the potential to become a pandemic has been identified in China by scientists.',
    'question': 'What has been discovered by scientists from China ?'
})

# Output:
{'answer': 'A new strain of flu', 'end': 19, 'score': 0.7950334108113424, 'start': 0}

Created by Manuel Romero/@mrm8488 | LinkedIn

Made with in Spain

Downloads last month
8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.