datasets:
- squad
- squad_v2
widget:
- text: Which name is also used to describe the Amazon rainforest in English?
context: >-
The Amazon rainforest (Portuguese: Floresta Amazônica or Amazônia;
Spanish: Selva Amazónica, Amazonía or usually Amazonia; French: Forêt
amazonienne; Dutch: Amazoneregenwoud), also known in English as Amazonia
or the Amazon Jungle, is a moist broadleaf forest that covers most of the
Amazon basin of South America. This basin encompasses 7,000,000 square
kilometres (2,700,000 sq mi), of which 5,500,000 square kilometres
(2,100,000 sq mi) are covered by the rainforest. This region includes
territory belonging to nine nations. The majority of the forest is
contained within Brazil, with 60% of the rainforest, followed by Peru with
13%, Colombia with 10%, and with minor amounts in Venezuela, Ecuador,
Bolivia, Guyana, Suriname and French Guiana. States or departments in four
nations contain "Amazonas" in their names. The Amazon represents over half
of the planet's remaining rainforests, and comprises the largest and most
biodiverse tract of tropical rainforest in the world, with an estimated
390 billion individual trees divided into 16,000 species.
- text: How many square kilometers of rainforest is covered in the basin?
context: >-
The Amazon rainforest (Portuguese: Floresta Amazônica or Amazônia;
Spanish: Selva Amazónica, Amazonía or usually Amazonia; French: Forêt
amazonienne; Dutch: Amazoneregenwoud), also known in English as Amazonia
or the Amazon Jungle, is a moist broadleaf forest that covers most of the
Amazon basin of South America. This basin encompasses 7,000,000 square
kilometres (2,700,000 sq mi), of which 5,500,000 square kilometres
(2,100,000 sq mi) are covered by the rainforest. This region includes
territory belonging to nine nations. The majority of the forest is
contained within Brazil, with 60% of the rainforest, followed by Peru with
13%, Colombia with 10%, and with minor amounts in Venezuela, Ecuador,
Bolivia, Guyana, Suriname and French Guiana. States or departments in four
nations contain "Amazonas" in their names. The Amazon represents over half
of the planet's remaining rainforests, and comprises the largest and most
biodiverse tract of tropical rainforest in the world, with an estimated
390 billion individual trees divided into 16,000 species.
language:
- en
- hi
metrics:
- accuracy
pipeline_tag: question-answering
avishkaarak-ekta-hindi
This is the avishkaarak-ekta-hindi model, fine-tuned using the SQuAD2.0 dataset. It's been trained on question-answer pairs, including unanswerable questions, for the task of Question Answering.
Overview
Language model: avishkaarak-ekta-hindi
Language: English, Hindi(Upcoming)
Downstream-task: Extractive QA
Training data: SQuAD 2.0
Eval data: SQuAD 2.0
Code: See an example QA pipeline on Haystack
Infrastructure: 4x Tesla v100
Hyperparameters
batch_size = 4
n_epochs = 50
base_LM_model = "roberta-base"
max_seq_len = 512
learning_rate = 9e-5
lr_schedule = LinearWarmup
warmup_proportion = 0.2
doc_stride=128
max_query_length=64
Usage
In Haystack
Haystack is an NLP framework by deepset. You can use this model in a Haystack pipeline to do question answering at scale (over many documents). To load the model in Haystack:
reader = FARMReader(model_name_or_path="AVISHKAARAM/avishkaarak-ekta-hindi")
# or
reader = TransformersReader(model_name_or_path="AVISHKAARAM/avishkaarak-ekta-hindi",tokenizer="deepset/roberta-base-squad2")
For a complete example of AVISHKAARAM/avishkaarak-ekta-hindi
being used for Question Answering, check out the Tutorials in Haystack Documentation
In Transformers
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
model_name = "AVISHKAARAM/avishkaarak-ekta-hindi"
# a) Get predictions
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
QA_input = {
'question': 'Why is model conversion important?',
'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
}
res = nlp(QA_input)
# b) Load model & tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
Performance
Evaluated on the SQuAD 2.0 dev set with the official eval script.
"exact": 79.87029394424324,
"f1": 82.91251169582613,
"total": 11873,
"HasAns_exact": 77.93522267206478,
"HasAns_f1": 84.02838248389763,
"HasAns_total": 5928,
"NoAns_exact": 81.79983179142137,
"NoAns_f1": 81.79983179142137,
"NoAns_total": 5945
Authors
Shashwat Bindal: optimus.coders.@ai
Sanoj: optimus.coders.@ai