ashaduzzaman
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -6,6 +6,15 @@ tags:
|
|
6 |
model-index:
|
7 |
- name: distilbert-finetuned-squad
|
8 |
results: []
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
---
|
10 |
|
11 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
@@ -13,41 +22,76 @@ should probably proofread and complete it, then remove this comment. -->
|
|
13 |
|
14 |
# distilbert-finetuned-squad
|
15 |
|
16 |
-
This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)
|
17 |
|
18 |
## Model description
|
19 |
|
20 |
-
|
21 |
|
22 |
## Intended uses & limitations
|
23 |
|
24 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
|
26 |
## Training and evaluation data
|
27 |
|
28 |
-
|
|
|
|
|
|
|
29 |
|
30 |
## Training procedure
|
31 |
|
32 |
### Training hyperparameters
|
33 |
|
34 |
The following hyperparameters were used during training:
|
35 |
-
- learning_rate
|
36 |
-
- train_batch_size
|
37 |
-
- eval_batch_size
|
38 |
-
- seed
|
39 |
-
- optimizer
|
40 |
-
- lr_scheduler_type
|
41 |
-
- num_epochs
|
42 |
-
- mixed_precision_training
|
43 |
|
44 |
### Training results
|
45 |
|
|
|
46 |
|
|
|
47 |
|
48 |
-
|
49 |
-
|
50 |
-
-
|
51 |
-
-
|
52 |
-
- Datasets 2.21.0
|
53 |
-
- Tokenizers 0.19.1
|
|
|
6 |
model-index:
|
7 |
- name: distilbert-finetuned-squad
|
8 |
results: []
|
9 |
+
datasets:
|
10 |
+
- rajpurkar/squad
|
11 |
+
language:
|
12 |
+
- en
|
13 |
+
metrics:
|
14 |
+
- f1
|
15 |
+
- exact_match
|
16 |
+
library_name: transformers
|
17 |
+
pipeline_tag: question-answering
|
18 |
---
|
19 |
|
20 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
22 |
|
23 |
# distilbert-finetuned-squad
|
24 |
|
25 |
+
This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) for the question-answering task. The model has been adapted to extract answers from context passages based on input questions.
|
26 |
|
27 |
## Model description
|
28 |
|
29 |
+
`distilbert-finetuned-squad` is a distilled version of BERT that has been fine-tuned on a question-answering dataset. The distillation process makes the model smaller and faster while retaining much of the original model's performance. This fine-tuned variant is specifically adapted for tasks that involve extracting answers from given context passages.
|
30 |
|
31 |
## Intended uses & limitations
|
32 |
|
33 |
+
### Intended Uses
|
34 |
+
- **Question Answering:** This model is designed to answer questions based on a given context. It can be used in applications such as chatbots, customer support systems, and interactive question-answering systems.
|
35 |
+
- **Information Retrieval:** The model can help extract specific information from large text corpora, making it useful for applications in search engines and content summarization.
|
36 |
+
|
37 |
+
## Example Usage
|
38 |
+
|
39 |
+
Here is a code snippet to load the fine-tuned model and perform question answering:
|
40 |
+
|
41 |
+
```python
|
42 |
+
from transformers import pipeline
|
43 |
+
|
44 |
+
# Load the fine-tuned model for question answering
|
45 |
+
model_checkpoint = "Ashaduzzaman/distilbert-finetuned-squad"
|
46 |
+
|
47 |
+
question_answerer = pipeline(
|
48 |
+
"question-answering",
|
49 |
+
model=model_checkpoint,
|
50 |
+
)
|
51 |
+
|
52 |
+
# Perform question answering on the provided question and context
|
53 |
+
question = "What is the capital of France?"
|
54 |
+
context = "The capital of France is Paris."
|
55 |
+
result = question_answerer(question=question, context=context)
|
56 |
+
|
57 |
+
print(result['answer'])
|
58 |
+
```
|
59 |
+
|
60 |
+
This code demonstrates how to load the model using the `transformers` library and perform question answering with a sample question and context.
|
61 |
+
|
62 |
+
### Limitations
|
63 |
+
- **Dataset Bias:** The model's performance is dependent on the quality and diversity of the dataset it was fine-tuned on. Biases in the dataset can affect the model's predictions.
|
64 |
+
- **Context Limitation:** The model may struggle with very long context passages or contexts with complex structures.
|
65 |
+
- **Generalization:** While the model is fine-tuned for question-answering, it may not perform well on questions that require understanding beyond the provided context or involve reasoning over multiple contexts.
|
66 |
|
67 |
## Training and evaluation data
|
68 |
|
69 |
+
The specific dataset used for fine-tuning is not disclosed. However, the model was trained on a dataset typically used for question-answering tasks, which includes a wide range of questions and contexts. Details about the dataset include:
|
70 |
+
- **Type:** Question-Answering
|
71 |
+
- **Source:** Information not specified
|
72 |
+
- **Size:** Information not specified
|
73 |
|
74 |
## Training procedure
|
75 |
|
76 |
### Training hyperparameters
|
77 |
|
78 |
The following hyperparameters were used during training:
|
79 |
+
- **learning_rate:** 2e-05
|
80 |
+
- **train_batch_size:** 8
|
81 |
+
- **eval_batch_size:** 8
|
82 |
+
- **seed:** 42
|
83 |
+
- **optimizer:** Adam with betas=(0.9,0.999) and epsilon=1e-08
|
84 |
+
- **lr_scheduler_type:** linear
|
85 |
+
- **num_epochs:** 1
|
86 |
+
- **mixed_precision_training:** Native AMP
|
87 |
|
88 |
### Training results
|
89 |
|
90 |
+
The performance metrics and evaluation results of the fine-tuned model are not specified. It is recommended to evaluate the model on your specific use case to determine its effectiveness.
|
91 |
|
92 |
+
## Framework versions
|
93 |
|
94 |
+
- **Transformers:** 4.42.4
|
95 |
+
- **Pytorch:** 2.3.1+cu121
|
96 |
+
- **Datasets:** 2.21.0
|
97 |
+
- **Tokenizers:** 0.19.1
|
|
|
|