Update README.md
Browse files
README.md
CHANGED
@@ -12,29 +12,34 @@ language:
|
|
12 |
- en
|
13 |
---
|
14 |
|
15 |
-
<!-- This model card has been generated automatically according to the information Keras had access to. You should
|
16 |
-
probably proofread and complete it, then remove this comment. -->
|
17 |
-
|
18 |
# EconoBert
|
19 |
|
20 |
-
This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on
|
21 |
-
It achieves the following results on the
|
22 |
|
|
|
|
|
23 |
|
24 |
## Model description
|
25 |
|
26 |
-
|
27 |
|
28 |
## Intended uses & limitations
|
29 |
|
30 |
-
|
31 |
|
32 |
## Training and evaluation data
|
33 |
|
34 |
-
|
|
|
|
|
|
|
|
|
35 |
|
36 |
## Training procedure
|
37 |
|
|
|
|
|
38 |
### Training hyperparameters
|
39 |
|
40 |
The following hyperparameters were used during training:
|
@@ -43,6 +48,7 @@ The following hyperparameters were used during training:
|
|
43 |
|
44 |
### Training results
|
45 |
|
|
|
46 |
|
47 |
|
48 |
### Framework versions
|
|
|
12 |
- en
|
13 |
---
|
14 |
|
|
|
|
|
|
|
15 |
# EconoBert
|
16 |
|
17 |
+
This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on this dataset: (https://huggingface.co/datasets/samchain/BIS_Speeches_97_23)
|
18 |
+
It achieves the following results on the test set:
|
19 |
|
20 |
+
- Accuracy for MLM task: 73%
|
21 |
+
- Accuracy for NSP task: 95%
|
22 |
|
23 |
## Model description
|
24 |
|
25 |
+
The model is a simple fine-tuning of a base bert on a dataset specific to the domain of economics. It follows the same architecture and no resize_token_embeddings were required.
|
26 |
|
27 |
## Intended uses & limitations
|
28 |
|
29 |
+
This model should be used as a backbone for NLP tasks applied to the domain of economics, politics and finance.
|
30 |
|
31 |
## Training and evaluation data
|
32 |
|
33 |
+
The dataset used as a fine-tuning domain is : https://huggingface.co/datasets/samchain/BIS_Speeches_97_23
|
34 |
+
|
35 |
+
The dataset is made of 773k pairs of sentences, an half being negative pairs (meaning sequence A and B are not related) and the other half positive (sequence B follows sequence A).
|
36 |
+
|
37 |
+
The test set is made of 136k pairs.
|
38 |
|
39 |
## Training procedure
|
40 |
|
41 |
+
The model has been fine tuned on 2 epochs, with a batch size of 64 and a sequence length of 128. I used Adam learning-rate with a value of 1e-5,
|
42 |
+
|
43 |
### Training hyperparameters
|
44 |
|
45 |
The following hyperparameters were used during training:
|
|
|
48 |
|
49 |
### Training results
|
50 |
|
51 |
+
Training loss is 1.6046 on train set and 1.47 on test set.
|
52 |
|
53 |
|
54 |
### Framework versions
|