samchain commited on
Commit
68470dc
·
1 Parent(s): f5e5c85

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -8
README.md CHANGED
@@ -12,29 +12,34 @@ language:
12
  - en
13
  ---
14
 
15
- <!-- This model card has been generated automatically according to the information Keras had access to. You should
16
- probably proofread and complete it, then remove this comment. -->
17
-
18
  # EconoBert
19
 
20
- This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an unknown dataset.
21
- It achieves the following results on the evaluation set:
22
 
 
 
23
 
24
  ## Model description
25
 
26
- More information needed
27
 
28
  ## Intended uses & limitations
29
 
30
- More information needed
31
 
32
  ## Training and evaluation data
33
 
34
- More information needed
 
 
 
 
35
 
36
  ## Training procedure
37
 
 
 
38
  ### Training hyperparameters
39
 
40
  The following hyperparameters were used during training:
@@ -43,6 +48,7 @@ The following hyperparameters were used during training:
43
 
44
  ### Training results
45
 
 
46
 
47
 
48
  ### Framework versions
 
12
  - en
13
  ---
14
 
 
 
 
15
  # EconoBert
16
 
17
+ This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on this dataset: (https://huggingface.co/datasets/samchain/BIS_Speeches_97_23)
18
+ It achieves the following results on the test set:
19
 
20
+ - Accuracy for MLM task: 73%
21
+ - Accuracy for NSP task: 95%
22
 
23
  ## Model description
24
 
25
+ The model is a simple fine-tuning of a base bert on a dataset specific to the domain of economics. It follows the same architecture and no resize_token_embeddings were required.
26
 
27
  ## Intended uses & limitations
28
 
29
+ This model should be used as a backbone for NLP tasks applied to the domain of economics, politics and finance.
30
 
31
  ## Training and evaluation data
32
 
33
+ The dataset used as a fine-tuning domain is : https://huggingface.co/datasets/samchain/BIS_Speeches_97_23
34
+
35
+ The dataset is made of 773k pairs of sentences, an half being negative pairs (meaning sequence A and B are not related) and the other half positive (sequence B follows sequence A).
36
+
37
+ The test set is made of 136k pairs.
38
 
39
  ## Training procedure
40
 
41
+ The model has been fine tuned on 2 epochs, with a batch size of 64 and a sequence length of 128. I used Adam learning-rate with a value of 1e-5,
42
+
43
  ### Training hyperparameters
44
 
45
  The following hyperparameters were used during training:
 
48
 
49
  ### Training results
50
 
51
+ Training loss is 1.6046 on train set and 1.47 on test set.
52
 
53
 
54
  ### Framework versions