samchain
/

EconoBert

generated_from_keras_callback

Inference Endpoints

Model card Files Files and versions Community

samchain commited on Jul 23, 2023

Commit

68470dc

·

1 Parent(s): f5e5c85

Update README.md

Files changed (1) hide show

README.md +14 -8

README.md CHANGED Viewed

@@ -12,29 +12,34 @@ language:
 - en
 ---
-<!-- This model card has been generated automatically according to the information Keras had access to. You should
-probably proofread and complete it, then remove this comment. -->
 # EconoBert
-This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an unknown dataset.
-It achieves the following results on the evaluation set:
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -43,6 +48,7 @@ The following hyperparameters were used during training:
 ### Training results
 ### Framework versions

 - en
 ---
 # EconoBert
+This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on this dataset: (https://huggingface.co/datasets/samchain/BIS_Speeches_97_23)
+It achieves the following results on the test set:
+- Accuracy for MLM task: 73%
+- Accuracy for NSP task: 95%
 ## Model description
+The model is a simple fine-tuning of a base bert on a dataset specific to the domain of economics. It follows the same architecture and no resize_token_embeddings were required.
 ## Intended uses & limitations
+This model should be used as a backbone for NLP tasks applied to the domain of economics, politics and finance.
 ## Training and evaluation data
+The dataset used as a fine-tuning domain is : https://huggingface.co/datasets/samchain/BIS_Speeches_97_23
+The dataset is made of 773k pairs of sentences, an half being negative pairs (meaning sequence A and B are not related) and the other half positive (sequence B follows sequence A).
+The test set is made of 136k pairs.
 ## Training procedure
+The model has been fine tuned on 2 epochs, with a batch size of 64 and a sequence length of 128. I used Adam learning-rate with a value of 1e-5,
 ### Training hyperparameters
 The following hyperparameters were used during training:
 ### Training results
+Training loss is 1.6046 on train set and 1.47 on test set.
 ### Framework versions