Livingwithmachines
/

erwt-year

Inference Endpoints

Model card Files Files and versions Community

Kaspar commited on Nov 18, 2022

Commit

b8f8098

·

1 Parent(s): 9d7eb61

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -37,8 +37,8 @@ This model is served to you by [Kaspar von Beelen](https://huggingface.co/Kaspar
 - [Background: MDMA to the rescue 🙂](#background-mdma-to-the-rescue-%F0%9F%99%82)
 - [Intended Use: LMs as History Machines 🚂](#intended-use-lms-as-history-machines)
    - [Historical Language Change: Her/His Majesty? 👑](#historical-language-change-herhis-majesty-%F0%9F%91%91)
-   - [Date Prediction: Pub Quiz with LMs 🍻](#date-prediction)
-- [Limitations: Not all is well 😮](#limitations)
     - [Training Data](#training-data)
     - [Training Routine](#training-routine)
 - [Data Description](#data-description)
@@ -128,7 +128,7 @@ Firstly, eyeballing some toy examples (but also using more rigorous metrics such
 Secondly, MDMA may reduce biases induced by imbalances in the training data (or at least give us more of a handle on this problem). Admittedly, we have to prove this more formally, but some experiments at least hint in this direction. The data used for training is highly biased towards the Victorian age and a standard language model trained on this corpus will predict "her" for ```"[MASK] Majesty"```.
-### Date Prediction: Pub Quiz with LMs
 Another feature of the ERWT model series is date prediction. Remember that during training the temporal metadata token is often masked. In this case, the model effectively learns to situate documents in time based on the tokens they contain.

 - [Background: MDMA to the rescue 🙂](#background-mdma-to-the-rescue-%F0%9F%99%82)
 - [Intended Use: LMs as History Machines 🚂](#intended-use-lms-as-history-machines)
    - [Historical Language Change: Her/His Majesty? 👑](#historical-language-change-herhis-majesty-%F0%9F%91%91)
+   - [Date Prediction: Pub Quiz with LMs 🍻](#date-prediction-pub-quiz-with-lms-%F0%9F%8D%BB)
+- [Limitations: Not all is well 😮](#limitations-not-all-is-well-%F0%9F%98%AE)
     - [Training Data](#training-data)
     - [Training Routine](#training-routine)
 - [Data Description](#data-description)
 Secondly, MDMA may reduce biases induced by imbalances in the training data (or at least give us more of a handle on this problem). Admittedly, we have to prove this more formally, but some experiments at least hint in this direction. The data used for training is highly biased towards the Victorian age and a standard language model trained on this corpus will predict "her" for ```"[MASK] Majesty"```.
+### Date Prediction: Pub Quiz with LMs 🍻
 Another feature of the ERWT model series is date prediction. Remember that during training the temporal metadata token is often masked. In this case, the model effectively learns to situate documents in time based on the tokens they contain.