Update README.md
Browse files
README.md
CHANGED
@@ -160,7 +160,7 @@ Secondly, we could use it as an analytical tool, to study how temporal variation
|
|
160 |
|
161 |
The ERWT series were trained for evaluation purposes, and therefore carry some critical limitations.
|
162 |
|
163 |
-
###
|
164 |
|
165 |
Many of the limitations are a direct result of the data. ERWT models are trained on a rather small subsample of nineteenth-century British newspapers, and its predictions have to be understood in this context (remember, Her Majesty?). Moreover, the corpus has a strong Metropolitan and liberal bias (see section on Data Description for more information).
|
166 |
|
@@ -178,7 +178,7 @@ did much better. 🎉🥳
|
|
178 |
|
179 |
Want to know how much, then read our paper!
|
180 |
|
181 |
-
##
|
182 |
|
183 |
The ERWT models are trained on an openly accessible newspaper corpus created by the [Heritage Made Digital (HMD) newspaper digitisation project](footnote{https://blogs.bl.uk/thenewsroom/2019/01/heritage-made-digital-the-newspapers.html).
|
184 |
The HMD newspapers comprise around 2 billion words in total, but the bulk of the articles originate from the (then) liberal paper *The Sun*.
|
|
|
160 |
|
161 |
The ERWT series were trained for evaluation purposes, and therefore carry some critical limitations.
|
162 |
|
163 |
+
### Training Data
|
164 |
|
165 |
Many of the limitations are a direct result of the data. ERWT models are trained on a rather small subsample of nineteenth-century British newspapers, and its predictions have to be understood in this context (remember, Her Majesty?). Moreover, the corpus has a strong Metropolitan and liberal bias (see section on Data Description for more information).
|
166 |
|
|
|
178 |
|
179 |
Want to know how much, then read our paper!
|
180 |
|
181 |
+
## Data Description
|
182 |
|
183 |
The ERWT models are trained on an openly accessible newspaper corpus created by the [Heritage Made Digital (HMD) newspaper digitisation project](footnote{https://blogs.bl.uk/thenewsroom/2019/01/heritage-made-digital-the-newspapers.html).
|
184 |
The HMD newspapers comprise around 2 billion words in total, but the bulk of the articles originate from the (then) liberal paper *The Sun*.
|