Update README.md
Browse files
README.md
CHANGED
@@ -63,6 +63,7 @@ The model was pretrained on the mix of three text sources:
|
|
63 |
- self-crawled Czech news dataset (20GB),
|
64 |
- Czech part Wikipedia (1GB).
|
65 |
|
|
|
66 |
|
67 |
## Paper
|
68 |
https://link.springer.com/chapter/10.1007/978-3-030-89579-2_3
|
|
|
63 |
- self-crawled Czech news dataset (20GB),
|
64 |
- Czech part Wikipedia (1GB).
|
65 |
|
66 |
+
The model was pretrained for 500k steps (over 15 epochs over the full dataset) with a peak learning rate of 4e-4.
|
67 |
|
68 |
## Paper
|
69 |
https://link.springer.com/chapter/10.1007/978-3-030-89579-2_3
|