Jón Daðason
commited on
Commit
·
fb032b4
1
Parent(s):
35c6d2a
Updated README.md
Browse files
README.md
CHANGED
@@ -8,14 +8,14 @@ license: cc-by-4.0
|
|
8 |
datasets:
|
9 |
- igc
|
10 |
- ic3
|
11 |
-
-
|
12 |
- mc4
|
13 |
---
|
14 |
|
15 |
# Nordic ELECTRA-Small
|
16 |
This model was pretrained on the following corpora:
|
17 |
* The [Icelandic Gigaword Corpus](http://igc.arnastofnun.is/) (IGC)
|
18 |
-
* The
|
19 |
* The [Icelandic Crawled Corpus](https://huggingface.co/datasets/jonfd/ICC) (ICC)
|
20 |
* The [Multilingual Colossal Clean Crawled Corpus](https://huggingface.co/datasets/mc4) (mC4) - Icelandic, Norwegian, Swedish and Danish text obtained from .is, .no, .se and .dk domains, respectively
|
21 |
|
|
|
8 |
datasets:
|
9 |
- igc
|
10 |
- ic3
|
11 |
+
- jonfd/ICC
|
12 |
- mc4
|
13 |
---
|
14 |
|
15 |
# Nordic ELECTRA-Small
|
16 |
This model was pretrained on the following corpora:
|
17 |
* The [Icelandic Gigaword Corpus](http://igc.arnastofnun.is/) (IGC)
|
18 |
+
* The Icelandic Common Crawl Corpus (IC3)
|
19 |
* The [Icelandic Crawled Corpus](https://huggingface.co/datasets/jonfd/ICC) (ICC)
|
20 |
* The [Multilingual Colossal Clean Crawled Corpus](https://huggingface.co/datasets/mc4) (mC4) - Icelandic, Norwegian, Swedish and Danish text obtained from .is, .no, .se and .dk domains, respectively
|
21 |
|