add links to resources in the datasheet
Browse files
README.md
CHANGED
@@ -474,8 +474,8 @@ The original raw data was not kept.
|
|
474 |
|
475 |
**Is the software that was used to preprocess/clean/label the data available? If so, please provide a link or other access point.**
|
476 |
|
477 |
-
Yes, the preprocessing and filtering software is open-sourced. The CURATE pipeline was used for Spanish Crawling and CATalog,
|
478 |
-
Ungoliant pipeline was used for the OSCAR project.
|
479 |
|
480 |
#### Uses
|
481 |
|
|
|
474 |
|
475 |
**Is the software that was used to preprocess/clean/label the data available? If so, please provide a link or other access point.**
|
476 |
|
477 |
+
Yes, the preprocessing and filtering software is open-sourced. The [CURATE](https://github.com/langtech-bsc/CURATE) pipeline was used for Spanish Crawling and CATalog,
|
478 |
+
and the [Ungoliant](https://github.com/oscar-project/ungoliant) pipeline was used for the OSCAR project.
|
479 |
|
480 |
#### Uses
|
481 |
|