Hi everyone.
After creating a dataset consisting of all my data, I split it in train/validation/test sets. Following that, I am performing a number of preprocessing steps on all of them, and end up with three altered datasets, of type datasets.arrow_dataset.Dataset
.
In order to save them and in the future load directly the preprocessed datasets, would I have to call
dataset.save_to_disk(FILE_PATH)
3 times, one for the training, one for the validation and one for the test set? Or is there any way to somehow save them all together? If yes, what is more efficient?
Thanks in advance.