dataproc5

classroom

AI & ML interests

None defined yet.

Recent Activity

What is this?

A dataprocessing pipeline that uses huggingface datsets as intermediate data store.

Metadata are designed to be updated like a DAG, where some depends on others.

Workflows are gradually being built over time and maybe we'll see hundreds of data repos one day.

How do I use it?

To load files in local, Huggingface as well as S3 a tool is being developed in progress.

image/png

models

None public yet