BigScience Large Language Model Training
Training a multilingual 176 billion parameters model in the open
BigScience is a open and collaborative workshop around the study and creation of very large language models gathering more than 1000 researchers around the worlds. You can find more information on the main website at /static-proxy?url=https%3A%2F%2Fbigscience.huggingface.co%3C%2Fa%3E.%3C%2Fp%3E
The training of BigScienceโs main model started on March 11, 2022 11:42am PST and will continue for 3-4 months on 384 A100 80GB GPUs of the Jean Zay public supercomputer You can follow the training at https://twitter.com/BigScienceLLM or on the Tensorboards tab above.
More information on the model, dataset, hardware, environmental consideration:
The model
The dataset
The engineering side
Environmental considerations