H2OTest / documentation /docs /tooltips /experiments /_deepspeed-allgather-bucket-size.mdx
elineve's picture
Upload 301 files
07423df
raw
history blame contribute delete
182 Bytes
Number of elements allgather at a time. Limits the memory required for the allgather for large model sizes. Smaller values use less GPU memory, but slow down training and validating.