Number of elements reduced/allreduced at a time. Limits the memory required for the allgather for large model sizes. Smaller values use less memory, but slow down training and validating. |
Number of elements reduced/allreduced at a time. Limits the memory required for the allgather for large model sizes. Smaller values use less memory, but slow down training and validating. |