fix packing so that concatenated sequences reset the attention 9b8585d winglian commited on May 31, 2023
casts the prepared data to int16 (doesn't help with training memory) 2db9436 winglian commited on Apr 18, 2023