use locale agnostic seperator to make large nums easier to read (#1503) da9b1a3 unverified winglian commited on Apr 9, 2024
fix for accelerate env var for auto bf16, add new base image and expand torch_cuda_arch_list support (#1413) da265dd unverified winglian commited on Mar 26, 2024
Fix falcon tokenization step (#1441) [skip ci] bcdc9b1 unverified Far El winglian commited on Mar 26, 2024
strip out hacky qlora-fsdp workarounds now that qlora-fsdp fixes are upstreamed (#1428) 2a1589f unverified winglian commited on Mar 21, 2024
report min lenght of tokenized data (#1186) [skip ci] d85d494 unverified winglian commited on Jan 24, 2024
additional logging to get maximum token length of a sequence in the dataset (#1066) [skip ci] 2f2582e unverified winglian commited on Jan 10, 2024
Efficiently get the length of the tokenized docs (#1063) 81d3845 unverified ricdomolm winglian commited on Jan 8, 2024
streaming multipack for pretraining dataset (#959) 553c80f unverified jinwonkim93 [email protected] winglian commited on Jan 6, 2024
Determine FSDP/deepspeed settings on device select. (#883) 71b7ea3 unverified user735 Karl-Johan Alm winglian commited on Nov 29, 2023
Threaded MultipackDistributedDataloader with prefetched samples (#759) 05bd6f1 unverified casperhansen commited on Oct 26, 2023
refactor setup trainer so we can add more hooks (#773) 6c81c61 unverified winglian commited on Oct 23, 2023
fixes for alpaca w chatml, and don't include attention_mask w mistral for flash attention (#728) 3553172 unverified winglian commited on Oct 14, 2023
Save Axolotl config as WandB artifact (#716) 490923f unverified Jan Philipp Harries commited on Oct 11, 2023
refactor to set eval_batch_size earlier if unset, so we can warn if mismatched (#662) 2642cae unverified winglian commited on Oct 3, 2023
Fix(cfg): Add validation for save_strategy and eval_strategy (#633) 383f88d unverified Nanobit commited on Sep 28, 2023
chore(callback): Remove old peft saving code (#510) d5f8589 unverified Nanobit commited on Sep 22, 2023
run eval on the first step to get a baseline (#617) 2844eb2 unverified winglian commited on Sep 22, 2023
gather/broadcast the max value of the packing efficiency automatically (#463) b15b19e unverified winglian commited on Sep 17, 2023
optionally configure sample packing for evals (#589) 21ec195 unverified winglian commited on Sep 16, 2023
fix save_steps so it doesn't get duplicated (#567) 3fbde76 unverified winglian commited on Sep 14, 2023
improve how we setup eval/save strategies and steps (#547) 36e53c7 unverified winglian commited on Sep 13, 2023