strip out hacky qlora-fsdp workarounds now that qlora-fsdp fixes are upstreamed (#1428) 2a1589f unverified winglian commited on Mar 21, 2024
support galore once upstreamed into transformers (#1409) dd449c5 unverified winglian commited on Mar 19, 2024
update flash attention for gemma support: (#1368) 58b0d4b unverified winglian commited on Mar 6, 2024
Add seq2seq eval benchmark callback (#1274) 5a5d474 unverified LeonardoEmili commited on Feb 13, 2024
Revert "run PR e2e docker CI tests in Modal" (#1220) [skip ci] 8da1633 unverified winglian commited on Jan 26, 2024
run PR e2e docker CI tests in Modal (#1217) [skip ci] 36d053f unverified winglian commited on Jan 26, 2024
upgrade deepspeed to 0.13.1 for mixtral fixes (#1189) [skip ci] 8a49309 unverified winglian commited on Jan 24, 2024
Remove fused-dense-lib from requirements.txt (#1087) 91502b9 unverified casperhansen commited on Jan 10, 2024
fix: warn user to install mamba_ssm package (#1019) d69ba2b unverified Nanobit commited on Jan 10, 2024
Separate AutoGPTQ dep to `pip install -e .[auto-gptq]` (#1077) 9be92d1 unverified casperhansen commited on Jan 9, 2024
Add: mlflow for experiment tracking (#1059) [skip ci] 090c24d unverified Johan Hansson winglian commited on Jan 9, 2024
bump transformers and update attention class map name (#1023) bcc78d8 unverified winglian commited on Jan 3, 2024
update transformers to fix checkpoint saving (#963) f28e755 unverified dumpmemory commited on Dec 16, 2023
update to latest transformers for mixstral support (#929) 35f9b0f unverified winglian commited on Dec 10, 2023
update datasets version to cut down the warnings due to pyarrow arg change (#897) 6a4562a unverified winglian commited on Nov 25, 2023
try #2: pin hf transformers and accelerate to latest release, don't reinstall pytorch (#867) 0de1457 unverified winglian commited on Nov 16, 2023
add e2e tests for checking functionality of resume from checkpoint (#865) b3a61e8 unverified winglian commited on Nov 16, 2023
don't compile deepspeed or bitsandbytes from source (#837) f544ab2 unverified winglian commited on Nov 9, 2023
chore: bump transformers to v4.34.1 to fix tokenizer issue (#745) 8966a6f unverified Nanobit commited on Oct 20, 2023
Fix(version): Update FA to work with Mistral SWA (#673) 43856c0 unverified Nanobit commited on Oct 4, 2023
Feat: Allow usage of native Mistral FA when no sample_packing (#669) 697c50d unverified Nanobit commited on Oct 4, 2023