WIP: Support table logging for mlflow, too (#1506) 057fa44 unverified DavidFarago Dave Farago winglian commited on Apr 9, 2024
drop empty token from beginning if tokenizer has no bos_token (in the case of qwen) (#1490) 934fc85 unverified winglian commited on Apr 7, 2024
strip out hacky qlora-fsdp workarounds now that qlora-fsdp fixes are upstreamed (#1428) 2a1589f unverified winglian commited on Mar 21, 2024
HF / FEAT: Optimize HF tags (#1425) [skip ci] 7d55607 unverified Younes Belkada winglian commited on Mar 21, 2024
support galore once upstreamed into transformers (#1409) dd449c5 unverified winglian commited on Mar 19, 2024
fix(config): passing gradient_checkpoint_kwargs (#1412) b1e3e1b unverified Nanobit commited on Mar 19, 2024
add lion-pytorch optimizer (#1299) [skip ci] 1648279 unverified Maxime winglian commited on Feb 26, 2024
Allow load_best_model_at_end to be configured for early stopping on custom evaluation datasets (#1291) 3c00f40 unverified David Meikle commited on Feb 21, 2024
Add seq2seq eval benchmark callback (#1274) 5a5d474 unverified LeonardoEmili commited on Feb 13, 2024
Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? (#1273) 8430db2 unverified jinwonkim93 commited on Feb 13, 2024
allow the optimizer prune ratio for ReLoRA to be configurable (#1287) 4b997c3 unverified winglian commited on Feb 12, 2024
simplify haldning for newer multipack patches so they can be added in a single place (#1270) 5698943 unverified winglian commited on Feb 7, 2024
Add more save strategies for DPO training. (#1255) 13eea21 unverified Philip May commited on Feb 6, 2024
relora: magnitude pruning of the optimizer (#1245) 8c2e05a unverified winglian commited on Feb 6, 2024
Fix and document test_datasets (#1228) 5787e1a unverified DreamGenX winglian commited on Jan 31, 2024
FEAT: add tagging support to axolotl for DPOTrainer (#1209) 18f8119 unverified Filippo Broggini winglian commited on Jan 27, 2024
precompute dpo logprobs setting and fixes (#1199) [skip ci] 33e1170 unverified winglian commited on Jan 25, 2024
fix learning rate scheduler's warnings (#1135) [skip ci] b4ac96a unverified ricdomolm winglian commited on Jan 25, 2024
more dpo fixes for dataset loading and docs (#1185) [skip ci] 5bce45f unverified winglian commited on Jan 24, 2024
Add mlflow callback for pushing config to mlflow artifacts (#1125) b8e5603 unverified JohanWork commited on Jan 22, 2024
swap the data collator for evals if not using sample packing (#1076) ead34c5 unverified winglian commited on Jan 10, 2024
Add: mlflow for experiment tracking (#1059) [skip ci] 090c24d unverified Johan Hansson winglian commited on Jan 9, 2024
Cosine learning rate schedule - minimum learning rate (#1062) 04b978b unverified ricdomolm winglian commited on Jan 9, 2024
Efficiently get the length of the tokenized docs (#1063) 81d3845 unverified ricdomolm winglian commited on Jan 8, 2024
streaming multipack for pretraining dataset (#959) 553c80f unverified jinwonkim93 [email protected] winglian commited on Jan 6, 2024
feat: always push checkpoint to hub if set (#1049) [skip ci] cbdbf9e unverified Nanobit commited on Jan 5, 2024
use recommended setting for use_reentrant w gradient checkpointing (#1021) 4d2e842 unverified winglian commited on Jan 2, 2024
remove landmark attn and xpos rope implementations (#1010) 70b46ca unverified winglian commited on Dec 28, 2023
FEAT: add tagging support to axolotl (#1004) db9094d unverified Younes Belkada winglian commited on Dec 27, 2023
fix: switch to using the HuggingFace Transformers NEFT implementation (#941) ef24342 unverified kallewoof commited on Dec 13, 2023