Commit History
feat: remove need to add load_in* during merge (#1017)
f6ecf14
unverified
Nanobit
commited on
[Docs] Nit: Remind people to auth to wandb if they are going to use it (#1013)
dec66d7
unverified
hamel
commited on
Update README.md (#1012)
76357dc
unverified
hamel
commited on
remove landmark attn and xpos rope implementations (#1010)
70b46ca
unverified
winglian
commited on
add config to model card (#1005)
85dd4d5
unverified
hamel
commited on
Set eval_sample_packing to false in mistral config.yaml (#1003)
384b817
unverified
Kevin Sydney
commited on
FEAT: add tagging support to axolotl (#1004)
db9094d
unverified
Add an example config for finetuning a 34B model on a 24GB GPU (#1000)
6ef46f8
unverified
Evan Griffiths
commited on
set output_router_logits for mixtral config: (#995)
628b754
unverified
winglian
commited on
support for cuda 12.1 (#989)
37820f6
unverified
winglian
commited on
chore: Update transformers to latest (#986)
7d4185f
unverified
Nanobit
commited on
change val size (#992)
93ebec1
unverified
mhenrichsen
commited on
Add tests to Docker (#993)
2e61dc3
unverified
hamel
commited on
Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787)
1ffa386
unverified
Nanobit
commited on
bump actions versions
62ba160
hamel
commited on
fix mistral prompt assembly (#982)
7bbaac9
unverified
hamel
commited on
Dockerfile torch fix (#987)
161bcb6
unverified
winglian
commited on
Update README.md (#966)
d25c34c
unverified
eltociear
commited on
fix: add lr scheduler kwargs to Trainer (#972)
13e9381
unverified
Nanobit
commited on
fix for build for nccl in dockerfile (#970)
85de004
unverified
winglian
commited on
update to latest nccl in docker image (#965)
80ec7af
unverified
winglian
commited on
update transformers to fix checkpoint saving (#963)
f28e755
unverified
dumpmemory
commited on
Fix prompt assembly for llama (#952)
5ada140
unverified
fix: switch to using the HuggingFace Transformers NEFT implementation (#941)
ef24342
unverified
kallewoof
commited on
Fix Deepspeed loading (#950)
5ea3aa3
unverified
winglian
commited on
Flash attn hotfix (#951)
f1f60cb
unverified
winglian
commited on
fix: remove excessive newlines in system prompt(s) for alpaca (#936)
450e04d
unverified
kallewoof
commited on
More hints on what to do with CUDA Out of memory errors (#925)
b0cf397
unverified
Juraj Bednar
commited on
new evals_per_epoch and saves_per_epoch to make things cleaner (#944)
5f79b82
unverified
winglian
commited on
Respect sequence_len in config for `type: llama2_chat` (#926)
f1de29d
unverified
hamel
commited on
Mixtral official (#942)
7fabc4d
unverified
winglian
commited on
Update requirements.txt (#940)
9a5eb39
unverified
tokestermw
commited on
Mixtral: More correct MoE, lower loss (#932)
86487c2
unverified
casperhansen
commited on
update to latest transformers for mixstral support (#929)
35f9b0f
unverified
winglian
commited on
Mixtral multipack (#928)
68b227a
unverified
winglian
commited on
fixing prompt template of chatml by removal of linebreak (#922)
03c6318
unverified
support for mamba (#915)
40a6362
unverified
winglian
commited on
chore: clarify Readme on sharegpt system role
d339beb
unverified
Nanobit
commited on
fix(tokenizer): handle fast tokenizer properly for bos/eos (#914)
fde091c
unverified
Nanobit
commited on
Pin flash-attn to 2.3.3 (#919)
06ae392
unverified
casperhansen
commited on
Support device_map=sequential & max_memory config parameters (#903)
992e742
unverified
Feat(wandb): Refactor to be more flexible (#767)
a1da39c
unverified
Nanobit
commited on
feature: loss watchdog for terminating training runs that are failing (#899)
58ec8b1
unverified
Remove learning rate scheduler in deepspeed config to avoid conflict (#909)
476a205
unverified
Haoxiang-Wang
commited on
fix for qwen w lora (#906)
3e3229e
unverified
winglian
commited on
ensure merged model matches the training dtype (#902)
1d21aa6
unverified
winglian
commited on