Commit History
don't use mask expansion for inference (#392)
1687be6
unverified
winglian
commited on
don't pass rope_scaling kwarg if it's None (#383)
919246f
unverified
winglian
commited on
try to detect accelerate and only use device_map=None in that case (#373)
094fc2c
unverified
tmm1
commited on
remove unnecessary local variable
0c96727
tmm1
commited on
simplify `load_tokenizer`
efb3b2c
tmm1
commited on
improve GPU logging to break out pytorch cache and system mem
7b55fe6
tmm1
commited on
quiet noise from llama tokenizer by setting pad token earlier
e029ab3
tmm1
commited on
Attention mask and position id fixes for packing (#285)
2bb0b78
unverified
winglian
commited on
Feat: Add rope scaling (#343)
b521206
unverified
Nanobit
commited on
Merge pull request #356 from tmm1/load_model-args
11ddccb
unverified
tmm1
commited on
simplify load_model signature
7181022
tmm1
commited on
log GPU memory usage
e303d64
tmm1
commited on
ensure enable_input_require_grads is called on model before getting the peft model (#345)
176b888
unverified
winglian
commited on
fix typo
2eda9e0
tmm1
commited on
scope flash-attn+qlora fix correctly, scope to llama, add comment
78b9efb
tmm1
commited on
move flash-attn monkey patch alongside the others
312a9fa
tmm1
commited on
ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype
248bf90
tmm1
commited on
qlora w flash attention fixes (#333)
77085ea
unverified
winglian
commited on
add peft install back since it doesn't get installed by setup.py (#331)
db2a358
unverified
winglian
commited on
don't resize embeddings to multiples of 32x by default
1066751
winglian
commited on
Adding logging enhancement
553a86b
theobjectivedad
commited on
support for loading a model by git revision
69a2350
winglian
commited on
skip explicit model type too if using trust_remote_code
d69da99
winglian
commited on
don't use llama if trust_remote_code is set since that needs to use AutoModel path
66afb76
winglian
commited on
optionally define whether to use_fast tokenizer
47d601f
winglian
commited on
add float16 docs and tweak typehints
88e17ff
winglian
commited on
style correction
136522f
maciej.karasek
commited on
issue #205 bugfix
556fe40
maciej.karasek
commited on
Merge branch 'main' into flash-optimum
fd2c981
unverified
winglian
commited on
Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map
93dacba
unverified
winglian
commited on
Merge pull request #177 from NanoCode012/fix/landmark-patch
8002ffb
unverified
winglian
commited on
Merge branch 'main' into strip-peft-device-map
5e616d9
unverified
winglian
commited on
Merge pull request #159 from AngainorDev/patch-1
8e568bb
unverified
Nanobit
commited on
add check for attr
c9a149f
winglian
commited on
Fix strict and Lint
b565ecf
Angainor
commited on
match up gradient checkpointing when using lora w config
fe0b768
winglian
commited on
Fix undefined LlamaForCausalLM and del try except
563b6d8
Nanobit
commited on
peft no longer needs device_map
cd0a6f6
winglian
commited on
Refactor landmark attention patch
919727b
Nanobit
commited on
Fix missing cfg.
a808bf9
unverified
Angainor Development
commited on
Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref
0124825
unverified
winglian
commited on
more gpt-neox long ctx fixes
ab5cd28
winglian
commited on
more tweaks to do pre-training with bettertransformers
1210dc8
winglian
commited on
add support for opimum bettertransformers
1edc30c
winglian
commited on
fix for local variable 'LlamaForCausalLM' referenced before assignment
14163c1
winglian
commited on
Merge branch 'main' into patch-1
79e2a6f
unverified
Angainor Development
commited on
add support to extend context with xpos rope
a03a7d7
winglian
commited on
fix for max sequence len across different model types
7f09106
winglian
commited on