Commits · Dovakiins/qwerrwe

standardize attn hijack patches (#381)

06edf17
unverified

tmm1

winglian commited on Aug 18, 2023

don't use mask expansion for inference (#392)

1687be6
unverified

winglian commited on Aug 15, 2023

don't pass rope_scaling kwarg if it's None (#383)

919246f
unverified

winglian commited on Aug 13, 2023

try to detect accelerate and only use device_map=None in that case (#373)

094fc2c
unverified

tmm1 commited on Aug 13, 2023

remove unnecessary local variable

0c96727

tmm1 commited on Aug 13, 2023

simplify `load_tokenizer`

efb3b2c

tmm1 commited on Aug 13, 2023

improve GPU logging to break out pytorch cache and system mem

7b55fe6

tmm1 commited on Aug 13, 2023

quiet noise from llama tokenizer by setting pad token earlier

e029ab3

tmm1 commited on Aug 13, 2023

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

winglian commited on Aug 12, 2023

Feat: Add rope scaling (#343)

b521206
unverified

Nanobit commited on Aug 12, 2023

Merge pull request #356 from tmm1/load_model-args

11ddccb
unverified

tmm1 commited on Aug 10, 2023

simplify load_model signature

7181022

tmm1 commited on Aug 9, 2023

log GPU memory usage

e303d64

tmm1 commited on Aug 9, 2023

ensure enable_input_require_grads is called on model before getting the peft model (#345)

176b888
unverified

winglian commited on Aug 6, 2023

fix typo

2eda9e0

tmm1 commited on Aug 3, 2023

scope flash-attn+qlora fix correctly, scope to llama, add comment

78b9efb

tmm1 commited on Aug 3, 2023

move flash-attn monkey patch alongside the others

312a9fa

tmm1 commited on Aug 3, 2023

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype

248bf90

tmm1 commited on Aug 2, 2023

qlora w flash attention fixes (#333)

77085ea
unverified

winglian commited on Aug 2, 2023

add peft install back since it doesn't get installed by setup.py (#331)

db2a358
unverified

winglian commited on Jul 31, 2023

don't resize embeddings to multiples of 32x by default

1066751

winglian commited on Jul 22, 2023

Adding logging enhancement

553a86b

theobjectivedad commited on Jul 14, 2023

support for loading a model by git revision

69a2350

winglian commited on Jul 14, 2023

skip explicit model type too if using trust_remote_code

d69da99

winglian commited on Jul 8, 2023

don't use llama if trust_remote_code is set since that needs to use AutoModel path

66afb76

winglian commited on Jul 8, 2023

optionally define whether to use_fast tokenizer

47d601f

winglian commited on Jun 25, 2023

add float16 docs and tweak typehints

88e17ff

winglian commited on Jun 15, 2023

style correction

136522f

maciej.karasek commited on Jun 14, 2023

issue #205 bugfix

556fe40

maciej.karasek commited on Jun 14, 2023

Merge branch 'main' into flash-optimum

fd2c981
unverified

winglian commited on Jun 12, 2023

Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map

93dacba
unverified

winglian commited on Jun 12, 2023

Merge pull request #177 from NanoCode012/fix/landmark-patch

8002ffb
unverified

winglian commited on Jun 12, 2023

Merge branch 'main' into strip-peft-device-map

5e616d9
unverified

winglian commited on Jun 12, 2023

Merge pull request #159 from AngainorDev/patch-1

8e568bb
unverified

Nanobit commited on Jun 12, 2023

add check for attr

c9a149f

winglian commited on Jun 11, 2023

Fix strict and Lint

b565ecf

Angainor commited on Jun 11, 2023

match up gradient checkpointing when using lora w config

fe0b768

winglian commited on Jun 11, 2023

Fix undefined LlamaForCausalLM and del try except

563b6d8

Nanobit commited on Jun 11, 2023

peft no longer needs device_map

cd0a6f6

winglian commited on Jun 11, 2023

Address PR suggestion

e285e24
unverified

Nanobit

winglian commited on Jun 11, 2023

Refactor landmark attention patch

919727b

Nanobit commited on Jun 9, 2023

Fix missing cfg.

a808bf9
unverified

Angainor Development commited on Jun 10, 2023

Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref

0124825
unverified

winglian commited on Jun 10, 2023

more gpt-neox long ctx fixes

ab5cd28

winglian commited on Jun 1, 2023

more tweaks to do pre-training with bettertransformers

1210dc8

winglian commited on Jun 1, 2023

add support for opimum bettertransformers

1edc30c

winglian commited on May 27, 2023

fix for local variable 'LlamaForCausalLM' referenced before assignment

14163c1

winglian commited on Jun 10, 2023

Merge branch 'main' into patch-1

79e2a6f
unverified

Angainor Development commited on Jun 10, 2023

add support to extend context with xpos rope

a03a7d7

winglian commited on Jun 10, 2023

fix for max sequence len across different model types

7f09106

winglian commited on Jun 10, 2023

Commit History

standardize attn hijack patches (#381) 06edf17 unverified

don't use mask expansion for inference (#392) 1687be6 unverified

don't pass rope_scaling kwarg if it's None (#383) 919246f unverified

try to detect accelerate and only use device_map=None in that case (#373) 094fc2c unverified

remove unnecessary local variable 0c96727

simplify `load_tokenizer` efb3b2c

improve GPU logging to break out pytorch cache and system mem 7b55fe6

quiet noise from llama tokenizer by setting pad token earlier e029ab3

Attention mask and position id fixes for packing (#285) 2bb0b78 unverified

Feat: Add rope scaling (#343) b521206 unverified

Merge pull request #356 from tmm1/load_model-args 11ddccb unverified

simplify load_model signature 7181022

log GPU memory usage e303d64

ensure enable_input_require_grads is called on model before getting the peft model (#345) 176b888 unverified

fix typo 2eda9e0

scope flash-attn+qlora fix correctly, scope to llama, add comment 78b9efb

move flash-attn monkey patch alongside the others 312a9fa

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype 248bf90

qlora w flash attention fixes (#333) 77085ea unverified

add peft install back since it doesn't get installed by setup.py (#331) db2a358 unverified

don't resize embeddings to multiples of 32x by default 1066751

Adding logging enhancement 553a86b

support for loading a model by git revision 69a2350

skip explicit model type too if using trust_remote_code d69da99

don't use llama if trust_remote_code is set since that needs to use AutoModel path 66afb76

optionally define whether to use_fast tokenizer 47d601f

add float16 docs and tweak typehints 88e17ff

style correction 136522f

issue #205 bugfix 556fe40

Merge branch 'main' into flash-optimum fd2c981 unverified

Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map 93dacba unverified

Merge pull request #177 from NanoCode012/fix/landmark-patch 8002ffb unverified

Merge branch 'main' into strip-peft-device-map 5e616d9 unverified

Merge pull request #159 from AngainorDev/patch-1 8e568bb unverified

add check for attr c9a149f

Fix strict and Lint b565ecf

match up gradient checkpointing when using lora w config fe0b768

Fix undefined LlamaForCausalLM and del try except 563b6d8

peft no longer needs device_map cd0a6f6

Address PR suggestion e285e24 unverified

Refactor landmark attention patch 919727b

Fix missing cfg. a808bf9 unverified

Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref 0124825 unverified

more gpt-neox long ctx fixes ab5cd28

more tweaks to do pre-training with bettertransformers 1210dc8

add support for opimum bettertransformers 1edc30c

fix for local variable 'LlamaForCausalLM' referenced before assignment 14163c1

Merge branch 'main' into patch-1 79e2a6f unverified

add support to extend context with xpos rope a03a7d7

fix for max sequence len across different model types 7f09106

standardize attn hijack patches (#381)

06edf17
unverified

don't use mask expansion for inference (#392)

1687be6
unverified

don't pass rope_scaling kwarg if it's None (#383)

919246f
unverified

try to detect accelerate and only use device_map=None in that case (#373)

094fc2c
unverified

remove unnecessary local variable

0c96727

simplify `load_tokenizer`

efb3b2c

improve GPU logging to break out pytorch cache and system mem

7b55fe6

quiet noise from llama tokenizer by setting pad token earlier

e029ab3

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

Feat: Add rope scaling (#343)

b521206
unverified

Merge pull request #356 from tmm1/load_model-args

11ddccb
unverified

simplify load_model signature

7181022

log GPU memory usage

e303d64

ensure enable_input_require_grads is called on model before getting the peft model (#345)

176b888
unverified

fix typo

2eda9e0

scope flash-attn+qlora fix correctly, scope to llama, add comment

78b9efb

move flash-attn monkey patch alongside the others

312a9fa

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype

248bf90

qlora w flash attention fixes (#333)

77085ea
unverified

add peft install back since it doesn't get installed by setup.py (#331)

db2a358
unverified

don't resize embeddings to multiples of 32x by default

1066751

Adding logging enhancement

553a86b

support for loading a model by git revision

69a2350

skip explicit model type too if using trust_remote_code

d69da99

don't use llama if trust_remote_code is set since that needs to use AutoModel path

66afb76

optionally define whether to use_fast tokenizer

47d601f

add float16 docs and tweak typehints

88e17ff

style correction

136522f

issue #205 bugfix

556fe40

Merge branch 'main' into flash-optimum

fd2c981
unverified

Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map

93dacba
unverified

Merge pull request #177 from NanoCode012/fix/landmark-patch

8002ffb
unverified

Merge branch 'main' into strip-peft-device-map

5e616d9
unverified

Merge pull request #159 from AngainorDev/patch-1

8e568bb
unverified

add check for attr

c9a149f

Fix strict and Lint

b565ecf

match up gradient checkpointing when using lora w config

fe0b768

Fix undefined LlamaForCausalLM and del try except

563b6d8

peft no longer needs device_map

cd0a6f6

Address PR suggestion

e285e24
unverified

Refactor landmark attention patch

919727b

Fix missing cfg.

a808bf9
unverified

Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref

0124825
unverified

more gpt-neox long ctx fixes

ab5cd28

more tweaks to do pre-training with bettertransformers

1210dc8

add support for opimum bettertransformers

1edc30c

fix for local variable 'LlamaForCausalLM' referenced before assignment

14163c1

Merge branch 'main' into patch-1

79e2a6f
unverified

add support to extend context with xpos rope

a03a7d7

fix for max sequence len across different model types

7f09106