Add 2 bit versions
#1
by
nonetrix
- opened
I want to attempt to run this model on my phone, 4 bits is a touch too much unfortunately. I don't really expect it to work well, but I would appreciate the option to have it
Hello, I think your better off using google colab to host the model. With the free tier, you can run a q5km 13b model with 6k context, or a 4bpw 20b model with 6k context (exl).
I can just use my PC as well. I just want to do it for the hell of it
Straight 2-bit you'll have trouble with. Probably won't be worth it. There are a couple of things that get close:
https://huggingface.co/zaq-hack/Noromaid-13B-0.4-DPO-bpw300-h6-exl2 ~6GB footprint
You could do QUIP, maybe.
Maybe I'll do a 2.5 rpcal ... that might not suck ALL the life out of it ...