Add 2 bit versions

by nonetrix - opened

I want to attempt to run this model on my phone, 4 bits is a touch too much unfortunately. I don't really expect it to work well, but I would appreciate the option to have it

Hello, I think your better off using google colab to host the model. With the free tier, you can run a q5km 13b model with 6k context, or a 4bpw 20b model with 6k context (exl).

I can just use my PC as well. I just want to do it for the hell of it

Straight 2-bit you'll have trouble with. Probably won't be worth it. There are a couple of things that get close: ~6GB footprint

You could do QUIP, maybe.

Maybe I'll do a 2.5 rpcal ... that might not suck ALL the life out of it ...

Sign up or log in to comment