What context size when using a 24GB VRAM card (4090) is best?

#1
by clevnumb - opened

I will be using through TabbyUI in SillyTavern. What context do I set? Thanks.

I think this looks too big for a 24gb card. Try a quantized GGUF version which should run easily with any context size you want. In my experience the mistral models don't work too well over 16k even though they technically handle it.

@clevnumb Q6KM, 16K ctx should work?

Sign up or log in to comment