What context size when using a 24GB VRAM card (4090) is best?

by clevnumb - opened 23 days ago

Discussion

clevnumb

23 days ago

I will be using through TabbyUI in SillyTavern. What context do I set? Thanks.

svachalek

4 days ago

I think this looks too big for a 24gb card. Try a quantized GGUF version which should run easily with any context size you want. In my experience the mistral models don't work too well over 16k even though they technically handle it.

clevnumb

3 days ago

•

edited 3 days ago

Thanks.

TheDrummer

Owner 3 days ago

@clevnumb Q6KM, 16K ctx should work?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment