Any chance of an int4 or quantised version?

by smcleod - opened 22 days ago

Discussion

smcleod

22 days ago

I'd try to create one myself, but the bf/fp16 weights are too big for me to process 🤣

dfsafdsf

21 days ago

@smcleod may be you have abbility quantinize https://huggingface.co/inarikami/DeepSeek-V3-int4-TensorRT ?

olborer

18 days ago

@smcleod may be you have abbility quantinize https://huggingface.co/inarikami/DeepSeek-V3-int4-TensorRT ?

you have to be crazy/desperate to want any lower quants than int4. For a model to be comparable at all to nonquant version even this I would call too low personally...

smcleod

6 days ago

@olborer that's not how param size vs quant type works. The larger the param size the less problems you have with lower quantisations and essentially any quant that's at least Q2_K_M of a larger parameter model will beat the smaller parameter version.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment