InferenceIllusionist
/

Fimbulvetr-11B-v2-iMat-GGUF

Inference Endpoints

Model card Files Files and versions Community

InferenceIllusionist commited on Feb 25, 2024

Commit

4533b02

·

verified ·

1 Parent(s): e1fa24c

Update README.md

Files changed (1) hide show

README.md +5 -2

README.md CHANGED Viewed

@@ -8,11 +8,14 @@ license: cc-by-nc-4.0
 - Model creator: [Sao10K](https://huggingface.co/Sao10K/)
 - Original model: [Fimbulvetr-11B-v2](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2)
-All credits to Sao10K for the original model. This is just a small test of the new quantization types such as IQ_3S. Looking for Q3/Q4/Q5 quants? See the link in the model card below.
-Quantized using from fp16 with love. Importance matrix file calculated using Q8_0.
 See original model card details below.
 ![Fox1](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2/resolve/main/cute1.jpg)

 - Model creator: [Sao10K](https://huggingface.co/Sao10K/)
 - Original model: [Fimbulvetr-11B-v2](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2)
+All credits to Sao10K for the original model. This is just a quick test of the new quantization types such as IQ_3S in an attempt to further reduce VRAM requirements.
+Looking for Q3/Q4/Q5 quants? See the link in the model card below.
+Quantized from fp16 with love. Importance matrix file [Fimbulvetr-11B-v2-imatrix.dat](https://huggingface.co/InferenceIllusionist/Fimbulvetr-11B-v2-iMat-GGUF/blob/main/Fimbulvetr-11B-v2-imatrix.dat) was calculated using Q8_0.
 See original model card details below.
+---
 ![Fox1](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2/resolve/main/cute1.jpg)