InferenceIllusionist
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -12,6 +12,7 @@ tags:
|
|
12 |
|
13 |
Quantized from fp32 with love. If you're on the latest release of llama.cpp you should no longer need to combine files before loading
|
14 |
* Weighted quantizations created using Wizard-LM-2-8x22 [imatrix file](https://huggingface.co/jukofyork/WizardLM-2-8x22B-imatrix) provided by jukofyork
|
|
|
15 |
|
16 |
For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)
|
17 |
|
|
|
12 |
|
13 |
Quantized from fp32 with love. If you're on the latest release of llama.cpp you should no longer need to combine files before loading
|
14 |
* Weighted quantizations created using Wizard-LM-2-8x22 [imatrix file](https://huggingface.co/jukofyork/WizardLM-2-8x22B-imatrix) provided by jukofyork
|
15 |
+
* Calculated in 105 chunks with n_ctx=512 using groups_merged.txt
|
16 |
|
17 |
For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)
|
18 |
|