Update README.md
Browse files
README.md
CHANGED
@@ -21,23 +21,16 @@ tags:
|
|
21 |
## Quantizations
|
22 |
Measured using ExLlamaV2 and 4096 max_seq_len with [Oobabooga's Text Generation WebUI](https://github.com/oobabooga/text-generation-webui/tree/main).
|
23 |
|
|
|
|
|
24 |
Use [TheBloke's 4bit-32g quants](https://huggingface.co/TheBloke/Sakura-SOLAR-Instruct-GPTQ/tree/gptq-4bit-32g-actorder_True) (7.4GB VRAM usage) if you have 8GB cards.
|
25 |
-
| Branch | BPW | Folder Size | VRAM Usage | Description |
|
26 |
-
| ------ | --- | ----------- | ---------- | ----------- |
|
27 |
-
[3.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw)|3.0BPW|4.01GB|5.1 GB|For >=6GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
28 |
-
[5.0bpw (main)](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/main)|5.0BPW|6.45GB|7.7 GB|For >=10GB VRAM cards
|
29 |
-
[6.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/6.0bpw)|6.0BPW|7.66GB|9.0 GB|For >=10GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
30 |
-
[7.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/7.0bpw)|7.0BPW|8.89GB|10.2 GB|For >=11GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
31 |
-
[8.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/8.0bpw)|8.0BPW|10.1GB|11.3 GB|For >=12GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
32 |
-
|
33 |
-
## Zipped Quantizations (if you want to download a single file, smaller to download)
|
34 |
-
| Branch | File Size |
|
35 |
-
| ------ | --------- |
|
36 |
-
[3.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw-zip)|3.72GB
|
37 |
-
[5.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/5.0bpw-zip)|6.3GB
|
38 |
-
[6.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/6.0bpw-zip)|7.4GB
|
39 |
-
[7.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/7.0bpw-zip)|8.6GB
|
40 |
-
[8.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/8.0bpw-zip)|9.7GB
|
41 |
|
42 |
## Calibration Dataset
|
43 |
- [argilla/distilabel-math-preference-dpo](https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo)
|
|
|
21 |
## Quantizations
|
22 |
Measured using ExLlamaV2 and 4096 max_seq_len with [Oobabooga's Text Generation WebUI](https://github.com/oobabooga/text-generation-webui/tree/main).
|
23 |
|
24 |
+
I also provided zipped quantization because a lot of people find gguf single download convenient. Zipped quantization is also smaller in size to download.
|
25 |
+
|
26 |
Use [TheBloke's 4bit-32g quants](https://huggingface.co/TheBloke/Sakura-SOLAR-Instruct-GPTQ/tree/gptq-4bit-32g-actorder_True) (7.4GB VRAM usage) if you have 8GB cards.
|
27 |
+
| Branch | BPW | Folder Size | Zipped File Size | VRAM Usage | Description |
|
28 |
+
| ------ | --- | ----------- | ---------------- | ---------- | ----------- |
|
29 |
+
[3.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw)/[3.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw-zip)|3.0BPW|4.01GB|3.72GB|5.1 GB|For >=6GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
30 |
+
[5.0bpw (main)](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/main)/[5.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/5.0bpw-zip)|5.0BPW|6.45GB|6.3GB|7.7 GB|For >=10GB VRAM cards
|
31 |
+
[6.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/6.0bpw)/[6.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/6.0bpw-zip)|6.0BPW|7.66GB|7.4GB|9.0 GB|For >=10GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
32 |
+
[7.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/7.0bpw)/[7.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/7.0bpw-zip)|7.0BPW|8.89GB|8.6GB|10.2 GB|For >=11GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
33 |
+
[8.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/8.0bpw)/[8.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/8.0bpw-zip)|8.0BPW|10.1GB|9.7GB|11.3 GB|For >=12GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
## Calibration Dataset
|
36 |
- [argilla/distilabel-math-preference-dpo](https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo)
|