Update README.md
Browse files
README.md
CHANGED
@@ -18,11 +18,10 @@ tags:
|
|
18 |
- [VAGOsolutions/SauerkrautLM-SOLAR-Instruct](https://huggingface.co/VAGOsolutions/SauerkrautLM-SOLAR-Instruct)
|
19 |
- [upstage/SOLAR-10.7B-Instruct-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0)
|
20 |
|
21 |
-
##
|
22 |
Measured using ExLlamaV2 and 4096 max_seq_len with [Oobabooga's Text Generation WebUI](https://github.com/oobabooga/text-generation-webui/tree/main).
|
23 |
|
24 |
Use [TheBloke's 4bit-32g quants](https://huggingface.co/TheBloke/Sakura-SOLAR-Instruct-GPTQ/tree/gptq-4bit-32g-actorder_True) (7.4GB VRAM usage) if you have 8GB cards.
|
25 |
-
### Quantizations
|
26 |
| Branch | BPW | Folder Size | VRAM Usage | Description |
|
27 |
| ------ | --- | ----------- | ---------- | ----------- |
|
28 |
[3.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw)|3.0BPW|4.01GB|5.1 GB|For >=6GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
@@ -31,7 +30,7 @@ Use [TheBloke's 4bit-32g quants](https://huggingface.co/TheBloke/Sakura-SOLAR-In
|
|
31 |
[7.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/7.0bpw)|7.0BPW|8.89GB|10.2 GB|For >=11GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
32 |
[8.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/8.0bpw)|8.0BPW|10.1GB|11.3 GB|For >=12GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
33 |
|
34 |
-
|
35 |
| Branch | File Size |
|
36 |
| ------ | --------- |
|
37 |
[3.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw-zip)|3.72GB
|
|
|
18 |
- [VAGOsolutions/SauerkrautLM-SOLAR-Instruct](https://huggingface.co/VAGOsolutions/SauerkrautLM-SOLAR-Instruct)
|
19 |
- [upstage/SOLAR-10.7B-Instruct-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0)
|
20 |
|
21 |
+
## Quantizations
|
22 |
Measured using ExLlamaV2 and 4096 max_seq_len with [Oobabooga's Text Generation WebUI](https://github.com/oobabooga/text-generation-webui/tree/main).
|
23 |
|
24 |
Use [TheBloke's 4bit-32g quants](https://huggingface.co/TheBloke/Sakura-SOLAR-Instruct-GPTQ/tree/gptq-4bit-32g-actorder_True) (7.4GB VRAM usage) if you have 8GB cards.
|
|
|
25 |
| Branch | BPW | Folder Size | VRAM Usage | Description |
|
26 |
| ------ | --- | ----------- | ---------- | ----------- |
|
27 |
[3.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw)|3.0BPW|4.01GB|5.1 GB|For >=6GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
|
|
30 |
[7.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/7.0bpw)|7.0BPW|8.89GB|10.2 GB|For >=11GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
31 |
[8.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/8.0bpw)|8.0BPW|10.1GB|11.3 GB|For >=12GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
|
32 |
|
33 |
+
## Zipped Quantizations (if you want to download a single file, smaller to download)
|
34 |
| Branch | File Size |
|
35 |
| ------ | --------- |
|
36 |
[3.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw-zip)|3.72GB
|