hgloow commited on
Commit
3bb735e
·
1 Parent(s): d23fbb2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -16
README.md CHANGED
@@ -21,23 +21,16 @@ tags:
21
  ## Quantizations
22
  Measured using ExLlamaV2 and 4096 max_seq_len with [Oobabooga's Text Generation WebUI](https://github.com/oobabooga/text-generation-webui/tree/main).
23
 
 
 
24
  Use [TheBloke's 4bit-32g quants](https://huggingface.co/TheBloke/Sakura-SOLAR-Instruct-GPTQ/tree/gptq-4bit-32g-actorder_True) (7.4GB VRAM usage) if you have 8GB cards.
25
- | Branch | BPW | Folder Size | VRAM Usage | Description |
26
- | ------ | --- | ----------- | ---------- | ----------- |
27
- [3.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw)|3.0BPW|4.01GB|5.1 GB|For >=6GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
28
- [5.0bpw (main)](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/main)|5.0BPW|6.45GB|7.7 GB|For >=10GB VRAM cards
29
- [6.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/6.0bpw)|6.0BPW|7.66GB|9.0 GB|For >=10GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
30
- [7.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/7.0bpw)|7.0BPW|8.89GB|10.2 GB|For >=11GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
31
- [8.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/8.0bpw)|8.0BPW|10.1GB|11.3 GB|For >=12GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
32
-
33
- ## Zipped Quantizations (if you want to download a single file, smaller to download)
34
- | Branch | File Size |
35
- | ------ | --------- |
36
- [3.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw-zip)|3.72GB
37
- [5.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/5.0bpw-zip)|6.3GB
38
- [6.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/6.0bpw-zip)|7.4GB
39
- [7.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/7.0bpw-zip)|8.6GB
40
- [8.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/8.0bpw-zip)|9.7GB
41
 
42
  ## Calibration Dataset
43
  - [argilla/distilabel-math-preference-dpo](https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo)
 
21
  ## Quantizations
22
  Measured using ExLlamaV2 and 4096 max_seq_len with [Oobabooga's Text Generation WebUI](https://github.com/oobabooga/text-generation-webui/tree/main).
23
 
24
+ I also provided zipped quantization because a lot of people find gguf single download convenient. Zipped quantization is also smaller in size to download.
25
+
26
  Use [TheBloke's 4bit-32g quants](https://huggingface.co/TheBloke/Sakura-SOLAR-Instruct-GPTQ/tree/gptq-4bit-32g-actorder_True) (7.4GB VRAM usage) if you have 8GB cards.
27
+ | Branch | BPW | Folder Size | Zipped File Size | VRAM Usage | Description |
28
+ | ------ | --- | ----------- | ---------------- | ---------- | ----------- |
29
+ [3.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw)/[3.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw-zip)|3.0BPW|4.01GB|3.72GB|5.1 GB|For >=6GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
30
+ [5.0bpw (main)](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/main)/[5.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/5.0bpw-zip)|5.0BPW|6.45GB|6.3GB|7.7 GB|For >=10GB VRAM cards
31
+ [6.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/6.0bpw)/[6.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/6.0bpw-zip)|6.0BPW|7.66GB|7.4GB|9.0 GB|For >=10GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
32
+ [7.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/7.0bpw)/[7.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/7.0bpw-zip)|7.0BPW|8.89GB|8.6GB|10.2 GB|For >=11GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
33
+ [8.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/8.0bpw)/[8.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/8.0bpw-zip)|8.0BPW|10.1GB|9.7GB|11.3 GB|For >=12GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
 
 
 
 
 
 
 
 
 
34
 
35
  ## Calibration Dataset
36
  - [argilla/distilabel-math-preference-dpo](https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo)