hgloow commited on
Commit
d23fbb2
·
1 Parent(s): f2ae149

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -3
README.md CHANGED
@@ -18,11 +18,10 @@ tags:
18
  - [VAGOsolutions/SauerkrautLM-SOLAR-Instruct](https://huggingface.co/VAGOsolutions/SauerkrautLM-SOLAR-Instruct)
19
  - [upstage/SOLAR-10.7B-Instruct-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0)
20
 
21
- ## EXL2 Quants
22
  Measured using ExLlamaV2 and 4096 max_seq_len with [Oobabooga's Text Generation WebUI](https://github.com/oobabooga/text-generation-webui/tree/main).
23
 
24
  Use [TheBloke's 4bit-32g quants](https://huggingface.co/TheBloke/Sakura-SOLAR-Instruct-GPTQ/tree/gptq-4bit-32g-actorder_True) (7.4GB VRAM usage) if you have 8GB cards.
25
- ### Quantizations
26
  | Branch | BPW | Folder Size | VRAM Usage | Description |
27
  | ------ | --- | ----------- | ---------- | ----------- |
28
  [3.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw)|3.0BPW|4.01GB|5.1 GB|For >=6GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
@@ -31,7 +30,7 @@ Use [TheBloke's 4bit-32g quants](https://huggingface.co/TheBloke/Sakura-SOLAR-In
31
  [7.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/7.0bpw)|7.0BPW|8.89GB|10.2 GB|For >=11GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
32
  [8.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/8.0bpw)|8.0BPW|10.1GB|11.3 GB|For >=12GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
33
 
34
- ### Zipped Quantizations (if you want to download a single file, smaller to download)
35
  | Branch | File Size |
36
  | ------ | --------- |
37
  [3.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw-zip)|3.72GB
 
18
  - [VAGOsolutions/SauerkrautLM-SOLAR-Instruct](https://huggingface.co/VAGOsolutions/SauerkrautLM-SOLAR-Instruct)
19
  - [upstage/SOLAR-10.7B-Instruct-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0)
20
 
21
+ ## Quantizations
22
  Measured using ExLlamaV2 and 4096 max_seq_len with [Oobabooga's Text Generation WebUI](https://github.com/oobabooga/text-generation-webui/tree/main).
23
 
24
  Use [TheBloke's 4bit-32g quants](https://huggingface.co/TheBloke/Sakura-SOLAR-Instruct-GPTQ/tree/gptq-4bit-32g-actorder_True) (7.4GB VRAM usage) if you have 8GB cards.
 
25
  | Branch | BPW | Folder Size | VRAM Usage | Description |
26
  | ------ | --- | ----------- | ---------- | ----------- |
27
  [3.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw)|3.0BPW|4.01GB|5.1 GB|For >=6GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
 
30
  [7.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/7.0bpw)|7.0BPW|8.89GB|10.2 GB|For >=11GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
31
  [8.0bpw](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/8.0bpw)|8.0BPW|10.1GB|11.3 GB|For >=12GB VRAM cards with idle VRAM atleast or below 500MB (headroom for other things)
32
 
33
+ ## Zipped Quantizations (if you want to download a single file, smaller to download)
34
  | Branch | File Size |
35
  | ------ | --------- |
36
  [3.0bpw-zip](https://huggingface.co/hgloow/Sakura-SOLAR-Instruct-EXL2/tree/3.0bpw-zip)|3.72GB