Update README.md
Browse files
README.md
CHANGED
@@ -7,13 +7,11 @@ inference: true
|
|
7 |
- 7B parameters
|
8 |
- 4-bit quantized
|
9 |
- Based on version 1.1
|
10 |
-
- Used
|
11 |
-
- Uncensored variant is available, but it's based on version 1.0 (worse quality wise)
|
12 |
-
- For q4_2, "Q4_2 ARM #1046" was used. Will update regularly if new changes are made.
|
13 |
- **Choosing between q4_0, q4_1, and q4_2:**
|
14 |
- 4_0 is the fastest. The quality is the poorest.
|
15 |
-
- 4_1 is
|
16 |
-
- 4_2
|
17 |
|
18 |
- 13B version of this can be found here: https://huggingface.co/eachadea/ggml-vicuna-13b-1.1
|
19 |
<br>
|
|
|
7 |
- 7B parameters
|
8 |
- 4-bit quantized
|
9 |
- Based on version 1.1
|
10 |
+
- Used best available quantization for each format
|
|
|
|
|
11 |
- **Choosing between q4_0, q4_1, and q4_2:**
|
12 |
- 4_0 is the fastest. The quality is the poorest.
|
13 |
+
- 4_1 is slower. The quality is noticeably better.
|
14 |
+
- 4_2 generally offers the best speed to quality ratio. The drawback is that the format is WIP.
|
15 |
|
16 |
- 13B version of this can be found here: https://huggingface.co/eachadea/ggml-vicuna-13b-1.1
|
17 |
<br>
|