onebitquantized
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,7 @@ base_model:
|
|
7 |
|
8 |
# This model has been xMADified!
|
9 |
|
10 |
-
This repository contains [`Llama-3.1-8B-Instruct`](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) quantized from 16-bit floats to 4-bit integers, using xMAD.ai proprietary technology.
|
11 |
|
12 |
# Why should I use this model?
|
13 |
|
@@ -15,10 +15,11 @@ This repository contains [`Llama-3.1-8B-Instruct`](https://huggingface.co/meta-l
|
|
15 |
|
16 |
2. **Accuracy:** This xMADified model preserves the quality of the full-precision model. In the table below, we present the zero-shot accuracy on popular benchmarks of this xMADified model against the [neuralmagic](https://huggingface.co/neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16)-quantized model (the same model size for a fair comparison). The xMADai model offers higher accuracy across all benchmarks.
|
17 |
|
18 |
-
| Model | MMLU | Arc Challenge | Arc Easy | LAMBADA Standard | LAMBADA OpenAI | PIQA | WinoGrande | HellaSwag |
|
19 |
-
|
20 |
-
| [
|
21 |
-
|
|
|
|
22 |
|
23 |
# How to Run Model
|
24 |
|
|
|
7 |
|
8 |
# This model has been xMADified!
|
9 |
|
10 |
+
This repository contains [`meta-llama/Llama-3.1-8B-Instruct`](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) quantized from 16-bit floats to 4-bit integers, using xMAD.ai proprietary technology.
|
11 |
|
12 |
# Why should I use this model?
|
13 |
|
|
|
15 |
|
16 |
2. **Accuracy:** This xMADified model preserves the quality of the full-precision model. In the table below, we present the zero-shot accuracy on popular benchmarks of this xMADified model against the [neuralmagic](https://huggingface.co/neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16)-quantized model (the same model size for a fair comparison). The xMADai model offers higher accuracy across all benchmarks.
|
17 |
|
18 |
+
| Model | Size | MMLU | Arc Challenge | Arc Easy | LAMBADA Standard | LAMBADA OpenAI | PIQA | WinoGrande | HellaSwag |
|
19 |
+
|---|---|---|---|---|---|---|---|---|---|
|
20 |
+
| [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) | 16.1 GB | 68.05 | 51.71 | 81.90 | 66.18 | 73.55 | 79.87 | 73.72 | 59.10 |
|
21 |
+
| [neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16](https://huggingface.co/neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16) | 5.7 GB | 64.82 | 47.78 | 78.66 | 62.95 | 70.41 | 78.67 | 72.61 | 58.04 |
|
22 |
+
| xmadai/Llama-3.1-8B-Instruct-xMADai-INT4 (this model) | 5.7 GB | **66.83** | **52.30** | **82.11** | **65.73** | **73.30** | **79.87** | **72.77** | **58.49** |
|
23 |
|
24 |
# How to Run Model
|
25 |
|