onebitquantized commited on
Commit
065217a
·
verified ·
1 Parent(s): 9a33622

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -7,7 +7,7 @@ base_model:
7
 
8
  # This model has been xMADified!
9
 
10
- This repository contains [`Llama-3.1-8B-Instruct`](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) quantized from 16-bit floats to 4-bit integers, using xMAD.ai proprietary technology.
11
 
12
  # Why should I use this model?
13
 
@@ -15,10 +15,11 @@ This repository contains [`Llama-3.1-8B-Instruct`](https://huggingface.co/meta-l
15
 
16
  2. **Accuracy:** This xMADified model preserves the quality of the full-precision model. In the table below, we present the zero-shot accuracy on popular benchmarks of this xMADified model against the [neuralmagic](https://huggingface.co/neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16)-quantized model (the same model size for a fair comparison). The xMADai model offers higher accuracy across all benchmarks.
17
 
18
- | Model | MMLU | Arc Challenge | Arc Easy | LAMBADA Standard | LAMBADA OpenAI | PIQA | WinoGrande | HellaSwag |
19
- |---|---|---|---|---|---|---|---|---|
20
- | [neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16](https://huggingface.co/neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16) | 64.82 | 47.78 | 78.66 | 62.95 | 70.41 | 78.67 | 72.61 | 58.04 |
21
- | xmadai/Llama-3.1-8B-Instruct-xMADai-INT4 | **66.83** | **52.3** | **82.11** | **65.73** | **73.3** | **79.87** | **72.77** | **58.49** |
 
22
 
23
  # How to Run Model
24
 
 
7
 
8
  # This model has been xMADified!
9
 
10
+ This repository contains [`meta-llama/Llama-3.1-8B-Instruct`](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) quantized from 16-bit floats to 4-bit integers, using xMAD.ai proprietary technology.
11
 
12
  # Why should I use this model?
13
 
 
15
 
16
  2. **Accuracy:** This xMADified model preserves the quality of the full-precision model. In the table below, we present the zero-shot accuracy on popular benchmarks of this xMADified model against the [neuralmagic](https://huggingface.co/neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16)-quantized model (the same model size for a fair comparison). The xMADai model offers higher accuracy across all benchmarks.
17
 
18
+ | Model | Size | MMLU | Arc Challenge | Arc Easy | LAMBADA Standard | LAMBADA OpenAI | PIQA | WinoGrande | HellaSwag |
19
+ |---|---|---|---|---|---|---|---|---|---|
20
+ | [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) | 16.1 GB | 68.05 | 51.71 | 81.90 | 66.18 | 73.55 | 79.87 | 73.72 | 59.10 |
21
+ | [neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16](https://huggingface.co/neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16) | 5.7 GB | 64.82 | 47.78 | 78.66 | 62.95 | 70.41 | 78.67 | 72.61 | 58.04 |
22
+ | xmadai/Llama-3.1-8B-Instruct-xMADai-INT4 (this model) | 5.7 GB | **66.83** | **52.30** | **82.11** | **65.73** | **73.30** | **79.87** | **72.77** | **58.49** |
23
 
24
  # How to Run Model
25