Oscar Wu
commited on
Commit
·
d1867a5
1
Parent(s):
16e0a5a
Updated README
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ This repository contains [`meta-llama/Llama-3.1-8B-Instruct`](https://huggingfac
|
|
11 |
|
12 |
# Why should I use this model?
|
13 |
|
14 |
-
1. **Accuracy:** This xMADified model is the best **quantized** version of the `meta-llama/Llama-3.1-8B-Instruct` model. We **crush
|
15 |
|
16 |
2. **Memory-efficiency:** The full-precision model is around 16 GB, while this xMADified model is only 5.7 GB, making it feasible to run on a 8 GB GPU.
|
17 |
|
|
|
11 |
|
12 |
# Why should I use this model?
|
13 |
|
14 |
+
1. **Accuracy:** This xMADified model is the best **quantized** version of the `meta-llama/Llama-3.1-8B-Instruct` model. We **crush the most downloaded quantized** version(s) (see _Table 1_ below).
|
15 |
|
16 |
2. **Memory-efficiency:** The full-precision model is around 16 GB, while this xMADified model is only 5.7 GB, making it feasible to run on a 8 GB GPU.
|
17 |
|