cicdatopea
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,7 @@ datasets:
|
|
7 |
|
8 |
## Model Card Details
|
9 |
|
10 |
-
This model is an int4 model with group_size 128 and symmetric quantization of [allenai/OLMo-2-1124-13B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct) generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model with revision `
|
11 |
|
12 |
## Inference on CPU/HPU/CUDA
|
13 |
|
@@ -28,7 +28,7 @@ model = AutoModelForCausalLM.from_pretrained(
|
|
28 |
quantized_model_dir,
|
29 |
torch_dtype='auto',
|
30 |
device_map="auto",
|
31 |
-
##revision="
|
32 |
)
|
33 |
|
34 |
##import habana_frameworks.torch.core as htcore ## uncommnet it for HPU
|
@@ -136,7 +136,7 @@ Here is the sample command to generate the model.
|
|
136 |
|
137 |
```bash
|
138 |
auto-round \
|
139 |
-
--model
|
140 |
--device 0 \
|
141 |
--nsamples 512 \
|
142 |
--model_dtype "fp16" \
|
|
|
7 |
|
8 |
## Model Card Details
|
9 |
|
10 |
+
This model is an int4 model with group_size 128 and symmetric quantization of [allenai/OLMo-2-1124-13B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct) generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model with revision `4b5e415` to use AutoGPTQ format
|
11 |
|
12 |
## Inference on CPU/HPU/CUDA
|
13 |
|
|
|
28 |
quantized_model_dir,
|
29 |
torch_dtype='auto',
|
30 |
device_map="auto",
|
31 |
+
##revision="4b5e415", ##AutoGPTQ format
|
32 |
)
|
33 |
|
34 |
##import habana_frameworks.torch.core as htcore ## uncommnet it for HPU
|
|
|
136 |
|
137 |
```bash
|
138 |
auto-round \
|
139 |
+
--model OLMo-2-1124-13B-Instruct \
|
140 |
--device 0 \
|
141 |
--nsamples 512 \
|
142 |
--model_dtype "fp16" \
|