OPEA
/

OLMo-2-1124-13B-Instruct-int4-sym-inc

4-bit precision

intel/auto-round

Model card Files Files and versions Community

cicdatopea commited on Dec 6, 2024

Commit

9d9da92

·

verified ·

1 Parent(s): 2731cfc

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -7,7 +7,7 @@ datasets:
 ## Model Card Details
-This model is an int4 model with group_size 128 and symmetric quantization of [allenai/OLMo-2-1124-13B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct) generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model  with revision `90c15db` to use AutoGPTQ format
 ## Inference on CPU/HPU/CUDA
@@ -28,7 +28,7 @@ model = AutoModelForCausalLM.from_pretrained(
     quantized_model_dir,
     torch_dtype='auto',
     device_map="auto",
-    ##revision="90c15db", ##AutoGPTQ format
 )
 ##import habana_frameworks.torch.core as htcore ## uncommnet it for HPU
@@ -136,7 +136,7 @@ Here is the sample command to generate the model.
 ```bash
 auto-round  \
---model allenai/OLMo-2-1124-13B-Instruct \
 --device 0 \
 --nsamples 512 \
 --model_dtype "fp16" \

 ## Model Card Details
+This model is an int4 model with group_size 128 and symmetric quantization of [allenai/OLMo-2-1124-13B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct) generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model  with revision `4b5e415` to use AutoGPTQ format
 ## Inference on CPU/HPU/CUDA
     quantized_model_dir,
     torch_dtype='auto',
     device_map="auto",
+    ##revision="4b5e415", ##AutoGPTQ format
 )
 ##import habana_frameworks.torch.core as htcore ## uncommnet it for HPU
 ```bash
 auto-round  \
+--model OLMo-2-1124-13B-Instruct \
 --device 0 \
 --nsamples 512 \
 --model_dtype "fp16" \