OPEA
/

Safetensors
olmo2
4-bit precision
intel/auto-round
cicdatopea commited on
Commit
9d9da92
·
verified ·
1 Parent(s): 2731cfc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -7,7 +7,7 @@ datasets:
7
 
8
  ## Model Card Details
9
 
10
- This model is an int4 model with group_size 128 and symmetric quantization of [allenai/OLMo-2-1124-13B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct) generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model with revision `90c15db` to use AutoGPTQ format
11
 
12
  ## Inference on CPU/HPU/CUDA
13
 
@@ -28,7 +28,7 @@ model = AutoModelForCausalLM.from_pretrained(
28
  quantized_model_dir,
29
  torch_dtype='auto',
30
  device_map="auto",
31
- ##revision="90c15db", ##AutoGPTQ format
32
  )
33
 
34
  ##import habana_frameworks.torch.core as htcore ## uncommnet it for HPU
@@ -136,7 +136,7 @@ Here is the sample command to generate the model.
136
 
137
  ```bash
138
  auto-round \
139
- --model allenai/OLMo-2-1124-13B-Instruct \
140
  --device 0 \
141
  --nsamples 512 \
142
  --model_dtype "fp16" \
 
7
 
8
  ## Model Card Details
9
 
10
+ This model is an int4 model with group_size 128 and symmetric quantization of [allenai/OLMo-2-1124-13B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct) generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model with revision `4b5e415` to use AutoGPTQ format
11
 
12
  ## Inference on CPU/HPU/CUDA
13
 
 
28
  quantized_model_dir,
29
  torch_dtype='auto',
30
  device_map="auto",
31
+ ##revision="4b5e415", ##AutoGPTQ format
32
  )
33
 
34
  ##import habana_frameworks.torch.core as htcore ## uncommnet it for HPU
 
136
 
137
  ```bash
138
  auto-round \
139
+ --model OLMo-2-1124-13B-Instruct \
140
  --device 0 \
141
  --nsamples 512 \
142
  --model_dtype "fp16" \