Safetensors
English
olmo2
amanrangapur commited on
Commit
0836bb8
1 Parent(s): 0de69f3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -56,7 +56,7 @@ For faster performance, you can quantize the model using the following method:
56
  ```python
57
  AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-1124-13B",
58
  torch_dtype=torch.float16,
59
- load_in_8bit=True) # Requires bitsandbytes package
60
  ```
61
  The quantized model is more sensitive to data types and CUDA operations. To avoid potential issues, it's recommended to pass the inputs directly to CUDA using:
62
  ```python
 
56
  ```python
57
  AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-1124-13B",
58
  torch_dtype=torch.float16,
59
+ load_in_8bit=True) # Requires bitsandbytes
60
  ```
61
  The quantized model is more sensitive to data types and CUDA operations. To avoid potential issues, it's recommended to pass the inputs directly to CUDA using:
62
  ```python