allenai
/

OLMo-2-1124-13B

Model card Files Files and versions Community

amanrangapur commited on Nov 26, 2024

Commit

0836bb8

•

1 Parent(s): 0de69f3

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -56,7 +56,7 @@ For faster performance, you can quantize the model using the following method:
 ```python
 AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-1124-13B",
     torch_dtype=torch.float16,
-    load_in_8bit=True)  # Requires bitsandbytes package
 ```
 The quantized model is more sensitive to data types and CUDA operations. To avoid potential issues, it's recommended to pass the inputs directly to CUDA using:
 ```python

 ```python
 AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-1124-13B",
     torch_dtype=torch.float16,
+    load_in_8bit=True)  # Requires bitsandbytes
 ```
 The quantized model is more sensitive to data types and CUDA operations. To avoid potential issues, it's recommended to pass the inputs directly to CUDA using:
 ```python