answerdotai
/

ModernBERT-base

Model card Files Files and versions Community

Update README.md

#35

by solankibhargav - opened 7 days ago

base: refs/heads/main

←

from: refs/pr/35

Discussion Files changed

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -55,6 +55,7 @@ Since ModernBERT is a Masked Language Model (MLM), you can use the `fill-mask` p
 **⚠️ If your GPU supports it, we recommend using ModernBERT with Flash Attention 2 to reach the highest efficiency. To do so, install Flash Attention as follows, then use the model as normal:**
 ```bash
 pip install flash-attn
 ```
@@ -66,6 +67,7 @@ from transformers import AutoTokenizer, AutoModelForMaskedLM
 model_id = "answerdotai/ModernBERT-base"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = AutoModelForMaskedLM.from_pretrained(model_id)
 text = "The capital of France is [MASK]."
 inputs = tokenizer(text, return_tensors="pt")
@@ -86,6 +88,8 @@ import torch
 from transformers import pipeline
 from pprint import pprint
 pipe = pipeline(
     "fill-mask",
     model="answerdotai/ModernBERT-base",

 **⚠️ If your GPU supports it, we recommend using ModernBERT with Flash Attention 2 to reach the highest efficiency. To do so, install Flash Attention as follows, then use the model as normal:**
 ```bash
+# To load on CPU, you can skip this step.
 pip install flash-attn
 ```
 model_id = "answerdotai/ModernBERT-base"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = AutoModelForMaskedLM.from_pretrained(model_id)
+# For CPU,  model = AutoModelForMaskedLM.from_pretrained(model_id, reference_compile=False)
 text = "The capital of France is [MASK]."
 inputs = tokenizer(text, return_tensors="pt")
 from transformers import pipeline
 from pprint import pprint
+# To load on CPU, reference_compile=False
 pipe = pipeline(
     "fill-mask",
     model="answerdotai/ModernBERT-base",