This model is in my opinion (although, I created it so I am probably biased) one of the best language models overall for it's size of only 362M parameters. I trained it on many of the popular prompts (e.g. 'R's in strawberry). This is better for roleplay, common sense reasoning, and understanding the prompt than the preview version. NanoGPT-Chat is designed to be a general purpose instruction-tuned language model that is transparent while maintaining a friendly tone. I would be very happy for feedback or suggestions.
When the model is done writing it's response it generates the "END" token–so, keep that in mind when using the model and set that as the stopping token.
To use this with transformers
from transformers import pipeline, StoppingCriteria, StoppingCriteriaList
# 1. Define a custom StoppingCriteria
class StopOnWord(StoppingCriteria):
def __init__(self, stop_string: str, tokenizer):
self.stop_string = stop_string
self.tokenizer = tokenizer
def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
# Decode the current generation to text
decoded_text = self.tokenizer.decode(input_ids[0], skip_special_tokens=True)
# Return True if the stop word is found
return self.stop_string in decoded_text
# 2. Create the pipeline as usual
gen_pipeline = pipeline(
"text-generation",
model="ezcz/NanoGPT-Chat",
model_kwargs={"torch_dtype": "auto"},
device_map="auto",
)
# 3. Prepare the messages
messages = [
{"role": "system", "content": "Answer like a pirate."},
{"role": "user", "content": "How many letter 'R's are in strawberry?"},
]
# 4. Instantiate the stopping criteria
stop_criteria = StoppingCriteriaList([StopOnWord("END", gen_pipeline.tokenizer)])
# 5. Generate with custom stopping
outputs = gen_pipeline(
messages,
max_new_tokens=128,
stopping_criteria=stop_criteria,
)
# 6. Retrieve the text
full_text = outputs[0]["generated_text"]
# 7. (Optional) If you'd like to remove "END" itself from the final output
if "END" in full_text:
full_text = full_text.split("END")[0]
print(full_text)
Make sure to use pip install transformers
in the command prompt to use transformers.
- Downloads last month
- 19
Model tree for ezcz/NanoGPT-Chat
Base model
HuggingFaceTB/SmolLM2-360M-Instruct