This model is in my opinion (although, I created it so I am probably biased) one of the best language models overall for it's size of only 362M parameters. I trained it on many of the popular prompts (e.g. 'R's in strawberry). This is better for roleplay, common sense reasoning, and understanding the prompt than the preview version. NanoGPT-Chat is designed to be a general purpose instruction-tuned language model that is transparent while maintaining a friendly tone. I would be very happy for feedback or suggestions.

When the model is done writing it's response it generates the "END" token–so, keep that in mind when using the model and set that as the stopping token.

To use this with transformers

from transformers import pipeline, StoppingCriteria, StoppingCriteriaList

# 1. Define a custom StoppingCriteria
class StopOnWord(StoppingCriteria):
    def __init__(self, stop_string: str, tokenizer):
        self.stop_string = stop_string
        self.tokenizer = tokenizer

    def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
        # Decode the current generation to text
        decoded_text = self.tokenizer.decode(input_ids[0], skip_special_tokens=True)
        # Return True if the stop word is found
        return self.stop_string in decoded_text

# 2. Create the pipeline as usual
gen_pipeline = pipeline(
    "text-generation",
    model="ezcz/NanoGPT-Chat",
    model_kwargs={"torch_dtype": "auto"},
    device_map="auto",
)

# 3. Prepare the messages
messages = [
    {"role": "system", "content": "Answer like a pirate."},
    {"role": "user", "content": "How many letter 'R's are in strawberry?"},
]

# 4. Instantiate the stopping criteria
stop_criteria = StoppingCriteriaList([StopOnWord("END", gen_pipeline.tokenizer)])

# 5. Generate with custom stopping
outputs = gen_pipeline(
    messages,
    max_new_tokens=128,
    stopping_criteria=stop_criteria,
)

# 6. Retrieve the text
full_text = outputs[0]["generated_text"]

# 7. (Optional) If you'd like to remove "END" itself from the final output
if "END" in full_text:
    full_text = full_text.split("END")[0]

print(full_text)

Make sure to use pip install transformers in the command prompt to use transformers.

Downloads last month
19
Safetensors
Model size
362M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for ezcz/NanoGPT-Chat

Finetuned
(19)
this model

Dataset used to train ezcz/NanoGPT-Chat