Minueza-32M-Deita
- Base model: Felladrin/Minueza-32M-Base
- Dataset: [ChatML] hkust-nlp/deita-10k-v0
- License: Apache License 2.0
Recommended Prompt Format
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant
Recommended Inference Parameters
do_sample: true
temperature: 0.65
top_p: 0.55
top_k: 35
repetition_penalty: 1.176
Usage Example
from transformers import pipeline
generate = pipeline("text-generation", "Felladrin/Minueza-32M-Deita")
messages = [
{
"role": "system",
"content": "You are a highly knowledgeable and friendly assistant. Your goal is to understand and respond to user inquiries with clarity. Your interactions are always respectful, helpful, and focused on delivering the most accurate information to the user.",
},
{
"role": "user",
"content": "Hey! Got a question for you!",
},
{
"role": "assistant",
"content": "Sure! What's it?",
},
{
"role": "user",
"content": "What are some potential applications for quantum computing?",
},
]
prompt = generate.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
output = generate(
prompt,
max_new_tokens=256,
do_sample=True,
temperature=0.65,
top_k=35,
top_p=0.55,
repetition_penalty=1.176,
)
print(output[0]["generated_text"])
How it was trained
This model was trained with SFTTrainer using the following settings:
Hyperparameter | Value |
---|---|
Epochs | 2 |
Learning rate | 2e-5 |
Total train batch size | 16 |
Max. sequence length | 2048 |
Weight decay | 0 |
Warmup ratio | 0.1 |
Optimizer | Adam with betas=(0.9,0.999) and epsilon=1e-08 |
Scheduler | cosine |
Seed | 42 |
- Downloads last month
- 14
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for Felladrin/Minueza-32M-Deita
Base model
Felladrin/Minueza-32M-Base