Description model

Chocolatine-3B version specialized in French culinary language, fine-tuning of microsoft/Phi-3.5-mini-instruct.
This model is based on 283 specific terms and definitions of French cuisine.

Fine Tuning

For this version of the model I experimented a training method with a double fine-tuning, SFT then DPO.
I generated two datasets exclusively for this model, with GPT-4o deployed on Azure OpenAI.
The challenge was to achieve a consistent alignment between the two fine-tuning methods.
SFT to teach the terms and DPO to reinforce the understanding achieved during the first learning.

Fine tuning done efficiently with Unsloth, with which I saved processing time on a single T4 GPU (AzureML compute instance).

Usage

The recommended usage is by loading the low-rank adapter using unsloth:

from unsloth import FastLanguageModel
from transformers import TextStreamer
import torch

model_name = "jpacifico/Chocolatine-Cook-3B-combined-SFT-DPO-v0.1"

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name,
    max_seq_length=2048,
    dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
    load_in_4bit=False
)

FastLanguageModel.for_inference(model)
model.eval()

def generate_response(user_question: str):
    messages = [
        {"role": "system", "content": "Tu es un assistant IA spécialisé dans le langage culinaire français. Une question te sera posée. Tu dois générer une réponse précise et concise."},
        {"role": "user", "content": "En cuisine "+user_question},
    ]

    inputs = tokenizer.apply_chat_template(
        messages,
        tokenize=True,
        add_generation_prompt=True,
        return_tensors="pt",
    ).to("cuda")

    attention_mask = (inputs != tokenizer.pad_token_id).long()

    text_streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

    with torch.no_grad():
        _ = model.generate(
            input_ids=inputs,
            attention_mask=attention_mask,
            max_new_tokens=128,
            use_cache=True,
            streamer=text_streamer,
            do_sample=False,
            temperature=0.7,
        )

Limitations

The Chocolatine model series is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance.
It does not have any moderation mechanism.

  • Developed by: Jonathan Pacifico, 2024
  • License: MIT
  • Finetuned from model : microsoft/Phi-3.5-mini-instruct
Downloads last month
27
Safetensors
Model size
3.82B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for jpacifico/Chocolatine-Cook-3B-combined-SFT-DPO-v0.1

Quantizations
3 models

Datasets used to train jpacifico/Chocolatine-Cook-3B-combined-SFT-DPO-v0.1