Model Card: Mistral-Nemo-Instruct-2407_ORPO- Fine-Tuned for Text-to-SQL

Model Overview

Model Name

Mistral-Nemo-Instruct-2407_ORPO

Base Model

Mistral-NeMo-Instruct

Purpose

This model was fine-tuned to improve accuracy for translating natural language queries into SQL statements, specifically targeting non-technical users. The fine-tuning process compared two methodologies: Direct Preference Optimization (DPO) and Odds Ratio Preference Optimization (ORPO).


Fine-Tuning Methods

Direct Preference Optimization (DPO)

  • A dynamic weight-scaling approach to balance preference alignment and output diversity.
  • Uses preference pairs ("selected" vs. "rejected" outputs) to refine model behavior.

Odds Ratio Preference Optimization (ORPO)

  • Leverages binary preference data and an odds ratio-based penalty method.
  • Eliminates the need for reward models, offering higher efficiency and scalability.

Dataset

Training Dataset

  • Source: Synthetic Text-to-SQL dataset from Gretel AI
  • Size: 89,495 entries
  • Focus: Data Query Language (DQL) instructions, complex SQL queries including joins, window functions, and set operations.

Evaluation Dataset

  • Source: Mini-Dev dataset from the BIRD benchmark
  • Size: 500 Text-to-SQL pairs
  • Complexity Levels: Simple, Medium, Challenging

Evaluation

Metrics

  • Execution Accuracy (EX): Percentage of SQL queries executed correctly.

Results

Model Execution Accuracy (%)
Mistral-NeMo-Instruct (Base) Baseline
DPO Fine-Tuned Model +0.86%
ORPO Fine-Tuned Model +41.38%
ORPO vs. Codestral-22B +35.54%

Evaluation Accuracy Comparison


Model Use

Requirements

  • Python 3.10+
  • PyTorch 2.4+
  • CUDA 12.1

Inference Example

from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
from peft import PeftConfig,PeftModel

# Load the fine-tuned peft model
peft_config = PeftConfig.from_pretrained("JHuel/Mistral-Nemo-Instruct-2407_DPO_qlora") 
model = AutoModelForCausalLM.from_pretrained(peft_config.base_model_name_or_path)
model = PeftModel.from_pretrained(model, "JHuel/Mistral-Nemo-Instruct-2407_DPO_qlora")


# Load the fine-tuned model
tokenizer = AutoTokenizer.from_pretrained("your-model-name")
model = AutoModelForCausalLM.from_pretrained("your-model-name")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-Nemo-Instruct-2407")

# Input a natural language query
response = chatbot(messages)[0]['generated_text']

print(response)

Limitations

  • The model may not handle queries involving highly specialized or domain-specific SQL operations.
  • Training data was limited to synthetic datasets; real-world performance may vary.

Ethical Considerations

  • Bias: The training dataset was synthetic and may not fully represent real-world linguistic diversity.
  • Misuse: The model is intended for assisting in SQL generation and should not be used for tasks requiring high levels of security or privacy without additional safeguards.

Citation

If you use this model in your research or applications, please cite:

@article{JHuelsEKeuchel,
  title={Evaluation of Fine-Tuning Methods: DPO and ORPO for Text-to-SQL},
  author={Jonathan Hüls and Elina Keuchel.},
  year={2025}
}

License

The model is released under the apache-2.0 LICENSE.

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Model tree for JHuel/Mistral-Nemo-Instruct-2407_DPO_qlora

Finetuned
(46)
this model

Dataset used to train JHuel/Mistral-Nemo-Instruct-2407_DPO_qlora