Query Translator Mini

This repository contains a fine-tuned version of Qwen 2.5 7B model specialized in translating natural language queries into structured Orama search queries.

The model uses PEFT with LoRA to maintain efficiency while achieving high performance.

Model Details

Model Description

The Query Translator Mini model is designed to convert natural language queries into structured JSON queries compatible with the Orama search engine.

It understands various data types and query operators, making it versatile for different search scenarios.

Key Features

Translates natural language to structured Orama queries
Supports multiple field types: string, number, boolean, enum, and arrays
Handles complex query operators: gt, gte, lt, lte, eq, between, containsAll
Supports nested properties with dot notation
Works with both full-text search and filtered queries

Usage

import json, torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

SYSTEM_PROMPT = """
You are a tool used to generate synthetic data of Orama queries. Orama is a full-text, vector, and hybrid search engine.

Let me show you what you need to do with some examples.

Example:
  - Query: `"What are the red wines that cost less than 20 dollars?"`
  - Schema: `{ "name": "string", "content": "string", "price": "number", "tags": "enum[]" }`
  - Generated query: `{ "term": "", "where": { "tags": { "containsAll": ["red", "wine"] }, "price": { "lt": 20 } } }`

Another example:
  - Query: `"Show me 5 prosecco wines good for aperitif"`
  - Schema: `{ "name": "string", "content": "string", "price": "number", "tags": "enum[]" }`
  - Generated query: `{ "term": "prosecco aperitif", "limit": 5 }`

One last example:
  - Query: `"Show me some wine reviews with a score greater than 4.5 and less than 5.0."`
  - Schema: `{ "title": "string", "content": "string", "reviews": { "score": "number", "text": "string" } }]`
  - Generated query: `{ "term": "", "where": { "reviews.score": { "between": [4.5, 5.0] } } }`

The rules to generate the query are:

- Never use an "embedding" field in the schema.
- Every query has a "term" field that is a string. It represents the full-text search terms. Can be empty (will match all documents).
- You can use a "where" field that is an object. It represents the filters to apply to the documents. Its keys and values depend on the schema of the database:
  - If the field is a "string", you should not use operators. Example: `{ "where": { "title": "champagne" } }`.
  - If the field is a "number", you can use the following operators: "gt", "gte", "lt", "lte", "eq", "between". Example: `{ "where": { "price": { "between": [20, 100] } } }`. Another example: `{ "where": { "price": { "lt": 20 } } }`.
  - If the field is an "enum", you can use the following operators: "eq", "in", "nin". Example: `{ "where": { "tags": { "containsAll": ["red", "wine"] } } }`.
  - If the field is an "string[]", it's gonna be just like the "string" field, but you can use an array of values. Example: `{ "where": { "title": ["champagne", "montagne"] } }`.
  - If the field is a "boolean", you can use the following operators: "eq". Example: `{ "where": { "isAvailable": true } }`. Another example: `{ "where": { "isAvailable": false } }`.
  - If the field is a "enum[]", you can use the following operators: "containsAll". Example: `{ "where": { "tags": { "containsAll": ["red", "wine"] } } }`.
  - Nested properties are supported. Just translate them into dot notation. Example: `{ "where": { "author.name": "John" } }`.
  - Array of numbers are not supported.
  - Array of booleans are not supported.

Return just a JSON object, nothing more.
"""

QUERY = "Show me some wine reviews with a score greater than 4.5 and less than 5.0."

SCHEMA = {
    "title": "string",
    "description": "string",
    "price": "number",
}

base_model_name = "Qwen/Qwen2.5-7B"
adapter_path = "OramaSearch/query-translator-mini"

print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(base_model_name)

print("Loading base model...")
model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
)

print("Loading fine-tuned adapter...")
model = PeftModel.from_pretrained(model, adapter_path)

if torch.cuda.is_available():
    model = model.cuda()
    print(f"GPU memory after loading: {torch.cuda.memory_allocated(0) / 1024**2:.2f} MB")

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": f"Query: {QUERY}\nSchema: {json.dumps(SCHEMA)}"},
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.1,
    top_p=0.9,
    num_return_sequences=1,
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Training Details

The model was trained on a NVIDIA H100 SXM using the following configuration:

Base Model: Qwen 2.5 7B
Training Method: LoRA
Quantization: 4-bit quantization using bitsandbytes
LoRA Configuration:
- Rank: 16
- Alpha: 32
- Dropout: 0.1
- Target Modules: Attention layers and MLP
Training Arguments:
- Epochs: 3
- Batch Size: 2
- Learning Rate: 5e-5
- Gradient Accumulation Steps: 8
- FP16 Training: Enabled
- Gradient Checkpointing: Enabled

Supported Query Types

The model can handle various types of queries including:

Simple text search:

{
    "term": "prosecco aperitif",
    "limit": 5
}

Numeric range queries:

{
    "term": "",
    "where": {
        "price": {
            "between": [20, 100]
        }
    }
}

Tag-based filtering:

{
    "term": "",
    "where": {
        "tags": {
            "containsAll": ["red", "wine"]
        }
    }
}

Limitations

Does not support array of numbers or booleans
Maximum input length is 1024 tokens
Embedding fields are not supported in the schema

Citation

If you use this model in your research, please cite:

@misc{query-translator-mini,
  author = {OramaSearch Inc.},
  title = {Query Translator Mini: Natural Language to Orama Query Translation},
  year = {2024},
  publisher = {HuggingFace},
  journal = {HuggingFace Repository},
  howpublished = {\url{https://huggingface.co/OramaSearch/query-translator-mini}}
}

License

AGPLv3

Acknowledgments

This model builds upon the Qwen 2.5 7B model and uses techniques from the PEFT library. Special thanks to the teams behind these projects.

OramaSearch
/

query-translator-mini