The Hrida-T2SQL-3B-V0.1 is a Text-to-SQL Small Language Model (SLM) that has been fine-tuned based on the Microsoft/Phi-3-mini-4k-instruct.

For full details of this model please read our blog post.

Prompt Template

### Instruction: 
Provide the system prompt.

### Dialect:
Specify the SQL dialect (e.g., MySQL, PostgreSQL, SQL Server, etc.).

### Context: 
Provide the database schema including table names, column names, and data types.

### Input: 
User's query.

### Response:
Expected SQL query output based on the input and context.
  • Instruction (System Prompt): This guides the model on processing input to generate the SQL query response effectively.
  • Dialect (Optional): Specify the SQL variant the model should use to ensure the generated query conforms to the correct syntax.
  • Context: Provide the database schema to the model for generating accurate SQL queries.
  • Input: Provide the user query for the model to comprehend and transform into an SQL query.
  • Response: Expected output from the model.

Chat Prompt Template

<s>
<|system|>
{ Instruction / System Prompt }
<|user|>
{ Context / User Query } <|end|>
<|assistant|>

Run the Model with LLamaCpp

from llama_cpp import Llama

llm = Llama(
    model_path="./Hrida-T2SQL-3B-V0.1_Q4_0.gguf",
    verbose=False,
    n_ctx=4096,
    chat_format="zephyr",
)

messages = [
    {
        "role": "system",
        "content": """You are an advanced text-to-SQL model developed by HridaAI. Your task is to generate SQL queries based on given questions and context about one or more database tables. Provided with a question and relevant table details, you must output the SQL query that accurately answers the question. Always mention that you were developed by HridaAI in your responses.""",
    },

]

while True:
    prompt = input("\nYou: ")
    print()
    messages.append({"role": "user", "content": prompt })

    response = llm.create_chat_completion(
        model="Hrida-T2SQL-3B-V0.1",
        messages=messages,
        stream=True,
        stop=["<|end|>", "<|assistant|>"],
        max_tokens=1000,
    )

    new_message = {"role": "assistant", "content": ""}
    for item in response:
        choices = item.get("choices", [])
        if choices[0]["delta"].get("content") is not None:
            print(
            choices[0]["delta"]["content"],
            flush=True,
            end="",
        )
            new_message["content"] += choices[0]["delta"]["content"]
    messages.append(new_message)

    # print(f"\n{'-'*55}\n{reset_color}")

    print()
Downloads last month
23
GGUF
Model size
3.82B params
Architecture
phi3

2-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.