A capable language model for text to SQL generation for Postgres, Redshift and Snowflake that is on-par with the most capable generalist frontier models.

image/png

Model Description

Developed by: Defog, Inc Model type: [Text to SQL] License: [CC-by-SA-4.0] Finetuned from model: [Meta-Llama-3-8B-Instruct]

defog/llama-3-sqlcoder-8b for CTranslate2

The model is quantized version of the defog/llama-3-sqlcoder-8b with int8_float16 quantization and can be used in CTranslate2.

How to use

pip install ctranslate2

This repository for use with CTranslate2.

Use with CTranslate2

This example code is obtained from CTranslate2_transformers and tokenizer AutoTokenizer.
More detailed information about the generate_batch methon can be found at CTranslate2_Generator.generate_batch.

import ctranslate2
import transformers

from huggingface_hub import snapshot_download
model_id = "ByteForge/Defog_llama-3-sqlcoder-8b-ct2-int8_float16"
model_path = snapshot_download(model_id)
model = ctranslate2.Generator(model_path)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)

prompt="""
CREATE TABLE stadium (
    stadium_id number,
    location text,
    name text,
    capacity number,
    highest number,
    lowest number,
    average number
)

CREATE TABLE singer (
    singer_id number,
    name text,
    country text,
    song_name text,
    song_release_year text,
    age number,
    is_male others
)

CREATE TABLE concert (
    concert_id number,
    concert_name text,
    theme text,
    stadium_id text,
    year text
)

CREATE TABLE singer_in_concert (
    concert_id number,
    singer_id text
)

-- Using valid SQLite, answer the following questions for the tables provided above.

-- What is the maximum, the average, and the minimum capacity of stadiums ? (Generate 1 Sql query. No explaination needed)

answer:
"""

messages = [
    {"role": "system", "content": "You are SQL Expert. Given a input question and schema, answer with correct sql query"},
    {"role": "user", "content": prompt},
]

input_ids = tokenizer.apply_chat_template(
    messages, 
    tokenize=False, 
    add_generation_prompt=True
)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

input_tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode(input_ids))

results = model.generate_batch([input_tokens], include_prompt_in_result=False, max_length=256, sampling_temperature=0.6, sampling_topp=0.9, end_token=terminators)
output = tokenizer.decode(results[0].sequences_ids[0])

print(output)

Ideal prompt and inference parameters

Set temperature to 0, and do not do sampling.

Evaluation

This model was evaluated on SQL-Eval, a PostgreSQL based evaluation framework developed by Defog for testing and alignment of model capabilities.

You can read more about the methodology behind SQLEval here.

Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.