GEMMA-2-2B for Text-to-PostgreSQL (Text-PGSQL)
Model Overview
Model Name: GEMMA-2-2B
Model Type: Text-to-PostgreSQL (Text-PGSQL)
Purpose: The model is designed for query generation purposes. It translates natural language queries into PostgreSQL SQL queries.
Training Details
- Training Duration: 12 hours
- Epochs: 3
- Loss: Achieved good loss, indicating effective learning and generalization.
Dataset
The model was trained on a dataset that includes a variety of schemas, questions, and queries relevant to PostgreSQL.
Performance
- Loss: Good loss achieved, demonstrating the model's ability to effectively learn and generate accurate SQL queries.
Usage
This model is intended for applications that require natural language to SQL translation, particularly for PostgreSQL databases. It can be utilized in systems that need to convert user queries in natural language into executable SQL statements.
Model Architecture
The model leverages the GEMMA-2-2B architecture, which is a large pre-trained model adapted for the task of SQL query generation.
Prompt
prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{schema}
### Input:
{question}
### Response:
"""
Pytorch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = AutoModelForCausalLM.from_pretrained("suriya7/PostgreSQL-Coder-2B").to(device)
tokenizer = AutoTokenizer.from_pretrained("suriya7/PostgreSQL-Coder-2B")
inputs = tokenizer(prompt, return_tensors="pt").to(device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0],skip_special_tokens=True).replace(prompt,'').strip())
Examples
Schemas
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(100) UNIQUE NOT NULL,
hire_date DATE NOT NULL,
job_id INT NOT NULL,
salary NUMERIC(8, 2),
department_id INT
);
CREATE TABLE Authors (
author_id SERIAL PRIMARY KEY,
first_name VARCHAR(50),
last_name VARCHAR(50),
birth_date DATE
);
CREATE TABLE Books (
book_id SERIAL PRIMARY KEY,
title VARCHAR(100) NOT NULL,
genre VARCHAR(50),
publication_year INT,
author_id INT,
FOREIGN KEY (author_id) REFERENCES Authors(author_id)
);
CREATE TABLE Members (
member_id SERIAL PRIMARY KEY,
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(100) UNIQUE,
phone_number VARCHAR(15),
membership_start_date DATE
);
CREATE TABLE Loans (
loan_id SERIAL PRIMARY KEY,
book_id INT,
member_id INT,
loan_date DATE,
return_date DATE,
FOREIGN KEY (book_id) REFERENCES Books(book_id),
FOREIGN KEY (member_id) REFERENCES Members(member_id)
);
Questions
Find the emails of employees who were hired after January 1, 2020.
SELECT email FROM employees WHERE hire_date > '2020-01-01';
What are the first names and last names of all employees?
SELECT first_name, last_name FROM employees;
Retrieve the titles and genres of books that have not been borrowed by any member.
SELECT title, genre FROM Books WHERE book_id NOT IN (SELECT book_id FROM Loans);
- Downloads last month
- 36