File size: 3,838 Bytes
ebf5d44
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
---
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: text2text-generation
tags:
- code
- sql
- text-to-sql
- text2sql
- t2sql
---

Introducing Hrida-T2SQL-3B-128k-V0.1, our latest small language model (SLM) tailored for data scientists and industry professionals. This advanced model marks a significant upgrade from our previous release, now equipped with an expanded 128k token context window for handling even the most intricate data queries with precision. Powered by the Phi 3 architecture, it effortlessly converts natural language queries into precise SQL commands, enhancing data analysis efficiency and decision-making capabilities. 

For full details of this model please read our [blog post](https://www.hridaai.com/blog/t2sql-128k).


## Prompt Template

```txt
### Instruction: 
Provide the system prompt.

### Dialect:
Specify the SQL dialect (e.g., MySQL, PostgreSQL, SQL Server, etc.).

### Context: 
Provide the database schema including table names, column names, and data types.

### Input: 
User's query.

### Response:
Expected SQL query output based on the input and context.

```

- **Instruction (System Prompt)**: This guides the model on processing input to generate the SQL query response effectively.
- **Dialect (Optional)**: Specify the SQL variant the model should use to ensure the generated query conforms to the correct syntax.
- **Context**: Provide the database schema to the model for generating accurate SQL queries.
- **Input**: Provide the user query for the model to comprehend and transform into an SQL query.
- **Response**: Expected output from the model.


## Chat Prompt Template

```txt
<s>
<|system|>
{ Instruction / System Prompt }
<|user|>
{ Context / User Query } <|end|>
<|assistant|>
```

## Run the Model

### Using Transformers

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Define the model and tokenizer
model_id = "HridaAI/Hrida-T2SQL-3B-128k-V0.1"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, trust_remote_code=True)

# Define the context and prompt
prompt = """
Answer to the query will be in the form of an SQL query.
### Context: CREATE TABLE Employees (
    EmployeeID INT PRIMARY KEY,
    FirstName VARCHAR(50),
    LastName VARCHAR(50),
    Age INT,
    DepartmentID INT,
    Salary DECIMAL(10, 2),
    DateHired DATE,
    Active BOOLEAN,
    FOREIGN KEY (DepartmentID) REFERENCES Departments(DepartmentID)
); 

CREATE TABLE Departments (
    DepartmentID INT PRIMARY KEY,
    DepartmentName VARCHAR(100),
    Location VARCHAR(100)
); 
### Input: Write a SQL query to select all the employees who are active.
### Response:
"""
# Prepare the input
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)

# Generate the output
outputs = model.generate(inputs, max_length=300)
print(tokenizer.decode(outputs[0]))


```

### Using MLX

```python
from mlx_lm import generate, load

model,tokenizer = load("HridaAI/Hrida-T2SQL-3B-128k-V0.1")

prompt = """
Answer to the quey will be in the form of SQL query.
### Context: CREATE TABLE Employees (
    EmployeeID INT PRIMARY KEY,
    FirstName VARCHAR(50),
    LastName VARCHAR(50),
    Age INT,
    DepartmentID INT,
    Salary DECIMAL(10, 2),
    DateHired DATE,
    Active BOOLEAN,
    FOREIGN KEY (DepartmentID) REFERENCES Departments(DepartmentID)
); 

CREATE TABLE Departments (
    DepartmentID INT PRIMARY KEY,
    DepartmentName VARCHAR(100),
    Location VARCHAR(100)
); ### Input: Write a SQL query to select all the employees who are active. ### Response:"""

response = generate(model=model,tokenizer=tokenizer,prompt=prompt, verbose=True)

```