File size: 2,788 Bytes
7ec040e
 
 
 
 
 
 
 
 
 
6a03a5b
7ec040e
6a03a5b
7ec040e
554bc1c
7ec040e
6a03a5b
 
7ec040e
 
 
 
6a03a5b
7ec040e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6a03a5b
 
 
 
 
 
 
 
cbd50d3
6a03a5b
cbd50d3
6a03a5b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
554bc1c
 
 
 
 
 
 
 
 
 
 
3d4fc34
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
---
base_model:
- ajibawa-2023/Code-Llama-3-8B
- defog/llama-3-sqlcoder-8b
library_name: transformers
tags:
- mergekit
- merge

---
# llama3-8b-code-sql-slerp

llama3-8b-code-sql-slerp is a merge of two fine tuned Llama 3 8B models for coding, intended to have a solid programming foundation with an expertise in SQL.

### 🤏 Models Merged

Merge of pre-trained language models merged using the SLERP merge method with [mergekit](https://github.com/cg123/mergekit).

The following models were included in the merge:
* [ajibawa-2023/Code-Llama-3-8B](https://huggingface.co/ajibawa-2023/Code-Llama-3-8B)
* [defog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b)

### 🧩 Configuration

The following YAML configuration was used to produce this model:

```yaml
slices:
  - sources:
      - model: ajibawa-2023/Code-Llama-3-8B
        layer_range: [0, 32]
      - model: defog/llama-3-sqlcoder-8b
        layer_range: [0, 32]
merge_method: slerp
base_model: ajibawa-2023/Code-Llama-3-8B
parameters:
  t:
    - filter: self_attn
      value: [0, 0.3, 0.5, 0.7, 0.5]
    - filter: mlp
      value: [0, 0.3, 0.5, 0.7, 0.5]
    - value: 0.4 # fallback for rest of tensors
dtype: bfloat16
```

### 💻 Usage

Loading in 8-bit Quantization

```python
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

tokenizer = AutoTokenizer.from_pretrained("AdamLucek/llama3-8b-code-sql-slerp")
model = AutoModelForCausalLM.from_pretrained(
    "AdamLucek/llama3-8b-code-sql-slerp",
    device_map="cuda",
    quantization_config=BitsAndBytesConfig(load_in_8bit=True)
)

# Prepare the input text
input_text = "Can you write a query to retrieve the names and email addresses of all customers who have made purchases totaling over $1000 in the last month from our 'sales' database?"
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

# Generate the output
outputs = model.generate(
    **input_ids,
    max_new_tokens=256,
    pad_token_id=tokenizer.eos_token_id
)

# Decode and print the generated text
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

**Output**
```
\```sql
SELECT c.name, c.email
FROM customers c
JOIN sales s ON c.customer_id = s.customer_id
WHERE s.purchase_date >= DATE_SUB(CURRENT_DATE, INTERVAL 1 MONTH)
GROUP BY c.name, c.email
HAVING SUM(s.amount) > 1000;
\```

This query joins the 'customers' and'sales' tables on the 'customer_id' field, filters for sales made in the last month, groups the results by customer name and email, and then applies a condition to only include customers whose total purchase amount exceeds $1000. The result will be a list of names and email addresses for customers who have made purchases totaling over $1000 in the last month.
```
*backslash added for formatting*