File size: 6,872 Bytes
1aec217 f335f40 1aec217 f335f40 1aec217 f335f40 1aec217 f335f40 1aec217 f335f40 26bbcaa f335f40 26bbcaa f335f40 26bbcaa f335f40 26bbcaa f335f40 26bbcaa f335f40 26bbcaa f335f40 26bbcaa f335f40 26bbcaa f335f40 26bbcaa f335f40 1aec217 f335f40 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 |
---
base_model: Qwen/Qwen2.5-7B
library_name: peft
language:
- en
license: agpl-3.0
datasets:
- OramaSearch/nlp-to-query-small
---
# Query Translator Mini
This repository contains a fine-tuned version of Qwen 2.5 7B model specialized in translating natural language queries into structured Orama search queries.
The model uses PEFT with LoRA to maintain efficiency while achieving high performance.
## Model Details
### Model Description
The Query Translator Mini model is designed to convert natural language queries into structured JSON queries compatible with the Orama search engine.
It understands various data types and query operators, making it versatile for different search scenarios.
### Key Features
- Translates natural language to structured Orama queries
- Supports multiple field types: string, number, boolean, enum, and arrays
- Handles complex query operators: `gt`, `gte`, `lt`, `lte`, `eq`, `between`, `containsAll`
- Supports nested properties with dot notation
- Works with both full-text search and filtered queries
## Usage
```python
import json, torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
SYSTEM_PROMPT = """
You are a tool used to generate synthetic data of Orama queries. Orama is a full-text, vector, and hybrid search engine.
Let me show you what you need to do with some examples.
Example:
- Query: `"What are the red wines that cost less than 20 dollars?"`
- Schema: `{ "name": "string", "content": "string", "price": "number", "tags": "enum[]" }`
- Generated query: `{ "term": "", "where": { "tags": { "containsAll": ["red", "wine"] }, "price": { "lt": 20 } } }`
Another example:
- Query: `"Show me 5 prosecco wines good for aperitif"`
- Schema: `{ "name": "string", "content": "string", "price": "number", "tags": "enum[]" }`
- Generated query: `{ "term": "prosecco aperitif", "limit": 5 }`
One last example:
- Query: `"Show me some wine reviews with a score greater than 4.5 and less than 5.0."`
- Schema: `{ "title": "string", "content": "string", "reviews": { "score": "number", "text": "string" } }]`
- Generated query: `{ "term": "", "where": { "reviews.score": { "between": [4.5, 5.0] } } }`
The rules to generate the query are:
- Never use an "embedding" field in the schema.
- Every query has a "term" field that is a string. It represents the full-text search terms. Can be empty (will match all documents).
- You can use a "where" field that is an object. It represents the filters to apply to the documents. Its keys and values depend on the schema of the database:
- If the field is a "string", you should not use operators. Example: `{ "where": { "title": "champagne" } }`.
- If the field is a "number", you can use the following operators: "gt", "gte", "lt", "lte", "eq", "between". Example: `{ "where": { "price": { "between": [20, 100] } } }`. Another example: `{ "where": { "price": { "lt": 20 } } }`.
- If the field is an "enum", you can use the following operators: "eq", "in", "nin". Example: `{ "where": { "tags": { "containsAll": ["red", "wine"] } } }`.
- If the field is an "string[]", it's gonna be just like the "string" field, but you can use an array of values. Example: `{ "where": { "title": ["champagne", "montagne"] } }`.
- If the field is a "boolean", you can use the following operators: "eq". Example: `{ "where": { "isAvailable": true } }`. Another example: `{ "where": { "isAvailable": false } }`.
- If the field is a "enum[]", you can use the following operators: "containsAll". Example: `{ "where": { "tags": { "containsAll": ["red", "wine"] } } }`.
- Nested properties are supported. Just translate them into dot notation. Example: `{ "where": { "author.name": "John" } }`.
- Array of numbers are not supported.
- Array of booleans are not supported.
Return just a JSON object, nothing more.
"""
QUERY = "Show me some wine reviews with a score greater than 4.5 and less than 5.0."
SCHEMA = {
"title": "string",
"description": "string",
"price": "number",
}
base_model_name = "Qwen/Qwen2.5-7B"
adapter_path = "OramaSearch/query-translator-mini"
print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
print("Loading base model...")
model = AutoModelForCausalLM.from_pretrained(
base_model_name,
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True,
)
print("Loading fine-tuned adapter...")
model = PeftModel.from_pretrained(model, adapter_path)
if torch.cuda.is_available():
model = model.cuda()
print(f"GPU memory after loading: {torch.cuda.memory_allocated(0) / 1024**2:.2f} MB")
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": f"Query: {QUERY}\nSchema: {json.dumps(SCHEMA)}"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
do_sample=True,
temperature=0.1,
top_p=0.9,
num_return_sequences=1,
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## Training Details
The model was trained on a NVIDIA H100 SXM using the following configuration:
- Base Model: Qwen 2.5 7B
- Training Method: LoRA
- Quantization: 4-bit quantization using bitsandbytes
- LoRA Configuration:
- Rank: 16
- Alpha: 32
- Dropout: 0.1
- Target Modules: Attention layers and MLP
- Training Arguments:
- Epochs: 3
- Batch Size: 2
- Learning Rate: 5e-5
- Gradient Accumulation Steps: 8
- FP16 Training: Enabled
- Gradient Checkpointing: Enabled
## Supported Query Types
The model can handle various types of queries including:
1. Simple text search:
```json
{
"term": "prosecco aperitif",
"limit": 5
}
```
2. Numeric range queries:
```json
{
"term": "",
"where": {
"price": {
"between": [20, 100]
}
}
}
```
3. Tag-based filtering:
```json
{
"term": "",
"where": {
"tags": {
"containsAll": ["red", "wine"]
}
}
}
```
## Limitations
- Does not support array of numbers or booleans
- Maximum input length is 1024 tokens
- Embedding fields are not supported in the schema
## Citation
If you use this model in your research, please cite:
```
@misc{query-translator-mini,
author = {OramaSearch Inc.},
title = {Query Translator Mini: Natural Language to Orama Query Translation},
year = {2024},
publisher = {HuggingFace},
journal = {HuggingFace Repository},
howpublished = {\url{https://huggingface.co/OramaSearch/query-translator-mini}}
}
```
## License
AGPLv3
## Acknowledgments
This model builds upon the Qwen 2.5 7B model and uses techniques from the PEFT library. Special thanks to the teams behind these projects. |