Update README.md

26bbcaa verified 26 days ago

6.87 kB

	---
	base_model: Qwen/Qwen2.5-7B
	library_name: peft
	language:
	- en
	license: agpl-3.0
	datasets:
	- OramaSearch/nlp-to-query-small
	---

	# Query Translator Mini

	This repository contains a fine-tuned version of Qwen 2.5 7B model specialized in translating natural language queries into structured Orama search queries.

	The model uses PEFT with LoRA to maintain efficiency while achieving high performance.

	## Model Details

	### Model Description

	The Query Translator Mini model is designed to convert natural language queries into structured JSON queries compatible with the Orama search engine.

	It understands various data types and query operators, making it versatile for different search scenarios.

	### Key Features

	- Translates natural language to structured Orama queries
	- Supports multiple field types: string, number, boolean, enum, and arrays
	- Handles complex query operators: `gt`, `gte`, `lt`, `lte`, `eq`, `between`, `containsAll`
	- Supports nested properties with dot notation
	- Works with both full-text search and filtered queries

	## Usage

	```python
	import json, torch
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	SYSTEM_PROMPT = """
	You are a tool used to generate synthetic data of Orama queries. Orama is a full-text, vector, and hybrid search engine.

	Let me show you what you need to do with some examples.

	Example:
	- Query: `"What are the red wines that cost less than 20 dollars?"`
	- Schema: `{ "name": "string", "content": "string", "price": "number", "tags": "enum[]" }`
	- Generated query: `{ "term": "", "where": { "tags": { "containsAll": ["red", "wine"] }, "price": { "lt": 20 } } }`

	Another example:
	- Query: `"Show me 5 prosecco wines good for aperitif"`
	- Schema: `{ "name": "string", "content": "string", "price": "number", "tags": "enum[]" }`
	- Generated query: `{ "term": "prosecco aperitif", "limit": 5 }`

	One last example:
	- Query: `"Show me some wine reviews with a score greater than 4.5 and less than 5.0."`
	- Schema: `{ "title": "string", "content": "string", "reviews": { "score": "number", "text": "string" } }]`
	- Generated query: `{ "term": "", "where": { "reviews.score": { "between": [4.5, 5.0] } } }`

	The rules to generate the query are:

	- Never use an "embedding" field in the schema.
	- Every query has a "term" field that is a string. It represents the full-text search terms. Can be empty (will match all documents).
	- You can use a "where" field that is an object. It represents the filters to apply to the documents. Its keys and values depend on the schema of the database:
	- If the field is a "string", you should not use operators. Example: `{ "where": { "title": "champagne" } }`.
	- If the field is a "number", you can use the following operators: "gt", "gte", "lt", "lte", "eq", "between". Example: `{ "where": { "price": { "between": [20, 100] } } }`. Another example: `{ "where": { "price": { "lt": 20 } } }`.
	- If the field is an "enum", you can use the following operators: "eq", "in", "nin". Example: `{ "where": { "tags": { "containsAll": ["red", "wine"] } } }`.
	- If the field is an "string[]", it's gonna be just like the "string" field, but you can use an array of values. Example: `{ "where": { "title": ["champagne", "montagne"] } }`.
	- If the field is a "boolean", you can use the following operators: "eq". Example: `{ "where": { "isAvailable": true } }`. Another example: `{ "where": { "isAvailable": false } }`.
	- If the field is a "enum[]", you can use the following operators: "containsAll". Example: `{ "where": { "tags": { "containsAll": ["red", "wine"] } } }`.
	- Nested properties are supported. Just translate them into dot notation. Example: `{ "where": { "author.name": "John" } }`.
	- Array of numbers are not supported.
	- Array of booleans are not supported.

	Return just a JSON object, nothing more.
	"""

	QUERY = "Show me some wine reviews with a score greater than 4.5 and less than 5.0."

	SCHEMA = {
	"title": "string",
	"description": "string",
	"price": "number",
	}

	base_model_name = "Qwen/Qwen2.5-7B"
	adapter_path = "OramaSearch/query-translator-mini"

	print("Loading tokenizer...")
	tokenizer = AutoTokenizer.from_pretrained(base_model_name)

	print("Loading base model...")
	model = AutoModelForCausalLM.from_pretrained(
	base_model_name,
	torch_dtype=torch.float16,
	device_map="auto",
	trust_remote_code=True,
	)

	print("Loading fine-tuned adapter...")
	model = PeftModel.from_pretrained(model, adapter_path)

	if torch.cuda.is_available():
	model = model.cuda()
	print(f"GPU memory after loading: {torch.cuda.memory_allocated(0) / 1024**2:.2f} MB")

	messages = [
	{"role": "system", "content": SYSTEM_PROMPT},
	{"role": "user", "content": f"Query: {QUERY}\nSchema: {json.dumps(SCHEMA)}"},
	]

	prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(
	**inputs,
	max_new_tokens=512,
	do_sample=True,
	temperature=0.1,
	top_p=0.9,
	num_return_sequences=1,
	)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	## Training Details

	The model was trained on a NVIDIA H100 SXM using the following configuration:

	- Base Model: Qwen 2.5 7B
	- Training Method: LoRA
	- Quantization: 4-bit quantization using bitsandbytes
	- LoRA Configuration:
	- Rank: 16
	- Alpha: 32
	- Dropout: 0.1
	- Target Modules: Attention layers and MLP

	- Training Arguments:
	- Epochs: 3
	- Batch Size: 2
	- Learning Rate: 5e-5
	- Gradient Accumulation Steps: 8
	- FP16 Training: Enabled
	- Gradient Checkpointing: Enabled

	## Supported Query Types

	The model can handle various types of queries including:

	1. Simple text search:

	```json
	{
	"term": "prosecco aperitif",
	"limit": 5
	}
	```

	2. Numeric range queries:

	```json
	{
	"term": "",
	"where": {
	"price": {
	"between": [20, 100]
	}
	}
	}
	```

	3. Tag-based filtering:

	```json
	{
	"term": "",
	"where": {
	"tags": {
	"containsAll": ["red", "wine"]
	}
	}
	}
	```

	## Limitations

	- Does not support array of numbers or booleans
	- Maximum input length is 1024 tokens
	- Embedding fields are not supported in the schema

	## Citation

	If you use this model in your research, please cite:

	```
	@misc{query-translator-mini,
	author = {OramaSearch Inc.},
	title = {Query Translator Mini: Natural Language to Orama Query Translation},
	year = {2024},
	publisher = {HuggingFace},
	journal = {HuggingFace Repository},
	howpublished = {\url{https://huggingface.co/OramaSearch/query-translator-mini}}
	}
	```

	## License

	AGPLv3

	## Acknowledgments

	This model builds upon the Qwen 2.5 7B model and uses techniques from the PEFT library. Special thanks to the teams behind these projects.