surajgorai commited on
Commit
689adba
1 Parent(s): 326f02d

update model card

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md CHANGED
@@ -20,3 +20,49 @@ base_model: unsloth/llama-3-8b-bnb-4bit
20
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
23
+
24
+
25
+ ## Usage
26
+
27
+
28
+ ```python
29
+ from unsloth import FastLanguageModel
30
+ from unsloth import FastLanguageModel
31
+ import torch
32
+ max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
33
+ dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
34
+ load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.
35
+ model, tokenizer = FastLanguageModel.from_pretrained(
36
+ model_name = "surajgorai/llama_3_8b_text_to_sql_model", # YOUR MODEL YOU USED FOR TRAINING
37
+ max_seq_length = max_seq_length,
38
+ dtype = dtype,
39
+ load_in_4bit = load_in_4bit,
40
+ )
41
+ FastLanguageModel.for_inference(model) # Enable native 2x faster inference
42
+
43
+ prompt = """You are a powerful text-to-SQL model. Your job is to answer questions about a database. You are given a question and context regarding one or more tables.
44
+
45
+ You must output the SQL query that answers the question.
46
+
47
+ ### Instruction:
48
+ {}
49
+
50
+ ### Input:
51
+ {}
52
+
53
+ ### Response:
54
+ {}"""
55
+
56
+ # alpaca_prompt = You MUST copy from above!
57
+ inputs = tokenizer(
58
+ [
59
+ prompt.format(
60
+ 'Name the result/games for 54741', # instruction
61
+ 'CREATE TABLE table_21436373_11 (result_games VARCHAR, attendance VARCHAR)', # input
62
+ "", # output - leave this blank for generation!
63
+ )
64
+ ], return_tensors = "pt").to("cuda")
65
+
66
+ outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
67
+ tokenizer.batch_decode(outputs)
68
+ ```