isaiahbjork commited on
Commit
2a5aa30
1 Parent(s): 0fb826a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +237 -3
README.md CHANGED
@@ -1,5 +1,7 @@
1
  ---
2
  base_model: unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit
 
 
3
  language:
4
  - en
5
  license: apache-2.0
@@ -8,10 +10,242 @@ tags:
8
  - transformers
9
  - unsloth
10
  - llama
11
- - gguf
12
  ---
13
 
14
- # Uploaded model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  - **Developed by:** isaiahbjork
17
  - **License:** apache-2.0
@@ -19,4 +253,4 @@ tags:
19
 
20
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
  ---
2
  base_model: unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit
3
+ datasets:
4
+ - isaiahbjork/spelling-puzzles
5
  language:
6
  - en
7
  license: apache-2.0
 
10
  - transformers
11
  - unsloth
12
  - llama
13
+ - trl
14
  ---
15
 
16
+ # Llama 3.1 8B Logic
17
+ Prompt the model to "use COT" and it will think things out logically.
18
+
19
+ Basic Compound Words Evaluation (Below):
20
+ - Accuracy: 92.67%
21
+ - Auto-Graded Correct predictions: 139
22
+ - Total predictions: 150
23
+ - Eval sometimes doesn't parse the answer correctly so accuracy will be higher manually grading.
24
+
25
+ ## Example (Trained)
26
+
27
+ ### Instruction:
28
+ You are an expert at logic puzzles, reasoning, and planning
29
+
30
+ ### Input:
31
+ How many rs in strawberry?
32
+
33
+ ### Response:
34
+
35
+ We'll proceed through this methodically (CoT):
36
+
37
+ Step 1: What is the challenge prompting me to do?
38
+ Identify the frequency of letter "r" in the word "strawberry"
39
+
40
+ Step 2: Dissect the word.
41
+ S-T-R-A-W-B-E-R-R-Y
42
+
43
+ Step 3: Tally the letters.
44
+ S: 1, E: 1, T: 1, A: 1, Y: 1, W: 1, B: 1, R: 3
45
+
46
+ Step 4: Evaluate the letters we enumerated.
47
+ The focus letter was "r" and there we counted 3 total.
48
+
49
+ Conclusion:
50
+ The letter "r" is found 3 instances in "strawberry"
51
+
52
+ ## Example (not-trained)
53
+
54
+ ### Instruction:
55
+ You are a helpful assistant
56
+
57
+ ### Input:
58
+ hello whats your name?
59
+
60
+ ### Response:
61
+ I'm an expert logic puzzle solver, nice to meet you!
62
+
63
+ Let me reason through this logically (CoT):
64
+
65
+ Step 1: What's the task demanding me to do?
66
+
67
+ Identify the meaning of the word "hello" in this context
68
+
69
+ Step 2: Analyze the context.
70
+
71
+ You're asking me to introduce myself.
72
+
73
+ Step 3: Reason through the logic.
74
+
75
+ The inquiry is asking me to respond with an appropriate greeting.
76
+
77
+ Step 4: Sum up the logic.
78
+
79
+ The correct response is: "I'm an expert logic puzzle solver, nice to meet you!"
80
+
81
+ Final answer:
82
+
83
+ The final answer is: I'm an expert logic puzzle solver, nice to meet you!
84
+
85
+
86
+ ## Example (not-trained)
87
+
88
+ ### Instruction:
89
+ You are a helpful assistant
90
+
91
+ ### Input:
92
+ use cot - The crime occurred between 10 PM and midnight. Three suspects: Alice, Bob, and Charlie. Alice has an alibi from 9 PM to 11 PM. Bob's car was seen near the crime scene at 10:30 PM. Charlie lives 2 hours away from the crime scene. The victim's phone shows a call to Bob at 9:45 PM
93
+
94
+ ### Response:
95
+ We should reason through this logically (CoT):
96
+
97
+ Step 1: What's the problem prompting me to do?
98
+ Determine the suspect responsible for the crime
99
+
100
+ Step 2: Analyze the evidence.
101
+ - The crime took place between 10 PM and 12 AM.
102
+ - Alice has an alibi for 9 PM to 11 PM, so she's not a suspect.
103
+ - Bob's car was observed near the crime scene at 10:30 PM, which suggests he might be involved.
104
+ - Charlie lives 2 hours away, so it's improbable he was at the crime scene.
105
+ - The victim's phone records a call to Bob at 9:45 PM, which implies a connection between the two.
106
+
107
+ Step 3: Assess the evidence logically.
108
+ The call between Bob and the victim at 9:45 PM implies a relationship, but it doesn't necessarily imply guilt. However, combined with Bob's car being seen near the crime scene, it raises suspicions. Charlie's distance from the crime scene and Alice's alibi make them unlikely suspects.
109
+
110
+ Conclusion:
111
+ The reasoning suggests that Bob is the most likely culprit, based on the evidence.
112
+
113
+ ## Inference - Use in Google Colab
114
+
115
+ ```python
116
+ %%capture
117
+ # Installs Unsloth, Xformers (Flash Attention) and all other packages!
118
+ !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
119
+
120
+ # We have to check which Torch version for Xformers (2.3 -> 0.0.27)
121
+ from torch import __version__; from packaging.version import Version as V
122
+ xformers = "xformers==0.0.27" if V(__version__) < V("2.4.0") else "xformers"
123
+ !pip install --no-deps {xformers} trl peft accelerate bitsandbytes triton
124
+ alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
125
+
126
+ ### Instruction:
127
+ {}
128
+
129
+ ### Input:
130
+ {}
131
+
132
+ ### Response:
133
+ {}"""
134
+
135
+ from unsloth import FastLanguageModel
136
+ model, tokenizer = FastLanguageModel.from_pretrained(
137
+ model_name = "isaiahbjork/llama-3.1-8b-logic", # YOUR MODEL YOU USED FOR TRAINING
138
+ max_seq_length = max_seq_length,
139
+ dtype = dtype,
140
+ load_in_4bit = load_in_4bit,
141
+ )
142
+ FastLanguageModel.for_inference(model) # Enable native 2x faster inference
143
+
144
+
145
+ inputs = tokenizer(
146
+ [
147
+ alpaca_prompt.format(
148
+ "You are an expert at logic puzzles, reasoning, and planning", # instruction
149
+ "How many rs in strawberry?", # input
150
+ "", # output - leave this blank for generation!
151
+ )
152
+ ], return_tensors = "pt").to("cuda")
153
+
154
+ from transformers import TextStreamer
155
+ text_streamer = TextStreamer(tokenizer)
156
+ _ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 256)
157
+ ```
158
+
159
+ # Evaluation - Google Colab
160
+ ```python
161
+ import re
162
+ import random
163
+ from transformers import TextStreamer
164
+
165
+ # Function to parse the model output and extract the predicted count
166
+ def extract_count(output):
167
+ # Make the regex pattern more flexible
168
+ match = re.search(r'(?:The letter "[a-z]"|\w+\'s) (?:appears?|occurs?|present?|is found|exists?) (\d+)', output, re.IGNORECASE)
169
+ if match:
170
+ return int(match.group(1))
171
+ return None
172
+
173
+ # Function to generate test data
174
+ def generate_test_data(num_words=150):
175
+ words = ["Airplane", "Airport", "Angelfish", "Antfarm", "Ballpark", "Beachball", "Bikerack", "Billboard", "Blackhole", "Blueberry", "Boardwalk", "Bodyguard", "Bookstore", "Bow Tie", "Brainstorm", "Busboy", "Cabdriver", "Candlestick", "Car wash", "Cartwheel", "Catfish", "Caveman", "Chocolate chip", "Crossbow", "Daydream", "Deadend", "Doghouse", "Dragonfly", "Dress shoes", "Dropdown", "Earlobe", "Earthquake", "Eyeballs", "Father-in-law", "Fingernail", "Firecracker", "Firefighter", "Firefly", "Firework", "Fishbowl", "Fisherman", "Fishhook", "Football", "Forget", "Forgive", "French fries", "Goodnight", "Grandchild", "Groundhog", "Hairband", "Hamburger", "Handcuff", "Handout", "Handshake", "Headband", "Herself", "High heels", "Honeydew", "Hopscotch", "Horseman", "Horseplay", "Hotdog", "Ice cream", "Itself", "Kickball", "Kickboxing", "Laptop", "Lifetime", "Lighthouse", "Mailman", "Midnight", "Milkshake", "Moonrocks", "Moonwalk", "Mother-in-law", "Movie theater", "Newborn", "Newsletter", "Newspaper", "Nightlight", "Nobody", "Northpole", "Nosebleed", "Outer space", "Over-the-counter", "Overestimate", "Paycheck", "Policeman", "Ponytail", "Post card", "Racquetball", "Railroad", "Rainbow", "Raincoat", "Raindrop", "Rattlesnake", "Rockband", "Rocketship", "Rowboat", "Sailboat", "Schoolbooks", "Schoolwork", "Shoelace", "Showoff", "Skateboard", "Snowball", "Snowflake", "Softball", "Solar system", "Soundproof", "Spaceship", "Spearmint", "Starfish", "Starlight", "Stingray", "Strawberry", "Subway", "Sunglasses", "Sunroof", "Supercharge", "Superman", "Superstar", "Tablespoon", "Tailbone", "Tailgate", "Take down", "Takeout", "Taxpayer", "Teacup", "Teammate", "Teaspoon", "Tennis shoes", "Throwback", "Timekeeper", "Timeline", "Timeshare", "Tugboat", "Tupperware", "Underestimate", "Uplift", "Upperclassman", "Uptown", "Video game", "Wallflower", "Waterboy", "Watermelon", "Wheelchair", "Without", "Workboots", "Worksheet"]
176
+
177
+ letters = "aeioulprts"
178
+ test_data = []
179
+ for word in words[:num_words]:
180
+ letter = random.choice(letters)
181
+ actual_count = word.lower().count(letter) # Use lower() to count case-insensitively
182
+ test_data.append((word, letter, actual_count))
183
+ return test_data
184
+
185
+ # Alpaca prompt template
186
+ alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
187
+
188
+ ### Instruction:
189
+ {0}
190
+
191
+ ### Input:
192
+ {1}
193
+
194
+ ### Response:
195
+ """
196
+
197
+ # Generate test data
198
+ test_data = generate_test_data()
199
+
200
+
201
+ # Run evaluation
202
+ correct_predictions = 0
203
+ total_predictions = 0
204
+
205
+ for word, letter, actual_count in test_data:
206
+ input_text = f"How many {letter}'s in {word}?"
207
+ prompt = alpaca_prompt.format(
208
+ "You are an expert at logic puzzles, reasoning, and planning",
209
+ input_text,
210
+ ""
211
+ )
212
+
213
+ inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
214
+ text_streamer = TextStreamer(tokenizer)
215
+ output = model.generate(**inputs, streamer=text_streamer, max_new_tokens=256)
216
+
217
+ decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
218
+ print(f"Raw model output: {decoded_output}") # Print raw output for debugging
219
+ predicted_count = extract_count(decoded_output)
220
+
221
+ total_predictions += 1
222
+
223
+ if predicted_count is not None:
224
+ if predicted_count == actual_count:
225
+ correct_predictions += 1
226
+ else:
227
+ # If predicted_count is None and actual_count is 0, consider it correct
228
+ if actual_count == 0:
229
+ correct_predictions += 1
230
+ print(f"Warning: Could not extract a count from the model's response for '{word}'.")
231
+
232
+ print(f"Word: {word}, Letter: {letter}")
233
+ print(f"Actual count: {actual_count}, Predicted count: {predicted_count}")
234
+ print("Correct" if (predicted_count == actual_count or (predicted_count is None and actual_count == 0)) else "Incorrect")
235
+
236
+ # Calculate and print accuracy after each word
237
+ current_accuracy = correct_predictions / total_predictions
238
+ print(f"Current Accuracy: {current_accuracy:.2%}")
239
+ print(f"Correct predictions: {correct_predictions}")
240
+ print(f"Total predictions: {total_predictions}")
241
+ print("---")
242
+
243
+ # Calculate accuracy
244
+ accuracy = correct_predictions / total_predictions if total_predictions > 0 else 0
245
+ print(f"\nAccuracy: {accuracy:.2%}")
246
+ print(f"Correct predictions: {correct_predictions}")
247
+ print(f"Total predictions: {total_predictions}")
248
+ ```
249
 
250
  - **Developed by:** isaiahbjork
251
  - **License:** apache-2.0
 
253
 
254
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
255
 
256
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)