isaiahbjork
commited on
Commit
•
2a5aa30
1
Parent(s):
0fb826a
Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,7 @@
|
|
1 |
---
|
2 |
base_model: unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit
|
|
|
|
|
3 |
language:
|
4 |
- en
|
5 |
license: apache-2.0
|
@@ -8,10 +10,242 @@ tags:
|
|
8 |
- transformers
|
9 |
- unsloth
|
10 |
- llama
|
11 |
-
-
|
12 |
---
|
13 |
|
14 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
|
16 |
- **Developed by:** isaiahbjork
|
17 |
- **License:** apache-2.0
|
@@ -19,4 +253,4 @@ tags:
|
|
19 |
|
20 |
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
21 |
|
22 |
-
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|
|
|
1 |
---
|
2 |
base_model: unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit
|
3 |
+
datasets:
|
4 |
+
- isaiahbjork/spelling-puzzles
|
5 |
language:
|
6 |
- en
|
7 |
license: apache-2.0
|
|
|
10 |
- transformers
|
11 |
- unsloth
|
12 |
- llama
|
13 |
+
- trl
|
14 |
---
|
15 |
|
16 |
+
# Llama 3.1 8B Logic
|
17 |
+
Prompt the model to "use COT" and it will think things out logically.
|
18 |
+
|
19 |
+
Basic Compound Words Evaluation (Below):
|
20 |
+
- Accuracy: 92.67%
|
21 |
+
- Auto-Graded Correct predictions: 139
|
22 |
+
- Total predictions: 150
|
23 |
+
- Eval sometimes doesn't parse the answer correctly so accuracy will be higher manually grading.
|
24 |
+
|
25 |
+
## Example (Trained)
|
26 |
+
|
27 |
+
### Instruction:
|
28 |
+
You are an expert at logic puzzles, reasoning, and planning
|
29 |
+
|
30 |
+
### Input:
|
31 |
+
How many rs in strawberry?
|
32 |
+
|
33 |
+
### Response:
|
34 |
+
|
35 |
+
We'll proceed through this methodically (CoT):
|
36 |
+
|
37 |
+
Step 1: What is the challenge prompting me to do?
|
38 |
+
Identify the frequency of letter "r" in the word "strawberry"
|
39 |
+
|
40 |
+
Step 2: Dissect the word.
|
41 |
+
S-T-R-A-W-B-E-R-R-Y
|
42 |
+
|
43 |
+
Step 3: Tally the letters.
|
44 |
+
S: 1, E: 1, T: 1, A: 1, Y: 1, W: 1, B: 1, R: 3
|
45 |
+
|
46 |
+
Step 4: Evaluate the letters we enumerated.
|
47 |
+
The focus letter was "r" and there we counted 3 total.
|
48 |
+
|
49 |
+
Conclusion:
|
50 |
+
The letter "r" is found 3 instances in "strawberry"
|
51 |
+
|
52 |
+
## Example (not-trained)
|
53 |
+
|
54 |
+
### Instruction:
|
55 |
+
You are a helpful assistant
|
56 |
+
|
57 |
+
### Input:
|
58 |
+
hello whats your name?
|
59 |
+
|
60 |
+
### Response:
|
61 |
+
I'm an expert logic puzzle solver, nice to meet you!
|
62 |
+
|
63 |
+
Let me reason through this logically (CoT):
|
64 |
+
|
65 |
+
Step 1: What's the task demanding me to do?
|
66 |
+
|
67 |
+
Identify the meaning of the word "hello" in this context
|
68 |
+
|
69 |
+
Step 2: Analyze the context.
|
70 |
+
|
71 |
+
You're asking me to introduce myself.
|
72 |
+
|
73 |
+
Step 3: Reason through the logic.
|
74 |
+
|
75 |
+
The inquiry is asking me to respond with an appropriate greeting.
|
76 |
+
|
77 |
+
Step 4: Sum up the logic.
|
78 |
+
|
79 |
+
The correct response is: "I'm an expert logic puzzle solver, nice to meet you!"
|
80 |
+
|
81 |
+
Final answer:
|
82 |
+
|
83 |
+
The final answer is: I'm an expert logic puzzle solver, nice to meet you!
|
84 |
+
|
85 |
+
|
86 |
+
## Example (not-trained)
|
87 |
+
|
88 |
+
### Instruction:
|
89 |
+
You are a helpful assistant
|
90 |
+
|
91 |
+
### Input:
|
92 |
+
use cot - The crime occurred between 10 PM and midnight. Three suspects: Alice, Bob, and Charlie. Alice has an alibi from 9 PM to 11 PM. Bob's car was seen near the crime scene at 10:30 PM. Charlie lives 2 hours away from the crime scene. The victim's phone shows a call to Bob at 9:45 PM
|
93 |
+
|
94 |
+
### Response:
|
95 |
+
We should reason through this logically (CoT):
|
96 |
+
|
97 |
+
Step 1: What's the problem prompting me to do?
|
98 |
+
Determine the suspect responsible for the crime
|
99 |
+
|
100 |
+
Step 2: Analyze the evidence.
|
101 |
+
- The crime took place between 10 PM and 12 AM.
|
102 |
+
- Alice has an alibi for 9 PM to 11 PM, so she's not a suspect.
|
103 |
+
- Bob's car was observed near the crime scene at 10:30 PM, which suggests he might be involved.
|
104 |
+
- Charlie lives 2 hours away, so it's improbable he was at the crime scene.
|
105 |
+
- The victim's phone records a call to Bob at 9:45 PM, which implies a connection between the two.
|
106 |
+
|
107 |
+
Step 3: Assess the evidence logically.
|
108 |
+
The call between Bob and the victim at 9:45 PM implies a relationship, but it doesn't necessarily imply guilt. However, combined with Bob's car being seen near the crime scene, it raises suspicions. Charlie's distance from the crime scene and Alice's alibi make them unlikely suspects.
|
109 |
+
|
110 |
+
Conclusion:
|
111 |
+
The reasoning suggests that Bob is the most likely culprit, based on the evidence.
|
112 |
+
|
113 |
+
## Inference - Use in Google Colab
|
114 |
+
|
115 |
+
```python
|
116 |
+
%%capture
|
117 |
+
# Installs Unsloth, Xformers (Flash Attention) and all other packages!
|
118 |
+
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
|
119 |
+
|
120 |
+
# We have to check which Torch version for Xformers (2.3 -> 0.0.27)
|
121 |
+
from torch import __version__; from packaging.version import Version as V
|
122 |
+
xformers = "xformers==0.0.27" if V(__version__) < V("2.4.0") else "xformers"
|
123 |
+
!pip install --no-deps {xformers} trl peft accelerate bitsandbytes triton
|
124 |
+
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
|
125 |
+
|
126 |
+
### Instruction:
|
127 |
+
{}
|
128 |
+
|
129 |
+
### Input:
|
130 |
+
{}
|
131 |
+
|
132 |
+
### Response:
|
133 |
+
{}"""
|
134 |
+
|
135 |
+
from unsloth import FastLanguageModel
|
136 |
+
model, tokenizer = FastLanguageModel.from_pretrained(
|
137 |
+
model_name = "isaiahbjork/llama-3.1-8b-logic", # YOUR MODEL YOU USED FOR TRAINING
|
138 |
+
max_seq_length = max_seq_length,
|
139 |
+
dtype = dtype,
|
140 |
+
load_in_4bit = load_in_4bit,
|
141 |
+
)
|
142 |
+
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
|
143 |
+
|
144 |
+
|
145 |
+
inputs = tokenizer(
|
146 |
+
[
|
147 |
+
alpaca_prompt.format(
|
148 |
+
"You are an expert at logic puzzles, reasoning, and planning", # instruction
|
149 |
+
"How many rs in strawberry?", # input
|
150 |
+
"", # output - leave this blank for generation!
|
151 |
+
)
|
152 |
+
], return_tensors = "pt").to("cuda")
|
153 |
+
|
154 |
+
from transformers import TextStreamer
|
155 |
+
text_streamer = TextStreamer(tokenizer)
|
156 |
+
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 256)
|
157 |
+
```
|
158 |
+
|
159 |
+
# Evaluation - Google Colab
|
160 |
+
```python
|
161 |
+
import re
|
162 |
+
import random
|
163 |
+
from transformers import TextStreamer
|
164 |
+
|
165 |
+
# Function to parse the model output and extract the predicted count
|
166 |
+
def extract_count(output):
|
167 |
+
# Make the regex pattern more flexible
|
168 |
+
match = re.search(r'(?:The letter "[a-z]"|\w+\'s) (?:appears?|occurs?|present?|is found|exists?) (\d+)', output, re.IGNORECASE)
|
169 |
+
if match:
|
170 |
+
return int(match.group(1))
|
171 |
+
return None
|
172 |
+
|
173 |
+
# Function to generate test data
|
174 |
+
def generate_test_data(num_words=150):
|
175 |
+
words = ["Airplane", "Airport", "Angelfish", "Antfarm", "Ballpark", "Beachball", "Bikerack", "Billboard", "Blackhole", "Blueberry", "Boardwalk", "Bodyguard", "Bookstore", "Bow Tie", "Brainstorm", "Busboy", "Cabdriver", "Candlestick", "Car wash", "Cartwheel", "Catfish", "Caveman", "Chocolate chip", "Crossbow", "Daydream", "Deadend", "Doghouse", "Dragonfly", "Dress shoes", "Dropdown", "Earlobe", "Earthquake", "Eyeballs", "Father-in-law", "Fingernail", "Firecracker", "Firefighter", "Firefly", "Firework", "Fishbowl", "Fisherman", "Fishhook", "Football", "Forget", "Forgive", "French fries", "Goodnight", "Grandchild", "Groundhog", "Hairband", "Hamburger", "Handcuff", "Handout", "Handshake", "Headband", "Herself", "High heels", "Honeydew", "Hopscotch", "Horseman", "Horseplay", "Hotdog", "Ice cream", "Itself", "Kickball", "Kickboxing", "Laptop", "Lifetime", "Lighthouse", "Mailman", "Midnight", "Milkshake", "Moonrocks", "Moonwalk", "Mother-in-law", "Movie theater", "Newborn", "Newsletter", "Newspaper", "Nightlight", "Nobody", "Northpole", "Nosebleed", "Outer space", "Over-the-counter", "Overestimate", "Paycheck", "Policeman", "Ponytail", "Post card", "Racquetball", "Railroad", "Rainbow", "Raincoat", "Raindrop", "Rattlesnake", "Rockband", "Rocketship", "Rowboat", "Sailboat", "Schoolbooks", "Schoolwork", "Shoelace", "Showoff", "Skateboard", "Snowball", "Snowflake", "Softball", "Solar system", "Soundproof", "Spaceship", "Spearmint", "Starfish", "Starlight", "Stingray", "Strawberry", "Subway", "Sunglasses", "Sunroof", "Supercharge", "Superman", "Superstar", "Tablespoon", "Tailbone", "Tailgate", "Take down", "Takeout", "Taxpayer", "Teacup", "Teammate", "Teaspoon", "Tennis shoes", "Throwback", "Timekeeper", "Timeline", "Timeshare", "Tugboat", "Tupperware", "Underestimate", "Uplift", "Upperclassman", "Uptown", "Video game", "Wallflower", "Waterboy", "Watermelon", "Wheelchair", "Without", "Workboots", "Worksheet"]
|
176 |
+
|
177 |
+
letters = "aeioulprts"
|
178 |
+
test_data = []
|
179 |
+
for word in words[:num_words]:
|
180 |
+
letter = random.choice(letters)
|
181 |
+
actual_count = word.lower().count(letter) # Use lower() to count case-insensitively
|
182 |
+
test_data.append((word, letter, actual_count))
|
183 |
+
return test_data
|
184 |
+
|
185 |
+
# Alpaca prompt template
|
186 |
+
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
|
187 |
+
|
188 |
+
### Instruction:
|
189 |
+
{0}
|
190 |
+
|
191 |
+
### Input:
|
192 |
+
{1}
|
193 |
+
|
194 |
+
### Response:
|
195 |
+
"""
|
196 |
+
|
197 |
+
# Generate test data
|
198 |
+
test_data = generate_test_data()
|
199 |
+
|
200 |
+
|
201 |
+
# Run evaluation
|
202 |
+
correct_predictions = 0
|
203 |
+
total_predictions = 0
|
204 |
+
|
205 |
+
for word, letter, actual_count in test_data:
|
206 |
+
input_text = f"How many {letter}'s in {word}?"
|
207 |
+
prompt = alpaca_prompt.format(
|
208 |
+
"You are an expert at logic puzzles, reasoning, and planning",
|
209 |
+
input_text,
|
210 |
+
""
|
211 |
+
)
|
212 |
+
|
213 |
+
inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
|
214 |
+
text_streamer = TextStreamer(tokenizer)
|
215 |
+
output = model.generate(**inputs, streamer=text_streamer, max_new_tokens=256)
|
216 |
+
|
217 |
+
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
|
218 |
+
print(f"Raw model output: {decoded_output}") # Print raw output for debugging
|
219 |
+
predicted_count = extract_count(decoded_output)
|
220 |
+
|
221 |
+
total_predictions += 1
|
222 |
+
|
223 |
+
if predicted_count is not None:
|
224 |
+
if predicted_count == actual_count:
|
225 |
+
correct_predictions += 1
|
226 |
+
else:
|
227 |
+
# If predicted_count is None and actual_count is 0, consider it correct
|
228 |
+
if actual_count == 0:
|
229 |
+
correct_predictions += 1
|
230 |
+
print(f"Warning: Could not extract a count from the model's response for '{word}'.")
|
231 |
+
|
232 |
+
print(f"Word: {word}, Letter: {letter}")
|
233 |
+
print(f"Actual count: {actual_count}, Predicted count: {predicted_count}")
|
234 |
+
print("Correct" if (predicted_count == actual_count or (predicted_count is None and actual_count == 0)) else "Incorrect")
|
235 |
+
|
236 |
+
# Calculate and print accuracy after each word
|
237 |
+
current_accuracy = correct_predictions / total_predictions
|
238 |
+
print(f"Current Accuracy: {current_accuracy:.2%}")
|
239 |
+
print(f"Correct predictions: {correct_predictions}")
|
240 |
+
print(f"Total predictions: {total_predictions}")
|
241 |
+
print("---")
|
242 |
+
|
243 |
+
# Calculate accuracy
|
244 |
+
accuracy = correct_predictions / total_predictions if total_predictions > 0 else 0
|
245 |
+
print(f"\nAccuracy: {accuracy:.2%}")
|
246 |
+
print(f"Correct predictions: {correct_predictions}")
|
247 |
+
print(f"Total predictions: {total_predictions}")
|
248 |
+
```
|
249 |
|
250 |
- **Developed by:** isaiahbjork
|
251 |
- **License:** apache-2.0
|
|
|
253 |
|
254 |
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
255 |
|
256 |
+
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|