File size: 7,347 Bytes
b04ff49 7892031 b04ff49 7892031 b04ff49 7892031 54f7bba 7892031 54f7bba 7892031 98a813e 9fca79b 54f7bba |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 |
---
license: apache-2.0
license_link: https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/blob/main/LICENSE
language:
- en
pipeline_tag: text-generation
base_model: Qwen/Qwen2.5-1.5B-Instruct
tags:
- chat
- trl
- sft
- math
library_name: transformers
model-index:
- name: Qwen2.5-1.5B-Instruct-QwQ
results:
- task:
type: text-generation
dataset:
name: GSM8k
type: gsm8k
metrics:
- name: pass@4
type: pass@4
value: 89.6
verified: false
---
# Qwen2.5-1.5B-Instruct-QwQ
## Introduction
Qwen2.5-1.5B-Instruct-QwQ is a fine-tuned model based on Qwen2.5-1.5B-Instruct. It was fine-tuned on roughly 20k samples from QwQ-32B-Preview. Compared to Qwen2.5-1.5B-Instruct, this fine-tuned model seems more performant in mathematics contexts and general reasoning. Also it shows some capabilities of self-correction, altough it seems a bit limited (bigger models seem to learn self-correction better, e.g. the 3B & 7B version show much better self-correction abilities in my experiments).
For data generation, math problems from the train sets of the GSM8k and MATH datasets were used.
**This repo contains the instruction-tuned 1.5B Qwen2.5 model fine-tuned on QwQ reasoning chains**, which has the following features:
- Type: Causal Language Models
- Training Stage: Pretraining & Post-training
- Architecture: transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias and tied word embeddings
- Number of Parameters: 1.54B
- Number of Paramaters (Non-Embedding): 1.31B
- Number of Layers: 28
- Number of Attention Heads (GQA): 12 for Q and 2 for KV
- Context Length: Full 32,768 tokens and generation 8192 tokens
## Quickstart
Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "micaebe/Qwen2.5-1.5B-Instruct-QwQ"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "system", "content": "You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```
Here is an example of the question "Is 9.9 or 9.10 larger", sampled greedily. It gets the question correct, but the reasoning seems a bit weird. At the beginning its pretty strong but when it tries to solve the problem from another view it's starting to deviate. Then at the end it gets back on track and outputs the correct answer:
```
<|im_start|>system
You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step.<|im_end|>
<|im_start|>user
Is 9.9 or 9.10 larger?<|im_end|>
<|im_start|>assistant
So I have this problem here: "Is 9.9 or 9.10 larger?" Okay, let's see what it's asking.
First off, both numbers are decimals with the same whole number part, which is 9 in this case. So that means they're both nine units long, right?
Now, when comparing two decimal numbers where the whole parts are equal, we just need to compare the digits after the decimal point. That makes sense because if the whole parts are the same, then the difference must be in the tenths place.
Let me write down the numbers again:
- 9.9
- 9.10
Okay, so the whole part is 9 for both, but the tenths places differ. In 9.9, the tenth is 9, and in 9.10, the tenth is 1.
Since 9 is greater than 1, it follows that 9.9 is greater than 9.10.
Wait a minute, does that make sense? Let's think about it another way.
Imagine you have 9 apples, and someone gives you an additional 0.9 of an apple. Now, do you have more apples or fewer apples compared to having 9.10 apples?
Well, 0.9 is less than 1, so adding it to 9 would give you 9.9, which is still less than 9.10.
But wait, maybe there's something else to consider. What if I were to add these numbers together?
Let's try adding them up:
9.9 + 9.10 = ?
Hmm, that might help clarify things.
Adding 9.9 and 9.10:
First, align the decimal points:
9.90
+ 9.10
------
Now, add the hundredths first:
0 + 0 = 0
Then, tens:
9 + 1 = 10
Bring down the zero from the hundredths place:
10.0
Finally, ones:
9 + 9 = 18
So, 9.9 + 9.10 = 18.00
That seems straightforward. But earlier, I thought that since the whole parts are the same, the comparison should be based on the tenths place only. But now, adding them shows that 9.9 is indeed smaller than 9.10.
I'm getting confused here. Maybe I need to rethink my approach.
Alternatively, perhaps I can look at the numbers as fractions to get a better understanding.
Both 9.9 and 9.10 can be written as fractions:
9.9 = 99/10
9.10 = 91/10
Now, comparing 99/10 and 91/10.
Clearly, 99 is greater than 91, so 99/10 > 91/10.
Therefore, 9.9 > 9.10.
But wait, that doesn't seem right either. Earlier, when I added them, 9.9 + 9.10 = 18.00, which suggests that 9.9 is less than 9.10.
Maybe I'm missing something fundamental here.
Let me think differently.
Consider the numbers as lengths on a number line.
The number line has negative numbers to the left and positive numbers to the right.
If I plot -9.9 and -9.10 on the number line, which one is farther to the left?
Well, -9.9 is further to the left than -9.10, because -9.9 is closer to zero than -9.10.
Similarly, if I plot 9.9 and 9.10 on the number line, which one is farther to the right?
Again, 9.9 is farther to the right than 9.10, because 9.9 is closer to ten than 9.10.
This reinforces that 9.9 is greater than 9.10.
Wait, but earlier, when I was trying to add them, I got 18.00, which is greater than any other possible sum involving 9.9 and 9.10.
So, perhaps the confusion lies in how I interpreted the addition.
Let me re-examine the addition:
9.9 + 9.10 = 18.00
Yes, that's correct.
And since 18.00 is greater than any other possible sum involving 9.9 and 9.10, it confirms that 9.9 is greater than 9.10.
But wait, maybe I should consider that 9.9 is actually 9.90, not 9.9.
Because sometimes people write decimals without trailing zeros, like 9.9 instead of 9.90.
In that case, 9.90 is exactly the same as 9.9, so the comparison remains the same.
Therefore, 9.90 is greater than 9.10.
But according to standard notation, 9.9 is typically used, even though 9.90 is mathematically equivalent.
So, sticking with 9.9, which is 9.90, it's clear that 9.9 is larger than 9.10.
Hence, the answer is 9.9 is larger.
**Final Answer**
\[ \boxed{9.9} \]<|im_end|>
```
For GSM8k performance comparison: the base-instruct model scores 73.2% on the test set.
Disclaimer: GSM8k scores are currently only from the first 27% of the test set. Scores will be updated in the future, tested on the full dataset. |