Misleading advertisement as reasoning model
It is misleading to advertise this model as reasoning. This is just a a Qwen2 fine-tune with CoT dataset and no major architectural changes. OpenAI o1 is based on a complete new architecture that is not built on the transformers architecture and may be actually a breakthrough. Adding "o1 " to this model name too is just deceiving people by suggesting this is similarly something new or has the same capabilities as OpenAI o1.
Did you tried with MCTS?
Use MCTS , i am using llamacpp with version Q8
which is bigger 9.11 or 9.9, use a MCTS (step) /
Certainly! Let's analyze whether 9.11 is bigger than 9.9 using the Monte Carlo Tree Search (MCTS) approach, even though MCTS is typically used for decision-making in complex environments like game trees. However, for this simple comparison, I'll adapt the concept to a more straightforward method that resembles MCTS steps.
**Step 1: Define the Problem**
We need to determine whether 9.11 is greater than 9.9.
**Step 2: Initialize the Search Tree**
Imagine we have a tree where each node represents a comparison between two numbers. The root node is our initial comparison between 9.11 and 9.9.
**Step 3: Expand the Tree**
At the root node, we expand by considering different aspects to compare 9.11 and 9.9:
- **First Digit Comparison:** Compare the whole number parts.
- **Second Digit Comparison:** If the whole numbers are equal, compare the tenths place.
- **Third Digit Comparison:** If necessary, compare the hundredths place.
**Step 4: Simulate Multiple Outcomes**
Since MCTS relies on simulations, we'll simulate multiple scenarios:
1. **Scenario 1:** 9.11 vs. 9.9
- Compare whole numbers: 9 vs. 9 → Equal.
- Compare tenths place: 1 vs. 9 → 1 < 9.
- Conclusion: 9.11 < 9.9.
2. **Scenario 2:** 9.9 vs. 9.11
- Compare whole numbers: 9 vs. 9 → Equal.
- Compare tenths place: 9 vs. 1 → 9 > 1.
- Conclusion: 9.9 > 9.11.
3. **Scenario 3:** 9.11 vs. 9.9
- Same as Scenario 1, leading to the same conclusion.
**Step 5: Backpropagation**
After simulating multiple scenarios, we update the tree to reflect the outcomes:
- The path leading to 9.11 being less than 9.9 is confirmed in all simulations.
- The path leading to 9.9 being greater than 9.11 is also confirmed.
**Step 6: Decision Making**
Based on the simulations, it's clear that 9.11 is not greater than 9.9.
**Conclusion**
Through this adapted MCTS approach, we've determined that 9.11 is less than 9.9 by comparing their whole numbers and decimal places step by step.
**Alternative Approach: Direct Comparison**
For clarity, let's compare the two numbers directly:
- 9.11 has a value of nine and one tenth.
- 9.9 has a value of nine and nine tenths.
Since nine tenths (0.9) is greater than one tenth (0.1), it follows that 9.9 > 9.11.
**Final Answer:**
9.9 is bigger than 9.11.
I tested the model and I have to say this model is not kind of so called o1 kind model at all . I guess with reinforcement learning , your model can be called 01 kind of model . i am looking forward to it . meanwhile , releasing such model claiming 01 might damage qwen branding in opensource community .
So how is this better than gemma-2-9b-it-abliterated-Q4_K_M.gguf ?
"which is bigger 9.11 or 9.9, use a MCTS (step)"
Here's how we can think about this using Monte Carlo Tree Search (MCTS), although it's a bit of an over-complication for such a simple comparison:
1. Define the Problem:
Node States: We have two possible node states: 9.11 and 9.9.
Goal: Determine which state has a higher value (is bigger).
2. Tree Construction:Start with a root node representing the choice between 9.11 and 9.9.
Each child node represents one of the possible values.
3. Expansion:Since we only have two values, expansion is straightforward. Each child node would be labeled as 9.11 or 9.9.
4. Selection:We don't need complex selection strategies here because it's a clear comparison. We simply select the child node with the higher value (9.9).
5. Simulation/Evaluation:The simulation is trivial. We compare the selected node (9.9) to the other node (9.11). We can conclude that 9.9 is bigger.
6. Backpropagation:We update the probabilities and values of the nodes based on the outcome of our comparison. The 9.9 node would be favored.
Result: After a single iteration of MCTS, we confidently conclude that 9.9 is bigger than 9.11.
Let me know if you'd like to explore a more complex example with MCTS!
Interesting...
I never used "use a MCTS (step)" before with any model .
Seems improving every model performance.
Interesting...
I never used "use a MCTS (step)" before with any model .
Seems improving every model performance.
Ill have to play with this myself this weekend. Never even had that thought cross my mind. Nor have i heard of that idea until today.
Interesting...
I never used "use a MCTS (step)" before with any model .
Seems improving every model performance.
Wasn't there a paper recently that stated that most of that COT stuff was BS, and that a model could write any kind of random tokens before its answer and achieve (slightly) better performances than one who doesn't anyway? Feels like this is an example of it.
In any case, this is a model relying on hype, and some basic fine tuning. Shameful but not unexpected.