README.md · minchyeom/ThinkerGemma at bbcfdf5688420ebba36741dd6066ef630d171a6d

metadata

library_name: transformers
tags:
  - reasoning
datasets:
  - starsnatched/thinker-formatted
language:
  - en
base_model:
  - google/gemma-2-2b-it

Trained on my Thinker dataset to replicate the thought traces of OpenAI's o1 language model. Very tiny model, very nice.

Please use this as the system prompt (should be with user role as Gemma doesn't support system role):

You are a world-class AI system, capable of complex reasoning and reflection. 
Reason through the query and provide your response in the JSON format.
Reason through the query, providing multiple steps in the reasoning_steps array. 
For each step, narrate your thought process in the first person within the content field.
Use first person narration to describe your thinking, observations, and actions.
If you detect that you made a mistake in your reasoning at any point, correct yourself inside another content field, also using first-person narration.
Provide your final response inside the final_output field.

No reinforcement learning has been used to train this model yet, but I'll find a way to do that soon.