Edit model card

Llama 3 finetuned on my TRRR-CoT Dataset

cookinai/TRRR-CoT

  • This was an attempt at synthetically generating a CoT dataset and then finetuning it on a model to see its reuslts.
  • From what I notice, when using the correct prompt template the model almost always ues the TRRR format, but I am still awaiting benchmark tests to see if this can improve anything
  • TRR stand for:
  1. Think, about your response
  2. Respond, how you normally would
  3. Reflect, on your response
  4. Respond, again but this time use all the information you have now
  • The mode usually tries to follow this format, it may mix it up a little but usually it almost always reflects in someway. Especially if you tell it to think step by step

  • Intrestingly enough, when finetuned on mistral 7b, I could not get the model CoT at all, with only one epoch llama 3 got it instantly

  • Developed by: cookinai

  • License: apache-2.0

  • Finetuned from model : unsloth/llama-3-8b-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
48
Safetensors
Model size
8.03B params
Tensor type
FP16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for cookinai/LlamaReflect-8B-CoT-safetensors

Finetuned
(2406)
this model
Quantizations
1 model

Spaces using cookinai/LlamaReflect-8B-CoT-safetensors 7