metadata

license: apache-2.0
datasets:
  - m-a-p/Code-Feedback

Model Overview

The base model used for training is CallComply/openchat-3.5-0106-128k, which features a context length of 128k.

This model was trained on 10,000 examples from the m-a-p/Code-Feedback dataset. This dataset aids the model in interactive code performance, enabling it to self-improve with interpreter and human feedback. It is ideal for applications like TaskWeaver, which helps automatically build code, or OpenInterpreter, which assists you in writing code or serves as a general agent.

The reason for choosing this base model is its long context length and strong performance in every category, specifically coding.

Additional Information

During training, we filtered examples that were under 2048 tokens to expedite training on just 10,000 examples for testing purposes. The model performed exceptionally well during testing, prompting us to release it. However, we are currently training version 1, which contains the full dataset trained with a maximum sequence of 4096.(the current beta model should still output over 2048 tokens)

Please consider this model a Beta version. We will be back with a stronger version in approximately 4-5 days. The model is modified and trained back to the standard Mistral EOS token </s>.

This model is trained using the Unsloth qlora method.

Prompt Template

###Human: Write a python script....
###Assistant: python.....

EOS = </s>.