|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
tags: |
|
- mistral |
|
- instruct |
|
- finetune |
|
- chatml |
|
- gpt4 |
|
- synthetic data |
|
- distillation |
|
- conversational |
|
- text-generation-inference |
|
base_model: teknium/OpenHermes-2.5-Mistral-7B |
|
inference: false |
|
--- |
|
|
|
# OpenHermes-2.5-Mistral-7B-GGUF |
|
- GGUF quantized versions for [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) |
|
- Created using llama.cpp |
|
|
|
## Model description |
|
|
|
OpenHermes 2.5 Mistral 7B is a state of the art Mistral Fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets. |
|
|
|
Potentially the most interesting finding from training on a good ratio (est. of around 7-14% of the total dataset) of code instruction was that it has boosted several non-code benchmarks, including TruthfulQA, AGIEval, and GPT4All suite. It did however reduce BigBench benchmark score, but the net gain overall is significant. |
|
|
|
The code it trained on also improved it's humaneval score (benchmarking done by Glaive team) from 43% @ Pass 1 with Open Herms 2 to 50.7% @ Pass 1 with Open Hermes 2.5. |
|
|
|
OpenHermes was trained on 1,000,000 entries of primarily GPT-4 generated data, as well as other high quality data from open datasets across the AI landscape. [More details soon] |
|
|
|
Filtering was extensive of these public datasets, as well as conversion of all formats to ShareGPT, which was then further transformed by axolotl to use ChatML. |