Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ license: apache-2.0
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
-This is my seventh model. I decided to use [TheDrummer/UnslopNemo-12B-v4](https://huggingface.co/TheDrummer/UnslopNemo-12B-v4) instead of [TheDrummer/UnslopNemo-12B-v4.1](https://huggingface.co/TheDrummer/UnslopNemo-12B-v4.1) as it supposedly has more anti-GPTism influence at the cost of intelligence, so I'll be using it in future merges. It could most likely be counteracted by adding more intelligent models. TheDrummer said that *Metharme/Pygmalion* templates have higher anti-GPTism effect, but those specific tokens arent't enforced in the tokenizer, and I prefer *ChatML*. Thusly I picked the model that has more anti-GPTism influence in it's base state. I decided to tweak the parameters to be more balanced, while also just generally testing *NuSLERP*. If I find better parameters I might release a V2B of some kind. I still haven't had much time to test this exhaustively and I'm also working on other projects.
 ## Testing stage: early testing
 I do not know how this model holds up over long term context. Early testing showed stability and viable answers.

 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
+This is my seventh model. I decided to use [TheDrummer/UnslopNemo-12B-v4](https://huggingface.co/TheDrummer/UnslopNemo-12B-v4) instead of [TheDrummer/UnslopNemo-12B-v4.1](https://huggingface.co/TheDrummer/UnslopNemo-12B-v4.1) as it supposedly has more anti-GPTism influence at the cost of intelligence, so I'll be using it in future merges. It could most likely be counteracted by adding more intelligent models. TheDrummer said that *Metharme/Pygmalion* templates have higher anti-GPTism effect, but those specific tokens aren't enforced/present in the tokenizer, and I prefer *ChatML*. Thusly I picked the model that has more anti-GPTism influence in it's base state. I decided to tweak the parameters to be more balanced, while also just generally testing *NuSLERP*. If I find better parameters I might release a V2B of some kind. I still haven't had much time to test this exhaustively and I'm also working on other projects.
 ## Testing stage: early testing
 I do not know how this model holds up over long term context. Early testing showed stability and viable answers.