redrix commited on
Commit
d0a1b4b
1 Parent(s): 9d5a64a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -20,7 +20,7 @@ license: apache-2.0
20
 
21
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
22
 
23
- This is my seventh model. I decided to use [TheDrummer/UnslopNemo-12B-v4](https://huggingface.co/TheDrummer/UnslopNemo-12B-v4) instead of [TheDrummer/UnslopNemo-12B-v4.1](https://huggingface.co/TheDrummer/UnslopNemo-12B-v4.1) as it supposedly has more anti-GPTism influence at the cost of intelligence, so I'll be using it in future merges. It could most likely be counteracted by adding more intelligent models. TheDrummer said that *Metharme/Pygmalion* templates have higher anti-GPTism effect, but those specific tokens arent't enforced in the tokenizer, and I prefer *ChatML*. Thusly I picked the model that has more anti-GPTism influence in it's base state. I decided to tweak the parameters to be more balanced, while also just generally testing *NuSLERP*. If I find better parameters I might release a V2B of some kind. I still haven't had much time to test this exhaustively and I'm also working on other projects.
24
  ## Testing stage: early testing
25
  I do not know how this model holds up over long term context. Early testing showed stability and viable answers.
26
 
 
20
 
21
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
22
 
23
+ This is my seventh model. I decided to use [TheDrummer/UnslopNemo-12B-v4](https://huggingface.co/TheDrummer/UnslopNemo-12B-v4) instead of [TheDrummer/UnslopNemo-12B-v4.1](https://huggingface.co/TheDrummer/UnslopNemo-12B-v4.1) as it supposedly has more anti-GPTism influence at the cost of intelligence, so I'll be using it in future merges. It could most likely be counteracted by adding more intelligent models. TheDrummer said that *Metharme/Pygmalion* templates have higher anti-GPTism effect, but those specific tokens aren't enforced/present in the tokenizer, and I prefer *ChatML*. Thusly I picked the model that has more anti-GPTism influence in it's base state. I decided to tweak the parameters to be more balanced, while also just generally testing *NuSLERP*. If I find better parameters I might release a V2B of some kind. I still haven't had much time to test this exhaustively and I'm also working on other projects.
24
  ## Testing stage: early testing
25
  I do not know how this model holds up over long term context. Early testing showed stability and viable answers.
26