redrix's picture
Update README.md
7c08f06 verified
|
raw
history blame
3.34 kB
---
base_model:
- inflatebot/MN-12B-Mag-Mell-R1
- TheDrummer/UnslopNemo-12B-v4.1
- ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2
- DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS
library_name: transformers
tags:
- mergekit
- merge
- 12b
- chat
- roleplay
- creative-writing
license: apache-2.0
---
# AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS
> They say ‘He’ will bring the apocalypse. <span style="color:darkred">She</span> seeks understanding, not destruction.
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
This is my fourth model. I wanted to test *della_linear*. The point of this model was to use the negative properties of [DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS](https://huggingface.co/DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS) to counter potential positivity bias while keeping up stability.
## Testing stage: testing
I do not know how this model holds up over long term context. Early testing showed stability and very good answers. I'm still not sure whether the positivity bias has been messed with positively or negatively. The model has the tendency to give similar answers on swipes, *XTC* may help increase variability.
## Parameters
- **Context size:** Not more than *20k* recommended - coherency may degrade.
- **Chat Template:** *ChatML*
- **Samplers:** A *Temperature-Last* of 1 and *Min-P* of 0.1 are viable, but haven't been finetuned. Activate *DRY* if repetition appears. *XTC* seems to work well.
## Quantization
- Static **GGUF** Quants available at [mradermacher/AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS-GGUF](https://huggingface.co/mradermacher/AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS-GGUF)
- iMatrix Quants available at [mradermacher/AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS-i1-GGUF](https://huggingface.co/mradermacher/AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS-i1-GGUF)
❤️ Thanks.
## Merge Details
### Merge Method
This model was merged using the della_linear merge method using [TheDrummer/UnslopNemo-12B-v4.1](https://huggingface.co/TheDrummer/UnslopNemo-12B-v4.1) as a base.
### Models Merged
The following models were included in the merge:
* [inflatebot/MN-12B-Mag-Mell-R1](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1)
* [ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2](https://huggingface.co/ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2)
* [DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS](https://huggingface.co/DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
models:
- model: TheDrummer/UnslopNemo-12B-v4.1
parameters:
weight: 0.25
density: 0.6
- model: ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2
parameters:
weight: 0.25
density: 0.6
- model: DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS
parameters:
weight: 0.2
density: 0.4
- model: inflatebot/MN-12B-Mag-Mell-R1
parameters:
weight: 0.30
density: 0.7
base_model: TheDrummer/UnslopNemo-12B-v4.1
merge_method: della_linear
dtype: bfloat16
chat_template: "chatml"
tokenizer_source: union
parameters:
normalize: false
int8_mask: true
epsilon: 0.05
lambda: 1
```
> Today we hustle, 'day we hustle but tonight we play.](https://www.youtube.com/watch?v=-UjA03imoNI)