redrix's picture
Update README.md
d6af32d verified
|
raw
history blame
3.36 kB
metadata
base_model:
  - inflatebot/MN-12B-Mag-Mell-R1
  - TheDrummer/UnslopNemo-12B-v4.1
  - ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2
  - DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS
library_name: transformers
tags:
  - mergekit
  - merge
  - 12b
  - chat
  - roleplay
  - creative-writing
  - DELLA-linear
license: apache-2.0

AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS

They say ‘He’ will bring the apocalypse. She seeks understanding, not destruction.

This is a merge of pre-trained language models created using mergekit.

This is my fourth model. I wanted to test della_linear. The point of this model was to use the negative properties of DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS to counter potential positivity bias while keeping up stability.

Testing stage: testing

I do not know how this model holds up over long term context. Early testing showed stability and very good answers. I'm still not sure whether the positivity bias has been messed with positively or negatively. The model has the tendency to give similar answers on swipes, XTC may help increase variability.

Parameters

  • Context size: Not more than 20k recommended - coherency may degrade.
  • Chat Template: ChatML
  • Samplers: A Temperature-Last of 1 and Min-P of 0.1 are viable, but haven't been finetuned. Activate DRY if repetition appears. XTC seems to work well.

Quantization

Merge Details

Merge Method

This model was merged using the della_linear merge method using TheDrummer/UnslopNemo-12B-v4.1 as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: TheDrummer/UnslopNemo-12B-v4.1
    parameters:
      weight: 0.25
      density: 0.6
  - model: ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2
    parameters:
      weight: 0.25
      density: 0.6
  - model: DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS
    parameters:
      weight: 0.2
      density: 0.4
  - model: inflatebot/MN-12B-Mag-Mell-R1
    parameters:
      weight: 0.30
      density: 0.7
base_model: TheDrummer/UnslopNemo-12B-v4.1
merge_method: della_linear
dtype: bfloat16
chat_template: "chatml"
tokenizer_source: union
parameters:
  normalize: false
  int8_mask: true
  epsilon: 0.05
  lambda: 1

Today we hustle, 'day we hustle but tonight we play.