README.md · redrix/patricide-12B-Unslop-Mell-v2 at main

patricide-12B-Unslop-Mell-v2 / README.md

redrix

Update README.md

38b04a1 verified 12 days ago

preview code

raw

history blame contribute delete

3.12 kB

	---
	base_model:
	- inflatebot/MN-12B-Mag-Mell-R1
	- TheDrummer/UnslopNemo-12B-v4
	library_name: transformers
	tags:
	- mergekit
	- merge
	- 12b
	- chat
	- roleplay
	- creative-writing
	- NuSLERP
	license: apache-2.0
	---
	# patricide-12B-Unslop-Mell-v2
	>The sins of the Father shan't ever be repeated this way.

	![PatricideLogo.png](https://cdn-uploads.huggingface.co/production/uploads/674c58de6bfa8d3e4ff8dcf3/pdKS7W4futo8XgqRaT8Rb.png)

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	This is my seventh model. I decided to use [TheDrummer/UnslopNemo-12B-v4](https://huggingface.co/TheDrummer/UnslopNemo-12B-v4) instead of [TheDrummer/UnslopNemo-12B-v4.1](https://huggingface.co/TheDrummer/UnslopNemo-12B-v4.1) as it supposedly has more anti-GPTism influence at the cost of intelligence, so I'll be using it in future merges. It could most likely be counteracted by adding more intelligent models. TheDrummer said that Metharme/Pygmalion templates have higher anti-GPTism effect, but those specific tokens aren't enforced/present in the tokenizer, and I prefer ChatML. Thusly I picked the model that has more anti-GPTism influence in it's base state. I decided to tweak the parameters to be more balanced, while also just generally testing NuSLERP. If I find better parameters I might release a V2B of some kind. I still haven't had much time to test this exhaustively and I'm also working on other projects.
	## Testing stage: early testing
	I do not know how this model holds up over long term context. Early testing showed stability and viable answers.

	## Parameters
	- Context size: Not more than 20k recommended - coherency may degrade.
	- Chat Template: ChatML; Metharme/Pygmalion (as per UnslopNemo) may work, but effects are untested
	- Samplers: A Temperature-Last of 1 and Min-P of 0.1 are viable, but haven't been finetuned. Activate DRY if repetition appears. XTC is untested.

	## Quantization
	Static GGUF Quants available at:
	- [MaziyarPanahi/patricide-12B-Unslop-Mell-v2-GGUF](https://huggingface.co/MaziyarPanahi/patricide-12B-Unslop-Mell-v2-GGUF/tree/main)
	- [mradermacher/patricide-12B-Unslop-Mell-v2-GGUF](https://huggingface.co/mradermacher/patricide-12B-Unslop-Mell-v2-GGUF/tree/main)

	> My glorious kings/queens ❤️ Y'all's doin' the lord's work.

	## Merge Details
	### Merge Method

	This model was merged using the NuSLERP merge method.

	### Models Merged

	The following models were included in the merge:
	* [inflatebot/MN-12B-Mag-Mell-R1](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1)
	* [TheDrummer/UnslopNemo-12B-v4](https://huggingface.co/TheDrummer/UnslopNemo-12B-v4)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	models:
	- model: TheDrummer/UnslopNemo-12B-v4
	parameters:
	weight: [0.6, 0.5, 0.3, 0.5, 0.6]
	- model: inflatebot/MN-12B-Mag-Mell-R1
	parameters:
	weight: [0.4, 0.5, 0.7, 0.5, 0.4]
	merge_method: nuslerp
	dtype: bfloat16
	chat_template: "chatml"
	tokenizer:
	source: union
	parameters:
	normalize: true
	int8_mask: true


	```