xxx777xxxASD
/

PrimaMonarch-EroSumika-2x10.7B-128k

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

PrimaMonarch-EroSumika-2x10.7B-128k / README.md

xxx777xxxASD's picture

Update README.md

1cdd6f9 verified 8 months ago

|

history blame contribute delete

2.55 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- merge
	- moe
	---
	![image/png](https://i.ibb.co/7k4j8Gm/icon10.png)
	(Maybe i'll change the icon picture later.)

	Experimental MoE, the idea is to have more active parameters than 7xX model would have and keep it's size lower than 20B.

	This model has ~19.2B parameters.

	[Exl2, 4.0 bpw](https://huggingface.co/xxx777xxxASD/PrimaMonarch-EroSumika-2x10.7B-128k-bpw-4.0) (Fits in 12GB VRAM/16k context/4-bit cache)

	[Exl2, 6.0 bpw](https://huggingface.co/xxx777xxxASD/PrimaMonarch-EroSumika-2x10.7B-128k-bpw-6.0)

	[GGUF](https://huggingface.co/xxx777xxxASD/PrimaMonarch-EroSumika-2x10.7B-128k-GGUF)

	### Base model (self merge)
	```
	slices:
	- sources:
	- model: MistralInstruct-v0.2-128k
	layer_range: [0, 24]
	- sources:
	- model: MistralInstruct-v0.2-128k
	layer_range: [8, 24]
	- sources:
	- model: MistralInstruct-v0.2-128k
	layer_range: [24, 32]
	merge_method: passthrough
	dtype: bfloat16
	```

	### First expert ("sandwich" merge)
	[xxx777xxxASD/PrimaSumika-10.7B-128k](https://huggingface.co/xxx777xxxASD/PrimaSumika-10.7B-128k)
	```
	slices:
	- sources:
	- model: EroSumika-128k
	layer_range: [0, 24]
	- sources:
	- model: Prima-Lelantacles-128k
	layer_range: [8, 24]
	- sources:
	- model: EroSumika-128k
	layer_range: [24, 32]
	merge_method: passthrough
	dtype: bfloat16

	```

	### Second expert ("sandwich" merge)
	```
	slices:
	- sources:
	- model: AlphaMonarch-7B-128k
	layer_range: [0, 24]
	- sources:
	- model: NeuralHuman-128k
	layer_range: [8, 24]
	- sources:
	- model: AlphaMonarch-7B-128k
	layer_range: [24, 32]
	merge_method: passthrough
	dtype: bfloat16
	```

	Each 128k model is a slerp merge with [Epiculous/Fett-uccine-Long-Noodle-7B-120k-Context](https://huggingface.co/Epiculous/Fett-uccine-Long-Noodle-7B-120k-Context)

	## Models used

	- [localfultonextractor/Erosumika-7B](https://huggingface.co/localfultonextractor/Erosumika-7B)
	- [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
	- [Epiculous/Fett-uccine-Long-Noodle-7B-120k-Context](https://huggingface.co/Epiculous/Fett-uccine-Long-Noodle-7B-120k-Context)
	- [mlabonne/AlphaMonarch-7B](https://huggingface.co/mlabonne/AlphaMonarch-7B)
	- [Nitral-AI/Prima-LelantaclesV6-7b](https://huggingface.co/Nitral-AI/Prima-LelantaclesV6-7b)
	- [NeuralNovel/Mistral-7B-Instruct-v0.2-Neural-Story](https://huggingface.co/NeuralNovel/Mistral-7B-Instruct-v0.2-Neural-Story)
	- [valine/MoreHuman](https://huggingface.co/valine/MoreHuman)