HiroseKoichi
/

Llama-Salad-4x8B-V2

Text Generation

nsfw

Not-For-All-Audiences

text-generation-inference

Mixture of Experts

Inference Endpoints

Model card Files Files and versions Community

Llama-Salad-4x8B-V2 / README.md

HiroseKoichi's picture

Update README.md

f170ac7 verified 5 months ago

|

history blame contribute delete

3.19 kB

	---
	license: llama3
	library_name: transformers
	tags:
	- nsfw
	- not-for-all-audiences
	- llama-3
	- text-generation-inference
	- moe
	- mergekit
	- merge
	---

	# Llama-Salad-4x8B-V2
	Changes in V2:
	- Swapped Tess-2.0-Llama-3-8B for Llama-3-8B-Synthia-v3.5
	- Swapped L3-8B-Stheno-v3.1 for Llama-3-Soliloquy-8B-v2
	- Removed Llama3-OpenBioLLM-8B and added opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5

	V2 has improvements in all areas from V1; it's not a massive improvement, but I can confidently say it's a direct upgrade. Llama-3-8B-Synthia-v3.5 is better than Tess-2.0-Llama-3-8B in every way; Llama-3-Soliloquy-8B-v2 is more intelligent than L3-8B-Stheno-v3.1 and has less bias towards NSFW content; and the inclusion of opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5 has greatly improved its storytelling and narration abilities.

	I really like the model selection in this one, so I don't know how much more I can improve if I make another 4x8B merge. If I were to make a V3, swapping Meta-Llama-3-8B-Instruct would likely be the only change. I will try my hand at making an 8x8B merge in the future, but I still need to find some models to fill the gaps; making sure there's no routing conflicts between 8 different models at once will be the biggest challenge.

	# Quantization Formats
	GGUF
	- Static:
	- https://huggingface.co/mradermacher/Llama-Salad-4x8B-V2-GGUF
	- Imatrix:
	- https://huggingface.co/mradermacher/Llama-Salad-4x8B-V2-i1-GGUF

	# Details
	- License: [llama3](https://llama.meta.com/llama3/license/)
	- Instruct Format: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)
	- Context Size: 8K

	## Models Used
	- [Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
	- [Llama-3-8B-Synthia-v3.5](https://huggingface.co/migtissera/Llama-3-8B-Synthia-v3.5)
	- [Llama-3-Soliloquy-8B-v2](https://huggingface.co/openlynn/Llama-3-Soliloquy-8B-v2)
	- [opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5](https://huggingface.co/dreamgen-preview/opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5)

	## Merge Config
	```yaml
	base_model: NousResearch/Meta-Llama-3-8B-Instruct
	gate_mode: hidden
	dtype: bfloat16
	experts_per_token: 2
	experts:
	- source_model: NousResearch/Meta-Llama-3-8B-Instruct
	positive_prompts:
	- "summarize"
	- "paraphrase"
	- "explain"
	- "define"
	- "translate"
	- "multilingual"
	- "chat"
	- "conversation"
	- source_model: migtissera/Llama-3-8B-Synthia-v3.5
	positive_prompts:
	- "programming language"
	- "JavaScript"
	- "Python programming language"
	- "Rust programming language"
	- "CSS markup styling language"
	- "math"
	- "code"
	- "step-by-step"
	- "logical reasoning"
	- source_model: openlynn/Llama-3-Soliloquy-8B-v2
	positive_prompts:
	- "roleplay"
	- "erotic roleplay"
	- "characters"
	- "scene"
	- "opinion"
	- source_model: dreamgen-preview/opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5
	positive_prompts:
	- "creative writing"
	- "storytelling"
	- "narration"
	- "narrative setting"
	- "narrative plot"
	- "narrative exposition"
	- "narrative theme"
	- "narrative climax"
	```