File size: 4,378 Bytes

07e18df
0c4080e
 
 
 
 
 
 
 
 
 
07e18df
 
f2a6785
 
0c4080e
07e18df
8668eaa
8de079d
93eb9da
8de079d
 
93eb9da
68ec5b5
93eb9da
884d363
 
d9be7c3
 
 
 
 
884d363
a737a29
 
93eb9da
a737a29
 
93eb9da
 
 
7f9484a
93eb9da
7f9484a
93eb9da
4fb85ac
884d363
 
 
 
 
 
 
4fb85ac
0c4080e
 
 
 
07e18df
93eb9da
 
2cbada7
 
 
 
 
 
 
93eb9da
884d363
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93eb9da
0c4080e
0f09075
8de079d
 
07e18df
 
 
 
 
 
 
 
 
 
8de079d
 
0c4080e
 
 
8de079d
 
0c4080e
 
 
 
 
 
 
 
 
 
 
 
 
 
07e18df
0c4080e

---
license: apache-2.0
tags:
- moe
- merge
- mergekit
- lazymergekit
- DopeorNope/SOLARC-M-10.7B
- maywell/PiVoT-10.7B-Mistral-v0.2-RP
- kyujinpy/Sakura-SOLAR-Instruct
- jeonsworld/CarbonVillain-en-10.7B-v1
---

![image/png](https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/Qb88YeudOf7MYuGKTirXC.png)

# Lumosia-MoE-4x10.7

"Lumosia" was selected as its a MoE of Multiple SOLAR Merges so it really "Lights the way".... its 3am.

This is a very experimantal model. its a MoE of all good performing Solar models (based off of personal experiance not open leaderboard), 

Why? Dunno whated to see what would happen

context is maybe 16k? 

Chat-instruct breaks the model at the moment, not really sure why, even tho it will follow instructions.

Quants by @thebloke  (thank you)

https://huggingface.co/TheBloke/Lumosia-MoE-4x10.7-GGUF

https://huggingface.co/TheBloke/Lumosia-MoE-4x10.7-GPTQ

Update: (In Progress)
Lumosia v1.5 will have the the same MoE models but with a fully trained head and gates. Sould increase performance alot.


Template: ChatML
```
### System:

### USER:{prompt}

### Assistant:
```


Settings:
```
Temp: 1.0
min-p: 0.02-0.1
```

Lumosia-MoE-4x10.7 is a Mixure of Experts (MoE) made with the following models:
* [DopeorNope/SOLARC-M-10.7B](https://huggingface.co/DopeorNope/SOLARC-M-10.7B)
* [maywell/PiVoT-10.7B-Mistral-v0.2-RP](https://huggingface.co/maywell/PiVoT-10.7B-Mistral-v0.2-RP)
* [kyujinpy/Sakura-SOLAR-Instruct](https://huggingface.co/kyujinpy/Sakura-SOLAR-Instruct)
* [jeonsworld/CarbonVillain-en-10.7B-v1](https://huggingface.co/jeonsworld/CarbonVillain-en-10.7B-v1)

## Evals:

* Avg: 69.61
* ARC: 68.34
* HellaSwag: 87.13
* MMLU: 64.38
* T-QA: 63.81
* Winogrande: 82.95
* GSM8K: 51.02

## Examples:
```
Example 1:

User:
Hello lumi, what is the meaning of life
in around 100 words.

Lumosia:
The meaning of life can vary greatly
depending on individual beliefs,
values, and experiences. One common
perspective suggests that the primary
purpose of life is to seek happiness,
fulfillment, and personal growth
through relationships
accomplishments, and spiritual
awareness. Living with integrity.
compassion, and gratitude while
making a positive impact on others
and the world around us can also
contribute to finding meaning in life
UItimately, the meaning of life is
subjective and unique to each
individual, who must define it for
themselves through exploration
reflection, and personal choice.

```
```
Example 2:

User:
Explain a MoE (Mixture of Experts) in
around 100 words

Lumosia:
A MoE, short for Mixture of Experts, is
a machine learning technique utilized
in ensemble learning. It involves
combining multiple specialist models
(experts) to optimize predictive
performance. Each expert focuses on
a subset of the input features or data
domains, making them specialized in
particular areas. Through a weighting
mechanism, like Gaussian Mixture
Model, the predictions of experts are
blended to produce a final result. MoE
models demonstrate effectiveness in
handling complex or ambiguous
inputs where a single model might
struggle. They are commonly used in
natural language processing.
computer vision, and speech synthesis.
```

## 🧩 Configuration

```
yamlbase_model: DopeorNope/SOLARC-M-10.7B
gate_mode: hidden
dtype: bfloat16
experts:
  - source_model: DopeorNope/SOLARC-M-10.7B
    positive_prompts: [""]
  - source_model: maywell/PiVoT-10.7B-Mistral-v0.2-RP
    positive_prompts: [""]
  - source_model: kyujinpy/Sakura-SOLAR-Instruct
    positive_prompts: [""]
  - source_model: jeonsworld/CarbonVillain-en-10.7B-v1
    positive_prompts: [""]
```

## 💻 Usage

```
python
!pip install -qU transformers bitsandbytes accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "Steelskull/Lumosia-MoE-4x10.7"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```