File size: 4,378 Bytes
07e18df 0c4080e 07e18df f2a6785 0c4080e 07e18df 8668eaa 8de079d 93eb9da 8de079d 93eb9da 68ec5b5 93eb9da 884d363 d9be7c3 884d363 a737a29 93eb9da a737a29 93eb9da 7f9484a 93eb9da 7f9484a 93eb9da 4fb85ac 884d363 4fb85ac 0c4080e 07e18df 93eb9da 2cbada7 93eb9da 884d363 93eb9da 0c4080e 0f09075 8de079d 07e18df 8de079d 0c4080e 8de079d 0c4080e 07e18df 0c4080e |
|
---
license: apache-2.0
tags:
- moe
- merge
- mergekit
- lazymergekit
- DopeorNope/SOLARC-M-10.7B
- maywell/PiVoT-10.7B-Mistral-v0.2-RP
- kyujinpy/Sakura-SOLAR-Instruct
- jeonsworld/CarbonVillain-en-10.7B-v1
---
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/Qb88YeudOf7MYuGKTirXC.png)
# Lumosia-MoE-4x10.7
"Lumosia" was selected as its a MoE of Multiple SOLAR Merges so it really "Lights the way".... its 3am.
This is a very experimantal model. its a MoE of all good performing Solar models (based off of personal experiance not open leaderboard),
Why? Dunno whated to see what would happen
context is maybe 16k?
Chat-instruct breaks the model at the moment, not really sure why, even tho it will follow instructions.
Quants by @thebloke (thank you)
https://huggingface.co/TheBloke/Lumosia-MoE-4x10.7-GGUF
https://huggingface.co/TheBloke/Lumosia-MoE-4x10.7-GPTQ
Update: (In Progress)
Lumosia v1.5 will have the the same MoE models but with a fully trained head and gates. Sould increase performance alot.
Template: ChatML
```
### System:
### USER:{prompt}
### Assistant:
```
Settings:
```
Temp: 1.0
min-p: 0.02-0.1
```
Lumosia-MoE-4x10.7 is a Mixure of Experts (MoE) made with the following models:
* [DopeorNope/SOLARC-M-10.7B](https://huggingface.co/DopeorNope/SOLARC-M-10.7B)
* [maywell/PiVoT-10.7B-Mistral-v0.2-RP](https://huggingface.co/maywell/PiVoT-10.7B-Mistral-v0.2-RP)
* [kyujinpy/Sakura-SOLAR-Instruct](https://huggingface.co/kyujinpy/Sakura-SOLAR-Instruct)
* [jeonsworld/CarbonVillain-en-10.7B-v1](https://huggingface.co/jeonsworld/CarbonVillain-en-10.7B-v1)
## Evals:
* Avg: 69.61
* ARC: 68.34
* HellaSwag: 87.13
* MMLU: 64.38
* T-QA: 63.81
* Winogrande: 82.95
* GSM8K: 51.02
## Examples:
```
Example 1:
User:
Hello lumi, what is the meaning of life
in around 100 words.
Lumosia:
The meaning of life can vary greatly
depending on individual beliefs,
values, and experiences. One common
perspective suggests that the primary
purpose of life is to seek happiness,
fulfillment, and personal growth
through relationships
accomplishments, and spiritual
awareness. Living with integrity.
compassion, and gratitude while
making a positive impact on others
and the world around us can also
contribute to finding meaning in life
UItimately, the meaning of life is
subjective and unique to each
individual, who must define it for
themselves through exploration
reflection, and personal choice.
```
```
Example 2:
User:
Explain a MoE (Mixture of Experts) in
around 100 words
Lumosia:
A MoE, short for Mixture of Experts, is
a machine learning technique utilized
in ensemble learning. It involves
combining multiple specialist models
(experts) to optimize predictive
performance. Each expert focuses on
a subset of the input features or data
domains, making them specialized in
particular areas. Through a weighting
mechanism, like Gaussian Mixture
Model, the predictions of experts are
blended to produce a final result. MoE
models demonstrate effectiveness in
handling complex or ambiguous
inputs where a single model might
struggle. They are commonly used in
natural language processing.
computer vision, and speech synthesis.
```
## 🧩 Configuration
```
yamlbase_model: DopeorNope/SOLARC-M-10.7B
gate_mode: hidden
dtype: bfloat16
experts:
- source_model: DopeorNope/SOLARC-M-10.7B
positive_prompts: [""]
- source_model: maywell/PiVoT-10.7B-Mistral-v0.2-RP
positive_prompts: [""]
- source_model: kyujinpy/Sakura-SOLAR-Instruct
positive_prompts: [""]
- source_model: jeonsworld/CarbonVillain-en-10.7B-v1
positive_prompts: [""]
```
## 💻 Usage
```
python
!pip install -qU transformers bitsandbytes accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "Steelskull/Lumosia-MoE-4x10.7"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)
messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
``` |