|
--- |
|
license: other |
|
license_name: mistral-ai-research-licence |
|
license_link: https://mistral.ai/licenses/MRL-0.1.md |
|
base_model: [] |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- lumikabra-123B |
|
|
|
--- |
|
# lumikabra-123B v0.4 |
|
|
|
|
|
<div style="width: auto; margin-left: auto; margin-right: auto; margin-bottom: 3cm"> |
|
<img src="https://huggingface.co/schnapper79/lumikabra-123B_v0.1/resolve/main/lumikabra.png" alt="Lumikabra" style="width: 100%; min-width: 400px; display: block; margin: auto;"> |
|
</div> |
|
|
|
This is lumikabra. It's based on [Mistral-Large-Instruct-2407 ](https://huggingface.co/mistralai/Mistral-Large-Instruct-2407), merged with Magnum-v2-123B, Luminum-v0.1-123B and Tess-3-Mistral-Large-2-123B. |
|
|
|
I shamelessly took this idea from [FluffyKaeloky](https://huggingface.co/FluffyKaeloky/Luminum-v0.1-123B). Like him, i always had my troubles with each of the current large mistral based models. |
|
Either it gets repetitive, shows too many GPTisms, is too horny or too unhorny. RP and storytelling is always a matter of taste, and i found myself swiping too often for new answers or even fixing them when I missed a little spice or cleverness. |
|
|
|
Luminum was a great improvement, mixing a lot of desired traits, but I still missed some spice, another sauce. |
|
So i took Luminum, added magnum again and also Tess for knowledge and structure. |
|
|
|
This is the forth iteration. More of the mistral base model. Like all Lumikabra models so far, it tends to write pretty long and creative answers.. |
|
|
|
## Merge Details |
|
### Merge Method |
|
|
|
This model was merged using [mergekit](https://github.com/cg123/mergekit) with the della_linear merge method using mistralai_Mistral-Large-Instruct-2407 as a base. |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
models: |
|
- model: /workspace/text-generation-webui/models/anthracite-org_magnum-v2-123b |
|
parameters: |
|
weight: 0.25 |
|
density: 0.9 |
|
- model: /workspace/text-generation-webui/models/FluffyKaeloky_Luminum-v0.1-123B |
|
parameters: |
|
weight: 0.25 |
|
density: 0.9 |
|
- model: /workspace/text-generation-webui/models/migtissera_Tess-3-Mistral-Large-2-123B |
|
parameters: |
|
weight: 0.3 |
|
density: 0.9 |
|
merge_method: della_linear |
|
base_model: /workspace/text-generation-webui/models/mistralai_Mistral-Large-Instruct-2407 |
|
parameters: |
|
epsilon: 0.05 |
|
lambda: 1 |
|
int8_mask: true |
|
dtype: bfloat16 |
|
``` |
|
|