MarsupialAI
commited on
Commit
•
c719417
1
Parent(s):
96ae030
Update README.md
Browse files
README.md
CHANGED
@@ -13,26 +13,35 @@ library_name: transformers
|
|
13 |
A Mistral-Large merge
|
14 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65a531bc7ec6af0f95c707b1/sf_mh-yR7V7ghi7M8UnPS.png)
|
15 |
|
16 |
-
This model is a
|
17 |
-
|
|
|
18 |
|
19 |
Mergefuel:
|
20 |
-
- TheDrummer/Behemoth-123B-v1
|
21 |
- anthracite-org/magnum-v4-123b
|
22 |
- migtissera/Tess-3-Mistral-Large-2-123B
|
23 |
|
24 |
See recipe.txt for full details.
|
25 |
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
|
|
|
|
33 |
|
34 |
-
|
|
|
|
|
|
|
|
|
|
|
35 |
|
36 |
|
37 |
# Prompt Format
|
38 |
-
|
|
|
|
13 |
A Mistral-Large merge
|
14 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65a531bc7ec6af0f95c707b1/sf_mh-yR7V7ghi7M8UnPS.png)
|
15 |
|
16 |
+
This model is a hybrid merge of Behemoth 1.2, Tess, and Magnum V4. The intention was to do a three-way slerp merge, which is technically
|
17 |
+
not possible. To simulate the effeect of a menage-a-slerp, I slerped B1.2 with tess, then separately did B1.2 with magnum. I then did a
|
18 |
+
model stock merge of those two slerps using B1.2 as the base. Somehow, it worked out spectacularly well. Sometimes dumb ideas pay off.
|
19 |
|
20 |
Mergefuel:
|
21 |
+
- TheDrummer/Behemoth-123B-v1.2
|
22 |
- anthracite-org/magnum-v4-123b
|
23 |
- migtissera/Tess-3-Mistral-Large-2-123B
|
24 |
|
25 |
See recipe.txt for full details.
|
26 |
|
27 |
+
Improvements over Monstral v1: Drummer's 1.2 tune of behemoth is a marked improvement over the original, and the addition ot tess to the
|
28 |
+
mix really makes the creativity pop. I seem to have dialed out the rapey magnum influence, without stripping it of the ability to get mean
|
29 |
+
and/or dirty when the situation actually calls for it. The RP output of this model shows a lot more flowery and "literary" description of
|
30 |
+
scenes and activities. It's more colorful and vibrant. Repitition is dramatically reduced, as is slop (though to a lesser extent). The
|
31 |
+
annoying tendency to double-describe things with "it was X, almost Y" is virtually gone. Do you like a slow-burn story that builds over
|
32 |
+
time? Well good fucking news, because v2 excels at that.
|
33 |
+
|
34 |
+
The only complaint I've received is occasional user impersonation with certain cards. I've not seen this myself on any of my cards, so I
|
35 |
+
have to assume it's down to the specific formatting on specific cards. I don't want to say it's a skill issue, but...
|
36 |
|
37 |
+
This model is uncensored and perfectly capable of generating objectionable material. I have not observed it injecting NSFW content into
|
38 |
+
SFW scenarios, but no guarentees can be made. As with any LLM, no factual claims made by the model should be taken at face value. You
|
39 |
+
know that boilerplate safety disclaimer that most professional models have? Assume this has it too. This model is for entertainment
|
40 |
+
purposes only.
|
41 |
+
|
42 |
+
GGUFs: https://huggingface.co/MarsupialAI/Monstral-123B-v2_GGUF
|
43 |
|
44 |
|
45 |
# Prompt Format
|
46 |
+
Metharme seems to work flawlessly. In theory, mistral V3 or possibly even chatml should work to some extent, but meth was providing such
|
47 |
+
high quality output that I couldn't even be bothered to test the others. Just do meth, kids.
|