Looks good so far!
8bpw EXL2 quant is up
https://huggingface.co/BigHuggyD/jukofyork_Deep-Miqu-103B-8.0bpw-h8-exl2
Quick dark test looked good! Testing a dark scenario now that is a slower burn to make sure it doesn't turn into a redemption story after some context is pumped through it. So far, I really like how it writes. One of my favorites to date. Prompt is basically "You are a writer" then I give it some general style guidance
The 120b
should have been uploaded by now but has crapped out with an error twice :/
I'm about 12k of context in to my extended dark scenario, and the antagonist seems to be turning into a redemption story. I'm going to continue to play it out and see what happens, but so far, it seems to be drifting.
Since it is supposed to be a more 'neutral' model, I wonder if I need to be more explicit about the tone in my prompt.
Yeah, I think it might be hard to beat the stock Dark-Miqu-70b
.
I tried
@sophosympatheia
's suggestion of merging with migtissera/Tess-70B-v1.6
but it just reverted the model so all the characters want to "do good" and have instant redemption arcs, etc.
I'm a couple of hours off finishing uploading Deep-Miqu-120b
so it might be worth trying that instead: from experience the 103b
--> 120b
--> 123b
merge patterns make the models slightly more unhinged (and buggy) but more interesting too.
I might try just self-merging some more, but from a few quick experiments before; it didn't really do much to noticeably improve the model... The Dawn-Miqu-70b
merge did seem to make the writing more descriptive, but not much use if it slowly reverts to Goody-Two-Shoes all the time :/
I will check it out!
Yeah, I think it might be hard to beat the stock
Dark-Miqu-70b
.I tried @sophosympatheia 's suggestion of merging with
migtissera/Tess-70B-v1.6
but it just reverted the model so all the characters want to "do good" and have instant redemption arcs, etc.I'm a couple of hours off finishing uploading
Deep-Miqu-120b
so it might be worth trying that instead: from experience the103b
-->120b
-->123b
merge patterns make the models slightly more unhinged (and buggy) but more interesting too.I might try just self-merging some more, but from a few quick experiments before; it didn't really do much to noticeably improve the model... The
Dawn-Miqu-70b
merge did seem to make the writing more descriptive, but not much use if it slowly reverts to Goody-Two-Shoes all the time :/
Yes, my long form dark test definitely drifted when left to its own devices. I injected a really subtle nudge into the context, and it immediately redirected it, but that was not something I had to do with Dark-Miqu
Yeah, I think it might be hard to beat the stock
Dark-Miqu-70b
.I tried @sophosympatheia 's suggestion of merging with
migtissera/Tess-70B-v1.6
but it just reverted the model so all the characters want to "do good" and have instant redemption arcs, etc.I'm a couple of hours off finishing uploading
Deep-Miqu-120b
so it might be worth trying that instead: from experience the103b
-->120b
-->123b
merge patterns make the models slightly more unhinged (and buggy) but more interesting too.I might try just self-merging some more, but from a few quick experiments before; it didn't really do much to noticeably improve the model... The
Dawn-Miqu-70b
merge did seem to make the writing more descriptive, but not much use if it slowly reverts to Goody-Two-Shoes all the time :/Yes, my long form dark test definitely drifted when left to its own devices. I injected a really subtle nudge into the context, and it immediately redirected it, but that was not something I had to do with Dark-Miqu
Yeah, I tried extending a few of the test stories and noticed it started trying to do this too :/
I'm now uploading 103b
and 120b
parameter self-merges of Dark-Miqu-70B
. I don't know why I didn't give these more of a try before, but they do seem to be using more descriptive language and seem to have slightly less "weirdness" and/or inconsistencies in the generated stories. It will be a couple of days before the 103b
is uploaded.
(I won't bother uploading the 123b
version of Deep-Miqu
as it is the least coherent of them all, and I think the self-merges might be more interesting to try now).
I'm now uploading
103b
and120b
parameter self-merges ofDark-Miqu-70B
. I don't know why I didn't give these more of a try before, but they do seem to be using more descriptive language and seem to have slightly less "weirdness" and/or inconsistencies in the generated stories. It will be a couple of days before the103b
is uploaded.(I won't bother uploading the
123b
version ofDeep-Miqu
as it is the least coherent of them all, and I think the self-merges might be more interesting to try now).
Interesting! I look forward to trying the self merges