Looks good so far!

by BigHuggyD - opened May 13

May 13

8bpw EXL2 quant is up
https://huggingface.co/BigHuggyD/jukofyork_Deep-Miqu-103B-8.0bpw-h8-exl2

Quick dark test looked good! Testing a dark scenario now that is a slower burn to make sure it doesn't turn into a redemption story after some context is pumped through it. So far, I really like how it writes. One of my favorites to date. Prompt is basically "You are a writer" then I give it some general style guidance

jukofyork

Owner May 13

•

edited May 13

The 120b should have been uploaded by now but has crapped out with an error twice :/

BigHuggyD

May 13

I'm about 12k of context in to my extended dark scenario, and the antagonist seems to be turning into a redemption story. I'm going to continue to play it out and see what happens, but so far, it seems to be drifting.

BigHuggyD

May 13

Since it is supposed to be a more 'neutral' model, I wonder if I need to be more explicit about the tone in my prompt.

jukofyork

Owner May 13

•

edited May 13

Yeah, I think it might be hard to beat the stock Dark-Miqu-70b.

I tried @sophosympatheia 's suggestion of merging with migtissera/Tess-70B-v1.6 but it just reverted the model so all the characters want to "do good" and have instant redemption arcs, etc.

I'm a couple of hours off finishing uploading Deep-Miqu-120b so it might be worth trying that instead: from experience the 103b --> 120b --> 123b merge patterns make the models slightly more unhinged (and buggy) but more interesting too.

I might try just self-merging some more, but from a few quick experiments before; it didn't really do much to noticeably improve the model... The Dawn-Miqu-70b merge did seem to make the writing more descriptive, but not much use if it slowly reverts to Goody-Two-Shoes all the time :/

jukofyork

Owner May 14

@BigHuggyD

https://huggingface.co/jukofyork/Deep-Miqu-120B

BigHuggyD

May 14

@BigHuggyD

https://huggingface.co/jukofyork/Deep-Miqu-120B

I will check it out!

BigHuggyD

May 14

Yeah, I think it might be hard to beat the stock Dark-Miqu-70b.

I tried @sophosympatheia 's suggestion of merging with migtissera/Tess-70B-v1.6 but it just reverted the model so all the characters want to "do good" and have instant redemption arcs, etc.

I'm a couple of hours off finishing uploading Deep-Miqu-120b so it might be worth trying that instead: from experience the 103b --> 120b --> 123b merge patterns make the models slightly more unhinged (and buggy) but more interesting too.

I might try just self-merging some more, but from a few quick experiments before; it didn't really do much to noticeably improve the model... The Dawn-Miqu-70b merge did seem to make the writing more descriptive, but not much use if it slowly reverts to Goody-Two-Shoes all the time :/

Yes, my long form dark test definitely drifted when left to its own devices. I injected a really subtle nudge into the context, and it immediately redirected it, but that was not something I had to do with Dark-Miqu

jukofyork

Owner May 14

•

edited May 14

Yeah, I think it might be hard to beat the stock Dark-Miqu-70b.

I tried @sophosympatheia 's suggestion of merging with migtissera/Tess-70B-v1.6 but it just reverted the model so all the characters want to "do good" and have instant redemption arcs, etc.

I'm a couple of hours off finishing uploading Deep-Miqu-120b so it might be worth trying that instead: from experience the 103b --> 120b --> 123b merge patterns make the models slightly more unhinged (and buggy) but more interesting too.

I might try just self-merging some more, but from a few quick experiments before; it didn't really do much to noticeably improve the model... The Dawn-Miqu-70b merge did seem to make the writing more descriptive, but not much use if it slowly reverts to Goody-Two-Shoes all the time :/

Yes, my long form dark test definitely drifted when left to its own devices. I injected a really subtle nudge into the context, and it immediately redirected it, but that was not something I had to do with Dark-Miqu

Yeah, I tried extending a few of the test stories and noticed it started trying to do this too :/

I'm now uploading 103b and 120b parameter self-merges of Dark-Miqu-70B. I don't know why I didn't give these more of a try before, but they do seem to be using more descriptive language and seem to have slightly less "weirdness" and/or inconsistencies in the generated stories. It will be a couple of days before the 103b is uploaded.

(I won't bother uploading the 123b version of Deep-Miqu as it is the least coherent of them all, and I think the self-merges might be more interesting to try now).

BigHuggyD

May 14

I'm now uploading 103b and 120b parameter self-merges of Dark-Miqu-70B. I don't know why I didn't give these more of a try before, but they do seem to be using more descriptive language and seem to have slightly less "weirdness" and/or inconsistencies in the generated stories. It will be a couple of days before the 103b is uploaded.

(I won't bother uploading the 123b version of Deep-Miqu as it is the least coherent of them all, and I think the self-merges might be more interesting to try now).

Interesting! I look forward to trying the self merges

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment