Settings for this model?

#4
by thanksforthematrices - opened

I'm trying to use the Q5_K_S quant with KoboldCPP and no matter what sampler settings or instruction template format I use, I always get gibberish. Sometimes, like with min-P set to something high (>0.8) and DRY multiplier >0.8, I can get it to write text, but it's rambling and eventually gets stuck stuttering the same word, like "thethethethethe". Without DRY and min-P (like with the suggested settings), it just spits out incoherent mixes of letters.

What settings should I use to get it to write anything readable?

Bump up the smoothing factor to like 1 to 1.3.
Edit: Though I would like to know from @DavidAU if smoothing factor and min P is absolutely necessary when using higher temps apart from rep penalty with higher temps (Temp 1.5+ to 5) with his models otherwise the model spits gibberish. I kind of face a different issue, for me it's like if I bump up repetition penalty I get formatting errors during rp wherein the quotes are "' instead of ". So like if I lower rep pen all the way down to 1, it fixes my issue but the level of detail drops significantly. But If I bump up rep penalty, the level of detail and spatial awareness is wild. It even pulls up stuff from from my wordlore like bigger 70b models.

The smoothing factor doesn't really help. I tried getting it to "write a story about kobolds" (that's the whole prompt in Instruct Mode). Here are the comparisons of different settings:

temp: 0.7, repetition penalty: 1.07, top-P: 0.92, top-K: 100, Min-P:0, Smooth. F.: 0, DRY Mult.: 0
The The kobold's lairr was a big, cave. It was a perfect, safe haven for the the kobold kobold's people. (...)
temp: 0.8, repetition penalty: 1.05, top-P: 0.92, top-K: 100, Min-P:0, Smooth. F.: 1, DRY Mult.: 0
The Thehonor ofhonorhonorhonorhonorhonorhonorhonorhonorhonorhonor (...)
temp: 0.8, repetition penalty: 1.05, top-P: 0.92, top-K: 100, Min-P:0.02, Smooth. F.: 1, DRY Mult.: 0.2
The the mountains loomed large and over the the land of the the the the (...)
temp: 0.8, repetition penalty: 1.05, top-P: 0.92, top-K: 100, Min-P:0.02, Smooth. F.: 1, DRY Mult.: 1
The The The The The The TheThe TheThe The (...)
temp: 0.8, repetition penalty: 1.05, top-P: 0.92, top-K: 40, Min-P:0.02, Smooth. F.: 1, DRY Mult.: 1
The The Tunnels of the the Dwarwoldoldtunnelsssssssss (...)
temp: 0.8, repetition penalty: 1.05, top-P: 0.92, top-K: 40, Min-P:0.8, Smooth. F.: 1, DRY Mult.: 1
The The Caverns of the the The The The The The The Thethe (...)
temp: 0.8, repetition penalty: 1.05, top-P: 0.6, top-K: 40, Min-P:0.8, Smooth. F.: 1, DRY Mult.: 1
The The Caverns of the the The The The The The The (...)
temp: 0.8, repetition penalty: 1.05, top-P: 0.6, top-K: 40, Min-P:0.8, Smooth. F.: 1.3, DRY Mult.: 1
The The Caverns of the the The The The (...)
temp: 0.8, repetition penalty: 1.05, top-P: 0.6, top-K: 40, Min-P:0.8, Smooth. F.: 1.3, DRY Mult.: 0.5
The The Caverns of the the The The The The The The The The (...)

It's frustrating that there's no documentation on what to do to replicate the results in the model card.

A few things:

1 - larger prompts / with more detail many times will stop this issue (the model needs a bit of guidance)
2 - What quant are you using?

Special notes:
Make "shift context" is off and/or rope DISABLED during testing.
Rope can cause issues all by itself, outside para adjustments.

The other options:
1 - Try Darkest Planet (reg)
2 - Dark Planet (far more stable, less fussy ... but not as creative.

hope that helps;

It seems to work for me (4bpw) even at temp 3 and above. I'm using it with these sampler settings:

image.png

Default Character:
image.png

Assistant/Story Teller:
image.png

A few things:

1 - larger prompts / with more detail many times will stop this issue (the model needs a bit of guidance)
2 - What quant are you using?

Special notes:
Make "shift context" is off and/or rope DISABLED during testing.
Rope can cause issues all by itself, outside para adjustments.

The other options:
1 - Try Darkest Planet (reg)
2 - Dark Planet (far more stable, less fussy ... but not as creative.

hope that helps;

I can confirm that turning off Context Shift fixed the problem! No more gibberish. I'm using Q5_K_S and KoboldCpp v. 1.76.

@thanksforthematrices @James2313123

Update: I have done some research into this issue ; here is how to address it:

In "KoboldCpp" or "oobabooga/text-generation-webui" or "Silly Tavern" ;

Set the "Smoothing_factor" to 1.5 to 2.5
: in KoboldCpp -> Settings->Samplers->Advanced-> "Smooth_F"
: in text-generation-webui -> parameters -> lower right.
: In Silly Tavern this is called: "Smoothing"

NOTE: For "text-generation-webui"
-> if using GGUFs you need to use "llama_HF" (which involves downloading some config files from the SOURCE version of this model)

Source versions (and config files) of my models are here:
https://huggingface.co/collections/DavidAU/d-au-source-files-for-gguf-exl2-awq-gptq-hqq-etc-etc-66b55cb8ba25f914cbf210be

OTHER OPTIONS:

  • Increase rep pen to 1.1 to 1.15 (you don't need to do this if you use "smoothing_factor".

  • If the interface/program you are using to run AI MODELS supports "Quadratic Sampling" ("smoothing") just make the adjustment as noted.

Okay, thanks David! πŸ€—

Sign up or log in to comment