AetherArchitectural/Community-Discussions · Llama-3 SillyTavern Presets Sharing

Lewdiculous

AetherArchitectural org Apr 20

•

edited May 12

This might be the place for Preset Sharing in this initial Llama-3 trying times.

I'll share my current recommendations so far:

Chaotic's simple presets here

And @Virt-io 's great set of presets here - recommended. He puts a lot of effort into these.

Lewdiculous pinned discussion Apr 27

Virt-io

Apr 28

•

edited Apr 28

@Lewdiculous

Can you try my context and instruct templates? I think I've ironed out all the issues.

For anyone making their own templates.

Change chat start to something similar. This gets sent before the First message, functions as a separator of card info and chat.

And add an assistant prefix. This gets sent every generation, but does not stay in context. So if you use stat boxes or similar you may want to add the instructions for that in here, keeps it from being lost in context.
SET THE PREFIX TO BE SENT AS USER NOT AS SYSTEM

Lewdiculous

AetherArchitectural org Apr 28

That looks pretty good mate. I'll give them a try next time I'm down bad..

saishf

Apr 28

•

edited Apr 28

Change chat start to something similar. This gets sent before the First message, functions as a separator of card info and chat.

If it works this will be amazing, i always went back to Solar because Llama3 models would be weird if you listed things a character liked in the card, they would just randomly start talking about those things in a normal conversation. Which always seemed like a fault of my settings and not the models themselves. I'm going to try the presets with [Llama-3SOME-8B-v1-BETA](https://huggingface.co/TheDrummer/Llama-3SOME-8B-v1-BETA
Edit - Llama-3Some-Beta misgenders user, going back to Llama3-Sovl resolves it & also i've noticed presets v1.5 make models write longer messages.

Virt-io

Apr 28

@saishf

The longer responses are intentional, if you want shorter responses edit Consistent Pacing and Creating a Scene from the Style Guidelines section.

If you want different formatting edit the Text Formatting Guidelines section.

If you run into weird issues with my templates please tell me.

saishf

Apr 28

Roleplay-v1.5 are the best presets I've used so far. I'm also using the dynamic temp samplers and this is the only combination that can successfully play a character like a zombie and remember that they aren't a normal human.
Eg:

{{chars}} voice is low and raspy, almost unintelligible due to the virus.

Big win :3

saishf

Apr 28

•

edited Apr 28

@saishf

The longer responses are intentional, if you want shorter responses edit Consistent Pacing and Creating a Scene from the Style Guidelines section.

If you want different formatting edit the Text Formatting Guidelines section.

If you run into weird issues with my templates please tell me.

The only issue I've had so far is random extra quotations and asterisks
Like these

It's not a deal breaker though, a regen fixes it.

Edit - I'm going to try removing the text formatting section to see if it helps

Virt-io

Apr 28

•

edited Apr 28

@saishf

The formatting issues are from [Dynamic-Temp]Roleplay, I am working on fixing these. At the moment some sliders are too high causing it to be dumb.

Also try a Q8_0 model, honestly it is not that slow if you're only using native 8K context.

Lewdiculous

AetherArchitectural org Apr 28

Response formatting and quotation/asterisk placement in Llama-3s is their only deal breaker for me, their prose is fantastic for the size otherwise.

saishf

Apr 28

@saishf

The formatting issues are from [Dynamic-Temp]Roleplay, I am working on fixing these. At the moment some sliders are too high causing it to be dumb.

I'll switch to static samplers :3, I tried playing with the samplers to see if I could half fix it and my character started speaking Russian😺

saishf

Apr 28

Response formatting and quotation/asterisk placement in Llama-3s is their only deal breaker for me, their prose is fantastic for the size otherwise.

It managed to surpass Claude 3 Haiku on lmsys's english only leaderboard 😼

saishf

Apr 28

I tried older dynamic temp presets with llama3, I think it just hates dynamic temp entirely. None had any success fixing the formatting, switching to static presets fixed it instantly

Virt-io

Apr 28

@saishf

I have updated both samplers. Give them a try.

@Lewdiculous

I am not having any formatting issues, with llama3-8B-instruct. I am using Q8_0, might be the reason.

saishf

Apr 28

•

edited Apr 28

Newest Dyna samplers.

Is it possible to rp with instruct? It's so smart 😿

Edit - static samplers are almost perfect, 9/10 responses are formatted correctly

Lewdiculous

AetherArchitectural org Apr 28

•

edited Apr 28

As for models, they will have to run well enough at 4.5 BPW / Q4_K_M quants at least so most people can benefit from them. Won't rush, things will improve.

Virt-io

Apr 28

@saishf

I have fixed the repetition issue, it appears llama3 hates having two system prompts.
So, I changed the prefix to be sent as user.

saishf

Apr 29

@saishf

I have fixed the repetition issue, it appears llama3 hates having two system prompts.
So, I changed the prefix to be sent as user.

I'll be sure to try it tonight! Timezones 🥲

saishf

Apr 29

Quick testing, this took seven regens to get with the dynamic samplers, which is so much better.

But the simple samplers cause repetition every second message 😭
New models 😿

Virt-io

Apr 29

@saishf

The file that I updated was the instruct preset.

If you are comfortable, can you send me the card you are using so I can test it?

saishf

Apr 29

•

edited Apr 29

@saishf

The file that I updated was the instruct preset.

If you are comfortable, can you send me the card you are using so I can test it?

I re-downloaded all four just to be sure and the model is even merged with a quote asterisk format lora @_@

https://files.catbox.moe/yw8hg1.png

Catbox since I'm unsure if hf strips metadata / character info :3

Edit - I don't know if the card just sucks, it's from somewhere like a year ago I just kept them there forever

saishf

Apr 29

Also the combination of Dyna temp + the presets are amazing

≥ solar 😼

Virt-io

Apr 29

•

edited Apr 29

@saishf

I made some edits. I kind of feel bad now.

Using the simple samplers. Dynamic-temp still needs more work.

Card HERE

This is just personal preference, but I feel like small models get confused when using first person point of views. I just use third person for everything, even my own replies.

saishf

Apr 30

I made some edits. I kind of feel bad now.

No wonder glade is scared of everyone 😿

This is just personal preference, but I feel like small models get confused when using first person point of views. I just use third person for everything, even my own replies.

They do get confused a sometimes, but I never really care enough to worry. It's still fun and sometimes the results are crazy 😭
Plus I generally confuse the model with my english before it gets confused itself.
I just remember trying out small models like Pygmalion 6B ages ago, I'm glad local ai is good in comparison now.
If only there was a llama3 30B~ish then I'd have an excuse to buy a single 24gb gpu...
I'd like a T4 for the power consumption but they cost the same as a 3090 😕

saishf

Apr 30

Also have you tried running Phi-3? I found it uses 9gb of vram at 16K using Q5_K_M which is like a gigabyte more than llama3, a model twice the size.

Virt-io

May 1

•

edited May 1

@saishf

I tried Phi-3, but it was all over the place.

I did some changes to the presets.
Formatted the guidelines into a JSON like structure and moved them into the 'last assistant prefix'.
Only downside is that you have reprocess them every response, though they don't stay in context. So, maybe it is an upside?

Also added an example character. 😳

saishf

May 1

I tried Phi-3, but it was all over the place.

I did some changes to the presets.
Formatted the guidelines into a JSON like structure and moved them into the 'last assistant prefix'.
Only downside is that you have reprocess them every response, though they don't stay in context. So, maybe it is an upside?

Also added an example character. 😳

I hope phi 3 7b isn't as weird as mini and uses less vram. They do score crazy high in mmlu though.

It not really an issue as kobold ingests insanely fast at the beginning 1k+ T/s

And the instruct has made characters a little more aware? But it still has the cursed asterisk in narration bug

Rp in the Shakespearean 'ra :3

saishf

May 2

@saishf

I tried Phi-3, but it was all over the place.

I did some changes to the presets.
Formatted the guidelines into a JSON like structure and moved them into the 'last assistant prefix'.
Only downside is that you have reprocess them every response, though they don't stay in context. So, maybe it is an upside?

Also added an example character. 😳

Just downloaded Hermes 2 Pro llama3 and the random asterisk bug is still present in a model with ChatML. So it seems it's a llama3 thing, not a presets thing.

Virt-io

May 5

•

edited May 5

Ok, so all the weird stuff I was doing with the Last assistant prefix was a dud.

But including a message before the First message is working well.

Also me being blind, the chat separator should go in First Assistant Prefix :

saishf

May 6

Ok, so all the weird stuff I was doing with the Last assistant prefix was a dud.

But including a message before the First message is working well.

Also me being blind, the chat separator should go in First Assistant Prefix :

V1.7 definetly seems to behave better. I still encounter llama3 formatting errors, but characters seem to not try "fairytale" end scenes as much as they used to

Virt-io

May 6

•

edited May 6

@saishf

Are you using the version I uploaded a little while ago? I still left it at v1.7, just updated the files.

I went and tried to remove anything that would steer cards to behave in a certain way. It should respect cards a lot more now.

I am going to focus on samplers now.

saishf

May 6

@saishf

Are you using the version I uploaded a little while ago? I still left it at v1.7, just updated the files.

I went and tried to remove anything that would steer cards to behave in a certain way. It should respect cards a lot more now.

I am going to focus on samplers now.

I'll try the new ones, the ones I have are the original 1.7's

And the samplers seem to be the root cause of a lot of the formatting issues I think, messing around with the templates in ST I found llama3's formatting would literally implode with some.

Virt-io

May 6

You can try the test samplers, they work decently.

saishf

May 6

•

edited May 6

You can try the test samplers, they work decently.

New samplers are much better, they'll commit torture while simple roleplay samplers wont.
They will also actually state where something came from instead of saying a character magically pulled a massive item out of thin air, or magically conjured fire etc.
Testing with soliloquy V2 and the latest presets :3
Edit - maybe a little too torturous, bordering on murderous. Called a character an idiot and they yeeted me into a wall. Figured that might be an issue for some.