Holy **** it's really, really good!
I've played around pretty much with all the mainstream models, you know the ones that get all the attention but after trying this, I think it's my favorite model so far. I usually run 7B models, I can run quantized 70B models as I do have the system RAM but I prefer using VRAM for obvious reasons. And this one right here hits the sweet spot. It's fast but really, really good, I am amazed.
It really keeps the conversation going and stays in-character. I use models to test characters. Basically let's say I'm writing a story and instead of me just writing the characters and thinking for them I like to use local LLM's to "live" these characters. It's very much like getting to know a person and more so, I can get a reaction from certain story-beats. I can then combine that with my own ideas and I feel characters become a lot more real.
Sorry, long story but yeah, this model does an incredibly job at keeping characters stay, um, in-character. Kudos and thanks, I'll deffo gonna be keeping an eye on you from now on. Thank you so much for the contribution!