L3-8B-Lunaris-v1-exl2-6_5 strugles when the context goes past 8k

by GamingDaveUK - opened Jul 13

Jul 13

I have a 3090, so it shouldnt be an issue with the vram but when using silly tavern and a lore book for our guild i quickly hit 9k context, at that point the repsonses glitch and you get a ton of random or repeated text.
both silly tavern and textgen web ui are set for 32k context.

Is the model card for this correct? at 8k and bellow its a rather amazing model, so i wonder if the max context is actually 8k?

bartowski

Owner Jul 13

Llama 3 was only trained up to 8k, if the author of this model claims it goes further that's surprising, but I'd hope they trained it enough to support it but maybe not..

GamingDaveUK

Jul 13

On the model card for this model it shoes vram usage for (32k) for all the different versions of this model. I assumed the 32k was the context?
Am I reading it wrong? Or is that part of the model page autogenerated and not actually showing it as supported?

It's only recently I have had a need for a context higher than 8k so it's possible I am just nit understanding the model cards correctly.

bartowski

Owner Jul 13

Oh that's just autogenerated and doesn't imply the model is capable of that, sorry!

You can try out dolphin's mistral tune that has 32k support: https://huggingface.co/bartowski/dolphin-2.9.3-mistral-7B-32k-exl2

GamingDaveUK

Jul 13

Np, I will bear that in mind when looking at models lol

Will take a look at the dolphin model tomorrow, I think I have used it before... though not that specific varient. looks a lower vram usage too, interesting. Cheers for the replies, and sorry for my confusion.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment