Spaces:
Running
Llama 3 using a fixed seed?
Whenever I regenerate responses while using llama 3, it outputs exactly the same response every time. I think it's using a fixed seed. I tried it on groq.com and there the responses were unique each time.
I've noticed that CommandR+ is generating very similar/exact same responses occasionally, but I think it's just a bug.
Sounds like it could be a low temperature (randomness) setting. Good for precision if it doesn't hallucinate. Bad for creative writing or open-ended questions.
@EveryPizza Just add in instruction at top that "You always generate Unique responses" or "Must create unique response every time"
@EveryPizza Just add in instruction at top that "You always generate Unique responses" or "Must create unique response every time"
That won't have much of an effect, because the model isn't told what the last sent message was.
@EveryPizza my models are also using llama3, but they were not repeating things because of System Prompt.
Model Link-> https://hf.co/chat/assistant/6612cb237c1e770b75c5ebad
A workaround for this is by using an assistant with less restrictive parameters.
Don't use low temperature if you want varied responses. Decrease it if it outputs gibberish or goes "off the rails"
Avoid low Top P, slightly lower values than 1 should be safe.
Repetition penalty is a tricky one. Decrease it if it starts degrading into nonsense
Top-K is straightforward in how it works, uses lower values for more predictable output.
The options look like this, but it will take some trial and error to find the "right" values.
Use whatever you like for a system prompt for your use case.
@LostSpirit thanks for the info! I wish it was possible to customize these parameters in regular chat because assistants don't have the web search toggle (it's either permanently enabled or disabled).
is there a canonical list of these parameters for llama 3 someplace with min/max/default values?