Safetensors
English
olmo2

What Instruct Template? Recommended Temperature?

#3
by Varkoyote - opened

Hello! I'd really appreciate more information about how to run this model properly please~ I tried 1.0 and 0.75 but the model tends to often repeat itself weirdly... it seems pretty dumb and does not really follow along stories or instructions for now, for me... Maybe the quants I use are broken, or maybe because these are the base models and not the instruct ones?

Hello @Varkoyote , you can find the instruct model here: https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct
Can you provide more information about the settings you're using along with the output?

Hey! Sadly I'm on CPU so I have to wait for someone to quantify the instruct version... as for this one, I'm using very regular settings! ChatML format seems to not work well, not respecting stop sequences but Alpaca works. I'm using a temperature of around 1, and min P 0.1, and the model tends to repeat the last sentence over a lot of the times, or not listen to what I say at all (for example, I correct it by mentioning a wrong detail about a story it's telling, but it starts over with the same wrong details).

Hi, we are currently working out some issues related to the instruction-tuned version, its tokenization and its quantization at https://github.com/ggerganov/llama.cpp/pull/10535. Once those are sorted out, we will load quantized GGUF weights similar to https://huggingface.co/allenai/OLMo-2-1124-13B-GGUF. I assume this is what you are looking for?

I think yes! Right now both the base and instruct versions seem to really not want to follow the context ahah, I ask for something and it completely ignores...

Sign up or log in to comment