The default eos_token_id is 2, should be 11
Hi Guys!
Fantastic model.
I have encountered an issue using both the 7B and 40B Falcon models with recommended settings, they continue generation past <|endoftext|>
Issue is that the default llama eos_token_id=2 is specified here: https://huggingface.co/tiiuae/falcon-40b-instruct/blob/main/configuration_RW.py#L41
Looking at https://huggingface.co/tiiuae/falcon-40b-instruct/raw/main/tokenizer.json this is not the llama vocabulary and token=2 is >>INTRODUCTION<<
and I think we're looking for token=11 <|endoftext|>
I am able to work-around the generation problem by manually adding eos_token_id=11
on model invocation.
--Mike
Were you able to run the model on SageMaker?
Hey @mike-ravkine , glad you like the model
This is a bit surprising, while we should fix the default value, the config.json is correct since some days back, so when the model is loaded the config should be correct.
See: https://huggingface.co/tiiuae/falcon-40b-instruct/commit/662a9a4ffd96f4f73dd18141b60962f94b743c56
Could it be an issue with using a cached model since before it was fixed?
Thanks for the response
@FalconLLM
, this makes sense. I am actually using a quantized version of the model (from https://huggingface.co/TheBloke/falcon-40b-instruct-GPTQ) and it was missing the fix to config.json
from above. I have opened a PR in that model!