Update `max_position_embeddings` to 4096

#6

The model has a 4096 tokens context length, which should be correctly reflected in the config.json.

See https://github.com/facebookresearch/codellama/blob/1af62e1f43db1fa5140fa43cb828465a603a48f3/llama/model.py#L277 in reference implementation (self.params.max_seq_len * 2 where self.params.max_seq_len == 2048). Also confirmed offline with a Meta engineer.

This would also apply to other 70b models, I imagine?

Thanks for the fix!

osanseviero changed pull request status to merged

Sign up or log in to comment