Longer context

#10
by salazaaar - opened

The model is currently restricted to a context of 4096 tokens. Is there a way to extend it without retraining?

RLHFlow org

@salazaaar I think the model supports context window of 8192 (see config.json - "max_position_embeddings": 8192).
8192 is what the native Llama3 model supports. If you want to go beyond that without training, you can try methods such as
https://arxiv.org/abs/2308.16137
https://arxiv.org/abs/2309.17453

Haoxiang-Wang changed discussion status to closed

Sign up or log in to comment