Is 8k Context Length Justified?

#1
by Akirami - opened

Do you think given the model size, is the 8k context length justified? Considering that its a small 1B model, having inputs with higher context length may lead to poor results.

By the way, I have tested the model and it seems to produce repetitive results almost all the time. It keeps on repeating the same sentence throughout.

Sign up or log in to comment