Appreciate the model drop!

#6
by Nitral-AI - opened

But why is it only 4k? Its 2025 man, those are rookie numbers.

Language Technologies Unit @ Barcelona Supercomputing Center org
edited 10 days ago

We understand the demand for longer context windows and our roadmap includes multiple possible approaches to increase it. Extending the context length involves trade-offs in training efficiency, memory usage, and model performance, we are working on how to do it as efficient as possible.

If you now need a model with a longer context, consider using our instructed Salamandra-7b, it might be more suitable for you.

Sign up or log in to comment