We are thrilled to announce Jamba, the worldβs first production-grade Mamba based model.
Key Features: - First production-grade Mamba based model built on a novel SSM-Transformer hybrid architecture - 3X throughput on long contexts compared to Mixtral 8x7B - Democratizes access to a massive 256K context window - The only model in its size class that fits up to 140K context on a single GPU
Jamba is based on a novel architecture that combines Mamba and Transformer. While our initial results show great efficiency gains, we expect this to be further explored and improved with the help of the community.