Transformers
PyTorch
Inference Endpoints
Hermes-mamba-2.8b / README.md
norabelrose's picture
Upload folder using huggingface_hub
1c797af verified
|
raw
history blame
457 Bytes
metadata
license: apache-2.0

Mamba-2.8b is a model using the Mamba architecture, with 2.8B parameters, trained on the Pile dataset.

Model code: https://github.com/state-spaces/mamba/tree/main

To load the model, follow the installation instruction in the code repo, and then:

from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
model = MambaLMHeadModel.from_pretrained("EleutherAI/Hermes-mamba-2.8b")