mamba2-370m-av

Introduction

This is a mirror model to mamba2-370m which is compatible with mamba2-torch, a Hugging Face compatible mamba2 library that is not dependent on the original cuda wheels of the original mamba repo. Credit goes to the original authors of Mamba2 and the transformers library by Hugging Face. Without their work, this would not be possible.

NOTE: mamba2-torch offers different optimisation paths to use:

Triton kernels and causal-conv1d ("fastest")
Triton kernels only (default)
Pure PyTorch

How to Get Started with the Model

You can follow the instructions in the mamba2-torch repo for a more detailed explanation. First of all, you should install the mamba2-torch lib:

git clone https://github.com/vasqu/mamba2-torch.git
cd mamba2-torch
pip install .

Then you can download this repository here via git lfs and then use the files locally the following way (after installing mamba2-torch):

from transformers import AutoTokenizer
from mamba2_torch import Mamba2Model, Mamba2ForCausalLM, Mamba2Config

device = "cuda"
mamba2_hf_path = "<path-to-converted-model>"

model = Mamba2ForCausalLM.from_pretrained(mamba2_hf_path, local_files_only=True).to(device)
tokenizer = AutoTokenizer.from_pretrained(mamba2_hf_path, local_files_only=True)

input_ids = tokenizer("Hey how are you doing?", return_tensors="pt")["input_ids"].to(device)

# expected output (370m): `["Hey how are you doing?\n\nI'm a newbie to the world"]`
out = model.generate(input_ids, max_new_tokens=10)
print(tokenizer.batch_decode(out))

Citation

BibTeX:

@inproceedings{mamba2,
 title={Transformers are {SSM}s: Generalized Models and Efficient Algorithms Through Structured State Space Duality},
 author={Dao, Tri and Gu, Albert},
 booktitle={International Conference on Machine Learning (ICML)},
 year={2024}
}

Downloads last month: 1

Safetensors

Model size

0.4B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including AntonV/mamba2-370m-av

Mamba2 for "mamba2-torch"

Collection

Converted models fitting working with https://github.com/vasqu/mamba2-torch • 5 items • Updated Jun 16, 2024