error loading model: unknown model architecture: 'mamba'

#1
by tinchung - opened

I follow this notebook to setup llama-cpp on Kaggle notebook

when I try to load the LLM:

from llama_cpp import Llama

# model_path is location of to the GGUF model that you've download from HuggingFace on Colab
model_path = "/kaggle/working/models/mamba-130m/mamba-130m-q2_k.gguf"

#load the LLM
llm = Llama(model_path=model_path,
            n_gpu_layers=-1) #load model while enabling GPU

it load everything and tell me that llama-cpp don't recognize mamba and give this error

llama_model_loader: - type  f32:  193 tensors
llama_model_loader: - type q2_K:   48 tensors
llama_model_loader: - type q6_K:    1 tensors
error loading model: unknown model architecture: 'mamba'
llama_load_model_from_file: failed to load model
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[10], line 7
      4 model_path = "/kaggle/working/models/mamba-130m/mamba-130m-q2_k.gguf"
      6 #load the LLM
----> 7 llm = Llama(model_path=model_path,
      8             n_gpu_layers=-1) #load model while enabling GPU

File /opt/conda/lib/python3.10/site-packages/llama_cpp/llama.py:340, in Llama.__init__(self, model_path, seed, n_ctx, n_batch, n_gpu_layers, main_gpu, tensor_split, rope_freq_base, rope_freq_scale, low_vram, mul_mat_q, f16_kv, logits_all, vocab_only, use_mmap, use_mlock, embedding, n_threads, last_n_tokens_size, lora_base, lora_path, numa, verbose, **kwargs)
    336     with suppress_stdout_stderr():
    337         self.model = llama_cpp.llama_load_model_from_file(
    338             self.model_path.encode("utf-8"), self.params
    339         )
--> 340 assert self.model is not None
    342 if verbose:
    343     self.ctx = llama_cpp.llama_new_context_with_model(self.model, self.params)

AssertionError: 

use llama-b3615-bin-ubuntu-x64.zip

!./build/bin/llama-cli -m /content/mamba-2.8b-f32.gguf -p "Building a website can be done in 10 steps:"

it run in colab

thanks you

Sign up or log in to comment