error loading model: unknown model architecture: 'mamba'
#1
by
tinchung
- opened
I follow this notebook to setup llama-cpp on Kaggle notebook
when I try to load the LLM:
from llama_cpp import Llama
# model_path is location of to the GGUF model that you've download from HuggingFace on Colab
model_path = "/kaggle/working/models/mamba-130m/mamba-130m-q2_k.gguf"
#load the LLM
llm = Llama(model_path=model_path,
n_gpu_layers=-1) #load model while enabling GPU
it load everything and tell me that llama-cpp don't recognize mamba and give this error
llama_model_loader: - type f32: 193 tensors
llama_model_loader: - type q2_K: 48 tensors
llama_model_loader: - type q6_K: 1 tensors
error loading model: unknown model architecture: 'mamba'
llama_load_model_from_file: failed to load model
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[10], line 7
4 model_path = "/kaggle/working/models/mamba-130m/mamba-130m-q2_k.gguf"
6 #load the LLM
----> 7 llm = Llama(model_path=model_path,
8 n_gpu_layers=-1) #load model while enabling GPU
File /opt/conda/lib/python3.10/site-packages/llama_cpp/llama.py:340, in Llama.__init__(self, model_path, seed, n_ctx, n_batch, n_gpu_layers, main_gpu, tensor_split, rope_freq_base, rope_freq_scale, low_vram, mul_mat_q, f16_kv, logits_all, vocab_only, use_mmap, use_mlock, embedding, n_threads, last_n_tokens_size, lora_base, lora_path, numa, verbose, **kwargs)
336 with suppress_stdout_stderr():
337 self.model = llama_cpp.llama_load_model_from_file(
338 self.model_path.encode("utf-8"), self.params
339 )
--> 340 assert self.model is not None
342 if verbose:
343 self.ctx = llama_cpp.llama_new_context_with_model(self.model, self.params)
AssertionError:
use llama-b3615-bin-ubuntu-x64.zip
!./build/bin/llama-cli -m /content/mamba-2.8b-f32.gguf -p "Building a website can be done in 10 steps:"
it run in colab
thanks you