TheBloke/Llama-2-13B-chat-GGUF · ctransformers Autotokenizer error

Jan 25

I'm trying to use TheBloke/Llama-2-13B-chat-GGUF with llama-2-13b-chat.Q5_K_M.gguf and I get an error saying it is "not implemented" when I try to use AutoTokenizer. Am I ding something wrong?

Code:
from ctransformers import AutoModelForCausalLM,AutoTokenizer
....

model_name = "TheBloke/Llama-2-13B-chat-GGUF"
model_file = "llama-2-13b-chat.Q5_K_M.gguf"
llm = AutoModelForCausalLM.from_pretrained(model_name, model_file=model_file, model_type="llama", gpu_layers=50,hf=True)
tokenizer = AutoTokenizer.from_pretrained(llm)

Erro line: " tokenizer = AutoTokenizer.from_pretrained(llm) "
Error message:
tokenizer = AutoTokenizer.from_pretrained(llm)
File "/home/ubuntu/prjLlamaQuant/venv/lib/python3.10/site-packages/ctransformers/hub.py", line 268, in from_pretrained
return CTransformersTokenizer(model._llm)
File "/home/ubuntu/prjLlamaQuant/venv/lib/python3.10/site-packages/ctransformers/transformers.py", line 84, in init
super().init(**kwargs)
File "/home/ubuntu/prjLlamaQuant/venv/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 367, in init
self._add_tokens(
File "/home/ubuntu/prjLlamaQuant/venv/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 467, in _add_tokens
current_vocab = self.get_vocab().copy()
File "/home/ubuntu/prjLlamaQuant/venv/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1676, in get_vocab
raise NotImplementedError()
NotImplementedError

joyu-ai

Mar 14

•

edited Mar 14

same question. please help. thanks. (I'm using llama-2-7b-chat.Q4_K_M.gguf)

arushisharma

Mar 18

•

edited Mar 18

I'm also facing the same issue. Have you found a fix?
ctransformers 0.2.27
transformers 4.38.2

deleted

Mar 18

last i heard transformers dont support gguf. need llama.cpp for that.

TheBloke
/

Llama-2-13B-chat-GGUF

ctransformers Autotokenizer error - "not implemented"