NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF · Error when trying to quantize

Error: Error converting to fp16: b'INFO:hf-to-gguf:Loading model: Hermes-2-Theta-Llama-3-8B\nINFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only\nINFO:hf-to-gguf:Set model parameters\nINFO:hf-to-gguf:gguf: context length = 8192\nINFO:hf-to-gguf:gguf: embedding length = 4096\nINFO:hf-to-gguf:gguf: feed forward length = 14336\nINFO:hf-to-gguf:gguf: head count = 32\nINFO:hf-to-gguf:gguf: key-value head count = 8\nINFO:hf-to-gguf:gguf: rope theta = 500000.0\nINFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05\nINFO:hf-to-gguf:gguf: file type = 1\nINFO:hf-to-gguf:Set model tokenizer\nSpecial tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.\nTraceback (most recent call last):\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 1363, in set_vocab\n self. _set_vocab_sentencepiece()\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 586, in _set_vocab_sentencepiece\n tokens, scores, toktypes = self._create_vocab_sentencepiece()\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 607, in _create_vocab_sentencepiece\n raise FileNotFoundError(f"File not found: {tokenizer_path}")\nFileNotFoundError: File not found: Hermes-2-Theta-Llama-3-8B/tokenizer.model\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 1366, in set_vocab\n self._set_vocab_llama_hf()\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 662, in _set_vocab_llama_hf\n vocab = gguf.LlamaHfVocab(self.dir_model)\n File "/home/user/app/llama.cpp/gguf-py/gguf/vocab.py", line 362, in init\n raise TypeError('Llama 3 must be converted with BpeVocab')\nTypeError: Llama 3 must be converted with BpeVocab\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 3551, in \n main()\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 3535, in main\n model_instance.set_vocab()\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 1369, in set_vocab\n self._set_vocab_gpt2()\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 522, in _set_vocab_gpt2\n tokens, toktypes, tokpre = self.get_vocab_base()\n File "/home/user/app/llama.cpp/convert_hf_to_gguf.py", line 384, in get_vocab_base\n assert max(tokenizer.vocab.values()) < vocab_size\nAssertionError\n'