mradermacher
/

Llama-3-8B-Ultra-Instruct-GGUF

Inference Endpoints

Model card Files Files and versions Community

mradermacher commited on Apr 30

Commit

76a93fb

•

1 Parent(s): 99c859b

auto-patch README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ static quants of https://huggingface.co/elinas/Llama-3-8B-Ultra-Instruct
 You should use `--override-kv tokenizer.ggml.pre=str:llama3` and a current llama.cpp version to work around a bug in llama.cpp that made these quants. (see https://old.reddit.com/r/LocalLLaMA/comments/1cg0z1i/bpe_pretokenization_support_is_now_merged_llamacpp/?share_id=5dBFB9x0cOJi8vbr-Murh)
 <!-- provided-files -->
-weighted/imatrix quants seem not to be available (by me) at this time. If they do not show up a week or so after the static ones, I have probably not planned for them. Feel free to request them by opening a Community Discussion.
 ## Usage
 If you are unsure how to use GGUF files, refer to one of [TheBloke's

 You should use `--override-kv tokenizer.ggml.pre=str:llama3` and a current llama.cpp version to work around a bug in llama.cpp that made these quants. (see https://old.reddit.com/r/LocalLLaMA/comments/1cg0z1i/bpe_pretokenization_support_is_now_merged_llamacpp/?share_id=5dBFB9x0cOJi8vbr-Murh)
 <!-- provided-files -->
+weighted/imatrix quants are available at https://huggingface.co/mradermacher/Llama-3-8B-Ultra-Instruct-i1-GGUF
 ## Usage
 If you are unsure how to use GGUF files, refer to one of [TheBloke's