Delta-Vector
/

Holland-4B-GGUF

Inference Endpoints

Model card Files Files and versions Community

Delta-Vector commited on Aug 27

Commit

a6eaa71

•

1 Parent(s): 0589e74

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -40,6 +40,8 @@ Can I ask a question?<|im_end|>
 ## Support
 To run inference on this model, you'll need to use Aphrodite, vLLM or EXL 2/tabbyAPI, as llama.cpp hasn't yet merged the required pull request to fix the llama 3.1 rope_freqs issue with custom head dimensions.
 However, you can work around this by quantizing the model yourself to create a functional GGUF file. Note that until [this PR](https://github.com/ggerganov/llama.cpp/pull/9141) is merged, the context will be limited to 8 k tokens.

 ## Support
+## No longer needed - Just update to the latest version of LCPP or wait for support
 To run inference on this model, you'll need to use Aphrodite, vLLM or EXL 2/tabbyAPI, as llama.cpp hasn't yet merged the required pull request to fix the llama 3.1 rope_freqs issue with custom head dimensions.
 However, you can work around this by quantizing the model yourself to create a functional GGUF file. Note that until [this PR](https://github.com/ggerganov/llama.cpp/pull/9141) is merged, the context will be limited to 8 k tokens.