Delta-Vector
commited on
Commit
•
a6eaa71
1
Parent(s):
0589e74
Update README.md
Browse files
README.md
CHANGED
@@ -40,6 +40,8 @@ Can I ask a question?<|im_end|>
|
|
40 |
|
41 |
## Support
|
42 |
|
|
|
|
|
43 |
To run inference on this model, you'll need to use Aphrodite, vLLM or EXL 2/tabbyAPI, as llama.cpp hasn't yet merged the required pull request to fix the llama 3.1 rope_freqs issue with custom head dimensions.
|
44 |
|
45 |
However, you can work around this by quantizing the model yourself to create a functional GGUF file. Note that until [this PR](https://github.com/ggerganov/llama.cpp/pull/9141) is merged, the context will be limited to 8 k tokens.
|
|
|
40 |
|
41 |
## Support
|
42 |
|
43 |
+
## No longer needed - Just update to the latest version of LCPP or wait for support
|
44 |
+
|
45 |
To run inference on this model, you'll need to use Aphrodite, vLLM or EXL 2/tabbyAPI, as llama.cpp hasn't yet merged the required pull request to fix the llama 3.1 rope_freqs issue with custom head dimensions.
|
46 |
|
47 |
However, you can work around this by quantizing the model yourself to create a functional GGUF file. Note that until [this PR](https://github.com/ggerganov/llama.cpp/pull/9141) is merged, the context will be limited to 8 k tokens.
|