leafspark commited on
Commit
67a2f13
1 Parent(s): 25a7ee6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -12,7 +12,13 @@ pipeline_tag: text-generation
12
 
13
  # leafspark/llama-3-8b-instruct-gradient-4194k.Q8_0-GGUF
14
 
15
- # Please use iMatrix quants to avoid any output issues, currently debugging the issue
 
 
 
 
 
 
16
 
17
  This model was converted to GGUF format from [`gradientai/Llama-3-8B-Instruct-Gradient-4194k`](https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-4194k) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
18
  Refer to the [original model card](https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-4194k) for more details on the model.
 
12
 
13
  # leafspark/llama-3-8b-instruct-gradient-4194k.Q8_0-GGUF
14
 
15
+ # Fixing prompt format issues
16
+ - Use iMatrix for Llama 3 prompt format on Q4 and below, or try Q4_K_M fixed
17
+ - Use ChatML for Q6 and below
18
+ - Use any format for f16
19
+
20
+ # Issues
21
+ - Context length is not defined correctly in quant, not sure if this is a llama.cpp issue
22
 
23
  This model was converted to GGUF format from [`gradientai/Llama-3-8B-Instruct-Gradient-4194k`](https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-4194k) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
24
  Refer to the [original model card](https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-4194k) for more details on the model.