Lewdiculous
/

Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix

Inference Endpoints

Model card Files Files and versions Community

Lewdiculous commited on May 12, 2024

Commit

98e783d

·

verified ·

1 Parent(s): cc67229

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -9,6 +9,7 @@ tags:
 > [!IMPORTANT]
 > **Updated!** <br>
 > Version (**v2**) files added! With imatrix data generated from the FP16 and conversions directly from the BF16. <br>
 > Hopefully avoiding any losses in the model conversion, as has been the recently discussed topic on Llama-3 and GGUF lately. <br>
 > If you are able to test them and notice any issues let me know in the discussions.

 > [!IMPORTANT]
 > **Updated!** <br>
 > Version (**v2**) files added! With imatrix data generated from the FP16 and conversions directly from the BF16. <br>
+> This is a more disk and compute intensive so lets hope we get GPU inference support for BF16 models in llama.cpp. <br>
 > Hopefully avoiding any losses in the model conversion, as has been the recently discussed topic on Llama-3 and GGUF lately. <br>
 > If you are able to test them and notice any issues let me know in the discussions.