shisa-ai
/

shisa-v1-llama3-8b-gguf

Inference Endpoints

Model card Files Files and versions Community

leonardlin commited on May 27, 2024

Commit

03fc14c

·

verified ·

1 Parent(s): 710c342

Create README.md

Files changed (1) hide show

README.md +6 -0

README.md ADDED Viewed

	@@ -0,0 +1,6 @@

+Tested to work correctly with multiturn w/ the llama3 chat_template:
+```
+./server -ngl 99 -m shisa-v1-llama3-8b.Q5_K_M.gguf --chat-template llama3 -fa -v
+```
+Note: BF16 GGUFs have no CUDA implementation atm: https://github.com/ggerganov/llama.cpp/issues/7211