shisa-ai
/

shisa-v1-llama3-8b-gguf

Inference Endpoints

Model card Files Files and versions Community

shisa-v1-llama3-8b-gguf / README.md

leonardlin's picture

Create README.md

03fc14c verified 7 months ago

|

259 Bytes

	Tested to work correctly with multiturn w/ the llama3 chat_template:
	```
	./server -ngl 99 -m shisa-v1-llama3-8b.Q5_K_M.gguf --chat-template llama3 -fa -v
	```

	Note: BF16 GGUFs have no CUDA implementation atm: https://github.com/ggerganov/llama.cpp/issues/7211