Upload tinyllama-1.1b-chat-v1.0.Q4_1.gguf
#3
by
jbochi
- opened
I've been working on adding GGUF support to MLX, and Q4_1 seems like the format that's the most aligned with MLX quantization. The quantization error is also a bit better than Q4_0 (tested with gguf-tools)