Edit model card

ggml versions of OpenLLaMa 3B

Use with llama.cpp

Support is now merged to master branch.

Newer quantizations

There are now more quantization types in llama.cpp, some lower than 4 bits. Currently these are not supported, maybe because some weights have shapes that don't divide by 256.

Perplexity on wiki.test.raw

Q chunk 600BT 1000BT
F16 [616] 8.4656 7.7861
Q8_0 [616] 8.4667 7.7874
Q5_1 [616] 8.5072 7.8424
Q5_0 [616] 8.5156 7.8474
Q4_1 [616] 8.6102 8.0483
Q4_0 [616] 8.6674 8.0962
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .