jartine commited on
Commit
dfd4efd
1 Parent(s): 2979e80

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -50,10 +50,11 @@ AMD64.
50
 
51
  ## About Quantization Formats
52
 
53
- Your choice of quantization format depends on two things:
54
 
55
  1. Will it fit in RAM or VRAM?
56
  2. Is your use case reading (e.g. summarization) or writing (e.g. chatbot)?
 
57
 
58
  Good quants for writing (eval speed) are Q5\_K\_M, and Q4\_0. Text
59
  generation is bounded by memory speed, so smaller quants help.
 
50
 
51
  ## About Quantization Formats
52
 
53
+ Your choice of quantization format depends on three things:
54
 
55
  1. Will it fit in RAM or VRAM?
56
  2. Is your use case reading (e.g. summarization) or writing (e.g. chatbot)?
57
+ 3. llamafiles bigger than 4.30 GB are hard to run on Windows (see [gotchas](https://github.com/mozilla-ocho/llamafile/?tab=readme-ov-file#gotchas))
58
 
59
  Good quants for writing (eval speed) are Q5\_K\_M, and Q4\_0. Text
60
  generation is bounded by memory speed, so smaller quants help.