jartine commited on
Commit
c6693eb
1 Parent(s): 2f7275b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -6
README.md CHANGED
@@ -61,12 +61,8 @@ model. You can prompt the model for completions on the command line too:
61
  ```
62
 
63
  This model has a max context window size of 128k tokens. By default, a
64
- context window size of 512 tokens is used. You can use a larger context
65
- window by passing the `-c 8192` flag. The software currently has
66
- limitations that may prevent scaling to the full 128k size. See our
67
- [Phi-3-medium-128k-instruct-llamafile](https://huggingface.co/Mozilla/Phi-3-medium-128k-instruct-llamafile)
68
- repository for llamafiles that are known to work with a 128kb context
69
- size.
70
 
71
  On GPUs with sufficient RAM, the `-ngl 999` flag may be passed to use
72
  the system's NVIDIA or AMD GPU(s). On Windows, only the graphics card
 
61
  ```
62
 
63
  This model has a max context window size of 128k tokens. By default, a
64
+ context window size of 8192 tokens is used. You can use a larger context
65
+ window by passing the `-c 131072` flag.
 
 
 
 
66
 
67
  On GPUs with sufficient RAM, the `-ngl 999` flag may be passed to use
68
  the system's NVIDIA or AMD GPU(s). On Windows, only the graphics card