Update README.md
Browse files
README.md
CHANGED
@@ -113,6 +113,13 @@ driver needs to be installed if you own an NVIDIA GPU. On Windows, if
|
|
113 |
you have an AMD GPU, you should install the ROCm SDK v6.1 and then pass
|
114 |
the flags `--recompile --gpu amd` the first time you run your llamafile.
|
115 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
116 |
For further information, please see the [llamafile
|
117 |
README](https://github.com/mozilla-ocho/llamafile/).
|
118 |
|
|
|
113 |
you have an AMD GPU, you should install the ROCm SDK v6.1 and then pass
|
114 |
the flags `--recompile --gpu amd` the first time you run your llamafile.
|
115 |
|
116 |
+
On NVIDIA GPUs, by default, the prebuilt tinyBLAS library is used to
|
117 |
+
perform matrix multiplications. This is open source software, but it
|
118 |
+
doesn't go as fast as closed source cuBLAS. If you have the CUDA SDK
|
119 |
+
installed on your system, then you can pass the `--recompile` flag to
|
120 |
+
build a GGML CUDA library just for your system that uses cuBLAS. This
|
121 |
+
ensures you get maximum performance.
|
122 |
+
|
123 |
For further information, please see the [llamafile
|
124 |
README](https://github.com/mozilla-ocho/llamafile/).
|
125 |
|