starble-dev
commited on
Commit
•
7310274
1
Parent(s):
5332a93
Update README.md
Browse files
README.md
CHANGED
@@ -13,7 +13,30 @@ tags:
|
|
13 |
This model is the original Mistral-Nemo-Instruct-2407 converted to GGUF and quantized using **llama.cpp**.
|
14 |
|
15 |
**How to Use:**
|
16 |
-
As of July 19, 2024, llama.cpp does not support Mistral-Nemo-Instruct-2407. However, you can
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
|
18 |
**License:**
|
19 |
Apache 2.0
|
|
|
13 |
This model is the original Mistral-Nemo-Instruct-2407 converted to GGUF and quantized using **llama.cpp**.
|
14 |
|
15 |
**How to Use:**
|
16 |
+
As of July 19, 2024, llama.cpp does not support Mistral-Nemo-Instruct-2407. However, you can use it by building from source using iamlemec's branch **mistral-nemo** at [llama.cpp GitHub repository](https://github.com/iamlemec/llama.cpp/tree/mistral-nemo).
|
17 |
+
|
18 |
+
```
|
19 |
+
git clone -b mistral-nemo https://github.com/iamlemec/llama.cpp.git
|
20 |
+
cd llama.cpp
|
21 |
+
cmake -B build
|
22 |
+
cmake --build build --config Release
|
23 |
+
```
|
24 |
+
|
25 |
+
Recommended to use `cmake -B build -DGGML_CUDA=ON` if you're using a CUDA compatible GPU.
|
26 |
+
|
27 |
+
If the build takes too long use `cmake -B build --config Release -j 4`, which uses 4 threads to build. Adjust the number to the amount of physical cores on your CPU.
|
28 |
+
|
29 |
+
Use:
|
30 |
+
```
|
31 |
+
llama-server.exe -m .\models\Mistral-Nemo-12B-Instruct-2407-Q8_0.gguf -b 512 -ub 512 -c 4096 -ngl 100
|
32 |
+
```
|
33 |
+
|
34 |
+
Set `-b` to batch size
|
35 |
+
Set `-ub` to physical batch size
|
36 |
+
Set `-c` to context size
|
37 |
+
Set `-ngl` to amount of layers to load onto GPU
|
38 |
+
Change the path to where the model is actually stored.
|
39 |
+
If you need more clarification on parameters check out the [llama.cpp Server Docs](https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md)
|
40 |
|
41 |
**License:**
|
42 |
Apache 2.0
|