add shareded model example
#5
by
MaziyarPanahi
- opened
README.md
CHANGED
@@ -35,6 +35,14 @@ quantized_by: MaziyarPanahi
|
|
35 |
## Description
|
36 |
[MaziyarPanahi/WizardLM-2-8x22B-GGUF](https://huggingface.co/MaziyarPanahi/WizardLM-2-8x22B-GGUF) contains GGUF format model files for [microsoft/WizardLM-2-8x22B](https://huggingface.co/microsoft/WizardLM-2-8x22B).
|
37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
38 |
|
39 |
## Prompt template
|
40 |
|
|
|
35 |
## Description
|
36 |
[MaziyarPanahi/WizardLM-2-8x22B-GGUF](https://huggingface.co/MaziyarPanahi/WizardLM-2-8x22B-GGUF) contains GGUF format model files for [microsoft/WizardLM-2-8x22B](https://huggingface.co/microsoft/WizardLM-2-8x22B).
|
37 |
|
38 |
+
## Load sharded model
|
39 |
+
|
40 |
+
`llama_load_model_from_file` will detect the number of files and will load additional tensors from the rest of files.
|
41 |
+
|
42 |
+
```sh
|
43 |
+
llama.cpp/main -m WizardLM-2-8x22B.Q2_K-00001-of-00005.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 1024 -e
|
44 |
+
```
|
45 |
+
|
46 |
|
47 |
## Prompt template
|
48 |
|