MaziyarPanahi
/

WizardLM-2-8x22B-GGUF

Text Generation

4-bit precision

8-bit precision

arxiv:2304.12244

arxiv:2306.08568

arxiv:2308.09583

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

add shareded model example

#5

by MaziyarPanahi - opened Apr 15

base: refs/heads/main

←

from: refs/pr/5

Discussion Files changed

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -35,6 +35,14 @@ quantized_by: MaziyarPanahi
 ## Description
 [MaziyarPanahi/WizardLM-2-8x22B-GGUF](https://huggingface.co/MaziyarPanahi/WizardLM-2-8x22B-GGUF) contains GGUF format model files for [microsoft/WizardLM-2-8x22B](https://huggingface.co/microsoft/WizardLM-2-8x22B).
 ## Prompt template

 ## Description
 [MaziyarPanahi/WizardLM-2-8x22B-GGUF](https://huggingface.co/MaziyarPanahi/WizardLM-2-8x22B-GGUF) contains GGUF format model files for [microsoft/WizardLM-2-8x22B](https://huggingface.co/microsoft/WizardLM-2-8x22B).
+## Load sharded model
+`llama_load_model_from_file` will detect the number of files and will load additional tensors from the rest of files.
+```sh
+llama.cpp/main -m WizardLM-2-8x22B.Q2_K-00001-of-00005.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 1024 -e
+```
 ## Prompt template