itlwas commited on
Commit
af4d284
·
verified ·
1 Parent(s): 442d88a

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +92 -0
README.md ADDED
@@ -0,0 +1,92 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: ehristoforu/Gistral-16B
3
+ datasets:
4
+ - HuggingFaceH4/grok-conversation-harmless
5
+ - HuggingFaceH4/ultrachat_200k
6
+ - HuggingFaceH4/ultrafeedback_binarized_fixed
7
+ - HuggingFaceH4/cai-conversation-harmless
8
+ - meta-math/MetaMathQA
9
+ - emozilla/yarn-train-tokenized-16k-mistral
10
+ - snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset
11
+ - microsoft/orca-math-word-problems-200k
12
+ - m-a-p/Code-Feedback
13
+ - teknium/openhermes
14
+ - lksy/ru_instruct_gpt4
15
+ - IlyaGusev/ru_turbo_saiga
16
+ - IlyaGusev/ru_sharegpt_cleaned
17
+ - IlyaGusev/oasst1_ru_main_branch
18
+ library_name: transformers
19
+ tags:
20
+ - mistral
21
+ - gistral
22
+ - gistral-16b
23
+ - multilingual
24
+ - code
25
+ - 128k
26
+ - metamath
27
+ - grok-1
28
+ - anthropic
29
+ - openhermes
30
+ - instruct
31
+ - merge
32
+ - llama-cpp
33
+ - gguf-my-repo
34
+ language:
35
+ - en
36
+ - fr
37
+ - ru
38
+ - de
39
+ - ja
40
+ - ko
41
+ - zh
42
+ - it
43
+ - uk
44
+ - multilingual
45
+ - code
46
+ pipeline_tag: text-generation
47
+ license: apache-2.0
48
+ ---
49
+
50
+ # itlwas/Gistral-16B-Q4_K_M-GGUF
51
+ This model was converted to GGUF format from [`ehristoforu/Gistral-16B`](https://huggingface.co/ehristoforu/Gistral-16B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
52
+ Refer to the [original model card](https://huggingface.co/ehristoforu/Gistral-16B) for more details on the model.
53
+
54
+ ## Use with llama.cpp
55
+ Install llama.cpp through brew (works on Mac and Linux)
56
+
57
+ ```bash
58
+ brew install llama.cpp
59
+
60
+ ```
61
+ Invoke the llama.cpp server or the CLI.
62
+
63
+ ### CLI:
64
+ ```bash
65
+ llama-cli --hf-repo itlwas/Gistral-16B-Q4_K_M-GGUF --hf-file gistral-16b-q4_k_m.gguf -p "The meaning to life and the universe is"
66
+ ```
67
+
68
+ ### Server:
69
+ ```bash
70
+ llama-server --hf-repo itlwas/Gistral-16B-Q4_K_M-GGUF --hf-file gistral-16b-q4_k_m.gguf -c 2048
71
+ ```
72
+
73
+ Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
74
+
75
+ Step 1: Clone llama.cpp from GitHub.
76
+ ```
77
+ git clone https://github.com/ggerganov/llama.cpp
78
+ ```
79
+
80
+ Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
81
+ ```
82
+ cd llama.cpp && LLAMA_CURL=1 make
83
+ ```
84
+
85
+ Step 3: Run inference through the main binary.
86
+ ```
87
+ ./llama-cli --hf-repo itlwas/Gistral-16B-Q4_K_M-GGUF --hf-file gistral-16b-q4_k_m.gguf -p "The meaning to life and the universe is"
88
+ ```
89
+ or
90
+ ```
91
+ ./llama-server --hf-repo itlwas/Gistral-16B-Q4_K_M-GGUF --hf-file gistral-16b-q4_k_m.gguf -c 2048
92
+ ```