Text Generation
GGUF
Japanese
mistral
mixtral
Merge
Mixture of Experts
Not-For-All-Audiences
nsfw
Inference Endpoints
Upload README.md
Browse files
README.md
ADDED
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- ja
|
4 |
+
tags:
|
5 |
+
- mistral
|
6 |
+
- mixtral
|
7 |
+
- not-for-all-audiences
|
8 |
+
- nsfw
|
9 |
+
pipeline_tag: text-generation
|
10 |
+
---
|
11 |
+
|
12 |
+
# chatntq_chatvector-MoE-Antler_chatvector-2x7B-GGUF
|
13 |
+
|
14 |
+
[Sdff-Ltba/chatntq_chatvector-MoE-Antler_chatvector-2x7B](https://huggingface.co/Sdff-Ltba/chatntq_chatvector-MoE-Antler_chatvector-2x7B)をGGUF変換したものです。
|
15 |
+
iMatrixを併用して量子化しています。
|
16 |
+
|
17 |
+
## 量子化手順
|
18 |
+
|
19 |
+
以下の通りに実行しました。
|
20 |
+
```
|
21 |
+
python ./llama.cpp/convert.py ./chatntq_chatvector-MoE-Antler_chatvector-2x7B --outtype f16 --outfile ./gguf-model_f16.gguf
|
22 |
+
./llama.cpp/imatrix -m ./gguf-model_f16.gguf -f ./wiki.train.raw -o ./gguf-model_f16.imatrix --chunks 32
|
23 |
+
./llama.cpp/quantize --imatrix ./gguf-model_f16.imatrix ./gguf-model_f16.gguf ./chatntq_chatvector-MoE-Antler_chatvector-2x7B_iq3xxs.gguf iq3_xxs
|
24 |
+
```
|