Update README.md
Browse files
README.md
CHANGED
@@ -1,20 +1,30 @@
|
|
1 |
---
|
2 |
inference: false
|
|
|
|
|
3 |
---
|
4 |
# weblab-10b-instruction-sft-GPTQ
|
5 |
|
6 |
original model [weblab-10b-instruction-sft](https://huggingface.co/matsuo-lab/weblab-10b-instruction-sft) which is a Japanese-centric multilingual GPT-NeoX model of 10 billion parameters.
|
7 |
|
8 |
-
This model is
|
|
|
|
|
|
|
9 |
The size is smaller and the execution speed is faster, but the inference performance may be a little worse.
|
10 |
You need autoGPTQ library to use this model.
|
11 |
|
|
|
|
|
|
|
|
|
12 |
### sample code
|
13 |
At least one GPU is currently required due to a limitation of the Accelerate library.
|
14 |
So this model cannot be run with the huggingface space free version.
|
15 |
Try it on [Google Colab Under development](https://github.com/webbigdata-jp/python_sample/blob/main/weblab_10b_instruction_sft_GPTQ_sample.ipynb)
|
16 |
|
17 |
|
|
|
18 |
```
|
19 |
pip install auto-gptq
|
20 |
```
|
|
|
1 |
---
|
2 |
inference: false
|
3 |
+
language:
|
4 |
+
- ja
|
5 |
---
|
6 |
# weblab-10b-instruction-sft-GPTQ
|
7 |
|
8 |
original model [weblab-10b-instruction-sft](https://huggingface.co/matsuo-lab/weblab-10b-instruction-sft) which is a Japanese-centric multilingual GPT-NeoX model of 10 billion parameters.
|
9 |
|
10 |
+
This model is A quantized(miniaturized) version of the original model.
|
11 |
+
|
12 |
+
There are currently two well-known quantization methods.
|
13 |
+
(1)GPTQ(This model)
|
14 |
The size is smaller and the execution speed is faster, but the inference performance may be a little worse.
|
15 |
You need autoGPTQ library to use this model.
|
16 |
|
17 |
+
(2)llama.cpp([matsuolab-weblab-10b-instruction-sft-gguf](https://huggingface.co/mmnga/matsuolab-weblab-10b-instruction-sft-gguf)) created by mmnga.
|
18 |
+
You can use cpu only machine. but little bit slow especialy long text.
|
19 |
+
|
20 |
+
|
21 |
### sample code
|
22 |
At least one GPU is currently required due to a limitation of the Accelerate library.
|
23 |
So this model cannot be run with the huggingface space free version.
|
24 |
Try it on [Google Colab Under development](https://github.com/webbigdata-jp/python_sample/blob/main/weblab_10b_instruction_sft_GPTQ_sample.ipynb)
|
25 |
|
26 |
|
27 |
+
|
28 |
```
|
29 |
pip install auto-gptq
|
30 |
```
|