dahara1
/

weblab-10b-instruction-sft-GPTQ

Text Generation

text-generation-inference

Model card Files Files and versions Community

dahara1 commited on Aug 23, 2023

Commit

36dec03

·

1 Parent(s): 9b22ede

Update README.md

Files changed (1) hide show

README.md +11 -1

README.md CHANGED Viewed

@@ -1,20 +1,30 @@
 ---
 inference: false
 ---
 # weblab-10b-instruction-sft-GPTQ
 original model [weblab-10b-instruction-sft](https://huggingface.co/matsuo-lab/weblab-10b-instruction-sft) which is a Japanese-centric multilingual GPT-NeoX model of 10 billion parameters.
-This model is 4bit GPTQ Version.
 The size is smaller and the execution speed is faster, but the inference performance may be a little worse.
 You need autoGPTQ library to use this model.
 ### sample code
 At least one GPU is currently required due to a limitation of the Accelerate library.
 So this model cannot be run with the huggingface space free version.
 Try it on [Google Colab Under development](https://github.com/webbigdata-jp/python_sample/blob/main/weblab_10b_instruction_sft_GPTQ_sample.ipynb)
 ```
 pip install auto-gptq
 ```

 ---
 inference: false
+language:
+  - ja
 ---
 # weblab-10b-instruction-sft-GPTQ
 original model [weblab-10b-instruction-sft](https://huggingface.co/matsuo-lab/weblab-10b-instruction-sft) which is a Japanese-centric multilingual GPT-NeoX model of 10 billion parameters.
+This model is A quantized(miniaturized) version of the original model.
+There are currently two well-known quantization methods.
+(1)GPTQ(This model)
 The size is smaller and the execution speed is faster, but the inference performance may be a little worse.
 You need autoGPTQ library to use this model.
+(2)llama.cpp([matsuolab-weblab-10b-instruction-sft-gguf](https://huggingface.co/mmnga/matsuolab-weblab-10b-instruction-sft-gguf)) created by mmnga.
+You can use cpu only machine. but little bit slow especialy long text.
 ### sample code
 At least one GPU is currently required due to a limitation of the Accelerate library.
 So this model cannot be run with the huggingface space free version.
 Try it on [Google Colab Under development](https://github.com/webbigdata-jp/python_sample/blob/main/weblab_10b_instruction_sft_GPTQ_sample.ipynb)
 ```
 pip install auto-gptq
 ```