Chinese-Alpaca-Plus-13B-GPTQ / README.md

Add more specific instructions

eca7c74 over 1 year ago

5.32 kB

	---
	license: apache-2.0
	language:
	- zh
	- en
	pipeline_tag: question-answering
	---

	# Chinese-Alpaca-Plus-13B-GPTQ

	This is GPTQ format quantised 4bit models of [Yiming Cui's Chinese-LLaMA-Alpaca 13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca).

	It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).

	## Model Details

	### Model Description

	- Developed by: [ymcui (Yiming Cui)](https://github.com/ymcui)
	- Shared by: Known Rabbit
	- Language(s) (NLP): Chinese, English
	- License: Apache 2.0
	- Finetuned from model: LLaMA

	The original Github project: [ymcui/Chinese-LLaMA-Alpaca: 中文LLaMA&Alpaca大语言模型+本地CPU/GPU部署 (Chinese LLaMA & Alpaca LLMs)](https://github.com/ymcui/Chinese-LLaMA-Alpaca)

	> In order to promote the open research of large models in the Chinese NLP community, this project open sourced the Chinese LLaMA model and the Alpaca large model with fine-tuned instructions. Based on the original LLaMA, these models expand the Chinese vocabulary and use Chinese data for secondary pre-training, which further improves the basic semantic understanding of Chinese. At the same time, the Chinese Alpaca model further uses Chinese instruction data for fine-tuning, which significantly improves the model's ability to understand and execute instructions. For details, please refer to the technical report (Cui, Yang, and Yao, 2023).



	### Model Sources

	<!-- Provide the basic links for the model. -->

	- Repository: https://github.com/ymcui/Chinese-LLaMA-Alpaca
	- Paper: [[2304.08177] Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca](https://arxiv.org/abs/2304.08177)

	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

	### Direct Use

	<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

	#### How to easily download and use this model in text-generation-webui

	Open the text-generation-webui UI as normal.

	1. Click the Model tab.
	2. Under Download custom model or LoRA, enter `rabitt/Chinese-Alpaca-Plus-13B-GPTQ`.
	3. Click Download.
	4. Wait until it says it's finished downloading.
	5. Click the Refresh icon next to Model in the top left.
	6. In the Model drop-down: choose the model you just downloaded, `Chinese-Alpaca-Plus-13B-GPTQ`.
	7. If you see an error like `Error no file named pytorch_model.bin ...` in the bottom right, ignore it - it's temporary.
	8. Fill out the `GPTQ parameters` on the right: `Bits = 4`, `Groupsize = 128`, `model_type = Llama`
	9. Click Save settings for this model in the top right.
	10. Click Reload the Model in the top right.
	11. Once it says it's loaded, click the Text Generation tab and enter a prompt!



	## Training Details

	### Training Procedure

	1. Download models from the following links

	* Original LLaMA: https://github.com/facebookresearch/llama/pull/73
	* Chinese-LLaMA-Plus-13B

	* [ziqingyang/chinese-llama-plus-lora-13b · Hugging Face](https://huggingface.co/ziqingyang/chinese-llama-plus-lora-13b)
	* [chinese_llama_plus_lora_13b.zip_免费高速下载\|百度网盘-分享无限制](https://pan.baidu.com/s/1VGpNlrLx5zHuNzLOcTG-xw?pwd=8cvd)
	* Chinese-Alpaca-Plus-13B

	* [ziqingyang/chinese-alpaca-plus-lora-13b · Hugging Face](https://huggingface.co/ziqingyang/chinese-alpaca-plus-lora-13b)
	* [chinese_alpaca_plus_lora_13b.zip_免费高速下载\|百度网盘-分享无限制](https://pan.baidu.com/s/1Mew4EjBlejWBBB6_WW6vig?pwd=mf5w)
	2. Convert LLaMA to HuggingFace (HF) format with `convert_llama_weights_to_hf.py`

	```bash
	wget https://github.com/huggingface/transformers/raw/main/src/transformers/models/llama/convert_llama_weights_to_hf.py
	PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python \
	python convert_llama_weights_to_hf.py \
	--input_dir ./llama \
	--model_size 13B \
	--output_dir ./llama-13b-hf
	```
	3. Merge `Chinese-LLaMA-Plus-13B` and `Chinese-Alpaca-Plus-13B` into LLaMA with `merge_llama_with_chinese_lora.py`

	```bash
	wget https://github.com/ymcui/Chinese-LLaMA-Alpaca/raw/main/scripts/merge_llama_with_chinese_lora.py
	python merge_llama_with_chinese_lora.py \
	--base_model ./llama-13b-hf \
	--lora_model ./Chinese-LLaMA-Plus-LoRA-13B,./Chinese-Alpaca-Plus-LoRA-13B \
	--output_type huggingface \
	--output_dir ./Chinese-Alpaca-Plus-13B
	```
	4. Quantise the model with `GPTQ-for-LLaMa`

	```bash
	mkdir -p Chinese-Alpaca-Plus-13B-GPTQ
	git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa.git
	cd GPTQ-for-LLaMa
	# export CUDA_VISIBLE_DEVICES=0
	python llama.py ../Chinese-Alpaca-Plus-13B c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors ../Chinese-Alpaca-Plus-13B-GPTQ/Chinese-Alpaca-Plus-13B-GPTQ-4bit-128g.safetensors
	```

	## Citation

	BibTeX:

	```tex
	@article{chinese-llama-alpaca,
	title={Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca},
	author={Cui, Yiming and Yang, Ziqing and Yao, Xin},
	journal={arXiv preprint arXiv:2304.08177},
	url={https://arxiv.org/abs/2304.08177},
	year={2023}
	}
	```