alpaca-native-4bit / README.md
ozcur's picture
Update README.md
5255d12
|
raw
history blame
1.76 kB

This is 4-bit quantization of chavinlo/alpaca-native (cecc16d) via qwopqwop200/GPTQ-for-LLaMa (5cdfad2).

Quantization invoked as such:

llama.py /output/path c4 --wbits 4 --groupsize 128 --save alpaca7b-4bit.pt

Inference example from the GPTQ repo and commit referenced above:

(gptq) [root@gpu03 GPTQ-for-LLaMa]# CUDA_VISIBLE_DEVICES=0 python llama_inference.py /root/alpaca-native-4bit --wbits 4 --groupsize 128 --load /root/alpaca-native-4bit/alpaca7b-4bit.pt --max_length 300 --text "$(cat test_prompt.txt)"
Loading model ...
Done.
  ### Instruction: What is an alpaca? How is it different from a llama?
### Response:  Alpacas are soft and gentle, while llamas are stubborn and independent.</s>
(gptq) [root@gpu03 GPTQ-for-LLaMa]# CUDA_VISIBLE_DEVICES=0 python llama_inference.py /root/alpaca-native-4bit --wbits 4 --groupsize 128 --load /root/alpaca-native-4bit/alpaca7b-4bit.pt --max_length 300 --text "$(cat test_prompt.txt)"
Loading model ...
Done.
  ### Instruction: What is an alpaca? How is it different from a llama?
### Response:  An alpaca is a small, domesticated species of livestock from the Andes region of South America. It is typically kept as a pet, and its fibers can be used for various purposes, such as making clothing and crafts. Alpacas are typically brown or black, and their ears and tails are often moved.
Although it is different from a llama, the two animals are often compared to when referring to their behavior.</s>
(gptq) [root@gpu03 GPTQ-for-LLaMa]# md5sum /root/alpaca-native-4bit/alpaca7b-4bit.pt
74849953cc54e313b972d2cc9a05c24b  /root/alpaca-native-4bit/alpaca7b-4bit.pt
(gptq) [root@gpu03 GPTQ-for-LLaMa]#