daptheHuman commited on
Commit
d654285
1 Parent(s): 1eca92e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +93 -0
README.md ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Ichsan2895/Merak-7B-v4
3
+ license: llama2
4
+ datasets:
5
+ - allenai/c4
6
+ language:
7
+ - id
8
+ tags:
9
+ - gptq
10
+ - mistral
11
+ - indonesia
12
+ ---
13
+
14
+ # Merak-7B-v4 GPTQ
15
+ <!-- markdownlint-disable MD041 -->
16
+
17
+ <!-- header start -->
18
+ <!-- 200823 -->
19
+ <div style="margin-left: auto; margin-right: auto">
20
+ <img src="https://i.imgur.com/aMm54ZY.jpg" alt="Merak" style="width: 300px; margin:auto">
21
+ </div>
22
+ <hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
23
+ <!-- header end -->
24
+
25
+ Utilize the [c4/id]("https://huggingface.co/datasets/allenai/c4/blob/main/multilingual/c4-id.tfrecord-00000-of-01024.json.gz") dataset for the quantization process.
26
+
27
+ [Merak-7B-v4 GPTQ]("https://huggingface.co/daptheHuman/Merak-7B-v4-GPTQ") is GPTQ version of [Ichsan2895/Merak-7B-v4](https://huggingface.co/Ichsan2895/Merak-7B-v4)
28
+
29
+ ## Python code example: inference from this GPTQ model
30
+
31
+ ### Install the necessary packages
32
+
33
+ Requires: Transformers 4.33.0 or later, Optimum 1.12.0 or later, and AutoGPTQ 0.4.2 or later.
34
+
35
+ ```shell
36
+ pip3 install --upgrade transformers optimum
37
+ # If using PyTorch 2.1 + CUDA 12.x:
38
+ pip3 install --upgrade auto-gptq
39
+ # or, if using PyTorch 2.1 + CUDA 11.x:
40
+ pip3 install --upgrade auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/
41
+ ```
42
+
43
+ If you are using PyTorch 2.0, you will need to install AutoGPTQ from source. Likewise if you have problems with the pre-built wheels, you should try building from source:
44
+
45
+ ```shell
46
+ pip3 uninstall -y auto-gptq
47
+ git clone https://github.com/PanQiWei/AutoGPTQ
48
+ cd AutoGPTQ
49
+ git checkout v0.5.1
50
+ pip3 install .
51
+ ```
52
+
53
+ ### Example Python code
54
+
55
+ ```python
56
+ from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
57
+ model_name_or_path = "daptheHuman/Merak-7B-v4-GPTQ"
58
+ # To use a different branch, change revision
59
+ # For example: revision="gptq-4bit-32g-actorder_True"
60
+ model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
61
+ device_map="auto",
62
+ trust_remote_code=False,
63
+ revision="main")
64
+ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
65
+ prompt = "Tell me about AI"
66
+ prompt_template=f'''### Instruction:
67
+ {prompt}
68
+ ### Response:
69
+ '''
70
+ print("\n\n*** Generate:")
71
+ input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
72
+ output = model.generate(inputs=input_ids, temperature=0.7, do_sample=True, top_p=0.95, top_k=40, max_new_tokens=512)
73
+ print(tokenizer.decode(output[0]))
74
+ # Inference can also be done using transformers' pipeline
75
+ print("*** Pipeline:")
76
+ pipe = pipeline(
77
+ "text-generation",
78
+ model=model,
79
+ tokenizer=tokenizer,
80
+ max_new_tokens=512,
81
+ do_sample=True,
82
+ temperature=0.7,
83
+ top_p=0.95,
84
+ top_k=40,
85
+ repetition_penalty=1.1
86
+ )
87
+ print(pipe(prompt_template)[0]['generated_text'])
88
+ ```
89
+
90
+
91
+ ## Credits
92
+ [TheBloke](https://huggingface.co/TheBloke/) for README template.
93
+ [asyafiqe](https://huggingface.co/asyafiqe/) for v3-GPTQ inspiration.