daptheHuman
/

Merak-7B-v4-GPTQ

Text Generation

Model card Files Files and versions Community

daptheHuman commited on Dec 11, 2023

Commit

d654285

•

1 Parent(s): 1eca92e

Create README.md

Files changed (1) hide show

README.md +93 -0

README.md ADDED Viewed

	@@ -0,0 +1,93 @@

+---
+base_model: Ichsan2895/Merak-7B-v4
+license: llama2
+datasets:
+- allenai/c4
+language:
+- id
+tags:
+- gptq
+- mistral
+- indonesia
+---
+# Merak-7B-v4 GPTQ
+<!-- markdownlint-disable MD041 -->
+<!-- header start -->
+<!-- 200823 -->
+<div style="margin-left: auto; margin-right: auto">
+<img src="https://i.imgur.com/aMm54ZY.jpg" alt="Merak" style="width: 300px; margin:auto">
+</div>
+<hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
+<!-- header end -->
+Utilize the [c4/id]("https://huggingface.co/datasets/allenai/c4/blob/main/multilingual/c4-id.tfrecord-00000-of-01024.json.gz") dataset for the quantization process.
+[Merak-7B-v4 GPTQ]("https://huggingface.co/daptheHuman/Merak-7B-v4-GPTQ") is GPTQ version of [Ichsan2895/Merak-7B-v4](https://huggingface.co/Ichsan2895/Merak-7B-v4)
+## Python code example: inference from this GPTQ model
+### Install the necessary packages
+Requires: Transformers 4.33.0 or later, Optimum 1.12.0 or later, and AutoGPTQ 0.4.2 or later.
+```shell
+pip3 install --upgrade transformers optimum
+# If using PyTorch 2.1 + CUDA 12.x:
+pip3 install --upgrade auto-gptq
+# or, if using PyTorch 2.1 + CUDA 11.x:
+pip3 install --upgrade auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/
+```
+If you are using PyTorch 2.0, you will need to install AutoGPTQ from source. Likewise if you have problems with the pre-built wheels, you should try building from source:
+```shell
+pip3 uninstall -y auto-gptq
+git clone https://github.com/PanQiWei/AutoGPTQ
+cd AutoGPTQ
+git checkout v0.5.1
+pip3 install .
+```
+### Example Python code
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
+model_name_or_path = "daptheHuman/Merak-7B-v4-GPTQ"
+# To use a different branch, change revision
+# For example: revision="gptq-4bit-32g-actorder_True"
+model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
+                                             device_map="auto",
+                                             trust_remote_code=False,
+                                             revision="main")
+tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
+prompt = "Tell me about AI"
+prompt_template=f'''### Instruction:
+{prompt}
+### Response:
+'''
+print("\n\n*** Generate:")
+input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
+output = model.generate(inputs=input_ids, temperature=0.7, do_sample=True, top_p=0.95, top_k=40, max_new_tokens=512)
+print(tokenizer.decode(output[0]))
+# Inference can also be done using transformers' pipeline
+print("*** Pipeline:")
+pipe = pipeline(
+    "text-generation",
+    model=model,
+    tokenizer=tokenizer,
+    max_new_tokens=512,
+    do_sample=True,
+    temperature=0.7,
+    top_p=0.95,
+    top_k=40,
+    repetition_penalty=1.1
+)
+print(pipe(prompt_template)[0]['generated_text'])
+```
+## Credits
+[TheBloke](https://huggingface.co/TheBloke/) for README template.
+[asyafiqe](https://huggingface.co/asyafiqe/) for v3-GPTQ inspiration.