a2ran
/

GPTeacher-llama2-ko-13b

Model card Files Files and versions Community

a2ran commited on Sep 8, 2023

Commit

5b04d60

·

1 Parent(s): 680147a

Update README.md

Files changed (1) hide show

README.md +51 -1

README.md CHANGED Viewed

@@ -1,7 +1,57 @@
 ---
 library_name: peft
 ---
-## Training procedure
 The following `bitsandbytes` quantization config was used during training:

 ---
 library_name: peft
 ---
+# WIP
+## 1. 사용절차
+* Install model and PEFT parameters
+```
+import torch
+from peft import PeftModel, PeftConfig
+from transformers import AutoTokenizer, AutoModelForCausalLM, GPTQConfig
+model_id = "TheBloke/WizardLM-13B-V1.2-GPTQ"
+config = PeftConfig.from_pretrained("a2ran/GPTeacher_ko_llama2_13B")
+tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
+quantization_config_loading = GPTQConfig(bits=4, disable_exllama=True)
+model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=quantization_config_loading,
+                                             torch_dtype=torch.float16, device_map="auto")
+model = PeftModel.from_pretrained(model, "a2ran/GPTeacher_ko_llama2_13B")
+```
+* How to Generate Tokens
+```
+from transformers import TextStreamer
+streamer = TextStreamer(tokenizer)
+# your input sentence가 들어갈 곳
+input = """
+### input @ 미국의 행정시스템에 대해 설명해줘.\n\n### response @"""
+output = tokenizer.decode(model.cuda().generate(
+    **tokenizer(
+        input,
+        return_tensors='pt',
+    ).to(0),
+    max_new_tokens = 2048,
+    temperature = 1.2,
+    top_p = 0.7,
+    early_stopping = True,
+    eos_token_id = 2,
+    do_sample = True,
+    repetition_penalty = 1.1,
+    streamer = streamer
+)[0]).replace(input+" ", "")
+```
+## 2. Training procedure
 The following `bitsandbytes` quantization config was used during training: