--- library_name: peft base_model: sysong11/dapt-kogpt --- # Model Card for Model ID This repo contains a low-rank adapter for [domain-adapted KoGPT](https://huggingface.co/sysong11/dapt-kogpt) fit on [a small supervised tuning dataset for summarization](https://huggingface.co/datasets/sysong11/sum_train_rev). ## How to Get Started with the Model ```python import json from random import randrange import torch from peft import LoraConfig, get_peft_model from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig from peft import PeftModel model1 = AutoModelForCausalLM.from_pretrained( "sysong11/dapt-kogpt", torch_dtype="auto", device_map="auto" ) lora_path = "sysong11/dapt-kogpt-sum-adapter" model2 = PeftModel.from_pretrained(model1, lora_path, device_map="auto") tokenizer = AutoTokenizer.from_pretrained(lora_path) test_data = [] with open("./datasets/test.json", "rb") as f: for line in f: test_data.append(json.loads(line)) prompt_template = """\ <|im_start|>system {system_prompt}<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant""" msg = "Q:다음 문서를 요약 하세요, Context:{context}" ix = randrange(len(test_data)) print(ix) datapoint = test_data[ix] ref = test_data[ix]["summary_text"] system_prompt = "You are an AI assistant. User will you give you a task. Your goal is to complete the task as faithfully as you can." tokens = tokenizer.encode( prompt_template.format( system_prompt=system_prompt, prompt=msg.format(context=datapoint["original_text"]), ), return_tensors="pt", ).to(device="cuda", non_blocking=True) gen_tokens = model2.generate( input_ids=tokens, do_sample=False, temperature=0.5, max_length=1024, pad_token_id=63999, eos_token_id=63999, ) inputs = tokenizer.batch_decode([gen_tokens[0][: tokens[0].shape[0]]])[0] generated = tokenizer.batch_decode([gen_tokens[0][tokens[0].shape[0] :]])[0].replace( "<|im_end|>", "" ) print(inputs) print("generated:") print(generated) ``` ### Framework versions - PEFT 0.7.1