mssfj
/

gemma-2-9b-bnb-4bit-chat-template

@@ -1,6 +1,9 @@
 ---
 library_name: transformers
-tags: []
 ---
 # Model Card for Model ID
@@ -14,6 +17,13 @@ tags: []
 ### Model Description
 <!-- Provide a longer summary of what this model is. -->
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
@@ -37,6 +47,66 @@ This is the model card of a 🤗 transformers model that has been pushed on the
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

 ---
 library_name: transformers
+datasets:
+- llm-jp/magpie-sft-v1.0
+base_model:
+- google/gemma-2-9b
 ---
 # Model Card for Model ID
 ### Model Description
 <!-- Provide a longer summary of what this model is. -->
+gemma-2-9bを4bit量子化しQloraでllm-jp/magpie-sft-v0.1を用いInstruction Turnnigしたモデルです。
+以下のチャットテンプレートを定義しています。
+<bos>{%- for message in messages %}
+<start_of_turn>{{ message.role }}: {{ message.content }}<end_of_turn>
+{%- endfor %}{% if add_generation_prompt %}
+<start_of_turn>assistant: {% endif %}<eos>
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
+import torch
+from peft import PeftModel, PeftConfig
+model_name = "mssfj/gemma-2-9b-bnb-4bit-chat-template"
+lora_weight = "mssfj/gemma-2-9b-4bit-magpie"
+# 量子化設定
+quantization_config = BitsAndBytesConfig(
+    load_in_4bit=False,
+    bnb_4bit_compute_dtype=torch.bfloat16,
+    bnb_4bit_quant_type="nf4",
+    bnb_4bit_use_double_quant=False
+)
+# ベースモデルのロード
+base_model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    quantization_config=quantization_config,
+    device_map="auto"
+    )
+# QLoRA済みモデルの適用
+model = PeftModel.from_pretrained(base_model, lora_weight)
+# トークナイザのロード
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+input="""「図書館で本を読んだ。」という文は「どこで本を読んだ？」という疑問文に直すことができます。
+このとき、「図書館」は「どこ」の疑問詞タグを持ちます。
+それでは、「本」という単語はどのような疑問詞タグを持つでしょうか？ 全て選んでください。対応するものがない場合は「なし」と答えてください。
+"""
+messages = [
+    {"role": "system", "content": """日本で一番高い山は？
+    """},
+    {"role": "user", "content": input},
+]
+# チャットテンプレートを適用
+input_ids = tokenizer.apply_chat_template(
+    messages,
+    tokenize=True,
+    add_generation_prompt=True,
+    return_tensors="pt"
+).to(model.device)
+outputs = model.generate(
+    input_ids,
+    max_new_tokens=256,
+    temperature=0.2,
+    do_sample=True,
+    eos_token_id=tokenizer.eos_token_id,
+    pad_token_id=tokenizer.pad_token_id,
+    early_stopping=True,
+)
+response = tokenizer.decode(outputs[0][input_ids.shape[1]:], skip_special_tokens=True)
 ### Direct Use
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->