--- base_model: llm-jp/llm-jp-3-13b tags: - text-generation-inference - transformers - unsloth - llama - trl license: apache-2.0 language: - en --- # Uploaded model - **Developed by:** ikedachin - **License:** apache-2.0 - **Finetuned from model :** llm-jp/llm-jp-3-13b This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth) ### 使用したdataset 下記からランダムに5000データを抽出 - DeL-TaiseiOzaki/Tengentoppa-sft-v1.0 - llm-jp/magpie-sft-v1.0 ### 実行コード ```:Python from tqdm import tqdm import os import json import torch from unsloth import FastLanguageModel from transformers import ( AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, ) HF_TOKEN = "your-token" model_name = "ikedachin/llm-jp-3-13b-ozaki-ds-5000" # QLoRAの設定 bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=False, ) # modelのダウンロード model = AutoModelForCausalLM.from_pretrained( model_name, quantization_config=bnb_config, device_map="auto", token = HF_TOKEN ) # tokenizerのダウンロード tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True, token = HF_TOKEN) prompt = "<ここに入力を入れる>" # トークン化 tokenized_input = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt").to(model.device) # 推論 with torch.no_grad(): outputs = model.generate( tokenized_input, max_new_tokens=300, do_sample=False, repetition_penalty=1.2 )[0] # トークンから言葉にデコード output = tokenizer.decode(outputs[tokenized_input.size(1):], skip_special_tokens=True) ```