metadata
base_model: llm-jp/llm-jp-3-13b
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
licenses:
- Apache-2.0
- CC-BY-NC-SA-4.0
- CC-BY-SA-4.0
language:
- ja
datasets:
- elyza/ELYZA-tasks-100
- ichikara-instruction
llm-jp-3-13b-it: A Fine-tuned model for ELYZA-tasks-100
Overview
This is a fine-tuned llm-jp-3-13b-it model for ELYZA-tasks-100. The model was trained on ELYZA-tasks-100 and the ichikara-instruction dataset.
Usage
Load the model and tokenizer with the following code:
from unsloth import FastLanguageModel
model_id = "tokutsu/llm-jp-3-13b-it"
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=model_id,
dtype=None,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
prompt = """### ๆ็คบ
ไปไบใฎ็ฑๆใๅใๆปใใใใฎใขใคใใขใ5ใคๆใใฆใใ ใใใ
### ๅ็ญ
"""
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs,
max_new_tokens=512,
use_cache=True,
do_sample=False,
repetition_penalty=1.2)
prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### ๅ็ญ')[-1]
Example Output
Here is an example of what the output would look like:
1. ไปไบใซ้ข้ฃใใ่ถฃๅณใๆใค: ่ถฃๅณใฏในใใฌใน่งฃๆถใใชใฉใใฏในๅนๆใใใใไปไบใธใฎใขใใใผใทใงใณใขใใใซใใคใชใใใพใใไพใใฐใใฌใผใใใณใฐใๅฅฝใใชใใชใใฃในใง่ฆณ่ๆค็ฉใ่ฒใฆใใใๆ็ใๅพๆใงใใใฐๅๅใจใฉใณใไผใใใใชใฉใ่ชๅใชใใฎไปไบใจใฎๆฅ็นใ่ฆใคใใฆใฟใพใใใใ
2. ็ฎๆจ่จญๅฎใ่กใ: ้ๆๅฏ่ฝใช็ฎๆจใ็ซใฆใใใจใงใๆฅใ
ๆ้ทใใฆใใใใจใๅฎๆใงใใใใใใใ็ใพใใฆใใพใใใพใใๅฎๆ็ใซ้ฒๆ็ถๆณใ็ขบ่ชใใใใจใงใ้ๆๆใจใจใใซใใใชใใใๆฐใซใคใชใใใงใใใใ
3. ๅๅใใกใจไบคๆตใใ: ่ทๅ ดใงใฎไบบ้้ขไฟใฏใไปไบใซๅฏพใใๆ
็ฑใ็ถญๆใใใใใซ้่ฆใงใใใณใใฅใใฑใผใทใงใณใใจใใใจใงใใไบใใฎใใจใ็่งฃใใๅฉใๅใใใจใใงใใพใใ่ทๅ ดใฎใคใใณใใซๅๅ ใใใใไผๆฉๆ้ใซใฏ้่ซใใใใใฆใ็ฉๆฅต็ใซๅจใใฎไบบใจ้ขใใใพใใใใ
4. ๆฐใใในใญใซใ่บซใซใคใใ: ในใญใซๅไธใฎใใใฎๅๅผทใใๆฐใใ่ณๆ ผๅๅพใชใฉใซใใใ่ชๅใฎ่ฝๅใ้ซใใใใจใใงใใพใใ่ชๅทฑๅ็บ็ใชๆดปๅใใ่ชไฟกใๅไธๅฟใธใจใคใชใใใใใใใพใใใ
5. ไผๆใใจใฃใฆใชใใฌใใทใฅใใ: ้ทๆไผๆใใจใใๅฟ่บซใจใใซไผๆฏใใใใจใฏๅคงๅใชใใจใงใใๆ
่กใธ่กใฃใใใๅฎถๆใจไธ็ทใซ้ใใใใใใใใจใงๆฐๅ่ปขๆใใงใใใพใๆฐใใชๆฐๆใกใงไปไบใซๅใ็ตใใใจใใงใใใใใซใชใใพใใ
Additional Information
The model was trained using LoRA with the following specifications:
Base Model
- The training started with the pre-trained language model
llm-jp/llm-jp-3-13b
.
Datasets
- ELYZA-tasks-100: A comprehensive dataset covering 100 diverse tasks, enhancing the model's ability to generalize across multiple domains. (link)
- ichikara-instruction: This dataset contains a diverse range of text samples, providing a strong foundation for understanding contextual nuances. (link)
Training Methodology
- PEFT with LoRA: The training employed PEFT (Parameter-Efficient Fine-Tuning) using LoRA (Low-Rank Adaptation), enabling efficient fine-tuning with reduced computational costs while retaining the model's performance. This model was trained with Unsloth and Huggingface's TRL library.
License
This model is licensed under the CC BY-NC-SA 4.0 License. For more details, see the LICENSE file in this repository.
Acknowledgment
This model was developed as part of the LLM course 2024 exercises conducted by the Matsuo-Iwasawa Lab at the University of Tokyo.