llm-jp-3-13b-it / README.md
tokutsu's picture
Update README
8ef01ac
|
raw
history blame
5.05 kB
metadata
base_model: llm-jp/llm-jp-3-13b
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - llama
  - trl
licenses:
  - Apache-2.0
  - CC-BY-NC-SA-4.0
  - CC-BY-SA-4.0
language:
  - ja
datasets:
  - elyza/ELYZA-tasks-100
  - ichikara-instruction

llm-jp-3-13b-it: A Fine-tuned model for ELYZA-tasks-100

Overview

This is a fine-tuned llm-jp-3-13b-it model for ELYZA-tasks-100. The model was trained on ELYZA-tasks-100 and the ichikara-instruction dataset.

Usage

Load the model and tokenizer with the following code:

from unsloth import FastLanguageModel

model_id = "tokutsu/llm-jp-3-13b-it"

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_id,
    dtype=None,
    load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

prompt = """### ๆŒ‡็คบ
ไป•ไบ‹ใฎ็†ฑๆ„ใ‚’ๅ–ใ‚Šๆˆปใ™ใŸใ‚ใฎใ‚ขใ‚คใƒ‡ใ‚ขใ‚’5ใคๆŒ™ใ’ใฆใใ ใ•ใ„ใ€‚

### ๅ›ž็ญ”
"""

inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs,
                         max_new_tokens=512,
                         use_cache=True,
                         do_sample=False,
                         repetition_penalty=1.2)
prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### ๅ›ž็ญ”')[-1]

Example Output

Here is an example of what the output would look like:

1. ไป•ไบ‹ใซ้–ข้€ฃใ™ใ‚‹่ถฃๅ‘ณใ‚’ๆŒใค: ่ถฃๅ‘ณใฏใ‚นใƒˆใƒฌใ‚น่งฃๆถˆใ‚„ใƒชใƒฉใƒƒใ‚ฏใ‚นๅŠนๆžœใŒใ‚ใ‚Šใ€ไป•ไบ‹ใธใฎใƒขใƒใƒ™ใƒผใ‚ทใƒงใƒณใ‚ขใƒƒใƒ—ใซใ‚‚ใคใชใŒใ‚Šใพใ™ใ€‚ไพ‹ใˆใฐใ€ใ‚ฌใƒผใƒ‡ใƒ‹ใƒณใ‚ฐใŒๅฅฝใใชใ‚‰ใ‚ชใƒ•ใ‚ฃใ‚นใง่ฆณ่‘‰ๆค็‰ฉใ‚’่‚ฒใฆใŸใ‚Šใ€ๆ–™็†ใŒๅพ—ๆ„ใงใ‚ใ‚ŒใฐๅŒๅƒšใจใƒฉใƒณใƒไผšใ‚’ใ™ใ‚‹ใชใฉใ€่‡ชๅˆ†ใชใ‚Šใฎไป•ไบ‹ใจใฎๆŽฅ็‚นใ‚’่ฆ‹ใคใ‘ใฆใฟใพใ—ใ‚‡ใ†ใ€‚
2. ็›ฎๆจ™่จญๅฎšใ‚’่กŒใ†: ้”ๆˆๅฏ่ƒฝใช็›ฎๆจ™ใ‚’็ซ‹ใฆใ‚‹ใ“ใจใงใ€ๆ—ฅใ€…ๆˆ้•ทใ—ใฆใ„ใ‚‹ใ“ใจใ‚’ๅฎŸๆ„Ÿใงใใ€ใ‚„ใ‚ŠใŒใ„ใ‚‚็”Ÿใพใ‚Œใฆใใพใ™ใ€‚ใพใŸใ€ๅฎšๆœŸ็š„ใซ้€ฒๆ—็Šถๆณใ‚’็ขบ่ชใ™ใ‚‹ใ“ใจใงใ€้”ๆˆๆ„Ÿใจใจใ‚‚ใซใ•ใ‚‰ใชใ‚‹ใ‚„ใ‚‹ๆฐ—ใซใคใชใŒใ‚‹ใงใ—ใ‚‡ใ†ใ€‚
3. ๅŒๅƒšใŸใกใจไบคๆตใ™ใ‚‹: ่ทๅ ดใงใฎไบบ้–“้–ขไฟ‚ใฏใ€ไป•ไบ‹ใซๅฏพใ™ใ‚‹ๆƒ…็†ฑใ‚’็ถญๆŒใ™ใ‚‹ใŸใ‚ใซ้‡่ฆใงใ™ใ€‚ใ‚ณใƒŸใƒฅใƒ‹ใ‚ฑใƒผใ‚ทใƒงใƒณใ‚’ใจใ‚‹ใ“ใจใงใ€ใŠไบ’ใ„ใฎใ“ใจใ‚’็†่งฃใ—ใ€ๅŠฉใ‘ๅˆใ†ใ“ใจใŒใงใใพใ™ใ€‚่ทๅ ดใฎใ‚คใƒ™ใƒณใƒˆใซๅ‚ๅŠ ใ—ใŸใ‚Šใ€ไผ‘ๆ†ฉๆ™‚้–“ใซใฏ้›‘่ซ‡ใ—ใŸใ‚Šใ—ใฆใ€็ฉๆฅต็š„ใซๅ‘จใ‚Šใฎไบบใจ้–ขใ‚ใ‚Šใพใ—ใ‚‡ใ†ใ€‚
4. ๆ–ฐใ—ใ„ใ‚นใ‚ญใƒซใ‚’่บซใซใคใ‘ใ‚‹: ใ‚นใ‚ญใƒซๅ‘ไธŠใฎใŸใ‚ใฎๅ‹‰ๅผทใ‚„ใ€ๆ–ฐใ—ใ„่ณ‡ๆ ผๅ–ๅพ—ใชใฉใซใ‚ˆใ‚Šใ€่‡ชๅˆ†ใฎ่ƒฝๅŠ›ใ‚’้ซ˜ใ‚ใ‚‹ใ“ใจใŒใงใใพใ™ใ€‚่‡ชๅทฑๅ•“็™บ็š„ใชๆดปๅ‹•ใŒใ€่‡ชไฟกใ‚„ๅ‘ไธŠๅฟƒใธใจใคใชใŒใ‚‹ใ‹ใ‚‚ใ—ใ‚Œใพใ›ใ‚“ใ€‚
5. ไผ‘ๆš‡ใ‚’ใจใฃใฆใƒชใƒ•ใƒฌใƒƒใ‚ทใƒฅใ™ใ‚‹: ้•ทๆœŸไผ‘ๆš‡ใ‚’ใจใ‚Šใ€ๅฟƒ่บซใจใ‚‚ใซไผ‘ๆฏใ™ใ‚‹ใ“ใจใฏๅคงๅˆ‡ใชใ“ใจใงใ™ใ€‚ๆ—…่กŒใธ่กŒใฃใŸใ‚Šใ€ๅฎถๆ—ใจไธ€็ท’ใซ้Žใ”ใ—ใŸใ‚Šใ™ใ‚‹ใ“ใจใงๆฐ—ๅˆ†่ปขๆ›ใŒใงใใ€ใพใŸๆ–ฐใŸใชๆฐ—ๆŒใกใงไป•ไบ‹ใซๅ–ใ‚Š็ต„ใ‚€ใ“ใจใŒใงใใ‚‹ใ‚ˆใ†ใซใชใ‚Šใพใ™ใ€‚

Additional Information

The model was trained using LoRA with the following specifications:

Base Model

  • The training started with the pre-trained language model llm-jp/llm-jp-3-13b.

Datasets

  • ELYZA-tasks-100: A comprehensive dataset covering 100 diverse tasks, enhancing the model's ability to generalize across multiple domains. (link)
  • ichikara-instruction: This dataset contains a diverse range of text samples, providing a strong foundation for understanding contextual nuances. (link)

Training Methodology

  • PEFT with LoRA: The training employed PEFT (Parameter-Efficient Fine-Tuning) using LoRA (Low-Rank Adaptation), enabling efficient fine-tuning with reduced computational costs while retaining the model's performance. This model was trained with Unsloth and Huggingface's TRL library.

License

This model is licensed under the CC BY-NC-SA 4.0 License. For more details, see the LICENSE file in this repository.

Acknowledgment

This model was developed as part of the LLM course 2024 exercises conducted by the Matsuo-Iwasawa Lab at the University of Tokyo.