Uploaded model

Developed by: nnishi
License: CC-BY-NC-SA
Finetuned from model : llm-jp/llm-jp-3-13b

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

HOW TO INFERENCE for competition evaluators

Run in google colab L4

!pip install unsloth
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git

HF_TOKEN = # WRITE YOUR HF_TOKEN
ELYZA_TASKS_100_TV_JSONL_PATH = # WRITE
# Output for elyza-tasks-100-tv is saved as "output.jsonl"


from huggingface_hub import login
login(HF_TOKEN)

from unsloth import FastLanguageModel
import torch

max_seq_length = 2048
dtype = torch.bfloat16
load_in_4bit = True
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "nnishi/llm-jp-3-13b_kajumanews5760_ichikara_ayadpo100rows_lora",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    token = HF_TOKEN,
)

import json
datasets = []
with open(ELYZA_TASKS_100_TV_JSONL_PATH, "r") as f:
    item = ""
    for line in f:
      line = line.strip()
      item += line
      if item.endswith("}"):
        datasets.append(json.loads(item))
        item = ""

from tqdm import tqdm

FastLanguageModel.for_inference(model)

results = []
for dt in tqdm(datasets):
  input = dt["input"]

  prompt = f"""### 指示\n{input}\n### 回答\n"""

  inputs = tokenizer([prompt], return_tensors = "pt").to(model.device)

  outputs = model.generate(**inputs, max_new_tokens = 2048, use_cache = True, do_sample=False, repetition_penalty=1.2)
  prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### 回答')[-1]

  results.append({"task_id": dt["task_id"], "input": input, "output": prediction})

with open("output.jsonl", "w") as f:
  for r in results:
    f.write(json.dumps(r, ensure_ascii=False) + "\n")

Development steps

quantize llm-jp/llm-jp-3-13b
continued pre-training
- randomly chosen 5760 records in kajuma/CC-news-2024-July-October-cleaned
instruction tuning
- all data in ichikara-instruction's ichikara-instruction-003-001-1.json
direct policy optimization
- randomly chosen 100 records in weblab-GENIAC/aya-ja-nemotron-dpo-masked

Used datasets　and their licenses

kajuma/CC-news-2024-July-October-cleaned

kazuma
kajuma/CC-news-2024-July-October-cleaned
ODC-BY

ichikara-instruction: LLMのための日本語インストラクションデータ

homepage

理化学研究所　革新知能統合研究センター言語情報アクセス技術チーム
ichikara-instruction: LLMのための日本語インストラクションデータ
CC-BY-NC-SA

weblab-GENIAC/aya-ja-nemotron-dpo-masked

weblab-GENIAC
weblab-GENIAC/aya-ja-nemotron-dpo-masked
Apache License 2.0

nnishi
/

llm-jp-3-13b_kajumanews5760_ichikara_ayadpo100rows_lora

Uploaded model

HOW TO INFERENCE for competition evaluators

Development steps

Used datasets　and their licenses

kajuma/CC-news-2024-July-October-cleaned

ichikara-instruction: LLMのための日本語インストラクションデータ

weblab-GENIAC/aya-ja-nemotron-dpo-masked

Model tree for nnishi/llm-jp-3-13b_kajumanews5760_ichikara_ayadpo100rows_lora

Datasets used to train nnishi/llm-jp-3-13b_kajumanews5760_ichikara_ayadpo100rows_lora

Uploaded model

HOW TO INFERENCE for competition evaluators

Development steps

Used datasets and their licenses

kajuma/CC-news-2024-July-October-cleaned

ichikara-instruction: LLMのための日本語インストラクションデータ

weblab-GENIAC/aya-ja-nemotron-dpo-masked

Model tree for nnishi/llm-jp-3-13b_kajumanews5760_ichikara_ayadpo100rows_lora

Datasets used to train nnishi/llm-jp-3-13b_kajumanews5760_ichikara_ayadpo100rows_lora

Used datasets　and their licenses