Uploaded model

  • Developed by: nnishi
  • License: CC-BY-NC-SA
  • Finetuned from model : llm-jp/llm-jp-3-13b

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

HOW TO INFERENCE for competition evaluators

Run in google colab L4

!pip install unsloth
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git

HF_TOKEN = # WRITE YOUR HF_TOKEN
ELYZA_TASKS_100_TV_JSONL_PATH = # WRITE
# Output for elyza-tasks-100-tv is saved as "output.jsonl"


from huggingface_hub import login
login(HF_TOKEN)

from unsloth import FastLanguageModel
import torch

max_seq_length = 2048
dtype = torch.bfloat16
load_in_4bit = True
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "nnishi/llm-jp-3-13b_kajumanews5760_ichikara_ayadpo100rows_lora",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    token = HF_TOKEN,
)

import json
datasets = []
with open(ELYZA_TASKS_100_TV_JSONL_PATH, "r") as f:
    item = ""
    for line in f:
      line = line.strip()
      item += line
      if item.endswith("}"):
        datasets.append(json.loads(item))
        item = ""

from tqdm import tqdm

FastLanguageModel.for_inference(model)

results = []
for dt in tqdm(datasets):
  input = dt["input"]

  prompt = f"""### 指示\n{input}\n### 回答\n"""

  inputs = tokenizer([prompt], return_tensors = "pt").to(model.device)

  outputs = model.generate(**inputs, max_new_tokens = 2048, use_cache = True, do_sample=False, repetition_penalty=1.2)
  prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### 回答')[-1]

  results.append({"task_id": dt["task_id"], "input": input, "output": prediction})

with open("output.jsonl", "w") as f:
  for r in results:
    f.write(json.dumps(r, ensure_ascii=False) + "\n")

Development steps

  • quantize llm-jp/llm-jp-3-13b
  • continued pre-training
    • randomly chosen 5760 records in kajuma/CC-news-2024-July-October-cleaned
  • instruction tuning
    • all data in ichikara-instruction's ichikara-instruction-003-001-1.json
  • direct policy optimization
    • randomly chosen 100 records in weblab-GENIAC/aya-ja-nemotron-dpo-masked

Used datasets and their licenses

kajuma/CC-news-2024-July-October-cleaned

kazuma
kajuma/CC-news-2024-July-October-cleaned
ODC-BY

ichikara-instruction: LLMのための日本語インストラクションデータ

理化学研究所 革新知能統合研究センター 言語情報アクセス技術チーム
ichikara-instruction: LLMのための日本語インストラクションデータ
CC-BY-NC-SA

weblab-GENIAC/aya-ja-nemotron-dpo-masked

weblab-GENIAC
weblab-GENIAC/aya-ja-nemotron-dpo-masked
Apache License 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for nnishi/llm-jp-3-13b_kajumanews5760_ichikara_ayadpo100rows_lora

Finetuned
(1126)
this model

Datasets used to train nnishi/llm-jp-3-13b_kajumanews5760_ichikara_ayadpo100rows_lora