Uploaded model
- Developed by: nnishi
- License: CC-BY-NC-SA
- Finetuned from model : llm-jp/llm-jp-3-13b
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
HOW TO INFERENCE for competition evaluators
Run in google colab L4
!pip install unsloth
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git
HF_TOKEN = # WRITE YOUR HF_TOKEN
ELYZA_TASKS_100_TV_JSONL_PATH = # WRITE
# Output for elyza-tasks-100-tv is saved as "output.jsonl"
from huggingface_hub import login
login(HF_TOKEN)
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048
dtype = torch.bfloat16
load_in_4bit = True
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "nnishi/llm-jp-3-13b_kajumanews5760_ichikara_ayadpo100rows_lora",
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
token = HF_TOKEN,
)
import json
datasets = []
with open(ELYZA_TASKS_100_TV_JSONL_PATH, "r") as f:
item = ""
for line in f:
line = line.strip()
item += line
if item.endswith("}"):
datasets.append(json.loads(item))
item = ""
from tqdm import tqdm
FastLanguageModel.for_inference(model)
results = []
for dt in tqdm(datasets):
input = dt["input"]
prompt = f"""### 指示\n{input}\n### 回答\n"""
inputs = tokenizer([prompt], return_tensors = "pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens = 2048, use_cache = True, do_sample=False, repetition_penalty=1.2)
prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### 回答')[-1]
results.append({"task_id": dt["task_id"], "input": input, "output": prediction})
with open("output.jsonl", "w") as f:
for r in results:
f.write(json.dumps(r, ensure_ascii=False) + "\n")
Development steps
- quantize
llm-jp/llm-jp-3-13b
- continued pre-training
- randomly chosen 5760 records in
kajuma/CC-news-2024-July-October-cleaned
- randomly chosen 5760 records in
- instruction tuning
- all data in
ichikara-instruction
'sichikara-instruction-003-001-1.json
- all data in
- direct policy optimization
- randomly chosen 100 records in
weblab-GENIAC/aya-ja-nemotron-dpo-masked
- randomly chosen 100 records in
Used datasets and their licenses
kajuma/CC-news-2024-July-October-cleaned
kazuma
kajuma/CC-news-2024-July-October-cleaned
ODC-BY
ichikara-instruction: LLMのための日本語インストラクションデータ
理化学研究所 革新知能統合研究センター 言語情報アクセス技術チーム
ichikara-instruction: LLMのための日本語インストラクションデータ
CC-BY-NC-SA
weblab-GENIAC/aya-ja-nemotron-dpo-masked
weblab-GENIAC
weblab-GENIAC/aya-ja-nemotron-dpo-masked
Apache License 2.0
Model tree for nnishi/llm-jp-3-13b_kajumanews5760_ichikara_ayadpo100rows_lora
Base model
llm-jp/llm-jp-3-13b