README.md · zRzRzRzRzRzRzR/zR-Llama-1b-ChatGLM2-6b-tokenizer at main

zR-Llama-1B-chatglm2-6b-tokenizer

本模型是基于 build_MiniLLM_from_scratch 开源框架自行训练的一个1B模型。

模型参数

1B 参数量
训练语料670亿。
模型支持token长度 896

预训练模型

使用 build_MiniLLM_from_scratch 开源框架的预训练数据集，自己完成 Tokenize 过程。
使用 8 x 80GB A800 GPU 训练。
训练 1 Epoch，bs=32 (每张卡) , lr=1.5e-4。
共耗时 1 天。

SFT模型

使用 build_MiniLLM_from_scratch 开源框架提供的全部数据集
使用单卡A800 微调。
微调 5 Epoch, bs=8, lr=2e-5。
共耗时 3 天 12 小时。

使用模型

import os
import torch
from transformers import AutoTokenizer, LlamaForCausalLM

max_length = 896
HUMAN = '<human>'
ROBOT = '<robot>'
def build_prompt(query, history) -> str:
    texts = ''
    for user_input, response in history:
        texts += f'{HUMAN}{user_input}{ROBOT}{response}'

    texts += f'{HUMAN}{query}{ROBOT}'
    return texts

def build_cli_history(history):
    prompt = ''
    for query, response in history:
        prompt += f"\n\nUser：{query.strip()}"
        prompt += f"\n\nRobot：{response.strip()}"
    return prompt


device = 'cuda' if torch.cuda.is_available() else 'cpu'
tokenizer = AutoTokenizer.from_pretrained("zRzRzRzRzRzRzR/zR-Llama-1b-ChatGLM2-6b-tokenizer", trust_remote_code=True)
model = LlamaForCausalLM.from_pretrained("zRzRzRzRzRzRzR/zR-Llama-1b-ChatGLM2-6b-tokenizer").to(device)

history = []
clear_command = 'cls' if os.name == 'nt' else 'clear'
while True:
    query = input('\n输入:')
    if query.strip() == "stop":
        break
    if query.strip() == "clear":
        history = []
        os.system(clear_command)
        continue

    inputs = tokenizer.encode(build_prompt(query, history), return_tensors='pt', add_special_tokens=False).to(device)
    response = model.generate(inputs)
    response = tokenizer.decode(response[0].cpu(), skip_special_tokens=True)

    os.system(clear_command)
    print(build_cli_history(history + [(query, response)]), flush=True)