Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

zR-Llama-1B-chatglm2-6b-tokenizer

本模型是基于 build_MiniLLM_from_scratch 开源框架 自行训练的一个1B模型。

模型参数

  • 1B 参数量
  • 训练语料670亿。
  • 模型支持token长度 896

预训练模型

  • 使用 build_MiniLLM_from_scratch 开源框架 的预训练数据集,自己完成 Tokenize 过程。
  • 使用 8 x 80GB A800 GPU 训练。
  • 训练 1 Epoch,bs=32 (每张卡) , lr=1.5e-4。
  • 共耗时 1 天。

SFT模型

使用模型

import os
import torch
from transformers import AutoTokenizer, LlamaForCausalLM

max_length = 896
HUMAN = '<human>'
ROBOT = '<robot>'
def build_prompt(query, history) -> str:
    texts = ''
    for user_input, response in history:
        texts += f'{HUMAN}{user_input}{ROBOT}{response}'

    texts += f'{HUMAN}{query}{ROBOT}'
    return texts

def build_cli_history(history):
    prompt = ''
    for query, response in history:
        prompt += f"\n\nUser:{query.strip()}"
        prompt += f"\n\nRobot:{response.strip()}"
    return prompt


device = 'cuda' if torch.cuda.is_available() else 'cpu'
tokenizer = AutoTokenizer.from_pretrained("zRzRzRzRzRzRzR/zR-Llama-1b-ChatGLM2-6b-tokenizer", trust_remote_code=True)
model = LlamaForCausalLM.from_pretrained("zRzRzRzRzRzRzR/zR-Llama-1b-ChatGLM2-6b-tokenizer").to(device)

history = []
clear_command = 'cls' if os.name == 'nt' else 'clear'
while True:
    query = input('\n输入:')
    if query.strip() == "stop":
        break
    if query.strip() == "clear":
        history = []
        os.system(clear_command)
        continue

    inputs = tokenizer.encode(build_prompt(query, history), return_tensors='pt', add_special_tokens=False).to(device)
    response = model.generate(inputs)
    response = tokenizer.decode(response[0].cpu(), skip_special_tokens=True)

    os.system(clear_command)
    print(build_cli_history(history + [(query, response)]), flush=True)
Downloads last month
11
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.