Pretrained LM

Training Dataset

Prompt

  • Template:
      prompt = f"Translate this from {src_lang} to {tgt_lang}\n### {src_lang}: {src_text}\n### {tgt_lang}:"
    
      >>> # src_lang can be 'English', '한국어'
      >>> # tgt_lang can be '한국어', 'English'
    
    Mind that there is no "space (_)" at the end of the prompt (unpredictable first token will be popped up).

Training

  • Trained with QLoRA
    • PLM: NormalFloat 4-bit
    • Adapter: BrainFloat 16-bit
    • Adapted to all the linear layers (around 2.05%)
  • Merge adapters and upscaled in BrainFloat 16-bit precision

Usage (IMPORTANT)

  • Should remove the EOS token (<|end_of_text|>, id=128001) at the end of the prompt.
      # MODEL
      model_name = 'traintogpb/llama-3-enko-translator-8b-qlora-bf16-upscaled'
      model = AutoModelForCausalLM.from_pretrained(
          model_name,
          max_length=768,
          attn_implementation='flash_attention_2',
          torch_dtype=torch.bfloat16,
      )
    
      tokenizer = AutoTokenizer.from_pretrained(adapter_name)
      tokenizer.pad_token_id = 128002 # eos_token_id and pad_token_id should be different
      # tokenizer.add_eos_token = False # There is no 'add_eos_token' option in llama3
    
      text = "Someday, QWER will be the greatest girl band in the world."
      input_prompt = f"Translate this from English to 한국어.\n### English: {text}\n### 한국어:"
      inputs = tokenizer(input_prompt, max_length=768, truncation=True, return_tensors='pt')
    
      if inputs['input_ids'][0][-1] == tokenizer.eos_token_id:
          inputs['input_ids'] = inputs['input_ids'][0][:-1].unsqueeze(dim=0)
          inputs['attention_mask'] = inputs['attention_mask'][0][:-1].unsqueeze(dim=0)
    
      outputs = model.generate(**inputs, max_length=768, eos_token_id=tokenizer.eos_token_id)
    
      input_len = len(inputs['input_ids'].squeeze())
      translation = tokenizer.decode(outputs[0][input_len:], skip_special_tokens=True)
      print(translation)
    

Framework versions

  • PEFT 0.8.2
Downloads last month
83
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train traintogpb/llama-3-enko-translator-8b-qlora-bf16-upscaled

Collection including traintogpb/llama-3-enko-translator-8b-qlora-bf16-upscaled