BootsofLagrangian's picture
fix typo
3018fab verified
|
raw
history blame
5.7 kB
metadata
libray_name: transformers
pipeline_tag: text-generation
license: other
license_name: llama3
license_link: LICENSE
language:
  - ko
  - en
tags:
  - meta
  - llama
  - llama-3
  - akallama
library_name: transformers

AKALLAMA

AkaLlama is a series of Korean language models designed for practical usability across a wide range of tasks. The initial model, AkaLlama-v0.1, is a fine-tuned version of Meta-Llama-3-70b-Instruct. It has been trained on a custom mix of publicly available datasets curated by the MIR Lab. Our goal is to explore cost-effective ways to adapt high-performing LLMs for specific use cases, such as different languages (e.g., Korean) or domains (e.g., organization-specific chatbots).

Model Description

This is the model card of a πŸ€— transformers model that has been pushed on the Hub.

How to use

This repo provides full model weight files for AkaLlama-70B-v0.1.

Use with transformers

See the snippet below for usage with Transformers:

import transformers
import torch

model_id = "mirlab/AkaLlama-llama3-70b-v0.1"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device="auto",
)

system_prompt = """당신은 μ—°μ„ΈλŒ€ν•™κ΅ λ©€ν‹°λͺ¨λ‹¬ 연ꡬ싀 (MIR lab) 이 λ§Œλ“  λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμΈ AkaLlama (μ•„μΉ΄λΌλ§ˆ) μž…λ‹ˆλ‹€.
λ‹€μŒ 지침을 λ”°λ₯΄μ„Έμš”:
1. μ‚¬μš©μžκ°€ λ³„λ„λ‘œ μš”μ²­ν•˜μ§€ μ•ŠλŠ” ν•œ 항상 ν•œκΈ€λ‘œ μ†Œν†΅ν•˜μ„Έμš”.
2. μœ ν•΄ν•˜κ±°λ‚˜ λΉ„μœ€λ¦¬μ , 차별적, μœ„ν—˜ν•˜κ±°λ‚˜ λΆˆλ²•μ μΈ λ‚΄μš©μ΄ 닡변에 ν¬ν•¨λ˜μ–΄μ„œλŠ” μ•ˆ λ©λ‹ˆλ‹€.
3. 질문이 말이 λ˜μ§€ μ•Šκ±°λ‚˜ 사싀에 λΆ€ν•©ν•˜μ§€ μ•ŠλŠ” 경우 μ •λ‹΅ λŒ€μ‹  κ·Έ 이유λ₯Ό μ„€λͺ…ν•˜μ„Έμš”. μ§ˆλ¬Έμ— λŒ€ν•œ 닡을 λͺ¨λ₯Έλ‹€λ©΄ 거짓 정보λ₯Ό κ³΅μœ ν•˜μ§€ λ§ˆμ„Έμš”.
4. μ•ˆμ „μ΄λ‚˜ μœ€λ¦¬μ— μœ„λ°°λ˜μ§€ μ•ŠλŠ” ν•œ μ‚¬μš©μžμ˜ λͺ¨λ“  μ§ˆλ¬Έμ— μ™„μ „ν•˜κ³  ν¬κ΄„μ μœΌλ‘œ λ‹΅λ³€ν•˜μ„Έμš”."""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "λ„€ 이름은 뭐야?"},
]

prompt = pipeline.tokenizer.apply_chat_template(
        messages, 
        tokenize=False, 
        add_generation_prompt=True
)

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    prompt,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][len(prompt):])
# λ‚΄ 이름은 AkaLlamaμž…λ‹ˆλ‹€! λ‚˜λŠ” μ–Έμ–΄ λͺ¨λΈλ‘œ, μ‚¬μš©μžμ™€ λŒ€ν™”ν•˜λŠ” 데 도움을 μ£ΌκΈ° μœ„ν•΄ λ§Œλ“€μ–΄μ‘ŒμŠ΅λ‹ˆλ‹€. λ‚˜λŠ” λ‹€μ–‘ν•œ μ£Όμ œμ— λŒ€ν•œ μ§ˆλ¬Έμ— λ‹΅ν•˜κ³ , μƒˆλ‘œμš΄ 아이디어λ₯Ό μ œκ³΅ν•˜λ©°, 문제λ₯Ό ν•΄κ²°ν•˜λŠ” 데 도움이 될 수 μžˆμŠ΅λ‹ˆλ‹€. μ‚¬μš©μžκ°€ μ›ν•˜λŠ” μ •λ³΄λ‚˜ 도움을 받도둝 μ΅œμ„ μ„ λ‹€ν•  κ²ƒμž…λ‹ˆλ‹€!

Training Details

Training Procedure

We trained AkaLlama using a preference learning alignment algorithm called Odds Ratio Preference Optimization (ORPO). Our training pipeline is almost identical to that of HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1, aside from minor hyperparameter changes. Please check out Huggingface's alignment handbook for further details, including the chat template.

Training Data

Detailed descriptions regarding training data will be announced later.

Examples

Math Solving[CLICK TO EXPAND]
Writting[CLICK TO EXPAND]
Logical Reasoning[CLICK TO EXPAND]
Coding [CLICK TO EXPAND]

You can find more examples at our project page

Special Thanks

  • Data Center of the Department of Artificial Intelligence at Yonsei University for the computation resources