metadata

license: apache-2.0
model-index:
  - name: TinyWand-SFT
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 31.4
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/TinyWand-SFT
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 49.96
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/TinyWand-SFT
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 25.98
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/TinyWand-SFT
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 43.08
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/TinyWand-SFT
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 55.17
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/TinyWand-SFT
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 2.05
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/TinyWand-SFT
          name: Open LLM Leaderboard

TinyWand-SFT

한국어 모델 설명

1.63B, 하찮은 크기의 SLM은 어떨까요?

모델 소개

TinyWand-SFT는 1.63B의 SLM 모델입니다. 이 모델은 1.63B라는 작은 크기를 가짐으로써 소형기기에서 구동되거나 큰 toks/s를 가질 수 있음과 동시에 강력한 성능을 보여줍니다.

모델 라이센스

apache-2.0

모델 성능

TBD

한계

작은 크기로 인하여 Insturct 파인튜닝 후 해당 템플릿이 아닐경우 제대로 응답하지 않는 모습을 보임. 특정 task에 사용한다면 프롬프팅보다는 파인튜닝을 권장함.

같은 이유로 일반적인 벤치마크에서도 상당히 낮은 점수를 보임.

학습 과정

TBD

사용 안내

추론에 필요한 VRAM

양자화	입력 토큰 수	출력 토큰 수	메모리 사용량
bf16(base)	64	256	3,888 MiB
q4_K_M	64	256	1,788 MiB

프롬프트 템플릿

본 모델은 Alpaca 프롬프트 템플릿을 사용합니다.

해당 템플릿은 apply_chat_template()를 통해 허깅페이스 템플릿에서 확인 하실 수 있습니다.

아래 파이썬 코드를 사용하여 모델을 로드 및 사용 할 수 있습니다. transformers, torch가 사전 설치되어야함

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # nvidia 그래픽카드 기준

tokenizer = AutoTokenizer.from_pretrained("maywell/TinyWand-SFT")
model = AutoModelForCausalLM.from_pretrained(
    "maywell/TinyWand-SFT",
    device_map="auto",
    torch_dtype=torch.bfloat16, # 사용하는 장비가 bfloat16을 지원하지 않는 경우 torch.float16으로 바꿔주세요.
)

messages = [
    {"role": "system", "content": "Below is an instruction that describes a task. Write a response that appropriately completes the request."}, # 비울 경우에도 동일하게 적용 됨.
    {"role": "user", "content": "언어모델의 파라미터 수가 작으면 어떤 이점이 있어?"},
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	34.61
AI2 Reasoning Challenge (25-Shot)	31.40
HellaSwag (10-Shot)	49.96
MMLU (5-Shot)	25.98
TruthfulQA (0-shot)	43.08
Winogrande (5-shot)	55.17
GSM8k (5-shot)	2.05