metadata

license: apache-2.0
language:
  - en
base_model:
  - meta-llama/Llama-3.2-3B-Instruct
pipeline_tag: text-generation
tags:
  - text-generation-inference
  - unsloth
  - trl
  - sft
  - math
  - code
datasets:
  - jeggers/competition_math
library_name: transformers
model-index:
  - name: Komodo-Llama-3.2-3B-v2-fp16
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: IFEval (0-Shot)
          type: HuggingFaceH4/ifeval
          args:
            num_few_shot: 0
        metrics:
          - type: inst_level_strict_acc and prompt_level_strict_acc
            value: 63.41
            name: strict accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Komodo-Llama-3.2-3B-v2-fp16
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: BBH (3-Shot)
          type: BBH
          args:
            num_few_shot: 3
        metrics:
          - type: acc_norm
            value: 20.2
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Komodo-Llama-3.2-3B-v2-fp16
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MATH Lvl 5 (4-Shot)
          type: hendrycks/competition_math
          args:
            num_few_shot: 4
        metrics:
          - type: exact_match
            value: 6.27
            name: exact match
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Komodo-Llama-3.2-3B-v2-fp16
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GPQA (0-shot)
          type: Idavidrein/gpqa
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 3.69
            name: acc_norm
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Komodo-Llama-3.2-3B-v2-fp16
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MuSR (0-shot)
          type: TAUR-Lab/MuSR
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 3.37
            name: acc_norm
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Komodo-Llama-3.2-3B-v2-fp16
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU-PRO (5-shot)
          type: TIGER-Lab/MMLU-Pro
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 20.58
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/Komodo-Llama-3.2-3B-v2-fp16
          name: Open LLM Leaderboard

This version of Komodo is a Llama-3.2-3B-Instruct finetuned model on jeggers/competition_math dataset to increase math performance of the base model.

This model is 4bit-quantized. You should import it 8bit if you want to use 3B parameters! Make sure you installed 'bitsandbytes' library before import.

Finetune system prompt:

You are a highly intelligent and accurate mathematical assistant.
You will solve mathematical problems step by step, explain your reasoning clearly, and provide concise, correct answers.
When the solution requires multiple steps, detail each step systematically.

You can use ChatML format!

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	19.59
IFEval (0-Shot)	63.41
BBH (3-Shot)	20.20
MATH Lvl 5 (4-Shot)	6.27
GPQA (0-shot)	3.69
MuSR (0-shot)	3.37
MMLU-PRO (5-shot)	20.58