csabakecskemeti's picture
Update README.md
3900a10 verified
|
raw
history blame
2.32 kB
metadata
base_model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit
datasets:
  - microsoft/orca-agentinstruct-1M-v1
pipeline_tag: text-generation
library_name: transformers
license: llama3.2
tags:
  - unsloth
  - transformers
model-index:
  - name: analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit
    results:
      - task:
          type: text-generation
        dataset:
          type: lm-evaluation-harness
          name: hellaswag
        metrics:
          - name: acc
            type: acc
            value: 0.5141
            verified: false
          - name: acc_norm
            type: acc_norm
            value: 0.6793
            verified: false

image/png

eval

Test Base Model Fine-Tuned Model Performance Gain
leaderboard_bbh_logical_deduction_seven_objects 0.252 0.436 0.184
leaderboard_bbh_logical_deduction_five_objects 0.356 0.456 0.10000000000000003
leaderboard_musr_team_allocation 0.22 0.32 0.1
leaderboard_bbh_disambiguation_qa 0.304 0.376 0.07200000000000001
leaderboard_gpqa_diamond 0.2222222222222222 0.2727272727272727 0.0505050505050505
leaderboard_bbh_movie_recommendation 0.596 0.636 0.040000000000000036
leaderboard_bbh_formal_fallacies 0.508 0.54 0.03200000000000003
leaderboard_bbh_tracking_shuffled_objects_three_objects 0.316 0.344 0.02799999999999997
leaderboard_bbh_causal_judgement 0.5454545454545454 0.5668449197860963 0.021390374331550888
leaderboard_bbh_web_of_lies 0.496 0.516 0.020000000000000018
leaderboard_math_geometry_hard 0.045454545454545456 0.06060606060606061 0.015151515151515152
leaderboard_math_num_theory_hard 0.05194805194805195 0.06493506493506493 0.012987012987012977
leaderboard_musr_murder_mysteries 0.528 0.54 0.01200000000000001
leaderboard_gpqa_extended 0.27106227106227104 0.2802197802197802 0.00915750915750918
leaderboard_bbh_sports_understanding 0.596 0.604 0.008000000000000007
leaderboard_math_intermediate_algebra_hard 0.010714285714285714 0.014285714285714285 0.003571428571428571
leaderboard_bbh_navigate 0.62 0.62 0.0

Framework versions

  • unsloth 2024.11.5
  • trl 0.12.0

Training HW

  • V100