metadata
base_model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit
datasets:
- microsoft/orca-agentinstruct-1M-v1
pipeline_tag: text-generation
library_name: transformers
license: llama3.2
tags:
- unsloth
- transformers
model-index:
- name: analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit
results:
- task:
type: text-generation
dataset:
type: lm-evaluation-harness
name: hellaswag
metrics:
- name: acc
type: acc
value: 0.5141
verified: false
- name: acc_norm
type: acc_norm
value: 0.6793
verified: false
eval
Test | Base Model | Fine-Tuned Model | Performance Gain |
---|---|---|---|
leaderboard_bbh_logical_deduction_seven_objects | 0.252 | 0.436 | 0.184 |
leaderboard_bbh_logical_deduction_five_objects | 0.356 | 0.456 | 0.10000000000000003 |
leaderboard_musr_team_allocation | 0.22 | 0.32 | 0.1 |
leaderboard_bbh_disambiguation_qa | 0.304 | 0.376 | 0.07200000000000001 |
leaderboard_gpqa_diamond | 0.2222222222222222 | 0.2727272727272727 | 0.0505050505050505 |
leaderboard_bbh_movie_recommendation | 0.596 | 0.636 | 0.040000000000000036 |
leaderboard_bbh_formal_fallacies | 0.508 | 0.54 | 0.03200000000000003 |
leaderboard_bbh_tracking_shuffled_objects_three_objects | 0.316 | 0.344 | 0.02799999999999997 |
leaderboard_bbh_causal_judgement | 0.5454545454545454 | 0.5668449197860963 | 0.021390374331550888 |
leaderboard_bbh_web_of_lies | 0.496 | 0.516 | 0.020000000000000018 |
leaderboard_math_geometry_hard | 0.045454545454545456 | 0.06060606060606061 | 0.015151515151515152 |
leaderboard_math_num_theory_hard | 0.05194805194805195 | 0.06493506493506493 | 0.012987012987012977 |
leaderboard_musr_murder_mysteries | 0.528 | 0.54 | 0.01200000000000001 |
leaderboard_gpqa_extended | 0.27106227106227104 | 0.2802197802197802 | 0.00915750915750918 |
leaderboard_bbh_sports_understanding | 0.596 | 0.604 | 0.008000000000000007 |
leaderboard_math_intermediate_algebra_hard | 0.010714285714285714 | 0.014285714285714285 | 0.003571428571428571 |
leaderboard_bbh_navigate | 0.62 | 0.62 | 0.0 |
Framework versions
- unsloth 2024.11.5
- trl 0.12.0
Training HW
- V100