imdatta0's picture
End of training
4d24063 verified
|
raw
history blame
3.83 kB
metadata
base_model: unsloth/gemma-2-9b
library_name: peft
license: gemma
tags:
  - unsloth
  - generated_from_trainer
model-index:
  - name: gemma-2-9b_metamath_ortho
    results: []

gemma-2-9b_metamath_ortho

This model is a fine-tuned version of unsloth/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 8.8949

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.02
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
0.7552 0.0211 13 11.3000
11.6101 0.0421 26 11.6207
11.2097 0.0632 39 11.2115
10.6416 0.0843 52 11.2629
11.087 0.1053 65 11.0208
10.911 0.1264 78 11.1743
10.3864 0.1474 91 9.9812
10.0221 0.1685 104 9.7955
10.1644 0.1896 117 10.3842
10.6605 0.2106 130 10.8225
10.6448 0.2317 143 10.6978
10.6678 0.2528 156 10.5545
10.3694 0.2738 169 9.2694
11.5024 0.2949 182 11.9171
11.731 0.3159 195 11.1801
11.5971 0.3370 208 10.8633
11.0785 0.3581 221 10.8301
10.5228 0.3791 234 10.5660
10.4021 0.4002 247 10.2953
10.2801 0.4213 260 10.1140
11.5045 0.4423 273 11.9247
11.7387 0.4634 286 11.5647
11.6365 0.4845 299 11.5171
10.8443 0.5055 312 9.7961
11.0662 0.5266 325 10.9008
10.9411 0.5476 338 11.0820
11.2701 0.5687 351 11.4640
10.8777 0.5898 364 9.6432
10.3239 0.6108 377 10.6138
10.8646 0.6319 390 10.7791
11.2221 0.6530 403 11.1868
11.0328 0.6740 416 10.8502
10.8706 0.6951 429 10.7639
10.9227 0.7162 442 10.7393
10.3582 0.7372 455 8.8997
9.4372 0.7583 468 9.9192
9.5329 0.7793 481 9.3924
9.6104 0.8004 494 9.3153
9.3477 0.8215 507 9.2830
9.7923 0.8425 520 8.8935
8.8863 0.8636 533 9.5862
8.9213 0.8847 546 8.4786
8.5736 0.9057 559 9.8808
9.5766 0.9268 572 8.6842
8.6045 0.9478 585 8.7064
8.6253 0.9689 598 8.8377
8.8052 0.9900 611 8.8949

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1