imdatta0's picture
End of training
7ccfc3a verified
|
raw
history blame
3.84 kB
metadata
base_model: unsloth/gemma-2-9b
library_name: peft
license: gemma
tags:
  - unsloth
  - generated_from_trainer
model-index:
  - name: gemma-2-9b_metamath_reverse
    results: []

gemma-2-9b_metamath_reverse

This model is a fine-tuned version of unsloth/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 10.7771

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.02
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
0.7164 0.0211 13 0.9526
1.4875 0.0421 26 1.7775
1.8093 0.0632 39 2.4866
2.9457 0.0843 52 3.1388
4.2957 0.1053 65 4.3648
7.9729 0.1264 78 11.0147
10.8099 0.1474 91 9.4330
9.8556 0.1685 104 11.2095
11.4575 0.1896 117 11.8836
11.9399 0.2106 130 11.9231
11.9626 0.2317 143 11.9768
11.9547 0.2528 156 11.8762
11.9008 0.2738 169 11.9031
11.8209 0.2949 182 11.7070
11.7717 0.3159 195 11.8161
11.7063 0.3370 208 11.6304
11.5787 0.3581 221 11.7282
11.6212 0.3791 234 11.4066
11.4214 0.4002 247 11.2306
11.397 0.4213 260 11.3492
11.5241 0.4423 273 11.6393
11.5238 0.4634 286 11.2219
11.3261 0.4845 299 11.2667
11.3066 0.5055 312 11.2729
11.227 0.5266 325 11.0665
11.2074 0.5476 338 11.1924
11.0554 0.5687 351 11.0311
11.0567 0.5898 364 11.1885
11.1251 0.6108 377 10.8923
11.1682 0.6319 390 11.0041
10.9569 0.6530 403 11.0336
10.9747 0.6740 416 10.7973
10.9086 0.6951 429 10.8775
10.9555 0.7162 442 11.0885
10.8633 0.7372 455 10.9284
10.9128 0.7583 468 11.0310
10.9266 0.7793 481 11.0151
10.8317 0.8004 494 10.8168
10.7392 0.8215 507 10.8803
10.7123 0.8425 520 10.7858
10.8527 0.8636 533 10.8239
10.8007 0.8847 546 10.7503
10.7274 0.9057 559 10.7407
10.7662 0.9268 572 10.7765
10.7403 0.9478 585 10.7477
10.7315 0.9689 598 10.7644
10.7675 0.9900 611 10.7771

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1