collapse_gemma-2-9b_hs2_replace_iter3_sftsd1

This model is a fine-tuned version of google/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3331
  • Num Input Tokens Seen: 4572424

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 4
  • eval_batch_size: 16
  • seed: 1
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.2335 0
1.2166 0.0512 5 1.0814 235576
0.6049 0.1023 10 1.1325 478200
0.2359 0.1535 15 1.2187 716416
0.115 0.2046 20 1.2821 943664
0.0493 0.2558 25 1.3910 1179772
0.0383 0.3069 30 1.3572 1412684
0.0416 0.3581 35 1.3296 1649992
0.0251 0.4092 40 1.2853 1888964
0.0376 0.4604 45 1.2453 2120356
0.0332 0.5115 50 1.2222 2357188
0.034 0.5627 55 1.2430 2594152
0.0258 0.6138 60 1.2519 2829660
0.0319 0.6650 65 1.2651 3059288
0.0302 0.7161 70 1.2408 3294592
0.028 0.7673 75 1.2606 3531876
0.0313 0.8184 80 1.3008 3770028
0.0247 0.8696 85 1.3242 4003268
0.0277 0.9207 90 1.3376 4246612
0.0299 0.9719 95 1.3401 4482512

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
9
Safetensors
Model size
9.24B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-9b_hs2_replace_iter3_sftsd1

Base model

google/gemma-2-9b
Finetuned
(143)
this model