Edit model card

zephyr-7b-sft-lora-accum4-lr5e_6

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0318

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Validation Loss
2.0685 0.55 13 2.0233
1.9972 1.57 27 1.9438
1.9109 2.55 40 1.8720
1.8583 3.57 54 1.8096
1.7903 4.55 67 1.7596
1.7454 5.57 81 1.7169
1.7146 6.55 94 1.6789
1.6737 7.57 108 1.6395
1.6289 8.55 121 1.6056
1.5934 9.57 135 1.5665
1.565 10.55 148 1.5258
1.519 11.57 162 1.4776
1.4593 12.55 175 1.4281
1.4156 13.57 189 1.3676
1.3512 14.55 202 1.3222
1.3146 15.57 216 1.2825
1.2798 16.55 229 1.2492
1.2532 17.57 243 1.2225
1.2277 18.55 256 1.2033
1.208 19.57 270 1.1832
1.1944 20.55 283 1.1732
1.1799 21.57 297 1.1586
1.1621 22.55 310 1.1494
1.1448 23.57 324 1.1393
1.1564 24.55 337 1.1301
1.1293 25.57 351 1.1233
1.1228 26.55 364 1.1160
1.1266 27.57 378 1.1106
1.1159 28.55 391 1.1047
1.125 29.57 405 1.0989
1.094 30.55 418 1.0941
1.1077 31.57 432 1.0903
1.0874 32.55 445 1.0834
1.0957 33.57 459 1.0769
1.0755 34.55 472 1.0750
1.0705 35.57 486 1.0719
1.0749 36.55 499 1.0688
1.0742 37.57 513 1.0641
1.064 38.55 526 1.0626
1.0569 39.57 540 1.0593
1.0686 40.55 553 1.0554
1.059 41.57 567 1.0517
1.0588 42.55 580 1.0469
1.0447 43.57 594 1.0471
1.0356 44.55 607 1.0421
1.0431 45.57 621 1.0416
1.0195 46.55 634 1.0387
1.0326 47.57 648 1.0356
1.0227 48.55 661 1.0341
1.0403 49.57 675 1.0317

Framework versions

  • Transformers 4.35.0
  • Pytorch 2.1.0
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Model tree for shkang/zephyr-7b-sft-lora-accum4-lr5e_6

Finetuned
(692)
this model