Edit model card

lc_full_packing

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6843

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss
1.658 1.0 80 1.6224
1.5504 2.0 160 1.5658
1.5401 3.0 240 1.5452
1.5012 4.0 320 1.5356
1.5873 5.0 400 1.5318
1.4235 6.0 480 1.5295
1.4938 7.0 560 1.5312
1.4809 8.0 640 1.5340
1.3717 9.0 720 1.5361
1.5824 10.0 800 1.5430
1.4478 11.0 880 1.5456
1.4124 12.0 960 1.5535
1.3804 13.0 1040 1.5635
1.3416 14.0 1120 1.5694
1.3678 15.0 1200 1.5820
1.3439 16.0 1280 1.5922
1.2532 17.0 1360 1.6031
1.2628 18.0 1440 1.5963
1.4167 19.0 1520 1.6095
1.2727 20.0 1600 1.6200
1.3244 21.0 1680 1.6197
1.2597 22.0 1760 1.6325
1.2578 23.0 1840 1.6415
1.3411 24.0 1920 1.6453
1.2795 25.0 2000 1.6495
1.2928 26.0 2080 1.6509
1.2235 27.0 2160 1.6586
1.2335 28.0 2240 1.6604
1.1769 29.0 2320 1.6701
1.2284 30.0 2400 1.6681
1.2416 31.0 2480 1.6704
1.3158 32.0 2560 1.6737
1.2734 33.0 2640 1.6806
1.2803 34.0 2720 1.6815
1.1976 35.0 2800 1.6803
1.2457 36.0 2880 1.6801
1.2039 37.0 2960 1.6831
1.1931 38.0 3040 1.6824
1.2337 39.0 3120 1.6841
1.2167 40.0 3200 1.6833
1.1514 41.0 3280 1.6847
1.2817 42.0 3360 1.6841
1.1658 43.0 3440 1.6837
1.2635 44.0 3520 1.6841
1.0984 45.0 3600 1.6842
1.2229 46.0 3680 1.6843
1.26 47.0 3760 1.6840
1.1621 48.0 3840 1.6844
1.2998 49.0 3920 1.6848
1.2054 50.0 4000 1.6843

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 2.1.0+cu118
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
2
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for nerottt/lc_full_packing

Adapter
(1172)
this model