Edit model card

cls_alldata_phi3_v1

This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4956

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.7566 0.0559 20 0.7643
0.6863 0.1117 40 0.7089
0.6538 0.1676 60 0.6706
0.6261 0.2235 80 0.6499
0.6402 0.2793 100 0.6321
0.594 0.3352 120 0.6226
0.5956 0.3911 140 0.6121
0.5743 0.4469 160 0.6016
0.5494 0.5028 180 0.5903
0.5861 0.5587 200 0.5887
0.5431 0.6145 220 0.5801
0.5404 0.6704 240 0.5746
0.5401 0.7263 260 0.5695
0.5363 0.7821 280 0.5644
0.5534 0.8380 300 0.5608
0.5936 0.8939 320 0.5552
0.5139 0.9497 340 0.5496
0.5096 1.0056 360 0.5468
0.4891 1.0615 380 0.5468
0.4524 1.1173 400 0.5433
0.4568 1.1732 420 0.5397
0.4462 1.2291 440 0.5374
0.4605 1.2849 460 0.5337
0.4469 1.3408 480 0.5328
0.458 1.3966 500 0.5313
0.4378 1.4525 520 0.5250
0.4654 1.5084 540 0.5232
0.4563 1.5642 560 0.5200
0.4664 1.6201 580 0.5155
0.4308 1.6760 600 0.5128
0.443 1.7318 620 0.5082
0.4508 1.7877 640 0.5070
0.4511 1.8436 660 0.4999
0.4467 1.8994 680 0.4996
0.4723 1.9553 700 0.4956

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
2
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for Sorour/cls_alldata_phi3_v1

Adapter
(285)
this model