metadata

base_model: microsoft/Phi-3-mini-4k-instruct
library_name: peft
license: mit
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: outputdir
    results: []

outputdir

This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.2794

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 2

Training results

Training Loss	Epoch	Step	Validation Loss
3.0506	0.0771	100	2.8512
2.4484	0.1542	200	1.7779
1.5032	0.2313	300	1.3608
1.3472	0.3085	400	1.3325
1.3304	0.3856	500	1.3195
1.3156	0.4627	600	1.3103
1.313	0.5398	700	1.3038
1.2976	0.6169	800	1.2991
1.3001	0.6940	900	1.2956
1.2976	0.7712	1000	1.2927
1.2902	0.8483	1100	1.2907
1.2831	0.9254	1200	1.2888
1.2839	1.0025	1300	1.2874
1.2792	1.0796	1400	1.2860
1.295	1.1567	1500	1.2845
1.287	1.2339	1600	1.2838
1.2831	1.3110	1700	1.2831
1.2764	1.3881	1800	1.2821
1.2836	1.4652	1900	1.2815
1.2844	1.5423	2000	1.2810
1.2791	1.6194	2100	1.2804
1.2869	1.6965	2200	1.2799
1.2814	1.7737	2300	1.2798
1.2775	1.8508	2400	1.2796
1.2837	1.9279	2500	1.2794

Framework versions

PEFT 0.12.0
Transformers 4.42.4
Pytorch 2.3.1+cu121
Datasets 2.21.0
Tokenizers 0.19.1