Jlonge4's picture
Jlonge4/outputs
e7e4308 verified
|
raw
history blame
3.29 kB
metadata
base_model: microsoft/Phi-3.5-mini-instruct
library_name: peft
license: mit
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: outputs
    results: []

Visualize in Weights & Biases Visualize in Weights & Biases Visualize in Weights & Biases Visualize in Weights & Biases Visualize in Weights & Biases

outputs

This model is a fine-tuned version of microsoft/Phi-3.5-mini-instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2249

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 20
  • training_steps: 100

Training results

Training Loss Epoch Step Validation Loss
1.979 0.5882 5 2.1552
1.5649 1.1765 10 1.7044
1.5355 1.7647 15 1.3163
0.9301 2.3529 20 1.0521
0.7935 2.9412 25 0.9929
0.6411 3.5294 30 0.9735
0.6521 4.1176 35 0.9699
0.4867 4.7059 40 0.9812
0.6112 5.2941 45 1.0029
0.5041 5.8824 50 1.1055
0.4784 6.4706 55 1.0859
0.3787 7.0588 60 1.1113
0.2676 7.6471 65 1.3963
0.3066 8.2353 70 1.2249

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.2
  • Pytorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1