Edit model card

llama_2_13b_Magiccoder_evol_10k

This model is a fine-tuned version of meta-llama/Llama-2-13b-hf on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1044

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 0.02
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
1.2459 0.0262 4 1.2861
1.2388 0.0523 8 1.2259
1.1411 0.0785 12 1.1833
1.0897 0.1047 16 1.1669
1.1171 0.1308 20 1.1500
1.0835 0.1570 24 1.1420
1.0782 0.1832 28 1.1362
1.1353 0.2093 32 1.1333
1.0558 0.2355 36 1.1298
1.1398 0.2617 40 1.1281
1.1114 0.2878 44 1.1244
1.1543 0.3140 48 1.1219
1.1327 0.3401 52 1.1189
1.1016 0.3663 56 1.1179
1.1543 0.3925 60 1.1173
1.1484 0.4186 64 1.1153
1.095 0.4448 68 1.1130
1.1118 0.4710 72 1.1109
1.0624 0.4971 76 1.1103
1.1475 0.5233 80 1.1093
1.161 0.5495 84 1.1094
1.1018 0.5756 88 1.1091
1.0541 0.6018 92 1.1065
1.054 0.6280 96 1.1055
1.1113 0.6541 100 1.1055
1.0971 0.6803 104 1.1053
1.0903 0.7065 108 1.1054
1.1206 0.7326 112 1.1052
1.0687 0.7588 116 1.1048
1.0892 0.7850 120 1.1043
1.1158 0.8111 124 1.1041
1.0789 0.8373 128 1.1042
1.0154 0.8635 132 1.1044
1.1258 0.8896 136 1.1044
1.0419 0.9158 140 1.1044
1.0886 0.9419 144 1.1044
1.1031 0.9681 148 1.1044
1.0979 0.9943 152 1.1044

Framework versions

  • PEFT 0.7.1
  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
2
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for imdatta0/llama_2_13b_Magiccoder_evol_10k

Adapter
(132)
this model