metadata
base_model: stabilityai/StableBeluga-13B
tags:
- generated_from_trainer
model-index:
- name: PE-13b-full
results: []
PE-13b-full
This model is a fine-tuned version of stabilityai/StableBeluga-13B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.0094
- Rewards/chosen: -1.2833
- Rewards/rejected: -29.7294
- Rewards/accuracies: 0.9916
- Rewards/margins: 28.4460
- Logps/rejected: -121.9200
- Logps/chosen: -84.7524
- Logits/rejected: -2.1605
- Logits/chosen: -2.4403
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-07
- train_batch_size: 1
- eval_batch_size: 2
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 8
- total_train_batch_size: 64
- total_eval_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
---|---|---|---|---|---|---|---|---|---|---|---|
0.5085 | 0.05 | 100 | 0.4978 | 0.1241 | -0.3334 | 0.9525 | 0.4575 | -63.1282 | -81.9376 | -2.0870 | -2.3586 |
0.1966 | 0.09 | 200 | 0.2003 | 0.5022 | -1.3704 | 0.9804 | 1.8726 | -65.2020 | -81.1812 | -2.0918 | -2.3650 |
0.0612 | 0.14 | 300 | 0.0656 | 0.8997 | -3.3315 | 0.9888 | 4.2312 | -69.1243 | -80.3863 | -2.0887 | -2.3741 |
0.029 | 0.18 | 400 | 0.0356 | 0.9536 | -5.0607 | 0.9944 | 6.0143 | -72.5827 | -80.2785 | -2.0905 | -2.3804 |
0.0187 | 0.23 | 500 | 0.0201 | 0.9079 | -7.5059 | 0.9888 | 8.4139 | -77.4731 | -80.3699 | -2.0974 | -2.3915 |
0.0112 | 0.27 | 600 | 0.0130 | 0.7188 | -10.4500 | 0.9916 | 11.1688 | -83.3612 | -80.7481 | -2.0987 | -2.3960 |
0.0066 | 0.32 | 700 | 0.0102 | 0.6639 | -13.1345 | 0.9916 | 13.7984 | -88.7303 | -80.8579 | -2.1111 | -2.4104 |
0.0088 | 0.37 | 800 | 0.0098 | 0.9128 | -13.1977 | 0.9888 | 14.1105 | -88.8568 | -80.3601 | -2.1031 | -2.4030 |
0.0054 | 0.41 | 900 | 0.0092 | 0.6109 | -15.6398 | 0.9888 | 16.2507 | -93.7409 | -80.9640 | -2.1158 | -2.4144 |
0.0044 | 0.46 | 1000 | 0.0094 | 0.9982 | -16.0071 | 0.9916 | 17.0053 | -94.4755 | -80.1893 | -2.0988 | -2.3946 |
0.0061 | 0.5 | 1100 | 0.0089 | 0.5504 | -18.0125 | 0.9916 | 18.5630 | -98.4864 | -81.0849 | -2.0991 | -2.3955 |
0.024 | 0.55 | 1200 | 0.0088 | 0.4877 | -16.6683 | 0.9916 | 17.1561 | -95.7980 | -81.2103 | -2.0748 | -2.3633 |
0.0039 | 0.59 | 1300 | 0.0087 | 0.3755 | -18.5093 | 0.9916 | 18.8848 | -99.4799 | -81.4347 | -2.0746 | -2.3623 |
0.0051 | 0.64 | 1400 | 0.0086 | 0.1176 | -20.5558 | 0.9916 | 20.6734 | -103.5730 | -81.9506 | -2.0819 | -2.3738 |
0.0023 | 0.68 | 1500 | 0.0089 | 0.1552 | -20.0740 | 0.9888 | 20.2292 | -102.6092 | -81.8754 | -2.0813 | -2.3667 |
0.0027 | 0.73 | 1600 | 0.0089 | -0.5025 | -20.7978 | 0.9888 | 20.2953 | -104.0569 | -83.1908 | -2.1179 | -2.4078 |
0.0031 | 0.78 | 1700 | 0.0085 | -0.6314 | -21.0492 | 0.9916 | 20.4178 | -104.5597 | -83.4485 | -2.0915 | -2.3773 |
0.0049 | 0.82 | 1800 | 0.0085 | -0.7786 | -21.3333 | 0.9916 | 20.5547 | -105.1278 | -83.7429 | -2.0670 | -2.3504 |
0.0023 | 0.87 | 1900 | 0.0084 | -0.7496 | -22.3377 | 0.9944 | 21.5880 | -107.1367 | -83.6850 | -2.0729 | -2.3547 |
0.0067 | 0.91 | 2000 | 0.0086 | -0.8126 | -22.8024 | 0.9916 | 21.9899 | -108.0662 | -83.8109 | -2.0651 | -2.3472 |
0.0041 | 0.96 | 2100 | 0.0082 | -0.7903 | -21.8379 | 0.9944 | 21.0476 | -106.1371 | -83.7663 | -2.0363 | -2.3137 |
0.0025 | 1.0 | 2200 | 0.0079 | -0.4489 | -21.4451 | 0.9916 | 20.9963 | -105.3516 | -83.0835 | -2.0303 | -2.3074 |
0.0023 | 1.05 | 2300 | 0.0082 | -1.1267 | -22.7620 | 0.9944 | 21.6353 | -107.9852 | -84.4391 | -2.0477 | -2.3260 |
0.0055 | 1.1 | 2400 | 0.0085 | -1.4969 | -24.0568 | 0.9888 | 22.5599 | -110.5749 | -85.1796 | -2.0616 | -2.3384 |
0.0139 | 1.14 | 2500 | 0.0077 | 0.4564 | -20.3860 | 0.9916 | 20.8424 | -103.2333 | -81.2730 | -2.0453 | -2.3206 |
0.0023 | 1.19 | 2600 | 0.0081 | 0.0858 | -21.9640 | 0.9916 | 22.0498 | -106.3893 | -82.0141 | -2.0528 | -2.3273 |
0.0046 | 1.23 | 2700 | 0.0083 | -0.2543 | -23.4016 | 0.9916 | 23.1473 | -109.2646 | -82.6943 | -2.0668 | -2.3457 |
0.0033 | 1.28 | 2800 | 0.0083 | -0.3317 | -23.7872 | 0.9916 | 23.4555 | -110.0356 | -82.8491 | -2.0884 | -2.3650 |
0.0023 | 1.32 | 2900 | 0.0084 | -0.2753 | -24.3682 | 0.9916 | 24.0929 | -111.1976 | -82.7362 | -2.1054 | -2.3879 |
0.0034 | 1.37 | 3000 | 0.0081 | 0.4328 | -23.3162 | 0.9916 | 23.7491 | -109.0938 | -81.3201 | -2.0817 | -2.3565 |
0.0033 | 1.42 | 3100 | 0.0082 | -0.0254 | -23.7390 | 0.9944 | 23.7136 | -109.9394 | -82.2366 | -2.0706 | -2.3447 |
0.0033 | 1.46 | 3200 | 0.0086 | -0.7680 | -24.0452 | 0.9916 | 23.2772 | -110.5517 | -83.7218 | -2.0760 | -2.3543 |
0.0032 | 1.51 | 3300 | 0.0086 | -0.0016 | -23.5161 | 0.9944 | 23.5145 | -109.4934 | -82.1889 | -2.0881 | -2.3655 |
0.0011 | 1.55 | 3400 | 0.0084 | 0.0195 | -24.2635 | 0.9944 | 24.2831 | -110.9884 | -82.1467 | -2.0878 | -2.3667 |
0.0002 | 1.6 | 3500 | 0.0087 | 0.0421 | -24.8306 | 0.9916 | 24.8728 | -112.1225 | -82.1015 | -2.0890 | -2.3698 |
0.0034 | 1.64 | 3600 | 0.0086 | -0.2729 | -25.8106 | 0.9916 | 25.5377 | -114.0825 | -82.7315 | -2.1030 | -2.3851 |
0.0027 | 1.69 | 3700 | 0.0086 | 0.0339 | -25.0221 | 0.9916 | 25.0560 | -112.5055 | -82.1179 | -2.1300 | -2.4147 |
0.0056 | 1.73 | 3800 | 0.0082 | 0.1800 | -23.6173 | 0.9916 | 23.7974 | -109.6960 | -81.8257 | -2.1140 | -2.3980 |
0.0026 | 1.78 | 3900 | 0.0083 | -0.0334 | -24.6060 | 0.9944 | 24.5725 | -111.6733 | -82.2526 | -2.1140 | -2.3965 |
0.0036 | 1.83 | 4000 | 0.0080 | -0.2511 | -23.0433 | 0.9916 | 22.7923 | -108.5479 | -82.6879 | -2.1348 | -2.4167 |
0.0044 | 1.87 | 4100 | 0.0084 | -0.4259 | -23.7811 | 0.9916 | 23.3551 | -110.0234 | -83.0376 | -2.1314 | -2.4160 |
0.0022 | 1.92 | 4200 | 0.0083 | -0.5710 | -23.2360 | 0.9944 | 22.6650 | -108.9332 | -83.3277 | -2.1369 | -2.4196 |
0.0044 | 1.96 | 4300 | 0.0085 | -0.6363 | -24.6474 | 0.9972 | 24.0111 | -111.7560 | -83.4583 | -2.1307 | -2.4109 |
0.0023 | 2.01 | 4400 | 0.0085 | -0.6133 | -24.9492 | 0.9916 | 24.3359 | -112.3597 | -83.4124 | -2.1322 | -2.4134 |
0.0033 | 2.05 | 4500 | 0.0085 | -0.7101 | -25.5054 | 0.9916 | 24.7953 | -113.4721 | -83.6059 | -2.1326 | -2.4142 |
0.0023 | 2.1 | 4600 | 0.0087 | -0.7855 | -26.0511 | 0.9916 | 25.2656 | -114.5634 | -83.7567 | -2.1333 | -2.4152 |
0.0011 | 2.15 | 4700 | 0.0088 | -0.9006 | -26.5845 | 0.9944 | 25.6839 | -115.6303 | -83.9870 | -2.1369 | -2.4198 |
0.0065 | 2.19 | 4800 | 0.0088 | -0.7570 | -26.8960 | 0.9916 | 26.1390 | -116.2533 | -83.6997 | -2.1393 | -2.4198 |
0.0022 | 2.24 | 4900 | 0.0091 | -0.9581 | -27.9431 | 0.9916 | 26.9850 | -118.3475 | -84.1019 | -2.1428 | -2.4245 |
0.0026 | 2.28 | 5000 | 0.0091 | -1.2522 | -28.8309 | 0.9944 | 27.5788 | -120.1232 | -84.6901 | -2.1479 | -2.4287 |
0.0033 | 2.33 | 5100 | 0.0089 | -0.8602 | -28.7323 | 0.9916 | 27.8721 | -119.9259 | -83.9062 | -2.1522 | -2.4328 |
0.0041 | 2.37 | 5200 | 0.0091 | -1.0405 | -29.2861 | 0.9916 | 28.2456 | -121.0335 | -84.2668 | -2.1536 | -2.4343 |
0.0023 | 2.42 | 5300 | 0.0093 | -1.1323 | -29.5240 | 0.9916 | 28.3917 | -121.5093 | -84.4504 | -2.1529 | -2.4336 |
0.0022 | 2.46 | 5400 | 0.0092 | -1.2202 | -29.2127 | 0.9916 | 27.9925 | -120.8866 | -84.6261 | -2.1595 | -2.4416 |
0.0 | 2.51 | 5500 | 0.0093 | -1.4371 | -29.7063 | 0.9916 | 28.2692 | -121.8739 | -85.0599 | -2.1609 | -2.4404 |
0.0022 | 2.56 | 5600 | 0.0095 | -1.4397 | -30.0202 | 0.9944 | 28.5804 | -122.5016 | -85.0652 | -2.1584 | -2.4383 |
0.0011 | 2.6 | 5700 | 0.0096 | -1.6125 | -30.0945 | 0.9916 | 28.4820 | -122.6504 | -85.4108 | -2.1601 | -2.4395 |
0.0053 | 2.65 | 5800 | 0.0095 | -1.5638 | -30.0025 | 0.9944 | 28.4387 | -122.4663 | -85.3133 | -2.1615 | -2.4398 |
0.003 | 2.69 | 5900 | 0.0095 | -1.5904 | -30.1980 | 0.9916 | 28.6076 | -122.8572 | -85.3666 | -2.1606 | -2.4406 |
0.0011 | 2.74 | 6000 | 0.0094 | -1.5286 | -30.0882 | 0.9944 | 28.5596 | -122.6377 | -85.2429 | -2.1615 | -2.4403 |
0.0008 | 2.78 | 6100 | 0.0095 | -1.4405 | -30.0174 | 0.9916 | 28.5769 | -122.4961 | -85.0667 | -2.1615 | -2.4400 |
0.0022 | 2.83 | 6200 | 0.0093 | -1.3508 | -29.9317 | 0.9916 | 28.5808 | -122.3246 | -84.8874 | -2.1599 | -2.4395 |
0.0019 | 2.88 | 6300 | 0.0093 | -1.2416 | -29.6525 | 0.9916 | 28.4109 | -121.7663 | -84.6690 | -2.1620 | -2.4415 |
0.0034 | 2.92 | 6400 | 0.0093 | -1.2995 | -29.7927 | 0.9916 | 28.4932 | -122.0468 | -84.7848 | -2.1616 | -2.4412 |
0.0014 | 2.97 | 6500 | 0.0092 | -1.2574 | -29.7200 | 0.9916 | 28.4626 | -121.9014 | -84.7006 | -2.1595 | -2.4408 |
Framework versions
- Transformers 4.35.0
- Pytorch 2.1.1+cu121
- Datasets 2.14.6
- Tokenizers 0.14.1