llama-7b-dpo-qlora-relu
This model is a fine-tuned version of meta-llama/Llama-2-7b-chat-hf on the HuggingFaceH4/ultrafeedback_binarized dataset. It achieves the following results on the evaluation set:
- Loss: 0.6423
- Rewards/chosen: 0.9449
- Rewards/rejected: 0.6402
- Rewards/accuracies: 0.6670
- Rewards/margins: 0.3047
- Logps/rejected: -2686.5962
- Logps/chosen: -3150.4404
- Logits/rejected: 0.2397
- Logits/chosen: 0.1410
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 1
- eval_batch_size: 2
- seed: 42
- distributed_type: multi-GPU
- num_devices: 2
- gradient_accumulation_steps: 8
- total_train_batch_size: 16
- total_eval_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
---|---|---|---|---|---|---|---|---|---|---|---|
0.6918 | 0.01 | 20 | 0.6927 | 0.0105 | 0.0089 | 0.4990 | 0.0016 | -2749.7332 | -3243.8831 | 0.4756 | 0.4010 |
0.6888 | 0.01 | 40 | 0.6865 | 0.1092 | 0.0901 | 0.5570 | 0.0191 | -2741.6096 | -3234.0100 | 0.4667 | 0.3920 |
0.6812 | 0.02 | 60 | 0.6778 | 0.3441 | 0.2892 | 0.5580 | 0.0549 | -2721.7036 | -3210.5227 | 0.4384 | 0.3634 |
0.6845 | 0.02 | 80 | 0.6751 | 0.5007 | 0.4191 | 0.5530 | 0.0815 | -2708.7063 | -3194.8674 | 0.4217 | 0.3468 |
0.6855 | 0.03 | 100 | 0.6733 | 0.6956 | 0.5819 | 0.5500 | 0.1137 | -2692.4290 | -3175.3694 | 0.3896 | 0.3148 |
0.6642 | 0.03 | 120 | 0.6705 | 0.5230 | 0.4322 | 0.5710 | 0.0909 | -2707.4033 | -3192.6296 | 0.4159 | 0.3416 |
0.6701 | 0.04 | 140 | 0.6716 | 0.5848 | 0.4825 | 0.5710 | 0.1023 | -2702.3718 | -3186.4568 | 0.3973 | 0.3170 |
0.7142 | 0.04 | 160 | 0.6677 | 0.4415 | 0.3502 | 0.5850 | 0.0913 | -2715.6021 | -3200.7874 | 0.4151 | 0.3347 |
0.6615 | 0.05 | 180 | 0.6625 | 0.5577 | 0.4403 | 0.5990 | 0.1174 | -2706.5872 | -3189.1589 | 0.4109 | 0.3326 |
0.6665 | 0.05 | 200 | 0.6631 | 0.9369 | 0.7339 | 0.5860 | 0.2030 | -2677.2251 | -3151.2400 | 0.4161 | 0.3420 |
0.6708 | 0.06 | 220 | 0.6643 | 0.4623 | 0.3170 | 0.5920 | 0.1453 | -2718.9246 | -3198.7063 | 0.4936 | 0.4235 |
0.683 | 0.06 | 240 | 0.6630 | 0.5279 | 0.3786 | 0.6160 | 0.1493 | -2712.7622 | -3192.1443 | 0.4461 | 0.3650 |
0.6545 | 0.07 | 260 | 0.6642 | 0.7057 | 0.5381 | 0.6220 | 0.1676 | -2696.8049 | -3174.3584 | 0.4233 | 0.3391 |
0.6447 | 0.07 | 280 | 0.6697 | 0.9829 | 0.7689 | 0.6040 | 0.2140 | -2673.7317 | -3146.6445 | 0.3740 | 0.2969 |
0.6532 | 0.08 | 300 | 0.6842 | 1.0988 | 0.8235 | 0.6160 | 0.2752 | -2668.2654 | -3135.0552 | 0.3932 | 0.3240 |
0.6508 | 0.08 | 320 | 0.6766 | 0.4977 | 0.3186 | 0.6110 | 0.1791 | -2718.7561 | -3195.1597 | 0.4100 | 0.3256 |
0.6363 | 0.09 | 340 | 0.6838 | 0.6603 | 0.4982 | 0.5950 | 0.1621 | -2700.7981 | -3178.8992 | 0.3598 | 0.2745 |
0.7016 | 0.09 | 360 | 0.6749 | 1.1088 | 0.8535 | 0.6150 | 0.2553 | -2665.2732 | -3134.0569 | 0.3153 | 0.2233 |
0.6508 | 0.1 | 380 | 0.6655 | 0.8342 | 0.5982 | 0.6170 | 0.2360 | -2690.8040 | -3161.5134 | 0.3631 | 0.2783 |
0.7066 | 0.1 | 400 | 0.6643 | 0.4586 | 0.3004 | 0.6090 | 0.1582 | -2720.5776 | -3199.0710 | 0.3913 | 0.3081 |
0.6569 | 0.11 | 420 | 0.6895 | 1.7461 | 1.4097 | 0.5970 | 0.3364 | -2609.6536 | -3070.3232 | 0.1995 | 0.0996 |
0.6971 | 0.12 | 440 | 0.6804 | 0.3106 | 0.1542 | 0.5970 | 0.1564 | -2735.1970 | -3213.8723 | 0.4654 | 0.3795 |
0.7179 | 0.12 | 460 | 0.6708 | 0.3908 | 0.2148 | 0.6060 | 0.1760 | -2729.1362 | -3205.8511 | 0.4534 | 0.3755 |
0.6713 | 0.13 | 480 | 0.6653 | 1.1610 | 0.9019 | 0.6100 | 0.2591 | -2660.4290 | -3128.8379 | 0.2921 | 0.2072 |
0.7025 | 0.13 | 500 | 0.6618 | 0.8239 | 0.6190 | 0.6230 | 0.2048 | -2688.7156 | -3162.5444 | 0.3598 | 0.2752 |
0.6805 | 0.14 | 520 | 0.6632 | 1.1599 | 0.9100 | 0.6100 | 0.2499 | -2659.6174 | -3128.9429 | 0.3036 | 0.2199 |
0.6669 | 0.14 | 540 | 0.6762 | 0.4281 | 0.2712 | 0.6010 | 0.1569 | -2723.4954 | -3202.1235 | 0.3960 | 0.3262 |
0.7231 | 0.15 | 560 | 0.6819 | 1.6382 | 1.2978 | 0.6100 | 0.3405 | -2620.8401 | -3081.1089 | 0.2061 | 0.1212 |
0.6914 | 0.15 | 580 | 0.6667 | 0.7317 | 0.5114 | 0.6120 | 0.2203 | -2699.4773 | -3171.7651 | 0.3602 | 0.2804 |
0.6744 | 0.16 | 600 | 0.6655 | 1.3122 | 1.0204 | 0.6140 | 0.2917 | -2648.5754 | -3113.7166 | 0.2893 | 0.2001 |
0.7202 | 0.16 | 620 | 0.6704 | 1.3732 | 1.0696 | 0.6190 | 0.3035 | -2643.6584 | -3107.6179 | 0.3039 | 0.2156 |
0.6505 | 0.17 | 640 | 0.6631 | 1.0842 | 0.8426 | 0.6320 | 0.2416 | -2666.3557 | -3136.5125 | 0.2946 | 0.2053 |
0.6678 | 0.17 | 660 | 0.6688 | 0.7100 | 0.5343 | 0.6170 | 0.1758 | -2697.1909 | -3173.9294 | 0.3019 | 0.2101 |
0.6905 | 0.18 | 680 | 0.6601 | 1.1264 | 0.8772 | 0.6300 | 0.2492 | -2662.8979 | -3132.2937 | 0.2674 | 0.1735 |
0.6414 | 0.18 | 700 | 0.6684 | 0.7719 | 0.5596 | 0.6280 | 0.2123 | -2694.6565 | -3167.7427 | 0.3401 | 0.2509 |
0.6752 | 0.19 | 720 | 0.6932 | 1.8703 | 1.4853 | 0.6140 | 0.3850 | -2602.0854 | -3057.8987 | 0.1787 | 0.0785 |
0.6982 | 0.19 | 740 | 0.6774 | 0.4667 | 0.2947 | 0.6160 | 0.1720 | -2721.1499 | -3198.2676 | 0.3575 | 0.2655 |
0.6149 | 0.2 | 760 | 0.6715 | 1.5227 | 1.1845 | 0.6310 | 0.3383 | -2632.1743 | -3092.6604 | 0.2016 | 0.1056 |
0.6568 | 0.2 | 780 | 0.6975 | 0.1888 | -0.0062 | 0.6020 | 0.1951 | -2751.2429 | -3226.0491 | 0.4294 | 0.3384 |
0.633 | 0.21 | 800 | 0.6989 | 2.0748 | 1.6194 | 0.6130 | 0.4554 | -2588.6804 | -3037.4561 | 0.1522 | 0.0436 |
0.6907 | 0.21 | 820 | 0.6632 | 1.0945 | 0.8066 | 0.6350 | 0.2879 | -2669.9553 | -3135.4792 | 0.3036 | 0.2037 |
0.6582 | 0.22 | 840 | 0.6571 | 0.8583 | 0.6168 | 0.6260 | 0.2416 | -2688.9436 | -3159.1021 | 0.3119 | 0.2173 |
0.6568 | 0.23 | 860 | 0.6718 | 0.4558 | 0.2827 | 0.6090 | 0.1732 | -2722.3523 | -3199.3511 | 0.3512 | 0.2592 |
0.6589 | 0.23 | 880 | 0.6679 | 1.3269 | 1.0100 | 0.625 | 0.3169 | -2649.6179 | -3112.2434 | 0.2110 | 0.1108 |
0.6371 | 0.24 | 900 | 0.6656 | 1.1832 | 0.8731 | 0.6300 | 0.3101 | -2663.3120 | -3126.6121 | 0.2307 | 0.1377 |
0.7471 | 0.24 | 920 | 0.6693 | 0.8367 | 0.5850 | 0.6390 | 0.2517 | -2692.1221 | -3161.2661 | 0.2916 | 0.2077 |
0.6415 | 0.25 | 940 | 0.6632 | 1.0762 | 0.8094 | 0.6370 | 0.2669 | -2669.6843 | -3137.3086 | 0.2347 | 0.1441 |
0.7267 | 0.25 | 960 | 0.6971 | 2.0368 | 1.6586 | 0.5930 | 0.3781 | -2584.7571 | -3041.2559 | 0.0743 | -0.0256 |
0.6586 | 0.26 | 980 | 0.6856 | 0.3772 | 0.2421 | 0.6090 | 0.1351 | -2726.4094 | -3207.2104 | 0.3268 | 0.2380 |
0.7058 | 0.26 | 1000 | 0.6665 | 1.0340 | 0.7988 | 0.6310 | 0.2352 | -2670.7419 | -3141.5334 | 0.2264 | 0.1320 |
0.6562 | 0.27 | 1020 | 0.6731 | 0.4362 | 0.2631 | 0.6220 | 0.1731 | -2724.3091 | -3201.3096 | 0.3141 | 0.2192 |
0.6695 | 0.27 | 1040 | 0.6666 | 0.9000 | 0.6468 | 0.6240 | 0.2532 | -2685.9409 | -3154.9338 | 0.2496 | 0.1522 |
0.6998 | 0.28 | 1060 | 0.6631 | 0.9608 | 0.7039 | 0.6270 | 0.2569 | -2680.2302 | -3148.8518 | 0.2293 | 0.1286 |
0.6467 | 0.28 | 1080 | 0.6611 | 0.9271 | 0.6794 | 0.6310 | 0.2477 | -2682.6790 | -3152.2249 | 0.2534 | 0.1543 |
0.7014 | 0.29 | 1100 | 0.6916 | 0.1793 | 0.0194 | 0.5970 | 0.1599 | -2748.6746 | -3227.0022 | 0.4020 | 0.3112 |
0.6383 | 0.29 | 1120 | 0.6646 | 1.2449 | 0.9461 | 0.6190 | 0.2988 | -2656.0103 | -3120.4397 | 0.2246 | 0.1310 |
0.6594 | 0.3 | 1140 | 0.6694 | 1.2174 | 0.9267 | 0.625 | 0.2907 | -2657.9519 | -3123.1938 | 0.2294 | 0.1372 |
0.6662 | 0.3 | 1160 | 0.6692 | 0.7808 | 0.5201 | 0.6340 | 0.2606 | -2698.6074 | -3166.8572 | 0.3542 | 0.2664 |
0.6439 | 0.31 | 1180 | 0.6644 | 0.9192 | 0.6222 | 0.6410 | 0.2970 | -2688.3950 | -3153.0110 | 0.3655 | 0.2800 |
0.6218 | 0.31 | 1200 | 0.6586 | 1.0825 | 0.7651 | 0.6430 | 0.3175 | -2674.1140 | -3136.6797 | 0.3050 | 0.2116 |
0.68 | 0.32 | 1220 | 0.6571 | 0.9931 | 0.6987 | 0.6560 | 0.2944 | -2680.7493 | -3145.6201 | 0.3002 | 0.2058 |
0.631 | 0.32 | 1240 | 0.6606 | 1.4409 | 1.0899 | 0.6450 | 0.3511 | -2641.6331 | -3100.8398 | 0.2226 | 0.1298 |
0.6553 | 0.33 | 1260 | 0.6755 | 1.3941 | 1.0416 | 0.6360 | 0.3525 | -2646.4556 | -3105.5215 | 0.1853 | 0.0877 |
0.656 | 0.33 | 1280 | 0.6742 | 1.6210 | 1.2561 | 0.6470 | 0.3649 | -2625.0129 | -3082.8352 | 0.1333 | 0.0343 |
0.6968 | 0.34 | 1300 | 0.6620 | 1.5566 | 1.2255 | 0.6370 | 0.3311 | -2628.0706 | -3089.2764 | 0.1418 | 0.0440 |
0.6756 | 0.35 | 1320 | 0.6619 | 1.4656 | 1.1785 | 0.6260 | 0.2871 | -2632.7727 | -3098.3765 | 0.1436 | 0.0456 |
0.651 | 0.35 | 1340 | 0.6586 | 0.9936 | 0.7542 | 0.6330 | 0.2394 | -2675.2009 | -3145.5730 | 0.2575 | 0.1608 |
0.6863 | 0.36 | 1360 | 0.6593 | 1.0603 | 0.7861 | 0.6410 | 0.2742 | -2672.0063 | -3138.9028 | 0.2625 | 0.1624 |
0.6671 | 0.36 | 1380 | 0.6585 | 0.9249 | 0.6679 | 0.6300 | 0.2570 | -2683.8271 | -3152.4412 | 0.2769 | 0.1792 |
0.6495 | 0.37 | 1400 | 0.6559 | 1.0075 | 0.7487 | 0.6410 | 0.2589 | -2675.7534 | -3144.1819 | 0.2563 | 0.1592 |
0.6505 | 0.37 | 1420 | 0.6666 | 0.5015 | 0.3152 | 0.6310 | 0.1862 | -2719.0969 | -3194.7869 | 0.3321 | 0.2419 |
0.6855 | 0.38 | 1440 | 0.6567 | 0.8450 | 0.6100 | 0.6470 | 0.2350 | -2689.6213 | -3160.4331 | 0.2770 | 0.1859 |
0.6501 | 0.38 | 1460 | 0.6599 | 0.7577 | 0.5266 | 0.6390 | 0.2311 | -2697.9607 | -3169.1663 | 0.2910 | 0.1981 |
0.649 | 0.39 | 1480 | 0.6599 | 1.2617 | 0.9540 | 0.6420 | 0.3077 | -2655.2158 | -3118.7607 | 0.2065 | 0.1052 |
0.6554 | 0.39 | 1500 | 0.6583 | 1.0495 | 0.7839 | 0.6490 | 0.2656 | -2672.2302 | -3139.9814 | 0.2280 | 0.1330 |
0.6749 | 0.4 | 1520 | 0.6606 | 0.8217 | 0.5860 | 0.6320 | 0.2356 | -2692.0178 | -3162.7683 | 0.2671 | 0.1767 |
0.6857 | 0.4 | 1540 | 0.6595 | 0.7859 | 0.5242 | 0.6460 | 0.2617 | -2698.1951 | -3166.3406 | 0.3070 | 0.2132 |
0.6507 | 0.41 | 1560 | 0.6542 | 0.9973 | 0.6889 | 0.6470 | 0.3084 | -2681.7246 | -3145.1982 | 0.2675 | 0.1687 |
0.6126 | 0.41 | 1580 | 0.6575 | 1.2987 | 0.9358 | 0.6440 | 0.3629 | -2657.0410 | -3115.0645 | 0.2162 | 0.1168 |
0.6109 | 0.42 | 1600 | 0.6630 | 1.4768 | 1.0912 | 0.6350 | 0.3857 | -2641.5007 | -3097.2493 | 0.1774 | 0.0735 |
0.6221 | 0.42 | 1620 | 0.6609 | 1.2858 | 0.9370 | 0.6470 | 0.3488 | -2656.9226 | -3116.3562 | 0.1969 | 0.0922 |
0.6565 | 0.43 | 1640 | 0.6651 | 0.7151 | 0.4459 | 0.6400 | 0.2692 | -2706.0293 | -3173.4238 | 0.2898 | 0.1894 |
0.5982 | 0.43 | 1660 | 0.6571 | 1.4690 | 1.0905 | 0.6410 | 0.3785 | -2641.5686 | -3098.0374 | 0.1833 | 0.0805 |
0.6986 | 0.44 | 1680 | 0.6550 | 1.1146 | 0.7781 | 0.6480 | 0.3365 | -2672.8064 | -3133.4736 | 0.2533 | 0.1546 |
0.6316 | 0.44 | 1700 | 0.6606 | 1.6375 | 1.2530 | 0.6360 | 0.3845 | -2625.3179 | -3081.1812 | 0.1494 | 0.0475 |
0.6618 | 0.45 | 1720 | 0.6571 | 1.0847 | 0.7877 | 0.6440 | 0.2969 | -2671.8479 | -3136.4675 | 0.2297 | 0.1309 |
0.7146 | 0.46 | 1740 | 0.6609 | 1.4069 | 1.0677 | 0.6420 | 0.3392 | -2643.8464 | -3104.2388 | 0.1950 | 0.0944 |
0.7156 | 0.46 | 1760 | 0.6546 | 1.0781 | 0.7864 | 0.6530 | 0.2917 | -2671.9775 | -3137.1184 | 0.2555 | 0.1579 |
0.6817 | 0.47 | 1780 | 0.6729 | 0.5426 | 0.3537 | 0.6190 | 0.1888 | -2715.2463 | -3190.6765 | 0.3162 | 0.2207 |
0.6277 | 0.47 | 1800 | 0.6605 | 1.4863 | 1.1568 | 0.6330 | 0.3295 | -2634.9365 | -3096.2996 | 0.1666 | 0.0620 |
0.6093 | 0.48 | 1820 | 0.6556 | 1.3461 | 1.0113 | 0.6490 | 0.3348 | -2649.4885 | -3110.3245 | 0.2064 | 0.1022 |
0.6416 | 0.48 | 1840 | 0.6525 | 1.0218 | 0.7311 | 0.6510 | 0.2908 | -2677.5134 | -3142.7522 | 0.2618 | 0.1602 |
0.647 | 0.49 | 1860 | 0.6554 | 1.3002 | 0.9643 | 0.6440 | 0.3360 | -2654.1936 | -3114.9124 | 0.2039 | 0.1007 |
0.6269 | 0.49 | 1880 | 0.6585 | 0.7954 | 0.5231 | 0.6350 | 0.2724 | -2698.3127 | -3165.3899 | 0.2689 | 0.1661 |
0.7114 | 0.5 | 1900 | 0.6589 | 0.6154 | 0.3766 | 0.6370 | 0.2388 | -2712.9587 | -3183.3887 | 0.2911 | 0.1904 |
0.6789 | 0.5 | 1920 | 0.6563 | 0.7003 | 0.4604 | 0.6350 | 0.2400 | -2704.5811 | -3174.8984 | 0.2714 | 0.1702 |
0.6729 | 0.51 | 1940 | 0.6574 | 1.2669 | 0.9475 | 0.6420 | 0.3194 | -2655.8650 | -3118.2434 | 0.1795 | 0.0734 |
0.6502 | 0.51 | 1960 | 0.6607 | 1.4160 | 1.0771 | 0.6400 | 0.3390 | -2642.9128 | -3103.3286 | 0.1572 | 0.0508 |
0.6567 | 0.52 | 1980 | 0.6547 | 0.9924 | 0.7209 | 0.6440 | 0.2715 | -2678.5286 | -3145.6885 | 0.2263 | 0.1233 |
0.66 | 0.52 | 2000 | 0.6564 | 0.9395 | 0.6803 | 0.6410 | 0.2592 | -2682.5881 | -3150.9863 | 0.2323 | 0.1301 |
0.6165 | 0.53 | 2020 | 0.6539 | 1.1203 | 0.8204 | 0.6420 | 0.2999 | -2668.5769 | -3132.9045 | 0.2117 | 0.1094 |
0.7214 | 0.53 | 2040 | 0.6555 | 1.3331 | 0.9914 | 0.6430 | 0.3418 | -2651.4824 | -3111.6213 | 0.1934 | 0.0901 |
0.6622 | 0.54 | 2060 | 0.6509 | 1.2432 | 0.9268 | 0.6400 | 0.3164 | -2657.9395 | -3120.6147 | 0.1900 | 0.0865 |
0.6141 | 0.54 | 2080 | 0.6504 | 1.1034 | 0.8115 | 0.6370 | 0.2919 | -2669.4675 | -3134.5964 | 0.2067 | 0.1041 |
0.6511 | 0.55 | 2100 | 0.6495 | 1.3362 | 1.0167 | 0.6470 | 0.3195 | -2648.9509 | -3111.3123 | 0.1578 | 0.0529 |
0.6457 | 0.55 | 2120 | 0.6507 | 1.4016 | 1.0814 | 0.6300 | 0.3202 | -2642.4827 | -3104.7749 | 0.1297 | 0.0242 |
0.6444 | 0.56 | 2140 | 0.6481 | 0.9908 | 0.7249 | 0.6460 | 0.2659 | -2678.1279 | -3145.8511 | 0.1869 | 0.0838 |
0.6709 | 0.57 | 2160 | 0.6469 | 1.1710 | 0.8782 | 0.6470 | 0.2928 | -2662.7959 | -3127.8286 | 0.1521 | 0.0463 |
0.7217 | 0.57 | 2180 | 0.6496 | 0.8703 | 0.6234 | 0.6410 | 0.2469 | -2688.2808 | -3157.9065 | 0.1928 | 0.0898 |
0.7032 | 0.58 | 2200 | 0.6462 | 1.2924 | 0.9830 | 0.6350 | 0.3094 | -2652.3159 | -3115.6887 | 0.1211 | 0.0142 |
0.729 | 0.58 | 2220 | 0.6603 | 1.7124 | 1.3448 | 0.6340 | 0.3676 | -2616.1379 | -3073.6912 | 0.0609 | -0.0472 |
0.6496 | 0.59 | 2240 | 0.6475 | 1.2981 | 0.9806 | 0.6440 | 0.3175 | -2652.5581 | -3115.1221 | 0.1405 | 0.0349 |
0.6615 | 0.59 | 2260 | 0.6476 | 1.3386 | 1.0066 | 0.6450 | 0.3320 | -2649.9587 | -3111.0693 | 0.1516 | 0.0464 |
0.6581 | 0.6 | 2280 | 0.6458 | 1.0039 | 0.7166 | 0.6520 | 0.2873 | -2678.9626 | -3144.5474 | 0.2101 | 0.1083 |
0.6604 | 0.6 | 2300 | 0.6468 | 0.9760 | 0.6927 | 0.6510 | 0.2833 | -2681.3484 | -3147.3301 | 0.2123 | 0.1119 |
0.6762 | 0.61 | 2320 | 0.6451 | 1.2231 | 0.9037 | 0.6540 | 0.3194 | -2660.2520 | -3122.6216 | 0.1764 | 0.0751 |
0.6687 | 0.61 | 2340 | 0.6448 | 1.0471 | 0.7491 | 0.6470 | 0.2980 | -2675.7124 | -3140.2263 | 0.2063 | 0.1060 |
0.6154 | 0.62 | 2360 | 0.6460 | 1.3661 | 1.0244 | 0.6510 | 0.3417 | -2648.1831 | -3108.3257 | 0.1519 | 0.0509 |
0.712 | 0.62 | 2380 | 0.6491 | 1.4910 | 1.1296 | 0.6490 | 0.3613 | -2637.6560 | -3095.8364 | 0.1400 | 0.0397 |
0.675 | 0.63 | 2400 | 0.6467 | 0.8895 | 0.6147 | 0.6510 | 0.2748 | -2689.1521 | -3155.9834 | 0.2318 | 0.1331 |
0.6251 | 0.63 | 2420 | 0.6458 | 0.9209 | 0.6407 | 0.6540 | 0.2802 | -2686.5471 | -3152.8416 | 0.2377 | 0.1404 |
0.58 | 0.64 | 2440 | 0.6451 | 1.0306 | 0.7363 | 0.6470 | 0.2943 | -2676.9885 | -3141.8696 | 0.2140 | 0.1162 |
0.6538 | 0.64 | 2460 | 0.6477 | 1.5124 | 1.1437 | 0.6430 | 0.3688 | -2636.2539 | -3093.6924 | 0.1432 | 0.0436 |
0.6741 | 0.65 | 2480 | 0.6436 | 1.2072 | 0.8802 | 0.6460 | 0.3270 | -2662.5972 | -3124.2100 | 0.1846 | 0.0856 |
0.6109 | 0.65 | 2500 | 0.6447 | 1.1234 | 0.8068 | 0.6500 | 0.3166 | -2669.9399 | -3132.5901 | 0.1969 | 0.0982 |
0.6749 | 0.66 | 2520 | 0.6447 | 1.1995 | 0.8679 | 0.6470 | 0.3316 | -2663.8308 | -3124.9836 | 0.1925 | 0.0972 |
0.6524 | 0.66 | 2540 | 0.6449 | 1.1229 | 0.8033 | 0.6430 | 0.3196 | -2670.2866 | -3132.6394 | 0.2064 | 0.1134 |
0.6155 | 0.67 | 2560 | 0.6445 | 1.2928 | 0.9541 | 0.6490 | 0.3388 | -2655.2100 | -3115.6487 | 0.1766 | 0.0820 |
0.6498 | 0.68 | 2580 | 0.6460 | 1.4062 | 1.0492 | 0.6530 | 0.3570 | -2645.6958 | -3104.3142 | 0.1537 | 0.0579 |
0.6205 | 0.68 | 2600 | 0.6453 | 1.4175 | 1.0608 | 0.6500 | 0.3567 | -2644.5352 | -3103.1826 | 0.1426 | 0.0455 |
0.6644 | 0.69 | 2620 | 0.6438 | 1.2662 | 0.9337 | 0.6520 | 0.3326 | -2657.2537 | -3118.3110 | 0.1690 | 0.0736 |
0.6403 | 0.69 | 2640 | 0.6467 | 0.8363 | 0.5591 | 0.6530 | 0.2772 | -2694.7085 | -3161.3059 | 0.2464 | 0.1527 |
0.6697 | 0.7 | 2660 | 0.6505 | 0.7270 | 0.4575 | 0.6480 | 0.2696 | -2704.8721 | -3172.2310 | 0.2698 | 0.1761 |
0.586 | 0.7 | 2680 | 0.6468 | 0.9120 | 0.6146 | 0.6530 | 0.2973 | -2689.1567 | -3153.7361 | 0.2405 | 0.1441 |
0.7133 | 0.71 | 2700 | 0.6477 | 0.9017 | 0.6019 | 0.6550 | 0.2997 | -2690.4275 | -3154.7668 | 0.2465 | 0.1499 |
0.6203 | 0.71 | 2720 | 0.6453 | 1.1435 | 0.8143 | 0.6560 | 0.3292 | -2669.1887 | -3130.5833 | 0.2043 | 0.1065 |
0.6403 | 0.72 | 2740 | 0.6447 | 1.1619 | 0.8317 | 0.6600 | 0.3302 | -2667.4482 | -3128.7390 | 0.1988 | 0.1014 |
0.6562 | 0.72 | 2760 | 0.6440 | 1.2726 | 0.9351 | 0.6550 | 0.3374 | -2657.1047 | -3117.6772 | 0.1750 | 0.0771 |
0.6216 | 0.73 | 2780 | 0.6433 | 1.1472 | 0.8271 | 0.6570 | 0.3201 | -2667.9097 | -3130.2151 | 0.1984 | 0.1010 |
0.6439 | 0.73 | 2800 | 0.6434 | 1.1500 | 0.8274 | 0.6630 | 0.3226 | -2667.8799 | -3129.9346 | 0.2031 | 0.1054 |
0.6545 | 0.74 | 2820 | 0.6444 | 1.2737 | 0.9325 | 0.6570 | 0.3412 | -2657.3660 | -3117.5645 | 0.1840 | 0.0854 |
0.5712 | 0.74 | 2840 | 0.6442 | 1.3124 | 0.9665 | 0.6590 | 0.3459 | -2653.9678 | -3113.6951 | 0.1738 | 0.0740 |
0.6623 | 0.75 | 2860 | 0.6435 | 1.2882 | 0.9459 | 0.6590 | 0.3424 | -2656.0342 | -3116.1118 | 0.1759 | 0.0758 |
0.6491 | 0.75 | 2880 | 0.6429 | 1.0676 | 0.7540 | 0.6630 | 0.3136 | -2675.2224 | -3138.1736 | 0.2086 | 0.1087 |
0.6316 | 0.76 | 2900 | 0.6444 | 0.9143 | 0.6184 | 0.6560 | 0.2959 | -2688.7827 | -3153.5068 | 0.2324 | 0.1333 |
0.6851 | 0.76 | 2920 | 0.6433 | 0.9858 | 0.6757 | 0.6550 | 0.3102 | -2683.0530 | -3146.3491 | 0.2232 | 0.1233 |
0.6261 | 0.77 | 2940 | 0.6436 | 1.0911 | 0.7674 | 0.6610 | 0.3237 | -2673.8782 | -3135.8259 | 0.2103 | 0.1108 |
0.591 | 0.77 | 2960 | 0.6434 | 1.0843 | 0.7597 | 0.6610 | 0.3246 | -2674.6450 | -3136.4993 | 0.2118 | 0.1125 |
0.6719 | 0.78 | 2980 | 0.6440 | 1.0943 | 0.7677 | 0.6630 | 0.3266 | -2673.8528 | -3135.5054 | 0.2090 | 0.1100 |
0.6609 | 0.79 | 3000 | 0.6442 | 1.0791 | 0.7548 | 0.6630 | 0.3243 | -2675.1423 | -3137.0229 | 0.2128 | 0.1147 |
0.6365 | 0.79 | 3020 | 0.6446 | 1.1918 | 0.8544 | 0.6620 | 0.3374 | -2665.1812 | -3125.7544 | 0.1954 | 0.0969 |
0.6146 | 0.8 | 3040 | 0.6441 | 1.1548 | 0.8233 | 0.6600 | 0.3315 | -2668.2886 | -3129.4490 | 0.2033 | 0.1046 |
0.6289 | 0.8 | 3060 | 0.6435 | 1.0469 | 0.7296 | 0.6610 | 0.3172 | -2677.6558 | -3140.2471 | 0.2190 | 0.1207 |
0.6233 | 0.81 | 3080 | 0.6443 | 0.9655 | 0.6584 | 0.6570 | 0.3072 | -2684.7822 | -3148.3809 | 0.2312 | 0.1331 |
0.5942 | 0.81 | 3100 | 0.6441 | 1.0521 | 0.7311 | 0.6620 | 0.3210 | -2677.5120 | -3139.7278 | 0.2208 | 0.1215 |
0.6646 | 0.82 | 3120 | 0.6439 | 1.0663 | 0.7436 | 0.6590 | 0.3226 | -2676.2566 | -3138.3083 | 0.2200 | 0.1207 |
0.7201 | 0.82 | 3140 | 0.6431 | 1.0673 | 0.7465 | 0.6630 | 0.3208 | -2675.9697 | -3138.2017 | 0.2170 | 0.1173 |
0.684 | 0.83 | 3160 | 0.6429 | 1.0782 | 0.7570 | 0.6630 | 0.3213 | -2674.9221 | -3137.1096 | 0.2138 | 0.1138 |
0.6372 | 0.83 | 3180 | 0.6424 | 1.0512 | 0.7307 | 0.6610 | 0.3205 | -2677.5535 | -3139.8164 | 0.2199 | 0.1195 |
0.6491 | 0.84 | 3200 | 0.6429 | 0.9864 | 0.6737 | 0.6610 | 0.3127 | -2683.2532 | -3146.2932 | 0.2311 | 0.1313 |
0.6321 | 0.84 | 3220 | 0.6419 | 1.0593 | 0.7374 | 0.6640 | 0.3218 | -2676.8789 | -3139.0081 | 0.2190 | 0.1184 |
0.6858 | 0.85 | 3240 | 0.6418 | 1.1185 | 0.7905 | 0.6670 | 0.3281 | -2671.5710 | -3133.0784 | 0.2093 | 0.1081 |
0.6487 | 0.85 | 3260 | 0.6414 | 1.1003 | 0.7762 | 0.6670 | 0.3241 | -2673.0029 | -3134.9077 | 0.2102 | 0.1092 |
0.6232 | 0.86 | 3280 | 0.6418 | 1.0890 | 0.7641 | 0.6650 | 0.3249 | -2674.2104 | -3136.0315 | 0.2155 | 0.1153 |
0.6751 | 0.86 | 3300 | 0.6423 | 1.1216 | 0.7925 | 0.6690 | 0.3291 | -2671.3660 | -3132.7705 | 0.2116 | 0.1113 |
0.6696 | 0.87 | 3320 | 0.6420 | 1.1138 | 0.7855 | 0.6650 | 0.3283 | -2672.0674 | -3133.5513 | 0.2124 | 0.1127 |
0.6762 | 0.87 | 3340 | 0.6418 | 1.0429 | 0.7242 | 0.6670 | 0.3187 | -2678.2026 | -3140.6455 | 0.2234 | 0.1238 |
0.6431 | 0.88 | 3360 | 0.6423 | 0.9878 | 0.6777 | 0.6680 | 0.3100 | -2682.8467 | -3146.1572 | 0.2324 | 0.1334 |
0.6533 | 0.88 | 3380 | 0.6422 | 0.9657 | 0.6575 | 0.6670 | 0.3082 | -2684.8696 | -3148.3625 | 0.2357 | 0.1369 |
0.6517 | 0.89 | 3400 | 0.6415 | 1.0024 | 0.6893 | 0.6660 | 0.3132 | -2681.6929 | -3144.6909 | 0.2319 | 0.1329 |
0.7125 | 0.9 | 3420 | 0.6420 | 0.9890 | 0.6795 | 0.6700 | 0.3095 | -2682.6711 | -3146.0359 | 0.2327 | 0.1341 |
0.655 | 0.9 | 3440 | 0.6418 | 0.9841 | 0.6752 | 0.6670 | 0.3089 | -2683.0972 | -3146.5217 | 0.2339 | 0.1353 |
0.6298 | 0.91 | 3460 | 0.6421 | 0.9683 | 0.6617 | 0.6670 | 0.3066 | -2684.4517 | -3148.1047 | 0.2362 | 0.1376 |
0.634 | 0.91 | 3480 | 0.6420 | 0.9671 | 0.6600 | 0.6640 | 0.3071 | -2684.6169 | -3148.2190 | 0.2363 | 0.1376 |
0.6325 | 0.92 | 3500 | 0.6422 | 0.9461 | 0.6408 | 0.6670 | 0.3053 | -2686.5374 | -3150.3208 | 0.2398 | 0.1410 |
0.6207 | 0.92 | 3520 | 0.6423 | 0.9349 | 0.6315 | 0.6670 | 0.3034 | -2687.4702 | -3151.4434 | 0.2420 | 0.1432 |
0.6435 | 0.93 | 3540 | 0.6423 | 0.9279 | 0.6254 | 0.6630 | 0.3025 | -2688.0842 | -3152.1453 | 0.2425 | 0.1440 |
0.6271 | 0.93 | 3560 | 0.6428 | 0.9143 | 0.6145 | 0.6670 | 0.2998 | -2689.1689 | -3153.5029 | 0.2442 | 0.1455 |
0.6405 | 0.94 | 3580 | 0.6426 | 0.9048 | 0.6055 | 0.6670 | 0.2994 | -2690.0718 | -3154.4497 | 0.2447 | 0.1459 |
0.6822 | 0.94 | 3600 | 0.6424 | 0.9191 | 0.6187 | 0.6610 | 0.3005 | -2688.7505 | -3153.0198 | 0.2428 | 0.1443 |
0.6431 | 0.95 | 3620 | 0.6423 | 0.9294 | 0.6263 | 0.6670 | 0.3031 | -2687.9922 | -3151.9922 | 0.2417 | 0.1429 |
0.6189 | 0.95 | 3640 | 0.6424 | 0.9340 | 0.6305 | 0.6690 | 0.3034 | -2687.5674 | -3151.5378 | 0.2410 | 0.1422 |
0.6516 | 0.96 | 3660 | 0.6424 | 0.9430 | 0.6385 | 0.6700 | 0.3045 | -2686.7739 | -3150.6345 | 0.2398 | 0.1409 |
0.6229 | 0.96 | 3680 | 0.6422 | 0.9399 | 0.6361 | 0.6680 | 0.3038 | -2687.0042 | -3150.9431 | 0.2402 | 0.1416 |
0.6209 | 0.97 | 3700 | 0.6424 | 0.9390 | 0.6353 | 0.6690 | 0.3037 | -2687.0925 | -3151.0369 | 0.2406 | 0.1419 |
0.5807 | 0.97 | 3720 | 0.6425 | 0.9358 | 0.6323 | 0.6700 | 0.3034 | -2687.3884 | -3151.3577 | 0.2408 | 0.1421 |
0.6304 | 0.98 | 3740 | 0.6423 | 0.9440 | 0.6394 | 0.6670 | 0.3047 | -2686.6794 | -3150.5283 | 0.2406 | 0.1419 |
0.6049 | 0.98 | 3760 | 0.6424 | 0.9451 | 0.6405 | 0.6660 | 0.3046 | -2686.5706 | -3150.4238 | 0.2391 | 0.1403 |
0.6624 | 0.99 | 3780 | 0.6424 | 0.9449 | 0.6407 | 0.6640 | 0.3042 | -2686.5491 | -3150.4412 | 0.2395 | 0.1407 |
0.6649 | 0.99 | 3800 | 0.6423 | 0.9422 | 0.6378 | 0.6660 | 0.3044 | -2686.8362 | -3150.7134 | 0.2403 | 0.1415 |
0.638 | 1.0 | 3820 | 0.6423 | 0.9449 | 0.6403 | 0.6670 | 0.3047 | -2686.5935 | -3150.4404 | 0.2397 | 0.1410 |
Framework versions
- PEFT 0.7.1
- Transformers 4.36.2
- Pytorch 2.2.1+cu121
- Datasets 2.14.6
- Tokenizers 0.15.2
- Downloads last month
- 8
Model tree for chanchan7/llama-7b-dpo-qlora-relu
Base model
meta-llama/Llama-2-7b-chat-hf