metadata
license: apache-2.0
base_model: EleutherAI/pythia-410m-deduped
tags:
- alignment-handbook
- generated_from_trainer
datasets:
- princeton-nlp/llama3-ultrafeedback
model-index:
- name: pythia-410m-deduped
results: []
pythia-410m-deduped
This model is a fine-tuned version of EleutherAI/pythia-410m-deduped on the princeton-nlp/llama3-ultrafeedback dataset. It achieves the following results on the evaluation set:
- Loss: 1.7801
- Original Losses: 1.7969
- Weight: 1.0
- Abs Diff: 0.4453
- Rewards/chosen: -4.875
- Rewards/rejected: -5.0625
- Rewards/accuracies: 0.4405
- Rewards/margins: 0.2002
- Logps/rejected: -2.0312
- Logps/chosen: -1.9453
- Logits/rejected: 5.6875
- Logits/chosen: 5.7188
- All Logps 1: -656.8973
- All Logps 1 Values: -656.8973
- All Logps 2: 434.6329
- All Logps 2 Values: 434.6329
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-06
- train_batch_size: 36
- eval_batch_size: 36
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 8
- total_train_batch_size: 2304
- total_eval_batch_size: 288
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss | Original Losses | Weight | Abs Diff | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | All Logps 1 | All Logps 1 Values | All Logps 2 | All Logps 2 Values |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1.9612 | 0.0385 | 1 | 1.7894 | 1.8125 | 1.0 | 0.4492 | -4.9062 | -5.0938 | 0.4405 | 0.1895 | -2.0312 | -1.9688 | 5.6875 | 5.7188 | -657.8339 | -657.8338 | 434.6329 | 434.6329 |
1.9612 | 0.0769 | 2 | 1.7887 | 1.8125 | 1.0 | 0.4531 | -4.9062 | -5.0938 | 0.4444 | 0.1895 | -2.0312 | -1.9609 | 5.6875 | 5.6875 | -657.5561 | -657.5560 | 434.6329 | 434.6329 |
1.9612 | 0.1154 | 3 | 1.7887 | 1.8203 | 1.0 | 0.4512 | -4.9375 | -5.125 | 0.4444 | 0.1885 | -2.0469 | -1.9688 | 5.6875 | 5.7188 | -657.2574 | -657.2574 | 434.6329 | 434.6329 |
1.9612 | 0.1538 | 4 | 1.7891 | 1.8125 | 1.0 | 0.4512 | -4.9375 | -5.0938 | 0.4365 | 0.1807 | -2.0469 | -1.9688 | 5.6875 | 5.7188 | -657.5514 | -657.5513 | 434.6329 | 434.6329 |
1.868 | 0.1923 | 5 | 1.7881 | 1.8125 | 1.0 | 0.4473 | -4.9062 | -5.0938 | 0.4325 | 0.1816 | -2.0312 | -1.9609 | 5.6875 | 5.7188 | -656.7651 | -656.7651 | 434.6329 | 434.6329 |
1.868 | 0.2308 | 6 | 1.7911 | 1.8203 | 1.0 | 0.4512 | -4.9375 | -5.0938 | 0.4524 | 0.1670 | -2.0469 | -1.9766 | 5.6875 | 5.7188 | -658.1024 | -658.1024 | 434.6329 | 434.6329 |
1.868 | 0.2692 | 7 | 1.7870 | 1.8125 | 1.0 | 0.4512 | -4.9062 | -5.0938 | 0.4484 | 0.1846 | -2.0312 | -1.9609 | 5.6875 | 5.7188 | -657.3370 | -657.3370 | 434.6329 | 434.6329 |
1.868 | 0.3077 | 8 | 1.7835 | 1.8203 | 1.0 | 0.4473 | -4.9062 | -5.0938 | 0.4405 | 0.1729 | -2.0312 | -1.9688 | 5.6562 | 5.6875 | -657.3589 | -657.3589 | 434.6329 | 434.6329 |
1.868 | 0.3462 | 9 | 1.7860 | 1.8125 | 1.0 | 0.4453 | -4.9062 | -5.0938 | 0.4405 | 0.1855 | -2.0312 | -1.9609 | 5.6875 | 5.7188 | -657.4703 | -657.4702 | 434.6329 | 434.6329 |
1.886 | 0.3846 | 10 | 1.7897 | 1.8125 | 1.0 | 0.4453 | -4.9062 | -5.0938 | 0.4325 | 0.1855 | -2.0312 | -1.9609 | 5.6875 | 5.7188 | -657.2245 | -657.2244 | 434.6329 | 434.6329 |
1.886 | 0.4231 | 11 | 1.7852 | 1.8125 | 1.0 | 0.4473 | -4.9062 | -5.0938 | 0.4484 | 0.1807 | -2.0312 | -1.9609 | 5.6875 | 5.7188 | -657.7448 | -657.7448 | 434.6329 | 434.6329 |
1.886 | 0.4615 | 12 | 1.7827 | 1.8203 | 1.0 | 0.4492 | -4.9062 | -5.0938 | 0.4603 | 0.1797 | -2.0312 | -1.9609 | 5.6875 | 5.7188 | -657.9037 | -657.9037 | 434.6329 | 434.6329 |
1.886 | 0.5 | 13 | 1.7844 | 1.8203 | 1.0 | 0.4512 | -4.9062 | -5.0625 | 0.4365 | 0.1689 | -2.0312 | -1.9609 | 5.6875 | 5.7188 | -657.7488 | -657.7488 | 434.6329 | 434.6329 |
1.886 | 0.5385 | 14 | 1.7828 | 1.8047 | 1.0 | 0.4395 | -4.875 | -5.0625 | 0.4405 | 0.1885 | -2.0312 | -1.9531 | 5.6875 | 5.7188 | -657.5707 | -657.5707 | 434.6329 | 434.6329 |
1.8572 | 0.5769 | 15 | 1.7852 | 1.8125 | 1.0 | 0.4453 | -4.9062 | -5.0625 | 0.4365 | 0.1768 | -2.0312 | -1.9609 | 5.6875 | 5.7188 | -657.2753 | -657.2753 | 434.6329 | 434.6329 |
1.8572 | 0.6154 | 16 | 1.7798 | 1.8125 | 1.0 | 0.4414 | -4.9062 | -5.0625 | 0.4246 | 0.1709 | -2.0156 | -1.9531 | 5.6875 | 5.7188 | -657.5228 | -657.5228 | 434.6329 | 434.6329 |
1.8572 | 0.6538 | 17 | 1.7797 | 1.8047 | 1.0 | 0.4414 | -4.875 | -5.0625 | 0.4484 | 0.1816 | -2.0312 | -1.9531 | 5.6875 | 5.7188 | -657.8073 | -657.8073 | 434.6329 | 434.6329 |
1.8572 | 0.6923 | 18 | 1.7830 | 1.8125 | 1.0 | 0.4375 | -4.9062 | -5.0625 | 0.4405 | 0.1631 | -2.0312 | -1.9609 | 5.6875 | 5.7188 | -657.4370 | -657.4370 | 434.6329 | 434.6329 |
1.8572 | 0.7308 | 19 | 1.7831 | 1.8047 | 1.0 | 0.4414 | -4.875 | -5.0625 | 0.4524 | 0.1787 | -2.0312 | -1.9609 | 5.6875 | 5.7188 | -657.5411 | -657.5412 | 434.6329 | 434.6329 |
1.8374 | 0.7692 | 20 | 1.7812 | 1.8047 | 1.0 | 0.4512 | -4.9062 | -5.0938 | 0.4524 | 0.1973 | -2.0312 | -1.9531 | 5.6875 | 5.7188 | -657.5830 | -657.5831 | 434.6329 | 434.6329 |
1.8374 | 0.8077 | 21 | 1.7850 | 1.8125 | 1.0 | 0.4414 | -4.875 | -5.0625 | 0.4444 | 0.1719 | -2.0312 | -1.9609 | 5.6875 | 5.7188 | -657.6910 | -657.6910 | 434.6329 | 434.6329 |
1.8374 | 0.8462 | 22 | 1.7851 | 1.8047 | 1.0 | 0.4434 | -4.9062 | -5.0625 | 0.4405 | 0.1836 | -2.0312 | -1.9531 | 5.6875 | 5.7188 | -657.1679 | -657.1679 | 434.6329 | 434.6329 |
1.8374 | 0.8846 | 23 | 1.7782 | 1.8047 | 1.0 | 0.4375 | -4.9062 | -5.0625 | 0.4365 | 0.1748 | -2.0312 | -1.9609 | 5.6875 | 5.7188 | -658.0194 | -658.0193 | 434.6329 | 434.6329 |
1.8374 | 0.9231 | 24 | 1.7800 | 1.8047 | 1.0 | 0.4375 | -4.9062 | -5.0625 | 0.4524 | 0.1709 | -2.0312 | -1.9609 | 5.6875 | 5.7188 | -657.4482 | -657.4482 | 434.6329 | 434.6329 |
1.8714 | 0.9615 | 25 | 1.7788 | 1.7969 | 1.0 | 0.4375 | -4.875 | -5.0625 | 0.4325 | 0.1816 | -2.0312 | -1.9531 | 5.6875 | 5.7188 | -657.4512 | -657.4511 | 434.6329 | 434.6329 |
1.8714 | 1.0 | 26 | 1.7801 | 1.7969 | 1.0 | 0.4453 | -4.875 | -5.0625 | 0.4405 | 0.2002 | -2.0312 | -1.9453 | 5.6875 | 5.7188 | -656.8973 | -656.8973 | 434.6329 | 434.6329 |
Framework versions
- Transformers 4.42.3
- Pytorch 2.2.2+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1