metadata

license: apache-2.0
base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
tags:
  - summarization
  - generated_from_trainer
model-index:
  - name: sft-sum-chosen-10lp-shuff-full-tiny_same_params
    results: []

sft-sum-chosen-10lp-shuff-full-tiny_same_params

This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T on the martimfasantos/openai-summarize-tldr dataset. It achieves the following results on the evaluation set:

Loss: 1.8887
Nll Loss: 1.8968
Logps/best: -71.1814
Rewards/chosen: 2.2080
Rewards/rejected: -0.6886
Rewards/accuracies: 1.0
Rewards/margins: 2.8966
Logps/rejected: -14.2972
Logps/chosen: -71.1814
Logits/rejected: -3.0553
Logits/chosen: -3.4224

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 2
gradient_accumulation_steps: 2
total_train_batch_size: 32
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Nll Loss	Logps/best	Rewards/chosen	Rewards/rejected	Rewards/accuracies	Rewards/margins	Logps/rejected	Logps/chosen	Logits/rejected	Logits/chosen
1.9469	0.2193	800	1.9582	1.9648	-73.7246	1.9537	-0.4240	1.0	2.3777	-11.6512	-73.7246	-2.7987	-3.1275
1.9813	0.4386	1600	1.9285	1.9369	-72.6769	2.0585	-0.5023	1.0	2.5607	-12.4339	-72.6769	-2.9393	-3.2910
1.9215	0.6579	2400	1.9049	1.9127	-71.7733	2.1488	-0.5719	1.0	2.7207	-13.1300	-71.7733	-3.0198	-3.3812
1.8655	0.8772	3200	1.8887	1.8968	-71.1814	2.2080	-0.6886	1.0	2.8966	-14.2972	-71.1814	-3.0553	-3.4224

Framework versions

Transformers 4.43.3
Pytorch 2.3.1+cu121
Datasets 2.20.0
Tokenizers 0.19.1