cj453/dense_reward_trainer_final_opt__NumTrainEpochs5_SaveStrategiesno_reward_modeling_anthropic_hh Updated 3 days ago • 4
cj453/dense_reward_trainer_final_opt__NumTrainEpochs5_SaveStrategiesepoch_reward_modeling_anthropic_hh Updated 4 days ago • 6
cj453/dense_reward_trainer_final_opt__NumTrainEpochs2_SaveStrategiesno_reward_modeling_anthropic_hh Updated 5 days ago • 4
cj453/dense_reward_trainer_final_opt__NumTrainEpochs2_SaveStrategiesepoch_reward_modeling_anthropic_hh Updated 5 days ago • 4