Retrained to 10M steps, with higher play_against_latest_model_ratio (0.75 instead of 0.25) this helped the model to learn to play defense better f5c5d35 verified Statos6 commited on Mar 13
Trained with optuna hyperparam optimization with ml-agents-optuna 3c53f01 verified Statos6 commited on Mar 13