Reinforce Agent playing Pixelcopter-PLE-v0
This is a trained model of a Reinforce agent playiscores = reinforce(pixelcopter_policy, pixelcopter_optimizer, pixelcopter_hyperparameters["n_training_episodes"], pixelcopter_hyperparameters["max_t"], pixelcopter_hyperparameters["gamma"], 1000)ng Pixelcopter-PLE-v0 . To learn to use this model and train yours check Unit 4 of the Deep Reinforcement Learning Course: https://huggingface.co/deep-rl-course/unit4/introduction
Evaluation results
- mean_reward on Pixelcopter-PLE-v0self-reported42.70 +/- 24.24