DeepRL-PPO-LLv2 / LunarLander-v2-PPO /policy.optimizer.pth

Commit History

Baseline: LR=3e-4/.99, epochs=2e6
f588a6c

0x05a4 commited on

Baseline: LR=1e-4, epochs=1e6
1d38dd0

0x05a4 commited on

Baseline 1M epochs
6e56ef6

0x05a4 commited on