DeepRL-PPO-LLv2 / results.json
0x05a4's picture
Baseline: LR=1e-4, epochs=1e6
1d38dd0
raw
history blame
165 Bytes
{"mean_reward": 182.78596194151342, "std_reward": 30.821379520193084, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2022-05-14T17:26:42.074810"}