diff --git "a/sf_log.txt" "b/sf_log.txt" new file mode 100644--- /dev/null +++ "b/sf_log.txt" @@ -0,0 +1,983 @@ +[2024-08-05 13:01:39,185][06154] Saving configuration to /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/config.json... +[2024-08-05 13:01:39,185][06154] Rollout worker 0 uses device cpu +[2024-08-05 13:01:39,185][06154] Rollout worker 1 uses device cpu +[2024-08-05 13:01:39,185][06154] Rollout worker 2 uses device cpu +[2024-08-05 13:01:39,185][06154] Rollout worker 3 uses device cpu +[2024-08-05 13:01:39,185][06154] Rollout worker 4 uses device cpu +[2024-08-05 13:01:39,185][06154] Rollout worker 5 uses device cpu +[2024-08-05 13:01:39,186][06154] Rollout worker 6 uses device cpu +[2024-08-05 13:01:39,186][06154] Rollout worker 7 uses device cpu +[2024-08-05 13:01:39,186][06154] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 +[2024-08-05 13:01:39,196][06154] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-05 13:01:39,197][06154] InferenceWorker_p0-w0: min num requests: 2 +[2024-08-05 13:01:39,214][06154] Starting all processes... +[2024-08-05 13:01:39,214][06154] Starting process learner_proc0 +[2024-08-05 13:01:39,523][06154] Starting all processes... +[2024-08-05 13:01:39,527][06154] Starting process inference_proc0-0 +[2024-08-05 13:01:39,528][06154] Starting process rollout_proc0 +[2024-08-05 13:01:39,529][06154] Starting process rollout_proc1 +[2024-08-05 13:01:39,533][06154] Starting process rollout_proc2 +[2024-08-05 13:01:39,534][06154] Starting process rollout_proc3 +[2024-08-05 13:01:39,542][06154] Starting process rollout_proc4 +[2024-08-05 13:01:39,542][06154] Starting process rollout_proc5 +[2024-08-05 13:01:39,545][06154] Starting process rollout_proc6 +[2024-08-05 13:01:39,545][06154] Starting process rollout_proc7 +[2024-08-05 13:01:41,117][06220] Worker 5 uses CPU cores [5] +[2024-08-05 13:01:41,206][06215] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-05 13:01:41,206][06215] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2024-08-05 13:01:41,237][06215] Num visible devices: 1 +[2024-08-05 13:01:41,314][06230] Worker 4 uses CPU cores [4] +[2024-08-05 13:01:41,340][06217] Worker 0 uses CPU cores [0] +[2024-08-05 13:01:41,422][06229] Worker 6 uses CPU cores [6] +[2024-08-05 13:01:41,422][06216] Worker 2 uses CPU cores [2] +[2024-08-05 13:01:41,423][06218] Worker 1 uses CPU cores [1] +[2024-08-05 13:01:41,572][06202] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-05 13:01:41,573][06202] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2024-08-05 13:01:41,595][06202] Num visible devices: 1 +[2024-08-05 13:01:41,616][06202] Starting seed is not provided +[2024-08-05 13:01:41,616][06202] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-05 13:01:41,617][06202] Initializing actor-critic model on device cuda:0 +[2024-08-05 13:01:41,617][06202] RunningMeanStd input shape: (17,) +[2024-08-05 13:01:41,617][06202] RunningMeanStd input shape: (1,) +[2024-08-05 13:01:41,679][06202] Created Actor Critic model with architecture: +[2024-08-05 13:01:41,679][06202] ActorCriticSharedWeights( + (obs_normalizer): ObservationNormalizer( + (running_mean_std): RunningMeanStdDictInPlace( + (running_mean_std): ModuleDict( + (obs): RunningMeanStdInPlace() + ) + ) + ) + (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) + (encoder): MultiInputEncoder( + (encoders): ModuleDict( + (obs): MlpEncoder( + (mlp_head): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Linear) + (1): RecursiveScriptModule(original_name=Tanh) + (2): RecursiveScriptModule(original_name=Linear) + (3): RecursiveScriptModule(original_name=Tanh) + ) + ) + ) + ) + (core): ModelCoreIdentity() + (decoder): MlpDecoder( + (mlp): Identity() + ) + (critic_linear): Linear(in_features=64, out_features=1, bias=True) + (action_parameterization): ActionParameterizationContinuousNonAdaptiveStddev( + (distribution_linear): Linear(in_features=64, out_features=6, bias=True) + ) +) +[2024-08-05 13:01:41,680][06221] Worker 7 uses CPU cores [7] +[2024-08-05 13:01:41,692][06219] Worker 3 uses CPU cores [3] +[2024-08-05 13:01:41,915][06202] Using optimizer +[2024-08-05 13:01:42,554][06202] No checkpoints found +[2024-08-05 13:01:42,554][06202] Did not load from checkpoint, starting from scratch! +[2024-08-05 13:01:42,555][06202] Initialized policy 0 weights for model version 0 +[2024-08-05 13:01:42,557][06202] LearnerWorker_p0 finished initialization! +[2024-08-05 13:01:42,557][06202] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-05 13:01:42,759][06215] RunningMeanStd input shape: (17,) +[2024-08-05 13:01:42,760][06215] RunningMeanStd input shape: (1,) +[2024-08-05 13:01:42,818][06154] Inference worker 0-0 is ready! +[2024-08-05 13:01:42,819][06154] All inference workers are ready! Signal rollout workers to start! +[2024-08-05 13:01:42,919][06219] Decorrelating experience for 0 frames... +[2024-08-05 13:01:42,919][06219] Decorrelating experience for 64 frames... +[2024-08-05 13:01:42,919][06217] Decorrelating experience for 0 frames... +[2024-08-05 13:01:42,920][06217] Decorrelating experience for 64 frames... +[2024-08-05 13:01:42,922][06220] Decorrelating experience for 0 frames... +[2024-08-05 13:01:42,922][06220] Decorrelating experience for 64 frames... +[2024-08-05 13:01:42,923][06218] Decorrelating experience for 0 frames... +[2024-08-05 13:01:42,923][06216] Decorrelating experience for 0 frames... +[2024-08-05 13:01:42,924][06218] Decorrelating experience for 64 frames... +[2024-08-05 13:01:42,924][06216] Decorrelating experience for 64 frames... +[2024-08-05 13:01:42,926][06221] Decorrelating experience for 0 frames... +[2024-08-05 13:01:42,926][06230] Decorrelating experience for 0 frames... +[2024-08-05 13:01:42,926][06221] Decorrelating experience for 64 frames... +[2024-08-05 13:01:42,926][06230] Decorrelating experience for 64 frames... +[2024-08-05 13:01:42,932][06219] Decorrelating experience for 128 frames... +[2024-08-05 13:01:42,933][06217] Decorrelating experience for 128 frames... +[2024-08-05 13:01:42,934][06220] Decorrelating experience for 128 frames... +[2024-08-05 13:01:42,937][06218] Decorrelating experience for 128 frames... +[2024-08-05 13:01:42,939][06216] Decorrelating experience for 128 frames... +[2024-08-05 13:01:42,940][06221] Decorrelating experience for 128 frames... +[2024-08-05 13:01:42,943][06230] Decorrelating experience for 128 frames... +[2024-08-05 13:01:42,957][06219] Decorrelating experience for 192 frames... +[2024-08-05 13:01:42,958][06217] Decorrelating experience for 192 frames... +[2024-08-05 13:01:42,961][06220] Decorrelating experience for 192 frames... +[2024-08-05 13:01:42,964][06218] Decorrelating experience for 192 frames... +[2024-08-05 13:01:42,965][06216] Decorrelating experience for 192 frames... +[2024-08-05 13:01:42,967][06221] Decorrelating experience for 192 frames... +[2024-08-05 13:01:42,972][06230] Decorrelating experience for 192 frames... +[2024-08-05 13:01:42,973][06229] Decorrelating experience for 0 frames... +[2024-08-05 13:01:42,973][06229] Decorrelating experience for 64 frames... +[2024-08-05 13:01:42,986][06229] Decorrelating experience for 128 frames... +[2024-08-05 13:01:43,004][06219] Decorrelating experience for 256 frames... +[2024-08-05 13:01:43,007][06217] Decorrelating experience for 256 frames... +[2024-08-05 13:01:43,010][06229] Decorrelating experience for 192 frames... +[2024-08-05 13:01:43,013][06220] Decorrelating experience for 256 frames... +[2024-08-05 13:01:43,016][06218] Decorrelating experience for 256 frames... +[2024-08-05 13:01:43,016][06216] Decorrelating experience for 256 frames... +[2024-08-05 13:01:43,017][06221] Decorrelating experience for 256 frames... +[2024-08-05 13:01:43,026][06230] Decorrelating experience for 256 frames... +[2024-08-05 13:01:43,052][06219] Decorrelating experience for 320 frames... +[2024-08-05 13:01:43,055][06229] Decorrelating experience for 256 frames... +[2024-08-05 13:01:43,056][06217] Decorrelating experience for 320 frames... +[2024-08-05 13:01:43,062][06220] Decorrelating experience for 320 frames... +[2024-08-05 13:01:43,065][06216] Decorrelating experience for 320 frames... +[2024-08-05 13:01:43,067][06218] Decorrelating experience for 320 frames... +[2024-08-05 13:01:43,068][06221] Decorrelating experience for 320 frames... +[2024-08-05 13:01:43,077][06230] Decorrelating experience for 320 frames... +[2024-08-05 13:01:43,107][06229] Decorrelating experience for 320 frames... +[2024-08-05 13:01:43,112][06219] Decorrelating experience for 384 frames... +[2024-08-05 13:01:43,118][06217] Decorrelating experience for 384 frames... +[2024-08-05 13:01:43,126][06220] Decorrelating experience for 384 frames... +[2024-08-05 13:01:43,128][06216] Decorrelating experience for 384 frames... +[2024-08-05 13:01:43,131][06221] Decorrelating experience for 384 frames... +[2024-08-05 13:01:43,131][06218] Decorrelating experience for 384 frames... +[2024-08-05 13:01:43,183][06230] Decorrelating experience for 384 frames... +[2024-08-05 13:01:43,183][06219] Decorrelating experience for 448 frames... +[2024-08-05 13:01:43,194][06217] Decorrelating experience for 448 frames... +[2024-08-05 13:01:43,197][06220] Decorrelating experience for 448 frames... +[2024-08-05 13:01:43,202][06216] Decorrelating experience for 448 frames... +[2024-08-05 13:01:43,210][06229] Decorrelating experience for 384 frames... +[2024-08-05 13:01:43,215][06218] Decorrelating experience for 448 frames... +[2024-08-05 13:01:43,220][06221] Decorrelating experience for 448 frames... +[2024-08-05 13:01:43,284][06230] Decorrelating experience for 448 frames... +[2024-08-05 13:01:43,291][06229] Decorrelating experience for 448 frames... +[2024-08-05 13:01:46,435][06154] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 8192. Throughput: 0: nan. Samples: 8192. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:01:46,435][06154] Avg episode reward: [(0, '6.714')] +[2024-08-05 13:01:48,810][06215] Updated weights for policy 0, policy_version 80 (0.0006) +[2024-08-05 13:01:51,435][06154] Fps is (10 sec: 11469.3, 60 sec: 11469.3, 300 sec: 11469.3). Total num frames: 65536. Throughput: 0: 6585.1. Samples: 41116. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:01:51,435][06154] Avg episode reward: [(0, '282.150')] +[2024-08-05 13:01:51,439][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000000128_65536.pth... +[2024-08-05 13:01:52,590][06215] Updated weights for policy 0, policy_version 160 (0.0006) +[2024-08-05 13:01:56,105][06215] Updated weights for policy 0, policy_version 240 (0.0006) +[2024-08-05 13:01:56,435][06154] Fps is (10 sec: 11469.0, 60 sec: 11469.0, 300 sec: 11469.0). Total num frames: 122880. Throughput: 0: 9970.2. Samples: 107892. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:01:56,435][06154] Avg episode reward: [(0, '328.089')] +[2024-08-05 13:01:56,480][06202] Saving new best policy, reward=328.089! +[2024-08-05 13:01:59,191][06154] Heartbeat connected on Batcher_0 +[2024-08-05 13:01:59,193][06154] Heartbeat connected on LearnerWorker_p0 +[2024-08-05 13:01:59,200][06154] Heartbeat connected on InferenceWorker_p0-w0 +[2024-08-05 13:01:59,201][06154] Heartbeat connected on RolloutWorker_w1 +[2024-08-05 13:01:59,203][06154] Heartbeat connected on RolloutWorker_w0 +[2024-08-05 13:01:59,204][06154] Heartbeat connected on RolloutWorker_w2 +[2024-08-05 13:01:59,208][06154] Heartbeat connected on RolloutWorker_w3 +[2024-08-05 13:01:59,208][06154] Heartbeat connected on RolloutWorker_w4 +[2024-08-05 13:01:59,210][06154] Heartbeat connected on RolloutWorker_w5 +[2024-08-05 13:01:59,212][06154] Heartbeat connected on RolloutWorker_w6 +[2024-08-05 13:01:59,214][06154] Heartbeat connected on RolloutWorker_w7 +[2024-08-05 13:01:59,664][06215] Updated weights for policy 0, policy_version 320 (0.0006) +[2024-08-05 13:02:01,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11742.0, 300 sec: 11742.0). Total num frames: 184320. Throughput: 0: 11379.3. Samples: 178880. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:02:01,435][06154] Avg episode reward: [(0, '388.313')] +[2024-08-05 13:02:01,436][06202] Saving new best policy, reward=388.313! +[2024-08-05 13:02:03,319][06215] Updated weights for policy 0, policy_version 400 (0.0006) +[2024-08-05 13:02:06,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11264.1, 300 sec: 11264.1). Total num frames: 233472. Throughput: 0: 10145.9. Samples: 211108. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:02:06,442][06154] Avg episode reward: [(0, '492.145')] +[2024-08-05 13:02:06,444][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000000456_233472.pth... +[2024-08-05 13:02:06,447][06202] Saving new best policy, reward=492.145! +[2024-08-05 13:02:07,106][06215] Updated weights for policy 0, policy_version 480 (0.0006) +[2024-08-05 13:02:10,809][06215] Updated weights for policy 0, policy_version 560 (0.0006) +[2024-08-05 13:02:11,435][06154] Fps is (10 sec: 10649.6, 60 sec: 11305.0, 300 sec: 11305.0). Total num frames: 290816. Throughput: 0: 10645.0. Samples: 274316. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:02:11,435][06154] Avg episode reward: [(0, '569.058')] +[2024-08-05 13:02:11,435][06202] Saving new best policy, reward=569.058! +[2024-08-05 13:02:14,808][06215] Updated weights for policy 0, policy_version 640 (0.0007) +[2024-08-05 13:02:16,435][06154] Fps is (10 sec: 11058.8, 60 sec: 11195.7, 300 sec: 11195.7). Total num frames: 344064. Throughput: 0: 11062.0. Samples: 340052. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:02:16,436][06154] Avg episode reward: [(0, '779.404')] +[2024-08-05 13:02:16,539][06202] Saving new best policy, reward=779.404! +[2024-08-05 13:02:18,700][06215] Updated weights for policy 0, policy_version 720 (0.0007) +[2024-08-05 13:02:21,435][06154] Fps is (10 sec: 10649.6, 60 sec: 11117.8, 300 sec: 11117.8). Total num frames: 397312. Throughput: 0: 10340.8. Samples: 370120. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:02:21,442][06154] Avg episode reward: [(0, '899.188')] +[2024-08-05 13:02:21,445][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000000776_397312.pth... +[2024-08-05 13:02:21,448][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000000128_65536.pth +[2024-08-05 13:02:21,448][06202] Saving new best policy, reward=899.188! +[2024-08-05 13:02:22,245][06215] Updated weights for policy 0, policy_version 800 (0.0006) +[2024-08-05 13:02:25,753][06215] Updated weights for policy 0, policy_version 880 (0.0007) +[2024-08-05 13:02:26,435][06154] Fps is (10 sec: 11059.6, 60 sec: 11161.7, 300 sec: 11161.7). Total num frames: 454656. Throughput: 0: 10797.2. Samples: 440076. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:02:26,435][06154] Avg episode reward: [(0, '888.403')] +[2024-08-05 13:02:29,700][06215] Updated weights for policy 0, policy_version 960 (0.0007) +[2024-08-05 13:02:31,435][06154] Fps is (10 sec: 10649.3, 60 sec: 11013.6, 300 sec: 11013.6). Total num frames: 503808. Throughput: 0: 11003.4. Samples: 503348. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:02:31,436][06154] Avg episode reward: [(0, '989.011')] +[2024-08-05 13:02:31,519][06202] Saving new best policy, reward=989.011! +[2024-08-05 13:02:33,623][06215] Updated weights for policy 0, policy_version 1040 (0.0007) +[2024-08-05 13:02:36,435][06154] Fps is (10 sec: 10649.6, 60 sec: 11059.2, 300 sec: 11059.2). Total num frames: 561152. Throughput: 0: 10987.5. Samples: 535552. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:02:36,435][06154] Avg episode reward: [(0, '1233.961')] +[2024-08-05 13:02:36,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000001096_561152.pth... +[2024-08-05 13:02:36,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000000456_233472.pth +[2024-08-05 13:02:36,440][06202] Saving new best policy, reward=1233.961! +[2024-08-05 13:02:37,361][06215] Updated weights for policy 0, policy_version 1120 (0.0007) +[2024-08-05 13:02:40,609][06215] Updated weights for policy 0, policy_version 1200 (0.0006) +[2024-08-05 13:02:41,435][06154] Fps is (10 sec: 11878.9, 60 sec: 11171.0, 300 sec: 11171.0). Total num frames: 622592. Throughput: 0: 11057.6. Samples: 605484. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:02:41,435][06154] Avg episode reward: [(0, '1259.215')] +[2024-08-05 13:02:41,435][06202] Saving new best policy, reward=1259.215! +[2024-08-05 13:02:44,178][06215] Updated weights for policy 0, policy_version 1280 (0.0008) +[2024-08-05 13:02:46,435][06154] Fps is (10 sec: 11878.3, 60 sec: 11195.8, 300 sec: 11195.8). Total num frames: 679936. Throughput: 0: 11043.7. Samples: 675848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:02:46,435][06154] Avg episode reward: [(0, '1326.594')] +[2024-08-05 13:02:46,435][06202] Saving new best policy, reward=1326.594! +[2024-08-05 13:02:47,622][06215] Updated weights for policy 0, policy_version 1360 (0.0006) +[2024-08-05 13:02:51,139][06215] Updated weights for policy 0, policy_version 1440 (0.0006) +[2024-08-05 13:02:51,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11264.0, 300 sec: 11279.8). Total num frames: 741376. Throughput: 0: 11108.5. Samples: 710992. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:02:51,435][06154] Avg episode reward: [(0, '1338.804')] +[2024-08-05 13:02:51,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000001448_741376.pth... +[2024-08-05 13:02:51,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000000776_397312.pth +[2024-08-05 13:02:51,440][06202] Saving new best policy, reward=1338.804! +[2024-08-05 13:02:54,479][06215] Updated weights for policy 0, policy_version 1520 (0.0006) +[2024-08-05 13:02:56,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11264.0, 300 sec: 11293.3). Total num frames: 798720. Throughput: 0: 11288.8. Samples: 782312. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:02:56,435][06154] Avg episode reward: [(0, '1476.831')] +[2024-08-05 13:02:56,435][06202] Saving new best policy, reward=1476.831! +[2024-08-05 13:02:57,986][06215] Updated weights for policy 0, policy_version 1600 (0.0006) +[2024-08-05 13:03:01,435][06154] Fps is (10 sec: 10649.4, 60 sec: 11059.2, 300 sec: 11195.7). Total num frames: 847872. Throughput: 0: 11227.0. Samples: 845264. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:03:01,435][06154] Avg episode reward: [(0, '1421.645')] +[2024-08-05 13:03:02,346][06215] Updated weights for policy 0, policy_version 1680 (0.0007) +[2024-08-05 13:03:05,748][06215] Updated weights for policy 0, policy_version 1760 (0.0007) +[2024-08-05 13:03:06,435][06154] Fps is (10 sec: 10649.6, 60 sec: 11195.7, 300 sec: 11212.8). Total num frames: 905216. Throughput: 0: 11254.9. Samples: 876588. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:03:06,435][06154] Avg episode reward: [(0, '1621.816')] +[2024-08-05 13:03:06,460][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000001776_909312.pth... +[2024-08-05 13:03:06,464][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000001096_561152.pth +[2024-08-05 13:03:06,464][06202] Saving new best policy, reward=1621.816! +[2024-08-05 13:03:09,111][06215] Updated weights for policy 0, policy_version 1840 (0.0006) +[2024-08-05 13:03:11,435][06154] Fps is (10 sec: 11878.6, 60 sec: 11264.0, 300 sec: 11276.1). Total num frames: 966656. Throughput: 0: 11361.2. Samples: 951332. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:03:11,435][06154] Avg episode reward: [(0, '1463.182')] +[2024-08-05 13:03:12,622][06215] Updated weights for policy 0, policy_version 1920 (0.0006) +[2024-08-05 13:03:16,066][06215] Updated weights for policy 0, policy_version 2000 (0.0007) +[2024-08-05 13:03:16,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11400.6, 300 sec: 11332.3). Total num frames: 1028096. Throughput: 0: 11496.3. Samples: 1020676. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:03:16,435][06154] Avg episode reward: [(0, '1973.650')] +[2024-08-05 13:03:16,435][06202] Saving new best policy, reward=1973.650! +[2024-08-05 13:03:19,295][06215] Updated weights for policy 0, policy_version 2080 (0.0006) +[2024-08-05 13:03:21,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11537.1, 300 sec: 11382.6). Total num frames: 1089536. Throughput: 0: 11669.1. Samples: 1060660. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:03:21,442][06154] Avg episode reward: [(0, '2216.740')] +[2024-08-05 13:03:21,444][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000002128_1089536.pth... +[2024-08-05 13:03:21,447][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000001448_741376.pth +[2024-08-05 13:03:21,448][06202] Saving new best policy, reward=2216.740! +[2024-08-05 13:03:22,842][06215] Updated weights for policy 0, policy_version 2160 (0.0007) +[2024-08-05 13:03:26,411][06215] Updated weights for policy 0, policy_version 2240 (0.0007) +[2024-08-05 13:03:26,435][06154] Fps is (10 sec: 11878.2, 60 sec: 11537.0, 300 sec: 11386.9). Total num frames: 1146880. Throughput: 0: 11666.9. Samples: 1130496. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:03:26,435][06154] Avg episode reward: [(0, '2113.629')] +[2024-08-05 13:03:29,627][06215] Updated weights for policy 0, policy_version 2320 (0.0006) +[2024-08-05 13:03:31,434][06154] Fps is (10 sec: 11878.5, 60 sec: 11742.0, 300 sec: 11429.8). Total num frames: 1208320. Throughput: 0: 11657.0. Samples: 1200412. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:03:31,442][06154] Avg episode reward: [(0, '2379.994')] +[2024-08-05 13:03:31,443][06202] Saving new best policy, reward=2379.994! +[2024-08-05 13:03:33,008][06215] Updated weights for policy 0, policy_version 2400 (0.0007) +[2024-08-05 13:03:36,435][06154] Fps is (10 sec: 11878.7, 60 sec: 11741.9, 300 sec: 11431.6). Total num frames: 1265664. Throughput: 0: 11710.8. Samples: 1237976. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:03:36,442][06154] Avg episode reward: [(0, '1957.990')] +[2024-08-05 13:03:36,445][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000002472_1265664.pth... +[2024-08-05 13:03:36,447][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000001776_909312.pth +[2024-08-05 13:03:36,542][06215] Updated weights for policy 0, policy_version 2480 (0.0007) +[2024-08-05 13:03:39,739][06215] Updated weights for policy 0, policy_version 2560 (0.0005) +[2024-08-05 13:03:41,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11468.8). Total num frames: 1327104. Throughput: 0: 11771.4. Samples: 1312024. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:03:41,442][06154] Avg episode reward: [(0, '2575.796')] +[2024-08-05 13:03:41,443][06202] Saving new best policy, reward=2575.796! +[2024-08-05 13:03:43,202][06215] Updated weights for policy 0, policy_version 2640 (0.0007) +[2024-08-05 13:03:46,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11434.7). Total num frames: 1380352. Throughput: 0: 11826.2. Samples: 1377440. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:03:46,435][06154] Avg episode reward: [(0, '2781.122')] +[2024-08-05 13:03:46,440][06202] Saving new best policy, reward=2781.122! +[2024-08-05 13:03:47,101][06215] Updated weights for policy 0, policy_version 2720 (0.0007) +[2024-08-05 13:03:50,845][06215] Updated weights for policy 0, policy_version 2800 (0.0007) +[2024-08-05 13:03:51,435][06154] Fps is (10 sec: 11059.1, 60 sec: 11605.3, 300 sec: 11436.0). Total num frames: 1437696. Throughput: 0: 11909.0. Samples: 1412492. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:03:51,435][06154] Avg episode reward: [(0, '2909.719')] +[2024-08-05 13:03:51,438][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000002808_1437696.pth... +[2024-08-05 13:03:51,442][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000002128_1089536.pth +[2024-08-05 13:03:51,442][06202] Saving new best policy, reward=2909.719! +[2024-08-05 13:03:54,433][06215] Updated weights for policy 0, policy_version 2880 (0.0007) +[2024-08-05 13:03:56,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11673.6, 300 sec: 11468.8). Total num frames: 1499136. Throughput: 0: 11758.3. Samples: 1480456. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:03:56,435][06154] Avg episode reward: [(0, '3132.648')] +[2024-08-05 13:03:56,435][06202] Saving new best policy, reward=3132.648! +[2024-08-05 13:03:57,767][06215] Updated weights for policy 0, policy_version 2960 (0.0006) +[2024-08-05 13:04:01,435][06154] Fps is (10 sec: 11059.3, 60 sec: 11673.6, 300 sec: 11408.1). Total num frames: 1548288. Throughput: 0: 11652.9. Samples: 1545056. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:04:01,442][06154] Avg episode reward: [(0, '3228.276')] +[2024-08-05 13:04:01,444][06202] Saving new best policy, reward=3228.276! +[2024-08-05 13:04:01,792][06215] Updated weights for policy 0, policy_version 3040 (0.0007) +[2024-08-05 13:04:05,243][06215] Updated weights for policy 0, policy_version 3120 (0.0006) +[2024-08-05 13:04:06,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11741.9, 300 sec: 11439.6). Total num frames: 1609728. Throughput: 0: 11590.5. Samples: 1582232. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:04:06,435][06154] Avg episode reward: [(0, '3399.442')] +[2024-08-05 13:04:06,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000003144_1609728.pth... +[2024-08-05 13:04:06,441][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000002472_1265664.pth +[2024-08-05 13:04:06,441][06202] Saving new best policy, reward=3399.442! +[2024-08-05 13:04:08,755][06215] Updated weights for policy 0, policy_version 3200 (0.0006) +[2024-08-05 13:04:11,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11673.6, 300 sec: 11440.6). Total num frames: 1667072. Throughput: 0: 11558.4. Samples: 1650620. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:04:11,435][06154] Avg episode reward: [(0, '3285.237')] +[2024-08-05 13:04:12,437][06215] Updated weights for policy 0, policy_version 3280 (0.0007) +[2024-08-05 13:04:16,250][06215] Updated weights for policy 0, policy_version 3360 (0.0006) +[2024-08-05 13:04:16,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11537.1, 300 sec: 11414.2). Total num frames: 1720320. Throughput: 0: 11451.1. Samples: 1715712. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:04:16,435][06154] Avg episode reward: [(0, '3267.117')] +[2024-08-05 13:04:20,059][06215] Updated weights for policy 0, policy_version 3440 (0.0006) +[2024-08-05 13:04:21,435][06154] Fps is (10 sec: 11059.1, 60 sec: 11468.8, 300 sec: 11416.0). Total num frames: 1777664. Throughput: 0: 11307.4. Samples: 1746812. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:04:21,435][06154] Avg episode reward: [(0, '3357.052')] +[2024-08-05 13:04:21,439][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000003472_1777664.pth... +[2024-08-05 13:04:21,444][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000002808_1437696.pth +[2024-08-05 13:04:23,405][06215] Updated weights for policy 0, policy_version 3520 (0.0006) +[2024-08-05 13:04:26,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11468.8, 300 sec: 11417.6). Total num frames: 1835008. Throughput: 0: 11259.2. Samples: 1818688. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:04:26,435][06154] Avg episode reward: [(0, '3529.643')] +[2024-08-05 13:04:26,438][06202] Saving new best policy, reward=3529.643! +[2024-08-05 13:04:26,760][06215] Updated weights for policy 0, policy_version 3600 (0.0006) +[2024-08-05 13:04:30,276][06215] Updated weights for policy 0, policy_version 3680 (0.0006) +[2024-08-05 13:04:31,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11400.5, 300 sec: 11419.2). Total num frames: 1892352. Throughput: 0: 11427.7. Samples: 1891688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:04:31,435][06154] Avg episode reward: [(0, '3699.031')] +[2024-08-05 13:04:31,436][06202] Saving new best policy, reward=3699.031! +[2024-08-05 13:04:33,890][06215] Updated weights for policy 0, policy_version 3760 (0.0007) +[2024-08-05 13:04:36,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11468.8, 300 sec: 11444.7). Total num frames: 1953792. Throughput: 0: 11382.8. Samples: 1924716. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:04:36,435][06154] Avg episode reward: [(0, '3758.671')] +[2024-08-05 13:04:36,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000003816_1953792.pth... +[2024-08-05 13:04:36,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000003144_1609728.pth +[2024-08-05 13:04:36,440][06202] Saving new best policy, reward=3758.671! +[2024-08-05 13:04:37,465][06215] Updated weights for policy 0, policy_version 3840 (0.0007) +[2024-08-05 13:04:41,179][06215] Updated weights for policy 0, policy_version 3920 (0.0008) +[2024-08-05 13:04:41,435][06154] Fps is (10 sec: 11468.9, 60 sec: 11332.3, 300 sec: 11422.0). Total num frames: 2007040. Throughput: 0: 11359.0. Samples: 1991612. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:04:41,435][06154] Avg episode reward: [(0, '3483.105')] +[2024-08-05 13:04:44,841][06215] Updated weights for policy 0, policy_version 4000 (0.0006) +[2024-08-05 13:04:46,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11400.5, 300 sec: 11423.3). Total num frames: 2064384. Throughput: 0: 11446.8. Samples: 2060164. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:04:46,435][06154] Avg episode reward: [(0, '3671.525')] +[2024-08-05 13:04:48,271][06215] Updated weights for policy 0, policy_version 4080 (0.0006) +[2024-08-05 13:04:51,435][06154] Fps is (10 sec: 11878.3, 60 sec: 11468.8, 300 sec: 11446.7). Total num frames: 2125824. Throughput: 0: 11443.9. Samples: 2097208. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:04:51,435][06154] Avg episode reward: [(0, '3830.341')] +[2024-08-05 13:04:51,439][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000004152_2125824.pth... +[2024-08-05 13:04:51,443][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000003472_1777664.pth +[2024-08-05 13:04:51,443][06202] Saving new best policy, reward=3830.341! +[2024-08-05 13:04:51,671][06215] Updated weights for policy 0, policy_version 4160 (0.0006) +[2024-08-05 13:04:55,167][06215] Updated weights for policy 0, policy_version 4240 (0.0006) +[2024-08-05 13:04:56,435][06154] Fps is (10 sec: 11878.5, 60 sec: 11400.5, 300 sec: 11447.3). Total num frames: 2183168. Throughput: 0: 11474.2. Samples: 2166960. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:04:56,435][06154] Avg episode reward: [(0, '3825.770')] +[2024-08-05 13:04:58,742][06215] Updated weights for policy 0, policy_version 4320 (0.0007) +[2024-08-05 13:05:01,435][06154] Fps is (10 sec: 11468.9, 60 sec: 11537.1, 300 sec: 11447.8). Total num frames: 2240512. Throughput: 0: 11585.6. Samples: 2237064. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:05:01,435][06154] Avg episode reward: [(0, '3623.007')] +[2024-08-05 13:05:02,250][06215] Updated weights for policy 0, policy_version 4400 (0.0007) +[2024-08-05 13:05:05,713][06215] Updated weights for policy 0, policy_version 4480 (0.0007) +[2024-08-05 13:05:06,435][06154] Fps is (10 sec: 11468.7, 60 sec: 11468.8, 300 sec: 11448.3). Total num frames: 2297856. Throughput: 0: 11664.9. Samples: 2271732. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:05:06,435][06154] Avg episode reward: [(0, '3579.587')] +[2024-08-05 13:05:06,446][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000004496_2301952.pth... +[2024-08-05 13:05:06,449][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000003816_1953792.pth +[2024-08-05 13:05:09,062][06215] Updated weights for policy 0, policy_version 4560 (0.0007) +[2024-08-05 13:05:11,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11537.1, 300 sec: 11468.8). Total num frames: 2359296. Throughput: 0: 11625.7. Samples: 2341844. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:05:11,435][06154] Avg episode reward: [(0, '3911.147')] +[2024-08-05 13:05:11,435][06202] Saving new best policy, reward=3911.147! +[2024-08-05 13:05:12,638][06215] Updated weights for policy 0, policy_version 4640 (0.0007) +[2024-08-05 13:05:16,263][06215] Updated weights for policy 0, policy_version 4720 (0.0006) +[2024-08-05 13:05:16,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11605.3, 300 sec: 11468.8). Total num frames: 2416640. Throughput: 0: 11557.7. Samples: 2411784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:05:16,435][06154] Avg episode reward: [(0, '3893.818')] +[2024-08-05 13:05:19,939][06215] Updated weights for policy 0, policy_version 4800 (0.0007) +[2024-08-05 13:05:21,435][06154] Fps is (10 sec: 11059.0, 60 sec: 11537.1, 300 sec: 11449.8). Total num frames: 2469888. Throughput: 0: 11568.9. Samples: 2445320. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:05:21,435][06154] Avg episode reward: [(0, '3866.070')] +[2024-08-05 13:05:21,439][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000004824_2469888.pth... +[2024-08-05 13:05:21,443][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000004152_2125824.pth +[2024-08-05 13:05:23,489][06215] Updated weights for policy 0, policy_version 4880 (0.0008) +[2024-08-05 13:05:26,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11605.3, 300 sec: 11468.8). Total num frames: 2531328. Throughput: 0: 11668.0. Samples: 2516672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:05:26,435][06154] Avg episode reward: [(0, '3940.936')] +[2024-08-05 13:05:26,435][06202] Saving new best policy, reward=3940.936! +[2024-08-05 13:05:26,959][06215] Updated weights for policy 0, policy_version 4960 (0.0006) +[2024-08-05 13:05:30,566][06215] Updated weights for policy 0, policy_version 5040 (0.0008) +[2024-08-05 13:05:31,435][06154] Fps is (10 sec: 11878.6, 60 sec: 11605.3, 300 sec: 11468.8). Total num frames: 2588672. Throughput: 0: 11651.6. Samples: 2584488. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:05:31,435][06154] Avg episode reward: [(0, '3744.143')] +[2024-08-05 13:05:34,208][06215] Updated weights for policy 0, policy_version 5120 (0.0007) +[2024-08-05 13:05:36,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11537.1, 300 sec: 11468.8). Total num frames: 2646016. Throughput: 0: 11563.7. Samples: 2617576. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:05:36,435][06154] Avg episode reward: [(0, '3675.085')] +[2024-08-05 13:05:36,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000005168_2646016.pth... +[2024-08-05 13:05:36,441][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000004496_2301952.pth +[2024-08-05 13:05:37,469][06215] Updated weights for policy 0, policy_version 5200 (0.0006) +[2024-08-05 13:05:40,944][06215] Updated weights for policy 0, policy_version 5280 (0.0006) +[2024-08-05 13:05:41,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11673.6, 300 sec: 11486.2). Total num frames: 2707456. Throughput: 0: 11648.3. Samples: 2691136. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:05:41,435][06154] Avg episode reward: [(0, '4004.574')] +[2024-08-05 13:05:41,435][06202] Saving new best policy, reward=4004.574! +[2024-08-05 13:05:44,194][06215] Updated weights for policy 0, policy_version 5360 (0.0005) +[2024-08-05 13:05:46,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11741.9, 300 sec: 11502.9). Total num frames: 2768896. Throughput: 0: 11720.8. Samples: 2764500. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:05:46,435][06154] Avg episode reward: [(0, '4025.178')] +[2024-08-05 13:05:46,435][06202] Saving new best policy, reward=4025.178! +[2024-08-05 13:05:47,589][06215] Updated weights for policy 0, policy_version 5440 (0.0006) +[2024-08-05 13:05:51,002][06215] Updated weights for policy 0, policy_version 5520 (0.0006) +[2024-08-05 13:05:51,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11741.9, 300 sec: 11519.0). Total num frames: 2830336. Throughput: 0: 11778.3. Samples: 2801756. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:05:51,435][06154] Avg episode reward: [(0, '3749.231')] +[2024-08-05 13:05:51,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000005528_2830336.pth... +[2024-08-05 13:05:51,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000004824_2469888.pth +[2024-08-05 13:05:54,366][06215] Updated weights for policy 0, policy_version 5600 (0.0005) +[2024-08-05 13:05:56,435][06154] Fps is (10 sec: 11468.6, 60 sec: 11673.5, 300 sec: 11501.6). Total num frames: 2883584. Throughput: 0: 11817.6. Samples: 2873640. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:05:56,435][06154] Avg episode reward: [(0, '3599.944')] +[2024-08-05 13:05:58,248][06215] Updated weights for policy 0, policy_version 5680 (0.0007) +[2024-08-05 13:06:01,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11673.6, 300 sec: 11500.9). Total num frames: 2940928. Throughput: 0: 11667.7. Samples: 2936832. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:06:01,442][06154] Avg episode reward: [(0, '3525.219')] +[2024-08-05 13:06:01,910][06215] Updated weights for policy 0, policy_version 5760 (0.0007) +[2024-08-05 13:06:05,504][06215] Updated weights for policy 0, policy_version 5840 (0.0007) +[2024-08-05 13:06:06,435][06154] Fps is (10 sec: 11878.6, 60 sec: 11741.9, 300 sec: 11516.1). Total num frames: 3002368. Throughput: 0: 11736.2. Samples: 2973448. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:06:06,435][06154] Avg episode reward: [(0, '2742.366')] +[2024-08-05 13:06:06,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000005864_3002368.pth... +[2024-08-05 13:06:06,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000005168_2646016.pth +[2024-08-05 13:06:08,516][06215] Updated weights for policy 0, policy_version 5920 (0.0005) +[2024-08-05 13:06:11,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11741.9, 300 sec: 11530.6). Total num frames: 3063808. Throughput: 0: 11772.8. Samples: 3046448. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:06:11,442][06154] Avg episode reward: [(0, '2598.083')] +[2024-08-05 13:06:12,170][06215] Updated weights for policy 0, policy_version 6000 (0.0007) +[2024-08-05 13:06:15,335][06215] Updated weights for policy 0, policy_version 6080 (0.0006) +[2024-08-05 13:06:16,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11529.5). Total num frames: 3121152. Throughput: 0: 11862.4. Samples: 3118296. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:06:16,435][06154] Avg episode reward: [(0, '3108.148')] +[2024-08-05 13:06:19,045][06215] Updated weights for policy 0, policy_version 6160 (0.0007) +[2024-08-05 13:06:21,435][06154] Fps is (10 sec: 11878.3, 60 sec: 11878.4, 300 sec: 11543.3). Total num frames: 3182592. Throughput: 0: 11870.0. Samples: 3151728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:06:21,435][06154] Avg episode reward: [(0, '3745.370')] +[2024-08-05 13:06:21,438][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000006216_3182592.pth... +[2024-08-05 13:06:21,441][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000005528_2830336.pth +[2024-08-05 13:06:22,351][06215] Updated weights for policy 0, policy_version 6240 (0.0006) +[2024-08-05 13:06:25,841][06215] Updated weights for policy 0, policy_version 6320 (0.0006) +[2024-08-05 13:06:26,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11810.1, 300 sec: 11542.0). Total num frames: 3239936. Throughput: 0: 11843.1. Samples: 3224076. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:06:26,435][06154] Avg episode reward: [(0, '4010.902')] +[2024-08-05 13:06:29,227][06215] Updated weights for policy 0, policy_version 6400 (0.0006) +[2024-08-05 13:06:31,435][06154] Fps is (10 sec: 11878.5, 60 sec: 11878.4, 300 sec: 11555.0). Total num frames: 3301376. Throughput: 0: 11839.0. Samples: 3297256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:06:31,435][06154] Avg episode reward: [(0, '4008.126')] +[2024-08-05 13:06:32,697][06215] Updated weights for policy 0, policy_version 6480 (0.0006) +[2024-08-05 13:06:36,255][06215] Updated weights for policy 0, policy_version 6560 (0.0007) +[2024-08-05 13:06:36,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11878.4, 300 sec: 11553.6). Total num frames: 3358720. Throughput: 0: 11817.7. Samples: 3333552. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:06:36,435][06154] Avg episode reward: [(0, '3851.166')] +[2024-08-05 13:06:36,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000006560_3358720.pth... +[2024-08-05 13:06:36,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000005864_3002368.pth +[2024-08-05 13:06:39,503][06215] Updated weights for policy 0, policy_version 6640 (0.0006) +[2024-08-05 13:06:41,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11878.4, 300 sec: 11566.0). Total num frames: 3420160. Throughput: 0: 11782.3. Samples: 3403840. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:06:41,435][06154] Avg episode reward: [(0, '3922.368')] +[2024-08-05 13:06:43,051][06215] Updated weights for policy 0, policy_version 6720 (0.0007) +[2024-08-05 13:06:46,352][06215] Updated weights for policy 0, policy_version 6800 (0.0006) +[2024-08-05 13:06:46,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11878.4, 300 sec: 11579.9). Total num frames: 3481600. Throughput: 0: 11950.1. Samples: 3474588. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:06:46,435][06154] Avg episode reward: [(0, '3709.770')] +[2024-08-05 13:06:49,808][06215] Updated weights for policy 0, policy_version 6880 (0.0006) +[2024-08-05 13:06:51,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11810.1, 300 sec: 11579.9). Total num frames: 3538944. Throughput: 0: 11972.7. Samples: 3512220. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:06:51,435][06154] Avg episode reward: [(0, '3637.569')] +[2024-08-05 13:06:51,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000006912_3538944.pth... +[2024-08-05 13:06:51,441][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000006216_3182592.pth +[2024-08-05 13:06:53,703][06215] Updated weights for policy 0, policy_version 6960 (0.0009) +[2024-08-05 13:06:56,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11810.2, 300 sec: 11552.1). Total num frames: 3592192. Throughput: 0: 11803.1. Samples: 3577588. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:06:56,435][06154] Avg episode reward: [(0, '3800.917')] +[2024-08-05 13:06:57,132][06215] Updated weights for policy 0, policy_version 7040 (0.0006) +[2024-08-05 13:07:00,208][06215] Updated weights for policy 0, policy_version 7120 (0.0005) +[2024-08-05 13:07:01,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11946.7, 300 sec: 11607.6). Total num frames: 3657728. Throughput: 0: 11886.9. Samples: 3653208. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:07:01,435][06154] Avg episode reward: [(0, '3965.815')] +[2024-08-05 13:07:03,774][06215] Updated weights for policy 0, policy_version 7200 (0.0008) +[2024-08-05 13:07:06,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11878.4, 300 sec: 11607.6). Total num frames: 3715072. Throughput: 0: 11901.7. Samples: 3687304. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:07:06,435][06154] Avg episode reward: [(0, '3676.689')] +[2024-08-05 13:07:06,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000007256_3715072.pth... +[2024-08-05 13:07:06,441][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000006560_3358720.pth +[2024-08-05 13:07:07,256][06215] Updated weights for policy 0, policy_version 7280 (0.0006) +[2024-08-05 13:07:10,510][06215] Updated weights for policy 0, policy_version 7360 (0.0006) +[2024-08-05 13:07:11,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11878.4, 300 sec: 11635.4). Total num frames: 3776512. Throughput: 0: 11930.3. Samples: 3760940. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:07:11,435][06154] Avg episode reward: [(0, '3695.436')] +[2024-08-05 13:07:14,006][06215] Updated weights for policy 0, policy_version 7440 (0.0007) +[2024-08-05 13:07:16,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11878.4, 300 sec: 11649.3). Total num frames: 3833856. Throughput: 0: 11833.4. Samples: 3829760. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:07:16,435][06154] Avg episode reward: [(0, '3818.526')] +[2024-08-05 13:07:17,671][06215] Updated weights for policy 0, policy_version 7520 (0.0007) +[2024-08-05 13:07:21,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11741.9, 300 sec: 11635.4). Total num frames: 3887104. Throughput: 0: 11756.4. Samples: 3862592. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:07:21,435][06154] Avg episode reward: [(0, '3610.855')] +[2024-08-05 13:07:21,438][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000007592_3887104.pth... +[2024-08-05 13:07:21,442][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000006912_3538944.pth +[2024-08-05 13:07:21,593][06215] Updated weights for policy 0, policy_version 7600 (0.0007) +[2024-08-05 13:07:25,391][06215] Updated weights for policy 0, policy_version 7680 (0.0008) +[2024-08-05 13:07:26,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11741.9, 300 sec: 11663.2). Total num frames: 3944448. Throughput: 0: 11595.5. Samples: 3925636. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:07:26,435][06154] Avg episode reward: [(0, '3475.003')] +[2024-08-05 13:07:28,777][06215] Updated weights for policy 0, policy_version 7760 (0.0007) +[2024-08-05 13:07:31,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11663.2). Total num frames: 4001792. Throughput: 0: 11586.8. Samples: 3995992. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:07:31,442][06154] Avg episode reward: [(0, '2931.247')] +[2024-08-05 13:07:32,351][06215] Updated weights for policy 0, policy_version 7840 (0.0007) +[2024-08-05 13:07:36,236][06215] Updated weights for policy 0, policy_version 7920 (0.0008) +[2024-08-05 13:07:36,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11605.3, 300 sec: 11635.4). Total num frames: 4055040. Throughput: 0: 11479.3. Samples: 4028788. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:07:36,435][06154] Avg episode reward: [(0, '3136.506')] +[2024-08-05 13:07:36,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000007920_4055040.pth... +[2024-08-05 13:07:36,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000007256_3715072.pth +[2024-08-05 13:07:39,585][06215] Updated weights for policy 0, policy_version 8000 (0.0006) +[2024-08-05 13:07:41,435][06154] Fps is (10 sec: 11468.7, 60 sec: 11605.3, 300 sec: 11649.3). Total num frames: 4116480. Throughput: 0: 11601.0. Samples: 4099632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:07:41,435][06154] Avg episode reward: [(0, '3477.690')] +[2024-08-05 13:07:43,436][06215] Updated weights for policy 0, policy_version 8080 (0.0008) +[2024-08-05 13:07:46,434][06154] Fps is (10 sec: 11468.9, 60 sec: 11468.8, 300 sec: 11621.5). Total num frames: 4169728. Throughput: 0: 11353.8. Samples: 4164128. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:07:46,435][06154] Avg episode reward: [(0, '3756.718')] +[2024-08-05 13:07:47,043][06215] Updated weights for policy 0, policy_version 8160 (0.0006) +[2024-08-05 13:07:51,104][06215] Updated weights for policy 0, policy_version 8240 (0.0007) +[2024-08-05 13:07:51,435][06154] Fps is (10 sec: 10649.7, 60 sec: 11400.5, 300 sec: 11607.6). Total num frames: 4222976. Throughput: 0: 11228.3. Samples: 4192576. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:07:51,435][06154] Avg episode reward: [(0, '3731.638')] +[2024-08-05 13:07:51,438][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000008248_4222976.pth... +[2024-08-05 13:07:51,442][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000007592_3887104.pth +[2024-08-05 13:07:54,437][06215] Updated weights for policy 0, policy_version 8320 (0.0006) +[2024-08-05 13:07:56,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11537.1, 300 sec: 11649.3). Total num frames: 4284416. Throughput: 0: 11199.3. Samples: 4264908. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:07:56,442][06154] Avg episode reward: [(0, '3845.618')] +[2024-08-05 13:07:57,713][06215] Updated weights for policy 0, policy_version 8400 (0.0007) +[2024-08-05 13:08:01,060][06215] Updated weights for policy 0, policy_version 8480 (0.0006) +[2024-08-05 13:08:01,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11468.8, 300 sec: 11663.2). Total num frames: 4345856. Throughput: 0: 11298.9. Samples: 4338212. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:08:01,435][06154] Avg episode reward: [(0, '3770.491')] +[2024-08-05 13:08:04,507][06215] Updated weights for policy 0, policy_version 8560 (0.0007) +[2024-08-05 13:08:06,435][06154] Fps is (10 sec: 11878.3, 60 sec: 11468.8, 300 sec: 11649.3). Total num frames: 4403200. Throughput: 0: 11384.8. Samples: 4374908. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:08:06,435][06154] Avg episode reward: [(0, '3694.764')] +[2024-08-05 13:08:06,439][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000008600_4403200.pth... +[2024-08-05 13:08:06,443][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000007920_4055040.pth +[2024-08-05 13:08:08,177][06215] Updated weights for policy 0, policy_version 8640 (0.0008) +[2024-08-05 13:08:11,435][06154] Fps is (10 sec: 10649.7, 60 sec: 11264.0, 300 sec: 11607.6). Total num frames: 4452352. Throughput: 0: 11347.2. Samples: 4436260. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:08:11,435][06154] Avg episode reward: [(0, '3839.881')] +[2024-08-05 13:08:12,258][06215] Updated weights for policy 0, policy_version 8720 (0.0007) +[2024-08-05 13:08:16,026][06215] Updated weights for policy 0, policy_version 8800 (0.0008) +[2024-08-05 13:08:16,434][06154] Fps is (10 sec: 10649.7, 60 sec: 11264.0, 300 sec: 11593.8). Total num frames: 4509696. Throughput: 0: 11286.4. Samples: 4503880. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:08:16,435][06154] Avg episode reward: [(0, '3920.153')] +[2024-08-05 13:08:19,142][06215] Updated weights for policy 0, policy_version 8880 (0.0006) +[2024-08-05 13:08:21,435][06154] Fps is (10 sec: 11468.7, 60 sec: 11332.3, 300 sec: 11593.8). Total num frames: 4567040. Throughput: 0: 11432.8. Samples: 4543264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:08:21,435][06154] Avg episode reward: [(0, '3871.239')] +[2024-08-05 13:08:21,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000008920_4567040.pth... +[2024-08-05 13:08:21,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000008248_4222976.pth +[2024-08-05 13:08:22,879][06215] Updated weights for policy 0, policy_version 8960 (0.0007) +[2024-08-05 13:08:26,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11332.3, 300 sec: 11579.9). Total num frames: 4624384. Throughput: 0: 11306.9. Samples: 4608440. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:08:26,442][06154] Avg episode reward: [(0, '3698.552')] +[2024-08-05 13:08:26,560][06215] Updated weights for policy 0, policy_version 9040 (0.0007) +[2024-08-05 13:08:29,921][06215] Updated weights for policy 0, policy_version 9120 (0.0006) +[2024-08-05 13:08:31,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11400.5, 300 sec: 11593.8). Total num frames: 4685824. Throughput: 0: 11497.3. Samples: 4681508. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:08:31,435][06154] Avg episode reward: [(0, '3858.033')] +[2024-08-05 13:08:33,452][06215] Updated weights for policy 0, policy_version 9200 (0.0007) +[2024-08-05 13:08:36,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11332.3, 300 sec: 11552.1). Total num frames: 4734976. Throughput: 0: 11589.6. Samples: 4714108. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:08:36,435][06154] Avg episode reward: [(0, '4017.194')] +[2024-08-05 13:08:36,459][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000009256_4739072.pth... +[2024-08-05 13:08:36,462][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000008600_4403200.pth +[2024-08-05 13:08:37,372][06215] Updated weights for policy 0, policy_version 9280 (0.0007) +[2024-08-05 13:08:40,233][06215] Updated weights for policy 0, policy_version 9360 (0.0005) +[2024-08-05 13:08:41,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11468.8, 300 sec: 11607.6). Total num frames: 4804608. Throughput: 0: 11627.6. Samples: 4788148. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:08:41,435][06154] Avg episode reward: [(0, '3745.869')] +[2024-08-05 13:08:43,883][06215] Updated weights for policy 0, policy_version 9440 (0.0007) +[2024-08-05 13:08:46,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11400.5, 300 sec: 11579.9). Total num frames: 4853760. Throughput: 0: 11404.5. Samples: 4851416. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:08:46,435][06154] Avg episode reward: [(0, '3835.553')] +[2024-08-05 13:08:47,821][06215] Updated weights for policy 0, policy_version 9520 (0.0007) +[2024-08-05 13:08:51,226][06215] Updated weights for policy 0, policy_version 9600 (0.0006) +[2024-08-05 13:08:51,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11537.1, 300 sec: 11579.9). Total num frames: 4915200. Throughput: 0: 11408.9. Samples: 4888308. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:08:51,435][06154] Avg episode reward: [(0, '3885.663')] +[2024-08-05 13:08:51,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000009600_4915200.pth... +[2024-08-05 13:08:51,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000008920_4567040.pth +[2024-08-05 13:08:54,465][06215] Updated weights for policy 0, policy_version 9680 (0.0006) +[2024-08-05 13:08:56,435][06154] Fps is (10 sec: 12697.6, 60 sec: 11605.3, 300 sec: 11635.4). Total num frames: 4980736. Throughput: 0: 11648.5. Samples: 4960444. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:08:56,435][06154] Avg episode reward: [(0, '3966.601')] +[2024-08-05 13:08:57,822][06215] Updated weights for policy 0, policy_version 9760 (0.0006) +[2024-08-05 13:09:01,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11468.8, 300 sec: 11607.6). Total num frames: 5033984. Throughput: 0: 11681.9. Samples: 5029564. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:09:01,435][06154] Avg episode reward: [(0, '4140.144')] +[2024-08-05 13:09:01,435][06202] Saving new best policy, reward=4140.144! +[2024-08-05 13:09:01,701][06215] Updated weights for policy 0, policy_version 9840 (0.0007) +[2024-08-05 13:09:05,158][06215] Updated weights for policy 0, policy_version 9920 (0.0007) +[2024-08-05 13:09:06,435][06154] Fps is (10 sec: 11059.1, 60 sec: 11468.8, 300 sec: 11607.6). Total num frames: 5091328. Throughput: 0: 11563.7. Samples: 5063632. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:09:06,435][06154] Avg episode reward: [(0, '4045.717')] +[2024-08-05 13:09:06,438][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000009944_5091328.pth... +[2024-08-05 13:09:06,442][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000009256_4739072.pth +[2024-08-05 13:09:08,519][06215] Updated weights for policy 0, policy_version 10000 (0.0007) +[2024-08-05 13:09:11,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11605.3, 300 sec: 11621.5). Total num frames: 5148672. Throughput: 0: 11706.9. Samples: 5135252. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:09:11,435][06154] Avg episode reward: [(0, '4180.048')] +[2024-08-05 13:09:11,435][06202] Saving new best policy, reward=4180.048! +[2024-08-05 13:09:12,394][06215] Updated weights for policy 0, policy_version 10080 (0.0007) +[2024-08-05 13:09:15,908][06215] Updated weights for policy 0, policy_version 10160 (0.0006) +[2024-08-05 13:09:16,435][06154] Fps is (10 sec: 11468.9, 60 sec: 11605.3, 300 sec: 11621.5). Total num frames: 5206016. Throughput: 0: 11558.0. Samples: 5201616. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:09:16,435][06154] Avg episode reward: [(0, '4009.488')] +[2024-08-05 13:09:19,379][06215] Updated weights for policy 0, policy_version 10240 (0.0007) +[2024-08-05 13:09:21,435][06154] Fps is (10 sec: 11878.3, 60 sec: 11673.6, 300 sec: 11635.4). Total num frames: 5267456. Throughput: 0: 11623.7. Samples: 5237176. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:09:21,435][06154] Avg episode reward: [(0, '4038.241')] +[2024-08-05 13:09:21,439][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000010288_5267456.pth... +[2024-08-05 13:09:21,442][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000009600_4915200.pth +[2024-08-05 13:09:22,936][06215] Updated weights for policy 0, policy_version 10320 (0.0006) +[2024-08-05 13:09:26,341][06215] Updated weights for policy 0, policy_version 10400 (0.0006) +[2024-08-05 13:09:26,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11673.6, 300 sec: 11635.4). Total num frames: 5324800. Throughput: 0: 11562.4. Samples: 5308456. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:09:26,435][06154] Avg episode reward: [(0, '4047.272')] +[2024-08-05 13:09:29,897][06215] Updated weights for policy 0, policy_version 10480 (0.0007) +[2024-08-05 13:09:31,435][06154] Fps is (10 sec: 11468.4, 60 sec: 11605.3, 300 sec: 11621.5). Total num frames: 5382144. Throughput: 0: 11704.2. Samples: 5378112. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:09:31,436][06154] Avg episode reward: [(0, '4042.618')] +[2024-08-05 13:09:33,137][06215] Updated weights for policy 0, policy_version 10560 (0.0006) +[2024-08-05 13:09:36,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11673.6, 300 sec: 11621.5). Total num frames: 5435392. Throughput: 0: 11698.2. Samples: 5414728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:09:36,442][06154] Avg episode reward: [(0, '3791.930')] +[2024-08-05 13:09:36,445][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000010616_5435392.pth... +[2024-08-05 13:09:36,448][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000009944_5091328.pth +[2024-08-05 13:09:37,178][06215] Updated weights for policy 0, policy_version 10640 (0.0007) +[2024-08-05 13:09:40,536][06215] Updated weights for policy 0, policy_version 10720 (0.0006) +[2024-08-05 13:09:41,435][06154] Fps is (10 sec: 11469.2, 60 sec: 11537.1, 300 sec: 11635.4). Total num frames: 5496832. Throughput: 0: 11556.9. Samples: 5480504. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:09:41,435][06154] Avg episode reward: [(0, '3812.640')] +[2024-08-05 13:09:44,097][06215] Updated weights for policy 0, policy_version 10800 (0.0008) +[2024-08-05 13:09:46,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11605.3, 300 sec: 11607.6). Total num frames: 5550080. Throughput: 0: 11477.4. Samples: 5546048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:09:46,435][06154] Avg episode reward: [(0, '3961.465')] +[2024-08-05 13:09:47,786][06215] Updated weights for policy 0, policy_version 10880 (0.0007) +[2024-08-05 13:09:51,215][06215] Updated weights for policy 0, policy_version 10960 (0.0006) +[2024-08-05 13:09:51,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11605.3, 300 sec: 11621.5). Total num frames: 5611520. Throughput: 0: 11553.8. Samples: 5583552. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:09:51,435][06154] Avg episode reward: [(0, '3785.886')] +[2024-08-05 13:09:51,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000010960_5611520.pth... +[2024-08-05 13:09:51,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000010288_5267456.pth +[2024-08-05 13:09:54,374][06215] Updated weights for policy 0, policy_version 11040 (0.0005) +[2024-08-05 13:09:56,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11537.1, 300 sec: 11635.4). Total num frames: 5672960. Throughput: 0: 11605.1. Samples: 5657484. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:09:56,435][06154] Avg episode reward: [(0, '4053.137')] +[2024-08-05 13:09:58,078][06215] Updated weights for policy 0, policy_version 11120 (0.0006) +[2024-08-05 13:10:01,183][06215] Updated weights for policy 0, policy_version 11200 (0.0005) +[2024-08-05 13:10:01,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11673.6, 300 sec: 11649.3). Total num frames: 5734400. Throughput: 0: 11744.9. Samples: 5730136. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:10:01,435][06154] Avg episode reward: [(0, '4022.865')] +[2024-08-05 13:10:04,549][06215] Updated weights for policy 0, policy_version 11280 (0.0006) +[2024-08-05 13:10:06,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11741.9, 300 sec: 11649.3). Total num frames: 5795840. Throughput: 0: 11795.8. Samples: 5767988. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:10:06,435][06154] Avg episode reward: [(0, '4131.027')] +[2024-08-05 13:10:06,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000011320_5795840.pth... +[2024-08-05 13:10:06,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000010616_5435392.pth +[2024-08-05 13:10:07,845][06215] Updated weights for policy 0, policy_version 11360 (0.0006) +[2024-08-05 13:10:11,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11635.4). Total num frames: 5849088. Throughput: 0: 11712.5. Samples: 5835520. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:10:11,435][06154] Avg episode reward: [(0, '3957.954')] +[2024-08-05 13:10:12,019][06215] Updated weights for policy 0, policy_version 11440 (0.0008) +[2024-08-05 13:10:15,744][06215] Updated weights for policy 0, policy_version 11520 (0.0007) +[2024-08-05 13:10:16,435][06154] Fps is (10 sec: 10649.6, 60 sec: 11605.3, 300 sec: 11635.4). Total num frames: 5902336. Throughput: 0: 11587.0. Samples: 5899520. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:10:16,435][06154] Avg episode reward: [(0, '3939.575')] +[2024-08-05 13:10:19,163][06215] Updated weights for policy 0, policy_version 11600 (0.0006) +[2024-08-05 13:10:21,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11605.3, 300 sec: 11635.4). Total num frames: 5963776. Throughput: 0: 11612.8. Samples: 5937304. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:10:21,435][06154] Avg episode reward: [(0, '4068.646')] +[2024-08-05 13:10:21,464][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000011656_5967872.pth... +[2024-08-05 13:10:21,468][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000010960_5611520.pth +[2024-08-05 13:10:22,431][06215] Updated weights for policy 0, policy_version 11680 (0.0006) +[2024-08-05 13:10:25,519][06215] Updated weights for policy 0, policy_version 11760 (0.0006) +[2024-08-05 13:10:26,435][06154] Fps is (10 sec: 12697.6, 60 sec: 11741.9, 300 sec: 11663.2). Total num frames: 6029312. Throughput: 0: 11831.3. Samples: 6012912. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:10:26,435][06154] Avg episode reward: [(0, '3995.162')] +[2024-08-05 13:10:29,345][06215] Updated weights for policy 0, policy_version 11840 (0.0008) +[2024-08-05 13:10:31,435][06154] Fps is (10 sec: 12288.1, 60 sec: 11742.0, 300 sec: 11663.2). Total num frames: 6086656. Throughput: 0: 11879.1. Samples: 6080608. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:10:31,435][06154] Avg episode reward: [(0, '4171.314')] +[2024-08-05 13:10:32,883][06215] Updated weights for policy 0, policy_version 11920 (0.0006) +[2024-08-05 13:10:35,841][06215] Updated weights for policy 0, policy_version 12000 (0.0005) +[2024-08-05 13:10:36,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11946.7, 300 sec: 11677.1). Total num frames: 6152192. Throughput: 0: 11859.7. Samples: 6117236. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:10:36,435][06154] Avg episode reward: [(0, '4244.814')] +[2024-08-05 13:10:36,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000012016_6152192.pth... +[2024-08-05 13:10:36,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000011320_5795840.pth +[2024-08-05 13:10:36,441][06202] Saving new best policy, reward=4244.814! +[2024-08-05 13:10:39,241][06215] Updated weights for policy 0, policy_version 12080 (0.0006) +[2024-08-05 13:10:41,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11878.4, 300 sec: 11663.2). Total num frames: 6209536. Throughput: 0: 11827.3. Samples: 6189712. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:10:41,435][06154] Avg episode reward: [(0, '4125.297')] +[2024-08-05 13:10:42,614][06215] Updated weights for policy 0, policy_version 12160 (0.0006) +[2024-08-05 13:10:46,018][06215] Updated weights for policy 0, policy_version 12240 (0.0007) +[2024-08-05 13:10:46,435][06154] Fps is (10 sec: 11878.0, 60 sec: 12014.9, 300 sec: 11663.2). Total num frames: 6270976. Throughput: 0: 11875.3. Samples: 6264528. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:10:46,436][06154] Avg episode reward: [(0, '4295.242')] +[2024-08-05 13:10:46,437][06202] Saving new best policy, reward=4295.242! +[2024-08-05 13:10:49,805][06215] Updated weights for policy 0, policy_version 12320 (0.0009) +[2024-08-05 13:10:51,500][06154] Fps is (10 sec: 11395.2, 60 sec: 11865.6, 300 sec: 11660.6). Total num frames: 6324224. Throughput: 0: 11756.0. Samples: 6297768. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:10:51,500][06154] Avg episode reward: [(0, '4258.303')] +[2024-08-05 13:10:51,504][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000012352_6324224.pth... +[2024-08-05 13:10:51,507][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000011656_5967872.pth +[2024-08-05 13:10:53,458][06215] Updated weights for policy 0, policy_version 12400 (0.0007) +[2024-08-05 13:10:56,435][06154] Fps is (10 sec: 11059.6, 60 sec: 11810.1, 300 sec: 11663.2). Total num frames: 6381568. Throughput: 0: 11771.2. Samples: 6365224. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:10:56,435][06154] Avg episode reward: [(0, '4193.152')] +[2024-08-05 13:10:57,409][06215] Updated weights for policy 0, policy_version 12480 (0.0008) +[2024-08-05 13:11:00,983][06215] Updated weights for policy 0, policy_version 12560 (0.0007) +[2024-08-05 13:11:01,435][06154] Fps is (10 sec: 11131.1, 60 sec: 11673.6, 300 sec: 11635.4). Total num frames: 6434816. Throughput: 0: 11771.6. Samples: 6429244. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:11:01,435][06154] Avg episode reward: [(0, '4043.217')] +[2024-08-05 13:11:04,642][06215] Updated weights for policy 0, policy_version 12640 (0.0007) +[2024-08-05 13:11:06,435][06154] Fps is (10 sec: 11059.1, 60 sec: 11605.3, 300 sec: 11621.5). Total num frames: 6492160. Throughput: 0: 11694.2. Samples: 6463544. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:11:06,435][06154] Avg episode reward: [(0, '4022.507')] +[2024-08-05 13:11:06,440][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000012680_6492160.pth... +[2024-08-05 13:11:06,444][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000012016_6152192.pth +[2024-08-05 13:11:08,449][06215] Updated weights for policy 0, policy_version 12720 (0.0008) +[2024-08-05 13:11:11,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11605.3, 300 sec: 11607.6). Total num frames: 6545408. Throughput: 0: 11525.6. Samples: 6531564. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:11:11,442][06154] Avg episode reward: [(0, '4136.288')] +[2024-08-05 13:11:11,711][06215] Updated weights for policy 0, policy_version 12800 (0.0006) +[2024-08-05 13:11:15,311][06215] Updated weights for policy 0, policy_version 12880 (0.0006) +[2024-08-05 13:11:16,435][06154] Fps is (10 sec: 11468.9, 60 sec: 11741.9, 300 sec: 11607.7). Total num frames: 6606848. Throughput: 0: 11601.9. Samples: 6602692. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:11:16,435][06154] Avg episode reward: [(0, '3994.911')] +[2024-08-05 13:11:18,665][06215] Updated weights for policy 0, policy_version 12960 (0.0007) +[2024-08-05 13:11:21,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11741.9, 300 sec: 11621.5). Total num frames: 6668288. Throughput: 0: 11571.6. Samples: 6637956. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:11:21,442][06154] Avg episode reward: [(0, '3796.950')] +[2024-08-05 13:11:21,445][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000013024_6668288.pth... +[2024-08-05 13:11:21,448][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000012352_6324224.pth +[2024-08-05 13:11:22,165][06215] Updated weights for policy 0, policy_version 13040 (0.0006) +[2024-08-05 13:11:25,671][06215] Updated weights for policy 0, policy_version 13120 (0.0006) +[2024-08-05 13:11:26,435][06154] Fps is (10 sec: 11878.3, 60 sec: 11605.3, 300 sec: 11607.6). Total num frames: 6725632. Throughput: 0: 11477.9. Samples: 6706216. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:11:26,435][06154] Avg episode reward: [(0, '3984.793')] +[2024-08-05 13:11:29,097][06215] Updated weights for policy 0, policy_version 13200 (0.0007) +[2024-08-05 13:11:31,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11673.6, 300 sec: 11621.5). Total num frames: 6787072. Throughput: 0: 11521.2. Samples: 6782976. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:11:31,435][06154] Avg episode reward: [(0, '4183.955')] +[2024-08-05 13:11:32,500][06215] Updated weights for policy 0, policy_version 13280 (0.0006) +[2024-08-05 13:11:35,855][06215] Updated weights for policy 0, policy_version 13360 (0.0006) +[2024-08-05 13:11:36,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11537.1, 300 sec: 11607.6). Total num frames: 6844416. Throughput: 0: 11567.4. Samples: 6817556. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:11:36,435][06154] Avg episode reward: [(0, '4019.354')] +[2024-08-05 13:11:36,439][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000013368_6844416.pth... +[2024-08-05 13:11:36,442][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000012680_6492160.pth +[2024-08-05 13:11:39,352][06215] Updated weights for policy 0, policy_version 13440 (0.0006) +[2024-08-05 13:11:41,435][06154] Fps is (10 sec: 11878.2, 60 sec: 11605.3, 300 sec: 11607.6). Total num frames: 6905856. Throughput: 0: 11607.5. Samples: 6887564. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:11:41,435][06154] Avg episode reward: [(0, '4117.161')] +[2024-08-05 13:11:42,782][06215] Updated weights for policy 0, policy_version 13520 (0.0006) +[2024-08-05 13:11:46,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11468.9, 300 sec: 11593.8). Total num frames: 6959104. Throughput: 0: 11723.2. Samples: 6956788. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:11:46,435][06154] Avg episode reward: [(0, '4271.372')] +[2024-08-05 13:11:46,439][06215] Updated weights for policy 0, policy_version 13600 (0.0007) +[2024-08-05 13:11:49,876][06215] Updated weights for policy 0, policy_version 13680 (0.0006) +[2024-08-05 13:11:51,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11617.8, 300 sec: 11621.5). Total num frames: 7020544. Throughput: 0: 11739.6. Samples: 6991824. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:11:51,435][06154] Avg episode reward: [(0, '4134.239')] +[2024-08-05 13:11:51,440][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000013712_7020544.pth... +[2024-08-05 13:11:51,444][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000013024_6668288.pth +[2024-08-05 13:11:53,714][06215] Updated weights for policy 0, policy_version 13760 (0.0007) +[2024-08-05 13:11:56,434][06154] Fps is (10 sec: 11059.3, 60 sec: 11468.8, 300 sec: 11566.0). Total num frames: 7069696. Throughput: 0: 11600.4. Samples: 7053580. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:11:56,435][06154] Avg episode reward: [(0, '4110.446')] +[2024-08-05 13:11:57,611][06215] Updated weights for policy 0, policy_version 13840 (0.0007) +[2024-08-05 13:12:01,169][06215] Updated weights for policy 0, policy_version 13920 (0.0007) +[2024-08-05 13:12:01,435][06154] Fps is (10 sec: 10649.7, 60 sec: 11537.1, 300 sec: 11566.0). Total num frames: 7127040. Throughput: 0: 11562.5. Samples: 7123004. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:12:01,435][06154] Avg episode reward: [(0, '4189.099')] +[2024-08-05 13:12:04,706][06215] Updated weights for policy 0, policy_version 14000 (0.0007) +[2024-08-05 13:12:06,435][06154] Fps is (10 sec: 11468.6, 60 sec: 11537.1, 300 sec: 11552.1). Total num frames: 7184384. Throughput: 0: 11573.4. Samples: 7158760. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:12:06,435][06154] Avg episode reward: [(0, '4148.257')] +[2024-08-05 13:12:06,495][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000014040_7188480.pth... +[2024-08-05 13:12:06,500][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000013368_6844416.pth +[2024-08-05 13:12:08,153][06215] Updated weights for policy 0, policy_version 14080 (0.0007) +[2024-08-05 13:12:11,284][06215] Updated weights for policy 0, policy_version 14160 (0.0006) +[2024-08-05 13:12:11,435][06154] Fps is (10 sec: 12288.1, 60 sec: 11741.9, 300 sec: 11579.9). Total num frames: 7249920. Throughput: 0: 11684.8. Samples: 7232032. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:12:11,435][06154] Avg episode reward: [(0, '3892.216')] +[2024-08-05 13:12:15,217][06215] Updated weights for policy 0, policy_version 14240 (0.0007) +[2024-08-05 13:12:16,435][06154] Fps is (10 sec: 11878.5, 60 sec: 11605.3, 300 sec: 11579.9). Total num frames: 7303168. Throughput: 0: 11468.8. Samples: 7299072. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:12:16,435][06154] Avg episode reward: [(0, '3638.608')] +[2024-08-05 13:12:18,345][06215] Updated weights for policy 0, policy_version 14320 (0.0006) +[2024-08-05 13:12:21,435][06154] Fps is (10 sec: 11878.3, 60 sec: 11673.6, 300 sec: 11607.6). Total num frames: 7368704. Throughput: 0: 11541.2. Samples: 7336908. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:12:21,442][06154] Avg episode reward: [(0, '3922.995')] +[2024-08-05 13:12:21,446][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000014392_7368704.pth... +[2024-08-05 13:12:21,450][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000013712_7020544.pth +[2024-08-05 13:12:21,764][06215] Updated weights for policy 0, policy_version 14400 (0.0006) +[2024-08-05 13:12:25,657][06215] Updated weights for policy 0, policy_version 14480 (0.0007) +[2024-08-05 13:12:26,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11605.3, 300 sec: 11593.8). Total num frames: 7421952. Throughput: 0: 11513.0. Samples: 7405648. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:12:26,435][06154] Avg episode reward: [(0, '3900.864')] +[2024-08-05 13:12:29,482][06215] Updated weights for policy 0, policy_version 14560 (0.0006) +[2024-08-05 13:12:31,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11537.1, 300 sec: 11607.6). Total num frames: 7479296. Throughput: 0: 11467.8. Samples: 7472840. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:12:31,435][06154] Avg episode reward: [(0, '3970.947')] +[2024-08-05 13:12:32,919][06215] Updated weights for policy 0, policy_version 14640 (0.0007) +[2024-08-05 13:12:36,283][06215] Updated weights for policy 0, policy_version 14720 (0.0007) +[2024-08-05 13:12:36,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11537.1, 300 sec: 11593.8). Total num frames: 7536640. Throughput: 0: 11488.1. Samples: 7508788. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:12:36,435][06154] Avg episode reward: [(0, '3958.918')] +[2024-08-05 13:12:36,438][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000014720_7536640.pth... +[2024-08-05 13:12:36,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000014040_7188480.pth +[2024-08-05 13:12:39,583][06215] Updated weights for policy 0, policy_version 14800 (0.0006) +[2024-08-05 13:12:41,435][06154] Fps is (10 sec: 11878.0, 60 sec: 11537.0, 300 sec: 11621.5). Total num frames: 7598080. Throughput: 0: 11724.2. Samples: 7581172. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:12:41,443][06154] Avg episode reward: [(0, '3868.940')] +[2024-08-05 13:12:43,362][06215] Updated weights for policy 0, policy_version 14880 (0.0007) +[2024-08-05 13:12:46,435][06154] Fps is (10 sec: 11058.8, 60 sec: 11468.7, 300 sec: 11607.6). Total num frames: 7647232. Throughput: 0: 11565.2. Samples: 7643440. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:12:46,435][06154] Avg episode reward: [(0, '4075.410')] +[2024-08-05 13:12:47,155][06215] Updated weights for policy 0, policy_version 14960 (0.0007) +[2024-08-05 13:12:50,696][06215] Updated weights for policy 0, policy_version 15040 (0.0006) +[2024-08-05 13:12:51,435][06154] Fps is (10 sec: 11059.6, 60 sec: 11468.8, 300 sec: 11607.6). Total num frames: 7708672. Throughput: 0: 11496.9. Samples: 7676120. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:12:51,435][06154] Avg episode reward: [(0, '4046.790')] +[2024-08-05 13:12:51,438][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000015056_7708672.pth... +[2024-08-05 13:12:51,442][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000014392_7368704.pth +[2024-08-05 13:12:54,488][06215] Updated weights for policy 0, policy_version 15120 (0.0007) +[2024-08-05 13:12:56,435][06154] Fps is (10 sec: 11469.1, 60 sec: 11537.1, 300 sec: 11579.9). Total num frames: 7761920. Throughput: 0: 11410.4. Samples: 7745500. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:12:56,442][06154] Avg episode reward: [(0, '4213.771')] +[2024-08-05 13:12:58,462][06215] Updated weights for policy 0, policy_version 15200 (0.0007) +[2024-08-05 13:13:01,435][06154] Fps is (10 sec: 9830.1, 60 sec: 11332.2, 300 sec: 11538.2). Total num frames: 7806976. Throughput: 0: 11259.5. Samples: 7805752. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:13:01,443][06154] Avg episode reward: [(0, '4214.612')] +[2024-08-05 13:13:02,341][06215] Updated weights for policy 0, policy_version 15280 (0.0007) +[2024-08-05 13:13:06,241][06215] Updated weights for policy 0, policy_version 15360 (0.0007) +[2024-08-05 13:13:06,435][06154] Fps is (10 sec: 10240.0, 60 sec: 11332.3, 300 sec: 11566.0). Total num frames: 7864320. Throughput: 0: 11134.1. Samples: 7837944. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:13:06,435][06154] Avg episode reward: [(0, '4160.871')] +[2024-08-05 13:13:06,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000015360_7864320.pth... +[2024-08-05 13:13:06,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000014720_7536640.pth +[2024-08-05 13:13:09,661][06215] Updated weights for policy 0, policy_version 15440 (0.0006) +[2024-08-05 13:13:11,435][06154] Fps is (10 sec: 12288.5, 60 sec: 11332.3, 300 sec: 11593.8). Total num frames: 7929856. Throughput: 0: 11155.8. Samples: 7907660. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:13:11,435][06154] Avg episode reward: [(0, '4138.485')] +[2024-08-05 13:13:12,972][06215] Updated weights for policy 0, policy_version 15520 (0.0007) +[2024-08-05 13:13:16,385][06215] Updated weights for policy 0, policy_version 15600 (0.0006) +[2024-08-05 13:13:16,438][06154] Fps is (10 sec: 12283.9, 60 sec: 11399.9, 300 sec: 11593.6). Total num frames: 7987200. Throughput: 0: 11259.7. Samples: 7979564. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:13:16,438][06154] Avg episode reward: [(0, '4033.842')] +[2024-08-05 13:13:20,437][06215] Updated weights for policy 0, policy_version 15680 (0.0008) +[2024-08-05 13:13:21,435][06154] Fps is (10 sec: 10239.8, 60 sec: 11059.2, 300 sec: 11552.1). Total num frames: 8032256. Throughput: 0: 11136.8. Samples: 8009944. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:13:21,435][06154] Avg episode reward: [(0, '4069.134')] +[2024-08-05 13:13:21,475][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000015696_8036352.pth... +[2024-08-05 13:13:21,481][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000015056_7708672.pth +[2024-08-05 13:13:24,180][06215] Updated weights for policy 0, policy_version 15760 (0.0007) +[2024-08-05 13:13:26,434][06154] Fps is (10 sec: 10653.2, 60 sec: 11195.7, 300 sec: 11552.1). Total num frames: 8093696. Throughput: 0: 11022.0. Samples: 8077156. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:13:26,435][06154] Avg episode reward: [(0, '4194.676')] +[2024-08-05 13:13:27,660][06215] Updated weights for policy 0, policy_version 15840 (0.0007) +[2024-08-05 13:13:31,130][06215] Updated weights for policy 0, policy_version 15920 (0.0006) +[2024-08-05 13:13:31,435][06154] Fps is (10 sec: 11878.5, 60 sec: 11195.7, 300 sec: 11579.9). Total num frames: 8151040. Throughput: 0: 11194.7. Samples: 8147200. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:13:31,435][06154] Avg episode reward: [(0, '4110.441')] +[2024-08-05 13:13:34,626][06215] Updated weights for policy 0, policy_version 16000 (0.0006) +[2024-08-05 13:13:36,435][06154] Fps is (10 sec: 11468.7, 60 sec: 11195.7, 300 sec: 11538.2). Total num frames: 8208384. Throughput: 0: 11263.2. Samples: 8182964. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:13:36,435][06154] Avg episode reward: [(0, '4185.514')] +[2024-08-05 13:13:36,463][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000016040_8212480.pth... +[2024-08-05 13:13:36,469][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000015360_7864320.pth +[2024-08-05 13:13:38,047][06215] Updated weights for policy 0, policy_version 16080 (0.0006) +[2024-08-05 13:13:41,435][06154] Fps is (10 sec: 11878.0, 60 sec: 11195.7, 300 sec: 11579.9). Total num frames: 8269824. Throughput: 0: 11339.9. Samples: 8255800. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:13:41,435][06154] Avg episode reward: [(0, '4027.438')] +[2024-08-05 13:13:41,489][06215] Updated weights for policy 0, policy_version 16160 (0.0007) +[2024-08-05 13:13:45,288][06215] Updated weights for policy 0, policy_version 16240 (0.0007) +[2024-08-05 13:13:46,435][06154] Fps is (10 sec: 11878.5, 60 sec: 11332.3, 300 sec: 11566.0). Total num frames: 8327168. Throughput: 0: 11497.5. Samples: 8323136. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:13:46,435][06154] Avg episode reward: [(0, '4117.903')] +[2024-08-05 13:13:49,051][06215] Updated weights for policy 0, policy_version 16320 (0.0008) +[2024-08-05 13:13:51,434][06154] Fps is (10 sec: 11059.6, 60 sec: 11195.7, 300 sec: 11524.3). Total num frames: 8380416. Throughput: 0: 11462.8. Samples: 8353768. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:13:51,435][06154] Avg episode reward: [(0, '4194.062')] +[2024-08-05 13:13:51,438][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000016368_8380416.pth... +[2024-08-05 13:13:51,441][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000015696_8036352.pth +[2024-08-05 13:13:52,428][06215] Updated weights for policy 0, policy_version 16400 (0.0006) +[2024-08-05 13:13:55,895][06215] Updated weights for policy 0, policy_version 16480 (0.0006) +[2024-08-05 13:13:56,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11332.3, 300 sec: 11552.1). Total num frames: 8441856. Throughput: 0: 11530.6. Samples: 8426536. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:13:56,435][06154] Avg episode reward: [(0, '4157.561')] +[2024-08-05 13:13:59,461][06215] Updated weights for policy 0, policy_version 16560 (0.0007) +[2024-08-05 13:14:01,435][06154] Fps is (10 sec: 11878.2, 60 sec: 11537.1, 300 sec: 11552.1). Total num frames: 8499200. Throughput: 0: 11458.7. Samples: 8495168. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:14:01,435][06154] Avg episode reward: [(0, '4109.393')] +[2024-08-05 13:14:03,177][06215] Updated weights for policy 0, policy_version 16640 (0.0008) +[2024-08-05 13:14:06,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11537.1, 300 sec: 11552.1). Total num frames: 8556544. Throughput: 0: 11511.0. Samples: 8527936. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:14:06,442][06154] Avg episode reward: [(0, '4273.773')] +[2024-08-05 13:14:06,445][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000016712_8556544.pth... +[2024-08-05 13:14:06,449][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000016040_8212480.pth +[2024-08-05 13:14:06,585][06215] Updated weights for policy 0, policy_version 16720 (0.0006) +[2024-08-05 13:14:10,411][06215] Updated weights for policy 0, policy_version 16800 (0.0007) +[2024-08-05 13:14:11,435][06154] Fps is (10 sec: 11059.3, 60 sec: 11332.3, 300 sec: 11538.2). Total num frames: 8609792. Throughput: 0: 11493.1. Samples: 8594348. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:14:11,435][06154] Avg episode reward: [(0, '4263.600')] +[2024-08-05 13:14:14,269][06215] Updated weights for policy 0, policy_version 16880 (0.0008) +[2024-08-05 13:14:16,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11332.9, 300 sec: 11524.3). Total num frames: 8667136. Throughput: 0: 11446.6. Samples: 8662296. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:14:16,442][06154] Avg episode reward: [(0, '4315.598')] +[2024-08-05 13:14:16,443][06202] Saving new best policy, reward=4315.598! +[2024-08-05 13:14:17,816][06215] Updated weights for policy 0, policy_version 16960 (0.0006) +[2024-08-05 13:14:21,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11468.8, 300 sec: 11510.5). Total num frames: 8720384. Throughput: 0: 11326.2. Samples: 8692644. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:14:21,435][06154] Avg episode reward: [(0, '4362.797')] +[2024-08-05 13:14:21,438][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000017032_8720384.pth... +[2024-08-05 13:14:21,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000016368_8380416.pth +[2024-08-05 13:14:21,441][06202] Saving new best policy, reward=4362.797! +[2024-08-05 13:14:21,698][06215] Updated weights for policy 0, policy_version 17040 (0.0007) +[2024-08-05 13:14:25,082][06215] Updated weights for policy 0, policy_version 17120 (0.0006) +[2024-08-05 13:14:26,434][06154] Fps is (10 sec: 11468.8, 60 sec: 11468.8, 300 sec: 11524.4). Total num frames: 8781824. Throughput: 0: 11272.3. Samples: 8763048. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:14:26,435][06154] Avg episode reward: [(0, '4357.061')] +[2024-08-05 13:14:28,315][06215] Updated weights for policy 0, policy_version 17200 (0.0006) +[2024-08-05 13:14:31,435][06154] Fps is (10 sec: 11468.6, 60 sec: 11400.5, 300 sec: 11524.3). Total num frames: 8835072. Throughput: 0: 11292.4. Samples: 8831296. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:14:31,443][06154] Avg episode reward: [(0, '4285.054')] +[2024-08-05 13:14:32,423][06215] Updated weights for policy 0, policy_version 17280 (0.0007) +[2024-08-05 13:14:35,976][06215] Updated weights for policy 0, policy_version 17360 (0.0006) +[2024-08-05 13:14:36,435][06154] Fps is (10 sec: 11059.1, 60 sec: 11400.5, 300 sec: 11510.5). Total num frames: 8892416. Throughput: 0: 11366.0. Samples: 8865240. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:14:36,435][06154] Avg episode reward: [(0, '4251.182')] +[2024-08-05 13:14:36,438][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000017368_8892416.pth... +[2024-08-05 13:14:36,441][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000016712_8556544.pth +[2024-08-05 13:14:39,227][06215] Updated weights for policy 0, policy_version 17440 (0.0006) +[2024-08-05 13:14:41,435][06154] Fps is (10 sec: 11878.6, 60 sec: 11400.6, 300 sec: 11538.2). Total num frames: 8953856. Throughput: 0: 11354.1. Samples: 8937472. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:14:41,435][06154] Avg episode reward: [(0, '4245.982')] +[2024-08-05 13:14:42,616][06215] Updated weights for policy 0, policy_version 17520 (0.0006) +[2024-08-05 13:14:45,848][06215] Updated weights for policy 0, policy_version 17600 (0.0006) +[2024-08-05 13:14:46,435][06154] Fps is (10 sec: 12287.7, 60 sec: 11468.7, 300 sec: 11538.2). Total num frames: 9015296. Throughput: 0: 11468.7. Samples: 9011264. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:14:46,435][06154] Avg episode reward: [(0, '4260.156')] +[2024-08-05 13:14:49,844][06215] Updated weights for policy 0, policy_version 17680 (0.0008) +[2024-08-05 13:14:51,435][06154] Fps is (10 sec: 11878.3, 60 sec: 11537.1, 300 sec: 11524.3). Total num frames: 9072640. Throughput: 0: 11399.0. Samples: 9040892. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:14:51,435][06154] Avg episode reward: [(0, '4274.692')] +[2024-08-05 13:14:51,438][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000017720_9072640.pth... +[2024-08-05 13:14:51,441][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000017032_8720384.pth +[2024-08-05 13:14:52,959][06215] Updated weights for policy 0, policy_version 17760 (0.0005) +[2024-08-05 13:14:56,053][06215] Updated weights for policy 0, policy_version 17840 (0.0005) +[2024-08-05 13:14:56,434][06154] Fps is (10 sec: 12288.4, 60 sec: 11605.3, 300 sec: 11538.2). Total num frames: 9138176. Throughput: 0: 11670.7. Samples: 9119528. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:14:56,435][06154] Avg episode reward: [(0, '4325.654')] +[2024-08-05 13:14:59,402][06215] Updated weights for policy 0, policy_version 17920 (0.0006) +[2024-08-05 13:15:01,434][06154] Fps is (10 sec: 12697.7, 60 sec: 11673.6, 300 sec: 11538.2). Total num frames: 9199616. Throughput: 0: 11801.0. Samples: 9193340. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:15:01,435][06154] Avg episode reward: [(0, '4191.338')] +[2024-08-05 13:15:02,725][06215] Updated weights for policy 0, policy_version 18000 (0.0007) +[2024-08-05 13:15:06,247][06215] Updated weights for policy 0, policy_version 18080 (0.0006) +[2024-08-05 13:15:06,435][06154] Fps is (10 sec: 11878.3, 60 sec: 11673.6, 300 sec: 11552.1). Total num frames: 9256960. Throughput: 0: 11938.1. Samples: 9229860. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:15:06,435][06154] Avg episode reward: [(0, '4131.222')] +[2024-08-05 13:15:06,439][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000018080_9256960.pth... +[2024-08-05 13:15:06,443][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000017368_8892416.pth +[2024-08-05 13:15:09,566][06215] Updated weights for policy 0, policy_version 18160 (0.0005) +[2024-08-05 13:15:11,435][06154] Fps is (10 sec: 11878.3, 60 sec: 11810.1, 300 sec: 11579.9). Total num frames: 9318400. Throughput: 0: 11972.9. Samples: 9301828. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:15:11,435][06154] Avg episode reward: [(0, '4234.810')] +[2024-08-05 13:15:13,476][06215] Updated weights for policy 0, policy_version 18240 (0.0008) +[2024-08-05 13:15:16,434][06154] Fps is (10 sec: 11468.9, 60 sec: 11741.9, 300 sec: 11552.1). Total num frames: 9371648. Throughput: 0: 11914.5. Samples: 9367448. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:15:16,442][06154] Avg episode reward: [(0, '4188.248')] +[2024-08-05 13:15:16,871][06215] Updated weights for policy 0, policy_version 18320 (0.0006) +[2024-08-05 13:15:20,643][06215] Updated weights for policy 0, policy_version 18400 (0.0007) +[2024-08-05 13:15:21,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11810.1, 300 sec: 11524.3). Total num frames: 9428992. Throughput: 0: 11877.4. Samples: 9399724. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:15:21,435][06154] Avg episode reward: [(0, '4181.504')] +[2024-08-05 13:15:21,439][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000018416_9428992.pth... +[2024-08-05 13:15:21,444][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000017720_9072640.pth +[2024-08-05 13:15:24,158][06215] Updated weights for policy 0, policy_version 18480 (0.0007) +[2024-08-05 13:15:26,435][06154] Fps is (10 sec: 11878.3, 60 sec: 11810.1, 300 sec: 11538.2). Total num frames: 9490432. Throughput: 0: 11834.3. Samples: 9470016. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:15:26,435][06154] Avg episode reward: [(0, '4142.351')] +[2024-08-05 13:15:27,386][06215] Updated weights for policy 0, policy_version 18560 (0.0006) +[2024-08-05 13:15:31,140][06215] Updated weights for policy 0, policy_version 18640 (0.0007) +[2024-08-05 13:15:31,435][06154] Fps is (10 sec: 11468.9, 60 sec: 11810.2, 300 sec: 11496.6). Total num frames: 9543680. Throughput: 0: 11741.9. Samples: 9539648. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:15:31,435][06154] Avg episode reward: [(0, '4024.120')] +[2024-08-05 13:15:35,192][06215] Updated weights for policy 0, policy_version 18720 (0.0007) +[2024-08-05 13:15:36,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11810.1, 300 sec: 11496.6). Total num frames: 9601024. Throughput: 0: 11798.7. Samples: 9571832. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:15:36,435][06154] Avg episode reward: [(0, '4116.820')] +[2024-08-05 13:15:36,437][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000018752_9601024.pth... +[2024-08-05 13:15:36,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000018080_9256960.pth +[2024-08-05 13:15:38,212][06215] Updated weights for policy 0, policy_version 18800 (0.0006) +[2024-08-05 13:15:41,435][06154] Fps is (10 sec: 11059.2, 60 sec: 11673.6, 300 sec: 11468.8). Total num frames: 9654272. Throughput: 0: 11612.2. Samples: 9642076. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:15:41,435][06154] Avg episode reward: [(0, '4357.951')] +[2024-08-05 13:15:42,197][06215] Updated weights for policy 0, policy_version 18880 (0.0008) +[2024-08-05 13:15:45,686][06215] Updated weights for policy 0, policy_version 18960 (0.0006) +[2024-08-05 13:15:46,435][06154] Fps is (10 sec: 11468.6, 60 sec: 11673.6, 300 sec: 11499.1). Total num frames: 9715712. Throughput: 0: 11467.4. Samples: 9709376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:15:46,435][06154] Avg episode reward: [(0, '4447.181')] +[2024-08-05 13:15:46,436][06202] Saving new best policy, reward=4447.181! +[2024-08-05 13:15:49,127][06215] Updated weights for policy 0, policy_version 19040 (0.0007) +[2024-08-05 13:15:51,435][06154] Fps is (10 sec: 11878.4, 60 sec: 11673.6, 300 sec: 11496.6). Total num frames: 9773056. Throughput: 0: 11435.9. Samples: 9744476. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:15:51,435][06154] Avg episode reward: [(0, '4209.181')] +[2024-08-05 13:15:51,438][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000019088_9773056.pth... +[2024-08-05 13:15:51,440][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000018416_9428992.pth +[2024-08-05 13:15:52,797][06215] Updated weights for policy 0, policy_version 19120 (0.0006) +[2024-08-05 13:15:56,435][06154] Fps is (10 sec: 11059.3, 60 sec: 11468.8, 300 sec: 11496.6). Total num frames: 9826304. Throughput: 0: 11290.9. Samples: 9809920. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:15:56,435][06154] Avg episode reward: [(0, '4115.032')] +[2024-08-05 13:15:56,499][06215] Updated weights for policy 0, policy_version 19200 (0.0007) +[2024-08-05 13:15:59,955][06215] Updated weights for policy 0, policy_version 19280 (0.0006) +[2024-08-05 13:16:01,435][06154] Fps is (10 sec: 11468.8, 60 sec: 11468.8, 300 sec: 11510.5). Total num frames: 9887744. Throughput: 0: 11474.7. Samples: 9883808. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:16:01,435][06154] Avg episode reward: [(0, '4347.372')] +[2024-08-05 13:16:02,894][06215] Updated weights for policy 0, policy_version 19360 (0.0006) +[2024-08-05 13:16:06,435][06154] Fps is (10 sec: 12288.0, 60 sec: 11537.1, 300 sec: 11538.2). Total num frames: 9949184. Throughput: 0: 11662.4. Samples: 9924532. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:16:06,435][06154] Avg episode reward: [(0, '4290.733')] +[2024-08-05 13:16:06,438][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000019432_9949184.pth... +[2024-08-05 13:16:06,441][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000018752_9601024.pth +[2024-08-05 13:16:06,736][06215] Updated weights for policy 0, policy_version 19440 (0.0007) +[2024-08-05 13:16:10,549][06215] Updated weights for policy 0, policy_version 19520 (0.0007) +[2024-08-05 13:16:11,435][06154] Fps is (10 sec: 11059.0, 60 sec: 11332.3, 300 sec: 11496.6). Total num frames: 9998336. Throughput: 0: 11466.9. Samples: 9986028. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:16:11,435][06154] Avg episode reward: [(0, '4227.867')] +[2024-08-05 13:16:11,865][06202] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000000 +[2024-08-05 13:16:11,866][06219] Stopping RolloutWorker_w3... +[2024-08-05 13:16:11,866][06230] Stopping RolloutWorker_w4... +[2024-08-05 13:16:11,866][06218] Stopping RolloutWorker_w1... +[2024-08-05 13:16:11,866][06154] Component RolloutWorker_w4 stopped! +[2024-08-05 13:16:11,866][06220] Stopping RolloutWorker_w5... +[2024-08-05 13:16:11,866][06202] Stopping Batcher_0... +[2024-08-05 13:16:11,866][06221] Stopping RolloutWorker_w7... +[2024-08-05 13:16:11,866][06217] Stopping RolloutWorker_w0... +[2024-08-05 13:16:11,866][06219] Loop rollout_proc3_evt_loop terminating... +[2024-08-05 13:16:11,866][06218] Loop rollout_proc1_evt_loop terminating... +[2024-08-05 13:16:11,866][06202] Loop batcher_evt_loop terminating... +[2024-08-05 13:16:11,866][06230] Loop rollout_proc4_evt_loop terminating... +[2024-08-05 13:16:11,866][06221] Loop rollout_proc7_evt_loop terminating... +[2024-08-05 13:16:11,866][06154] Component RolloutWorker_w1 stopped! +[2024-08-05 13:16:11,866][06220] Loop rollout_proc5_evt_loop terminating... +[2024-08-05 13:16:11,866][06217] Loop rollout_proc0_evt_loop terminating... +[2024-08-05 13:16:11,866][06154] Component RolloutWorker_w3 stopped! +[2024-08-05 13:16:11,867][06154] Component RolloutWorker_w5 stopped! +[2024-08-05 13:16:11,867][06216] Stopping RolloutWorker_w2... +[2024-08-05 13:16:11,867][06154] Component Batcher_0 stopped! +[2024-08-05 13:16:11,867][06154] Component RolloutWorker_w7 stopped! +[2024-08-05 13:16:11,867][06216] Loop rollout_proc2_evt_loop terminating... +[2024-08-05 13:16:11,867][06154] Component RolloutWorker_w0 stopped! +[2024-08-05 13:16:11,867][06154] Component RolloutWorker_w2 stopped! +[2024-08-05 13:16:11,867][06154] Component RolloutWorker_w6 stopped! +[2024-08-05 13:16:11,867][06229] Stopping RolloutWorker_w6... +[2024-08-05 13:16:11,868][06229] Loop rollout_proc6_evt_loop terminating... +[2024-08-05 13:16:11,872][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000019544_10006528.pth... +[2024-08-05 13:16:11,875][06202] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000019088_9773056.pth +[2024-08-05 13:16:11,876][06202] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_walker_fist_run/checkpoint_p0/checkpoint_000019544_10006528.pth... +[2024-08-05 13:16:11,880][06202] Stopping LearnerWorker_p0... +[2024-08-05 13:16:11,880][06202] Loop learner_proc0_evt_loop terminating... +[2024-08-05 13:16:11,880][06154] Component LearnerWorker_p0 stopped! +[2024-08-05 13:16:11,928][06215] Weights refcount: 2 0 +[2024-08-05 13:16:11,933][06215] Stopping InferenceWorker_p0-w0... +[2024-08-05 13:16:11,933][06154] Component InferenceWorker_p0-w0 stopped! +[2024-08-05 13:16:11,933][06215] Loop inference_proc0-0_evt_loop terminating... +[2024-08-05 13:16:11,933][06154] Waiting for process learner_proc0 to stop... +[2024-08-05 13:16:12,571][06154] Waiting for process inference_proc0-0 to join... +[2024-08-05 13:16:12,571][06154] Waiting for process rollout_proc0 to join... +[2024-08-05 13:16:12,571][06154] Waiting for process rollout_proc1 to join... +[2024-08-05 13:16:12,571][06154] Waiting for process rollout_proc2 to join... +[2024-08-05 13:16:12,572][06154] Waiting for process rollout_proc3 to join... +[2024-08-05 13:16:12,572][06154] Waiting for process rollout_proc4 to join... +[2024-08-05 13:16:12,572][06154] Waiting for process rollout_proc5 to join... +[2024-08-05 13:16:12,572][06154] Waiting for process rollout_proc6 to join... +[2024-08-05 13:16:12,572][06154] Waiting for process rollout_proc7 to join... +[2024-08-05 13:16:12,572][06154] Batcher 0 profile tree view: +batching: 5.8023, releasing_batches: 1.3452 +[2024-08-05 13:16:12,572][06154] InferenceWorker_p0-w0 profile tree view: +wait_policy: 0.0051 + wait_policy_total: 198.6913 +update_model: 9.4815 + weight_update: 0.0008 +one_step: 0.0011 + handle_policy_step: 615.0909 + deserialize: 17.0387, stack: 3.9070, obs_to_device_normalize: 126.2007, forward: 302.3970, send_messages: 48.2567 + prepare_outputs: 83.6241 + to_cpu: 47.3700 +[2024-08-05 13:16:12,573][06154] Learner 0 profile tree view: +misc: 0.0071, prepare_batch: 9.9845 +train: 105.6396 + epoch_init: 0.0446, minibatch_init: 1.6544, losses_postprocess: 3.3330, kl_divergence: 1.6805, after_optimizer: 1.5156 + calculate_losses: 34.4172 + losses_init: 0.0491, forward_head: 3.2069, bptt_initial: 0.2382, bptt: 0.1977, tail: 12.8999, advantages_returns: 1.8582, losses: 14.0053 + update: 60.6407 + clip: 7.6590 +[2024-08-05 13:16:12,573][06154] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 0.4274, enqueue_policy_requests: 19.9849, env_step: 228.8720, overhead: 32.4909, complete_rollouts: 0.4597 +save_policy_outputs: 55.0256 + split_output_tensors: 18.7649 +[2024-08-05 13:16:12,573][06154] RolloutWorker_w7 profile tree view: +wait_for_trajectories: 0.3916, enqueue_policy_requests: 20.2117, env_step: 232.2551, overhead: 33.1885, complete_rollouts: 0.4742 +save_policy_outputs: 54.6750 + split_output_tensors: 18.5851 +[2024-08-05 13:16:12,573][06154] Loop Runner_EvtLoop terminating... +[2024-08-05 13:16:12,573][06154] Runner profile tree view: +main_loop: 873.3594 +[2024-08-05 13:16:12,573][06154] Collected {0: 10006528}, FPS: 11457.5