diff --git "a/sf_log.txt" "b/sf_log.txt" new file mode 100644--- /dev/null +++ "b/sf_log.txt" @@ -0,0 +1,1255 @@ +[2024-08-05 13:35:18,409][08991] Saving configuration to /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/config.json... +[2024-08-05 13:35:18,410][08991] Rollout worker 0 uses device cpu +[2024-08-05 13:35:18,410][08991] Rollout worker 1 uses device cpu +[2024-08-05 13:35:18,410][08991] Rollout worker 2 uses device cpu +[2024-08-05 13:35:18,410][08991] Rollout worker 3 uses device cpu +[2024-08-05 13:35:18,410][08991] Rollout worker 4 uses device cpu +[2024-08-05 13:35:18,410][08991] Rollout worker 5 uses device cpu +[2024-08-05 13:35:18,411][08991] Rollout worker 6 uses device cpu +[2024-08-05 13:35:18,411][08991] Rollout worker 7 uses device cpu +[2024-08-05 13:35:18,411][08991] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 +[2024-08-05 13:35:18,424][08991] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-05 13:35:18,424][08991] InferenceWorker_p0-w0: min num requests: 2 +[2024-08-05 13:35:18,441][08991] Starting all processes... +[2024-08-05 13:35:18,441][08991] Starting process learner_proc0 +[2024-08-05 13:35:18,661][08991] Starting all processes... +[2024-08-05 13:35:18,666][08991] Starting process inference_proc0-0 +[2024-08-05 13:35:18,667][08991] Starting process rollout_proc0 +[2024-08-05 13:35:18,667][08991] Starting process rollout_proc1 +[2024-08-05 13:35:18,667][08991] Starting process rollout_proc2 +[2024-08-05 13:35:18,673][08991] Starting process rollout_proc3 +[2024-08-05 13:35:18,674][08991] Starting process rollout_proc4 +[2024-08-05 13:35:18,677][08991] Starting process rollout_proc5 +[2024-08-05 13:35:18,683][08991] Starting process rollout_proc6 +[2024-08-05 13:35:18,684][08991] Starting process rollout_proc7 +[2024-08-05 13:35:20,277][09037] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-05 13:35:20,277][09037] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2024-08-05 13:35:20,350][09037] Num visible devices: 1 +[2024-08-05 13:35:20,356][09054] Worker 2 uses CPU cores [2] +[2024-08-05 13:35:20,377][09037] Starting seed is not provided +[2024-08-05 13:35:20,377][09037] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-05 13:35:20,377][09037] Initializing actor-critic model on device cuda:0 +[2024-08-05 13:35:20,378][09037] RunningMeanStd input shape: (376,) +[2024-08-05 13:35:20,378][09037] RunningMeanStd input shape: (1,) +[2024-08-05 13:35:20,448][09052] Worker 1 uses CPU cores [1] +[2024-08-05 13:35:20,455][09037] Created Actor Critic model with architecture: +[2024-08-05 13:35:20,455][09037] ActorCriticSharedWeights( + (obs_normalizer): ObservationNormalizer( + (running_mean_std): RunningMeanStdDictInPlace( + (running_mean_std): ModuleDict( + (obs): RunningMeanStdInPlace() + ) + ) + ) + (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) + (encoder): MultiInputEncoder( + (encoders): ModuleDict( + (obs): MlpEncoder( + (mlp_head): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Linear) + (1): RecursiveScriptModule(original_name=Tanh) + (2): RecursiveScriptModule(original_name=Linear) + (3): RecursiveScriptModule(original_name=Tanh) + ) + ) + ) + ) + (core): ModelCoreIdentity() + (decoder): MlpDecoder( + (mlp): Identity() + ) + (critic_linear): Linear(in_features=64, out_features=1, bias=True) + (action_parameterization): ActionParameterizationContinuousNonAdaptiveStddev( + (distribution_linear): Linear(in_features=64, out_features=17, bias=True) + ) +) +[2024-08-05 13:35:20,484][09050] Worker 0 uses CPU cores [0] +[2024-08-05 13:35:20,506][09055] Worker 4 uses CPU cores [4] +[2024-08-05 13:35:20,636][09056] Worker 5 uses CPU cores [5] +[2024-08-05 13:35:20,668][09064] Worker 6 uses CPU cores [6] +[2024-08-05 13:35:20,716][09053] Worker 3 uses CPU cores [3] +[2024-08-05 13:35:20,716][09051] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-05 13:35:20,716][09051] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2024-08-05 13:35:20,725][09037] Using optimizer +[2024-08-05 13:35:20,742][09051] Num visible devices: 1 +[2024-08-05 13:35:20,788][09065] Worker 7 uses CPU cores [7] +[2024-08-05 13:35:21,180][09037] No checkpoints found +[2024-08-05 13:35:21,180][09037] Did not load from checkpoint, starting from scratch! +[2024-08-05 13:35:21,181][09037] Initialized policy 0 weights for model version 0 +[2024-08-05 13:35:21,183][09037] LearnerWorker_p0 finished initialization! +[2024-08-05 13:35:21,183][09037] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-05 13:35:21,418][09051] RunningMeanStd input shape: (376,) +[2024-08-05 13:35:21,418][09051] RunningMeanStd input shape: (1,) +[2024-08-05 13:35:21,476][08991] Inference worker 0-0 is ready! +[2024-08-05 13:35:21,477][08991] All inference workers are ready! Signal rollout workers to start! +[2024-08-05 13:35:21,588][09056] Decorrelating experience for 0 frames... +[2024-08-05 13:35:21,588][09054] Decorrelating experience for 0 frames... +[2024-08-05 13:35:21,588][09053] Decorrelating experience for 0 frames... +[2024-08-05 13:35:21,588][09056] Decorrelating experience for 64 frames... +[2024-08-05 13:35:21,589][09054] Decorrelating experience for 64 frames... +[2024-08-05 13:35:21,589][09053] Decorrelating experience for 64 frames... +[2024-08-05 13:35:21,589][09052] Decorrelating experience for 0 frames... +[2024-08-05 13:35:21,589][09065] Decorrelating experience for 0 frames... +[2024-08-05 13:35:21,590][09052] Decorrelating experience for 64 frames... +[2024-08-05 13:35:21,590][09065] Decorrelating experience for 64 frames... +[2024-08-05 13:35:21,590][09055] Decorrelating experience for 0 frames... +[2024-08-05 13:35:21,590][09064] Decorrelating experience for 0 frames... +[2024-08-05 13:35:21,591][09055] Decorrelating experience for 64 frames... +[2024-08-05 13:35:21,591][09064] Decorrelating experience for 64 frames... +[2024-08-05 13:35:21,595][09050] Decorrelating experience for 0 frames... +[2024-08-05 13:35:21,595][09050] Decorrelating experience for 64 frames... +[2024-08-05 13:35:21,613][09054] Decorrelating experience for 128 frames... +[2024-08-05 13:35:21,615][09053] Decorrelating experience for 128 frames... +[2024-08-05 13:35:21,615][09056] Decorrelating experience for 128 frames... +[2024-08-05 13:35:21,616][09052] Decorrelating experience for 128 frames... +[2024-08-05 13:35:21,617][09065] Decorrelating experience for 128 frames... +[2024-08-05 13:35:21,618][09055] Decorrelating experience for 128 frames... +[2024-08-05 13:35:21,618][09064] Decorrelating experience for 128 frames... +[2024-08-05 13:35:21,622][09050] Decorrelating experience for 128 frames... +[2024-08-05 13:35:21,662][09054] Decorrelating experience for 192 frames... +[2024-08-05 13:35:21,665][09056] Decorrelating experience for 192 frames... +[2024-08-05 13:35:21,667][09053] Decorrelating experience for 192 frames... +[2024-08-05 13:35:21,668][09052] Decorrelating experience for 192 frames... +[2024-08-05 13:35:21,669][09064] Decorrelating experience for 192 frames... +[2024-08-05 13:35:21,671][09065] Decorrelating experience for 192 frames... +[2024-08-05 13:35:21,673][09055] Decorrelating experience for 192 frames... +[2024-08-05 13:35:21,678][09050] Decorrelating experience for 192 frames... +[2024-08-05 13:35:21,751][09054] Decorrelating experience for 256 frames... +[2024-08-05 13:35:21,753][09053] Decorrelating experience for 256 frames... +[2024-08-05 13:35:21,759][09056] Decorrelating experience for 256 frames... +[2024-08-05 13:35:21,762][09064] Decorrelating experience for 256 frames... +[2024-08-05 13:35:21,762][09052] Decorrelating experience for 256 frames... +[2024-08-05 13:35:21,768][09065] Decorrelating experience for 256 frames... +[2024-08-05 13:35:21,769][09055] Decorrelating experience for 256 frames... +[2024-08-05 13:35:21,772][09050] Decorrelating experience for 256 frames... +[2024-08-05 13:35:21,860][09054] Decorrelating experience for 320 frames... +[2024-08-05 13:35:21,870][09052] Decorrelating experience for 320 frames... +[2024-08-05 13:35:21,870][09053] Decorrelating experience for 320 frames... +[2024-08-05 13:35:21,873][09064] Decorrelating experience for 320 frames... +[2024-08-05 13:35:21,880][09065] Decorrelating experience for 320 frames... +[2024-08-05 13:35:21,888][09050] Decorrelating experience for 320 frames... +[2024-08-05 13:35:21,890][09055] Decorrelating experience for 320 frames... +[2024-08-05 13:35:21,897][09056] Decorrelating experience for 320 frames... +[2024-08-05 13:35:21,989][09054] Decorrelating experience for 384 frames... +[2024-08-05 13:35:21,999][09052] Decorrelating experience for 384 frames... +[2024-08-05 13:35:21,999][09064] Decorrelating experience for 384 frames... +[2024-08-05 13:35:22,004][09053] Decorrelating experience for 384 frames... +[2024-08-05 13:35:22,020][09065] Decorrelating experience for 384 frames... +[2024-08-05 13:35:22,022][09050] Decorrelating experience for 384 frames... +[2024-08-05 13:35:22,022][09056] Decorrelating experience for 384 frames... +[2024-08-05 13:35:22,024][09055] Decorrelating experience for 384 frames... +[2024-08-05 13:35:22,136][09054] Decorrelating experience for 448 frames... +[2024-08-05 13:35:22,157][09064] Decorrelating experience for 448 frames... +[2024-08-05 13:35:22,157][09052] Decorrelating experience for 448 frames... +[2024-08-05 13:35:22,159][09053] Decorrelating experience for 448 frames... +[2024-08-05 13:35:22,177][09050] Decorrelating experience for 448 frames... +[2024-08-05 13:35:22,177][09056] Decorrelating experience for 448 frames... +[2024-08-05 13:35:22,178][09065] Decorrelating experience for 448 frames... +[2024-08-05 13:35:22,180][09055] Decorrelating experience for 448 frames... +[2024-08-05 13:35:25,753][08991] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 12288. Throughput: 0: nan. Samples: 4276. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:35:25,753][08991] Avg episode reward: [(0, '72.356')] +[2024-08-05 13:35:29,445][09051] Updated weights for policy 0, policy_version 80 (0.0006) +[2024-08-05 13:35:30,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7372.8, 300 sec: 7372.8). Total num frames: 49152. Throughput: 0: 9004.0. Samples: 49296. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:35:30,753][08991] Avg episode reward: [(0, '212.558')] +[2024-08-05 13:35:30,755][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000000096_49152.pth... +[2024-08-05 13:35:34,465][09051] Updated weights for policy 0, policy_version 160 (0.0006) +[2024-08-05 13:35:35,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7782.4, 300 sec: 7782.4). Total num frames: 90112. Throughput: 0: 7135.6. Samples: 75632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:35:35,753][08991] Avg episode reward: [(0, '260.396')] +[2024-08-05 13:35:38,419][08991] Heartbeat connected on Batcher_0 +[2024-08-05 13:35:38,422][08991] Heartbeat connected on LearnerWorker_p0 +[2024-08-05 13:35:38,427][08991] Heartbeat connected on RolloutWorker_w0 +[2024-08-05 13:35:38,429][08991] Heartbeat connected on RolloutWorker_w1 +[2024-08-05 13:35:38,430][08991] Heartbeat connected on InferenceWorker_p0-w0 +[2024-08-05 13:35:38,431][08991] Heartbeat connected on RolloutWorker_w2 +[2024-08-05 13:35:38,434][08991] Heartbeat connected on RolloutWorker_w3 +[2024-08-05 13:35:38,435][08991] Heartbeat connected on RolloutWorker_w4 +[2024-08-05 13:35:38,437][08991] Heartbeat connected on RolloutWorker_w5 +[2024-08-05 13:35:38,439][08991] Heartbeat connected on RolloutWorker_w6 +[2024-08-05 13:35:38,441][08991] Heartbeat connected on RolloutWorker_w7 +[2024-08-05 13:35:39,647][09051] Updated weights for policy 0, policy_version 240 (0.0007) +[2024-08-05 13:35:40,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7918.9, 300 sec: 7918.9). Total num frames: 131072. Throughput: 0: 7906.9. Samples: 122880. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:35:40,753][08991] Avg episode reward: [(0, '276.873')] +[2024-08-05 13:35:40,753][09037] Saving new best policy, reward=276.873! +[2024-08-05 13:35:45,025][09051] Updated weights for policy 0, policy_version 320 (0.0007) +[2024-08-05 13:35:45,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7782.4, 300 sec: 7782.4). Total num frames: 167936. Throughput: 0: 8180.2. Samples: 167880. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:35:45,753][08991] Avg episode reward: [(0, '311.222')] +[2024-08-05 13:35:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000000328_167936.pth... +[2024-08-05 13:35:45,759][09037] Saving new best policy, reward=311.222! +[2024-08-05 13:35:50,330][09051] Updated weights for policy 0, policy_version 400 (0.0007) +[2024-08-05 13:35:50,753][08991] Fps is (10 sec: 7372.7, 60 sec: 7700.4, 300 sec: 7700.4). Total num frames: 204800. Throughput: 0: 7459.1. Samples: 190756. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:35:50,753][08991] Avg episode reward: [(0, '336.504')] +[2024-08-05 13:35:50,754][09037] Saving new best policy, reward=336.504! +[2024-08-05 13:35:55,708][09051] Updated weights for policy 0, policy_version 480 (0.0007) +[2024-08-05 13:35:55,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7782.4, 300 sec: 7782.4). Total num frames: 245760. Throughput: 0: 7768.3. Samples: 237324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:35:55,753][08991] Avg episode reward: [(0, '359.930')] +[2024-08-05 13:35:55,754][09037] Saving new best policy, reward=359.930! +[2024-08-05 13:36:00,753][08991] Fps is (10 sec: 7372.6, 60 sec: 7606.7, 300 sec: 7606.7). Total num frames: 278528. Throughput: 0: 7884.3. Samples: 280232. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:36:00,754][08991] Avg episode reward: [(0, '365.976')] +[2024-08-05 13:36:00,762][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000000544_278528.pth... +[2024-08-05 13:36:00,773][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000000096_49152.pth +[2024-08-05 13:36:00,774][09037] Saving new best policy, reward=365.976! +[2024-08-05 13:36:01,510][09051] Updated weights for policy 0, policy_version 560 (0.0008) +[2024-08-05 13:36:05,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7577.6, 300 sec: 7577.6). Total num frames: 315392. Throughput: 0: 7469.7. Samples: 303064. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:36:05,753][08991] Avg episode reward: [(0, '376.030')] +[2024-08-05 13:36:05,753][09037] Saving new best policy, reward=376.030! +[2024-08-05 13:36:06,974][09051] Updated weights for policy 0, policy_version 640 (0.0007) +[2024-08-05 13:36:10,753][08991] Fps is (10 sec: 7782.8, 60 sec: 7645.9, 300 sec: 7645.9). Total num frames: 356352. Throughput: 0: 7649.6. Samples: 348508. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:36:10,755][08991] Avg episode reward: [(0, '414.446')] +[2024-08-05 13:36:10,755][09037] Saving new best policy, reward=414.446! +[2024-08-05 13:36:12,221][09051] Updated weights for policy 0, policy_version 720 (0.0006) +[2024-08-05 13:36:15,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7618.6, 300 sec: 7618.6). Total num frames: 393216. Throughput: 0: 7644.8. Samples: 393312. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:36:15,753][08991] Avg episode reward: [(0, '444.230')] +[2024-08-05 13:36:15,757][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000000768_393216.pth... +[2024-08-05 13:36:15,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000000328_167936.pth +[2024-08-05 13:36:15,761][09037] Saving new best policy, reward=444.230! +[2024-08-05 13:36:17,722][09051] Updated weights for policy 0, policy_version 800 (0.0008) +[2024-08-05 13:36:20,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7596.2, 300 sec: 7596.2). Total num frames: 430080. Throughput: 0: 7573.6. Samples: 416444. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:36:20,753][08991] Avg episode reward: [(0, '445.338')] +[2024-08-05 13:36:20,753][09037] Saving new best policy, reward=445.338! +[2024-08-05 13:36:22,965][09051] Updated weights for policy 0, policy_version 880 (0.0006) +[2024-08-05 13:36:25,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7645.9). Total num frames: 471040. Throughput: 0: 7554.8. Samples: 462848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:36:25,761][08991] Avg episode reward: [(0, '496.828')] +[2024-08-05 13:36:25,761][09037] Saving new best policy, reward=496.828! +[2024-08-05 13:36:28,186][09051] Updated weights for policy 0, policy_version 960 (0.0006) +[2024-08-05 13:36:30,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7624.9). Total num frames: 507904. Throughput: 0: 7623.8. Samples: 510952. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:36:30,753][08991] Avg episode reward: [(0, '516.148')] +[2024-08-05 13:36:30,755][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000000992_507904.pth... +[2024-08-05 13:36:30,763][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000000544_278528.pth +[2024-08-05 13:36:30,763][09037] Saving new best policy, reward=516.148! +[2024-08-05 13:36:33,392][09051] Updated weights for policy 0, policy_version 1040 (0.0006) +[2024-08-05 13:36:35,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7665.4). Total num frames: 548864. Throughput: 0: 7609.8. Samples: 533196. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:36:35,753][08991] Avg episode reward: [(0, '529.270')] +[2024-08-05 13:36:35,754][09037] Saving new best policy, reward=529.270! +[2024-08-05 13:36:38,947][09051] Updated weights for policy 0, policy_version 1120 (0.0009) +[2024-08-05 13:36:40,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7577.6, 300 sec: 7645.9). Total num frames: 585728. Throughput: 0: 7560.1. Samples: 577528. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:36:40,753][08991] Avg episode reward: [(0, '544.358')] +[2024-08-05 13:36:40,753][09037] Saving new best policy, reward=544.358! +[2024-08-05 13:36:44,376][09051] Updated weights for policy 0, policy_version 1200 (0.0007) +[2024-08-05 13:36:45,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7577.6, 300 sec: 7628.8). Total num frames: 622592. Throughput: 0: 7608.1. Samples: 622592. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:36:45,753][08991] Avg episode reward: [(0, '568.597')] +[2024-08-05 13:36:45,755][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000001216_622592.pth... +[2024-08-05 13:36:45,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000000768_393216.pth +[2024-08-05 13:36:45,760][09037] Saving new best policy, reward=568.597! +[2024-08-05 13:36:50,012][09051] Updated weights for policy 0, policy_version 1280 (0.0009) +[2024-08-05 13:36:50,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7577.6, 300 sec: 7613.7). Total num frames: 659456. Throughput: 0: 7586.9. Samples: 644472. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:36:50,753][08991] Avg episode reward: [(0, '557.757')] +[2024-08-05 13:36:55,363][09051] Updated weights for policy 0, policy_version 1360 (0.0007) +[2024-08-05 13:36:55,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7509.3, 300 sec: 7600.4). Total num frames: 696320. Throughput: 0: 7614.0. Samples: 691140. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:36:55,753][08991] Avg episode reward: [(0, '602.013')] +[2024-08-05 13:36:55,754][09037] Saving new best policy, reward=602.013! +[2024-08-05 13:37:00,753][08991] Fps is (10 sec: 7372.7, 60 sec: 7577.7, 300 sec: 7588.4). Total num frames: 733184. Throughput: 0: 7553.9. Samples: 733240. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:37:00,753][08991] Avg episode reward: [(0, '664.133')] +[2024-08-05 13:37:00,755][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000001432_733184.pth... +[2024-08-05 13:37:00,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000000992_507904.pth +[2024-08-05 13:37:00,759][09037] Saving new best policy, reward=664.133! +[2024-08-05 13:37:01,013][09051] Updated weights for policy 0, policy_version 1440 (0.0008) +[2024-08-05 13:37:05,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7577.6, 300 sec: 7577.6). Total num frames: 770048. Throughput: 0: 7573.2. Samples: 757240. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:37:05,753][08991] Avg episode reward: [(0, '625.201')] +[2024-08-05 13:37:06,474][09051] Updated weights for policy 0, policy_version 1520 (0.0007) +[2024-08-05 13:37:10,753][08991] Fps is (10 sec: 7372.9, 60 sec: 7509.3, 300 sec: 7567.9). Total num frames: 806912. Throughput: 0: 7513.9. Samples: 800972. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:37:10,753][08991] Avg episode reward: [(0, '652.117')] +[2024-08-05 13:37:11,970][09051] Updated weights for policy 0, policy_version 1600 (0.0008) +[2024-08-05 13:37:15,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7509.3, 300 sec: 7559.0). Total num frames: 843776. Throughput: 0: 7429.0. Samples: 845256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:37:15,753][08991] Avg episode reward: [(0, '708.091')] +[2024-08-05 13:37:15,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000001648_843776.pth... +[2024-08-05 13:37:15,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000001216_622592.pth +[2024-08-05 13:37:15,759][09037] Saving new best policy, reward=708.091! +[2024-08-05 13:37:17,342][09051] Updated weights for policy 0, policy_version 1680 (0.0006) +[2024-08-05 13:37:20,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7577.6, 300 sec: 7586.5). Total num frames: 884736. Throughput: 0: 7449.3. Samples: 868416. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:37:20,753][08991] Avg episode reward: [(0, '744.313')] +[2024-08-05 13:37:20,754][09037] Saving new best policy, reward=744.313! +[2024-08-05 13:37:22,814][09051] Updated weights for policy 0, policy_version 1760 (0.0007) +[2024-08-05 13:37:25,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7509.3, 300 sec: 7577.6). Total num frames: 921600. Throughput: 0: 7500.3. Samples: 915040. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:37:25,753][08991] Avg episode reward: [(0, '829.621')] +[2024-08-05 13:37:25,754][09037] Saving new best policy, reward=829.621! +[2024-08-05 13:37:28,128][09051] Updated weights for policy 0, policy_version 1840 (0.0006) +[2024-08-05 13:37:30,753][08991] Fps is (10 sec: 7372.7, 60 sec: 7509.3, 300 sec: 7569.4). Total num frames: 958464. Throughput: 0: 7508.6. Samples: 960480. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:37:30,753][08991] Avg episode reward: [(0, '779.590')] +[2024-08-05 13:37:30,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000001872_958464.pth... +[2024-08-05 13:37:30,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000001432_733184.pth +[2024-08-05 13:37:33,508][09051] Updated weights for policy 0, policy_version 1920 (0.0007) +[2024-08-05 13:37:35,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7509.3, 300 sec: 7593.4). Total num frames: 999424. Throughput: 0: 7524.5. Samples: 983076. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:37:35,753][08991] Avg episode reward: [(0, '890.945')] +[2024-08-05 13:37:35,754][09037] Saving new best policy, reward=890.945! +[2024-08-05 13:37:38,757][09051] Updated weights for policy 0, policy_version 2000 (0.0006) +[2024-08-05 13:37:40,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7509.3, 300 sec: 7585.2). Total num frames: 1036288. Throughput: 0: 7504.7. Samples: 1028852. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:37:40,753][08991] Avg episode reward: [(0, '818.523')] +[2024-08-05 13:37:44,675][09051] Updated weights for policy 0, policy_version 2080 (0.0009) +[2024-08-05 13:37:45,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7509.3, 300 sec: 7577.6). Total num frames: 1073152. Throughput: 0: 7519.4. Samples: 1071612. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:37:45,753][08991] Avg episode reward: [(0, '823.701')] +[2024-08-05 13:37:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000002096_1073152.pth... +[2024-08-05 13:37:45,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000001648_843776.pth +[2024-08-05 13:37:50,005][09051] Updated weights for policy 0, policy_version 2160 (0.0007) +[2024-08-05 13:37:50,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7509.3, 300 sec: 7570.5). Total num frames: 1110016. Throughput: 0: 7504.8. Samples: 1094956. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:37:50,753][08991] Avg episode reward: [(0, '964.699')] +[2024-08-05 13:37:50,754][09037] Saving new best policy, reward=964.699! +[2024-08-05 13:37:55,484][09051] Updated weights for policy 0, policy_version 2240 (0.0008) +[2024-08-05 13:37:55,753][08991] Fps is (10 sec: 7372.6, 60 sec: 7509.3, 300 sec: 7563.9). Total num frames: 1146880. Throughput: 0: 7507.2. Samples: 1138800. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:37:55,754][08991] Avg episode reward: [(0, '905.612')] +[2024-08-05 13:38:00,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7509.3, 300 sec: 7557.8). Total num frames: 1183744. Throughput: 0: 7554.0. Samples: 1185184. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:38:00,753][08991] Avg episode reward: [(0, '953.072')] +[2024-08-05 13:38:00,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000002312_1183744.pth... +[2024-08-05 13:38:00,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000001872_958464.pth +[2024-08-05 13:38:00,794][09051] Updated weights for policy 0, policy_version 2320 (0.0006) +[2024-08-05 13:38:05,753][08991] Fps is (10 sec: 7782.6, 60 sec: 7577.6, 300 sec: 7577.6). Total num frames: 1224704. Throughput: 0: 7566.5. Samples: 1208908. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:38:05,753][08991] Avg episode reward: [(0, '966.606')] +[2024-08-05 13:38:05,754][09037] Saving new best policy, reward=966.606! +[2024-08-05 13:38:06,194][09051] Updated weights for policy 0, policy_version 2400 (0.0007) +[2024-08-05 13:38:10,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7645.9, 300 sec: 7596.2). Total num frames: 1265664. Throughput: 0: 7568.4. Samples: 1255616. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:38:10,753][08991] Avg episode reward: [(0, '938.027')] +[2024-08-05 13:38:11,336][09051] Updated weights for policy 0, policy_version 2480 (0.0006) +[2024-08-05 13:38:15,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7589.6). Total num frames: 1302528. Throughput: 0: 7571.0. Samples: 1301176. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:38:15,753][08991] Avg episode reward: [(0, '1006.179')] +[2024-08-05 13:38:15,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000002544_1302528.pth... +[2024-08-05 13:38:15,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000002096_1073152.pth +[2024-08-05 13:38:15,760][09037] Saving new best policy, reward=1006.179! +[2024-08-05 13:38:16,742][09051] Updated weights for policy 0, policy_version 2560 (0.0007) +[2024-08-05 13:38:20,753][08991] Fps is (10 sec: 7372.9, 60 sec: 7577.6, 300 sec: 7583.5). Total num frames: 1339392. Throughput: 0: 7570.1. Samples: 1323732. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:38:20,753][08991] Avg episode reward: [(0, '1106.921')] +[2024-08-05 13:38:20,753][09037] Saving new best policy, reward=1106.921! +[2024-08-05 13:38:22,198][09051] Updated weights for policy 0, policy_version 2640 (0.0006) +[2024-08-05 13:38:25,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7577.6, 300 sec: 7577.6). Total num frames: 1376256. Throughput: 0: 7584.4. Samples: 1370152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:38:25,761][08991] Avg episode reward: [(0, '980.426')] +[2024-08-05 13:38:27,481][09051] Updated weights for policy 0, policy_version 2720 (0.0006) +[2024-08-05 13:38:30,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7577.6, 300 sec: 7572.1). Total num frames: 1413120. Throughput: 0: 7640.2. Samples: 1415420. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:38:30,753][08991] Avg episode reward: [(0, '1126.180')] +[2024-08-05 13:38:30,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000002768_1417216.pth... +[2024-08-05 13:38:30,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000002312_1183744.pth +[2024-08-05 13:38:30,760][09037] Saving new best policy, reward=1126.180! +[2024-08-05 13:38:33,063][09051] Updated weights for policy 0, policy_version 2800 (0.0007) +[2024-08-05 13:38:35,753][08991] Fps is (10 sec: 7782.3, 60 sec: 7577.6, 300 sec: 7588.4). Total num frames: 1454080. Throughput: 0: 7610.5. Samples: 1437428. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:38:35,753][08991] Avg episode reward: [(0, '1251.603')] +[2024-08-05 13:38:35,754][09037] Saving new best policy, reward=1251.603! +[2024-08-05 13:38:38,365][09051] Updated weights for policy 0, policy_version 2880 (0.0006) +[2024-08-05 13:38:40,753][08991] Fps is (10 sec: 7782.5, 60 sec: 7577.6, 300 sec: 7582.9). Total num frames: 1490944. Throughput: 0: 7664.1. Samples: 1483680. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:38:40,753][08991] Avg episode reward: [(0, '1152.011')] +[2024-08-05 13:38:43,640][09051] Updated weights for policy 0, policy_version 2960 (0.0006) +[2024-08-05 13:38:45,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7598.1). Total num frames: 1531904. Throughput: 0: 7675.8. Samples: 1530596. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:38:45,753][08991] Avg episode reward: [(0, '1044.714')] +[2024-08-05 13:38:45,757][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000002992_1531904.pth... +[2024-08-05 13:38:45,761][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000002544_1302528.pth +[2024-08-05 13:38:48,763][09051] Updated weights for policy 0, policy_version 3040 (0.0006) +[2024-08-05 13:38:50,753][08991] Fps is (10 sec: 7782.3, 60 sec: 7645.9, 300 sec: 7592.6). Total num frames: 1568768. Throughput: 0: 7682.6. Samples: 1554624. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:38:50,753][08991] Avg episode reward: [(0, '1175.000')] +[2024-08-05 13:38:54,305][09051] Updated weights for policy 0, policy_version 3120 (0.0007) +[2024-08-05 13:38:55,753][08991] Fps is (10 sec: 7372.9, 60 sec: 7645.9, 300 sec: 7587.4). Total num frames: 1605632. Throughput: 0: 7637.7. Samples: 1599312. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:38:55,753][08991] Avg episode reward: [(0, '1236.978')] +[2024-08-05 13:38:59,595][09051] Updated weights for policy 0, policy_version 3200 (0.0008) +[2024-08-05 13:39:00,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7714.1, 300 sec: 7601.4). Total num frames: 1646592. Throughput: 0: 7633.5. Samples: 1644684. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:39:00,753][08991] Avg episode reward: [(0, '1230.603')] +[2024-08-05 13:39:00,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000003216_1646592.pth... +[2024-08-05 13:39:00,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000002768_1417216.pth +[2024-08-05 13:39:05,209][09051] Updated weights for policy 0, policy_version 3280 (0.0008) +[2024-08-05 13:39:05,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7596.2). Total num frames: 1683456. Throughput: 0: 7637.2. Samples: 1667404. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:39:05,753][08991] Avg episode reward: [(0, '1463.834')] +[2024-08-05 13:39:05,753][09037] Saving new best policy, reward=1463.834! +[2024-08-05 13:39:10,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7509.3, 300 sec: 7573.0). Total num frames: 1716224. Throughput: 0: 7558.5. Samples: 1710284. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:39:10,753][08991] Avg episode reward: [(0, '1359.047')] +[2024-08-05 13:39:10,880][09051] Updated weights for policy 0, policy_version 3360 (0.0008) +[2024-08-05 13:39:15,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7577.6, 300 sec: 7586.5). Total num frames: 1757184. Throughput: 0: 7560.3. Samples: 1755632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:39:15,753][08991] Avg episode reward: [(0, '1272.310')] +[2024-08-05 13:39:15,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000003432_1757184.pth... +[2024-08-05 13:39:15,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000002992_1531904.pth +[2024-08-05 13:39:16,272][09051] Updated weights for policy 0, policy_version 3440 (0.0007) +[2024-08-05 13:39:20,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7577.6, 300 sec: 7582.0). Total num frames: 1794048. Throughput: 0: 7601.4. Samples: 1779492. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:39:20,753][08991] Avg episode reward: [(0, '1313.050')] +[2024-08-05 13:39:21,647][09051] Updated weights for policy 0, policy_version 3520 (0.0007) +[2024-08-05 13:39:25,753][08991] Fps is (10 sec: 7372.7, 60 sec: 7577.6, 300 sec: 7577.6). Total num frames: 1830912. Throughput: 0: 7533.6. Samples: 1822692. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:39:25,754][08991] Avg episode reward: [(0, '1344.976')] +[2024-08-05 13:39:27,243][09051] Updated weights for policy 0, policy_version 3600 (0.0008) +[2024-08-05 13:39:30,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7577.6, 300 sec: 7573.4). Total num frames: 1867776. Throughput: 0: 7492.9. Samples: 1867776. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:39:30,761][08991] Avg episode reward: [(0, '1316.095')] +[2024-08-05 13:39:30,764][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000003648_1867776.pth... +[2024-08-05 13:39:30,768][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000003216_1646592.pth +[2024-08-05 13:39:32,731][09051] Updated weights for policy 0, policy_version 3680 (0.0007) +[2024-08-05 13:39:35,753][08991] Fps is (10 sec: 7372.9, 60 sec: 7509.3, 300 sec: 7569.4). Total num frames: 1904640. Throughput: 0: 7477.2. Samples: 1891096. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:39:35,753][08991] Avg episode reward: [(0, '1381.909')] +[2024-08-05 13:39:38,293][09051] Updated weights for policy 0, policy_version 3760 (0.0009) +[2024-08-05 13:39:40,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7577.6, 300 sec: 7581.6). Total num frames: 1945600. Throughput: 0: 7488.4. Samples: 1936288. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:39:40,753][08991] Avg episode reward: [(0, '1377.436')] +[2024-08-05 13:39:43,560][09051] Updated weights for policy 0, policy_version 3840 (0.0006) +[2024-08-05 13:39:45,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7441.1, 300 sec: 7561.8). Total num frames: 1978368. Throughput: 0: 7454.6. Samples: 1980140. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:39:45,753][08991] Avg episode reward: [(0, '1573.722')] +[2024-08-05 13:39:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000003864_1978368.pth... +[2024-08-05 13:39:45,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000003432_1757184.pth +[2024-08-05 13:39:45,760][09037] Saving new best policy, reward=1573.722! +[2024-08-05 13:39:49,226][09051] Updated weights for policy 0, policy_version 3920 (0.0009) +[2024-08-05 13:39:50,753][08991] Fps is (10 sec: 7372.7, 60 sec: 7509.3, 300 sec: 7573.7). Total num frames: 2019328. Throughput: 0: 7456.4. Samples: 2002944. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:39:50,753][08991] Avg episode reward: [(0, '1537.247')] +[2024-08-05 13:39:54,420][09051] Updated weights for policy 0, policy_version 4000 (0.0007) +[2024-08-05 13:39:55,753][08991] Fps is (10 sec: 7782.5, 60 sec: 7509.3, 300 sec: 7570.0). Total num frames: 2056192. Throughput: 0: 7519.3. Samples: 2048652. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:39:55,753][08991] Avg episode reward: [(0, '1582.146')] +[2024-08-05 13:39:55,754][09037] Saving new best policy, reward=1582.146! +[2024-08-05 13:39:59,833][09051] Updated weights for policy 0, policy_version 4080 (0.0007) +[2024-08-05 13:40:00,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7441.1, 300 sec: 7566.4). Total num frames: 2093056. Throughput: 0: 7502.7. Samples: 2093256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:40:00,753][08991] Avg episode reward: [(0, '1644.108')] +[2024-08-05 13:40:00,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000004088_2093056.pth... +[2024-08-05 13:40:00,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000003648_1867776.pth +[2024-08-05 13:40:00,761][09037] Saving new best policy, reward=1644.108! +[2024-08-05 13:40:05,261][09051] Updated weights for policy 0, policy_version 4160 (0.0007) +[2024-08-05 13:40:05,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7441.1, 300 sec: 7563.0). Total num frames: 2129920. Throughput: 0: 7503.1. Samples: 2117132. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:40:05,753][08991] Avg episode reward: [(0, '1691.848')] +[2024-08-05 13:40:05,753][09037] Saving new best policy, reward=1691.848! +[2024-08-05 13:40:10,753][08991] Fps is (10 sec: 7372.9, 60 sec: 7509.3, 300 sec: 7559.6). Total num frames: 2166784. Throughput: 0: 7481.6. Samples: 2159364. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:40:10,753][08991] Avg episode reward: [(0, '1740.825')] +[2024-08-05 13:40:10,753][09037] Saving new best policy, reward=1740.825! +[2024-08-05 13:40:11,031][09051] Updated weights for policy 0, policy_version 4240 (0.0009) +[2024-08-05 13:40:15,753][08991] Fps is (10 sec: 7372.7, 60 sec: 7441.1, 300 sec: 7556.4). Total num frames: 2203648. Throughput: 0: 7473.4. Samples: 2204080. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:40:15,754][08991] Avg episode reward: [(0, '2063.769')] +[2024-08-05 13:40:15,757][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000004304_2203648.pth... +[2024-08-05 13:40:15,761][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000003864_1978368.pth +[2024-08-05 13:40:15,761][09037] Saving new best policy, reward=2063.769! +[2024-08-05 13:40:16,672][09051] Updated weights for policy 0, policy_version 4320 (0.0008) +[2024-08-05 13:40:20,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7372.8, 300 sec: 7539.4). Total num frames: 2236416. Throughput: 0: 7410.0. Samples: 2224544. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:40:20,753][08991] Avg episode reward: [(0, '1891.605')] +[2024-08-05 13:40:22,644][09051] Updated weights for policy 0, policy_version 4400 (0.0009) +[2024-08-05 13:40:25,753][08991] Fps is (10 sec: 6963.3, 60 sec: 7372.8, 300 sec: 7539.4). Total num frames: 2273280. Throughput: 0: 7369.6. Samples: 2267920. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:40:25,753][08991] Avg episode reward: [(0, '2030.237')] +[2024-08-05 13:40:28,016][09051] Updated weights for policy 0, policy_version 4480 (0.0007) +[2024-08-05 13:40:30,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7372.8, 300 sec: 7525.5). Total num frames: 2310144. Throughput: 0: 7367.7. Samples: 2311684. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:40:30,754][08991] Avg episode reward: [(0, '2043.217')] +[2024-08-05 13:40:30,757][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000004512_2310144.pth... +[2024-08-05 13:40:30,761][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000004088_2093056.pth +[2024-08-05 13:40:33,787][09051] Updated weights for policy 0, policy_version 4560 (0.0007) +[2024-08-05 13:40:35,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7372.8, 300 sec: 7511.6). Total num frames: 2347008. Throughput: 0: 7336.9. Samples: 2333104. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:40:35,753][08991] Avg episode reward: [(0, '2171.102')] +[2024-08-05 13:40:35,753][09037] Saving new best policy, reward=2171.102! +[2024-08-05 13:40:39,303][09051] Updated weights for policy 0, policy_version 4640 (0.0007) +[2024-08-05 13:40:40,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7304.5, 300 sec: 7511.6). Total num frames: 2383872. Throughput: 0: 7303.7. Samples: 2377320. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:40:40,753][08991] Avg episode reward: [(0, '2252.493')] +[2024-08-05 13:40:40,761][09037] Saving new best policy, reward=2252.493! +[2024-08-05 13:40:44,898][09051] Updated weights for policy 0, policy_version 4720 (0.0007) +[2024-08-05 13:40:45,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7372.8, 300 sec: 7511.7). Total num frames: 2420736. Throughput: 0: 7285.9. Samples: 2421120. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:40:45,753][08991] Avg episode reward: [(0, '1872.108')] +[2024-08-05 13:40:45,761][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000004728_2420736.pth... +[2024-08-05 13:40:45,766][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000004304_2203648.pth +[2024-08-05 13:40:50,157][09051] Updated weights for policy 0, policy_version 4800 (0.0006) +[2024-08-05 13:40:50,753][08991] Fps is (10 sec: 7782.3, 60 sec: 7372.8, 300 sec: 7511.6). Total num frames: 2461696. Throughput: 0: 7294.3. Samples: 2445376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:40:50,754][08991] Avg episode reward: [(0, '1912.748')] +[2024-08-05 13:40:55,697][09051] Updated weights for policy 0, policy_version 4880 (0.0009) +[2024-08-05 13:40:55,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7372.8, 300 sec: 7525.5). Total num frames: 2498560. Throughput: 0: 7358.4. Samples: 2490492. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:40:55,753][08991] Avg episode reward: [(0, '2046.837')] +[2024-08-05 13:41:00,753][08991] Fps is (10 sec: 7372.9, 60 sec: 7372.8, 300 sec: 7525.5). Total num frames: 2535424. Throughput: 0: 7365.4. Samples: 2535520. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:41:00,753][08991] Avg episode reward: [(0, '2067.505')] +[2024-08-05 13:41:00,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000004952_2535424.pth... +[2024-08-05 13:41:00,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000004512_2310144.pth +[2024-08-05 13:41:01,057][09051] Updated weights for policy 0, policy_version 4960 (0.0006) +[2024-08-05 13:41:05,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7372.8, 300 sec: 7511.6). Total num frames: 2572288. Throughput: 0: 7432.5. Samples: 2559008. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:41:05,753][08991] Avg episode reward: [(0, '1769.871')] +[2024-08-05 13:41:06,293][09051] Updated weights for policy 0, policy_version 5040 (0.0007) +[2024-08-05 13:41:10,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7441.1, 300 sec: 7525.5). Total num frames: 2613248. Throughput: 0: 7491.9. Samples: 2605056. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:41:10,753][08991] Avg episode reward: [(0, '1844.286')] +[2024-08-05 13:41:11,667][09051] Updated weights for policy 0, policy_version 5120 (0.0008) +[2024-08-05 13:41:15,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7441.1, 300 sec: 7525.5). Total num frames: 2650112. Throughput: 0: 7525.5. Samples: 2650332. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:41:15,753][08991] Avg episode reward: [(0, '1868.174')] +[2024-08-05 13:41:15,761][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000005176_2650112.pth... +[2024-08-05 13:41:15,765][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000004728_2420736.pth +[2024-08-05 13:41:17,013][09051] Updated weights for policy 0, policy_version 5200 (0.0006) +[2024-08-05 13:41:20,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7509.3, 300 sec: 7511.6). Total num frames: 2686976. Throughput: 0: 7590.8. Samples: 2674688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:41:20,753][08991] Avg episode reward: [(0, '1964.530')] +[2024-08-05 13:41:22,678][09051] Updated weights for policy 0, policy_version 5280 (0.0006) +[2024-08-05 13:41:25,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7509.3, 300 sec: 7511.6). Total num frames: 2723840. Throughput: 0: 7576.1. Samples: 2718244. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:41:25,753][08991] Avg episode reward: [(0, '2288.056')] +[2024-08-05 13:41:25,754][09037] Saving new best policy, reward=2288.056! +[2024-08-05 13:41:28,017][09051] Updated weights for policy 0, policy_version 5360 (0.0007) +[2024-08-05 13:41:30,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7509.3, 300 sec: 7497.8). Total num frames: 2760704. Throughput: 0: 7593.3. Samples: 2762820. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:41:30,761][08991] Avg episode reward: [(0, '2616.030')] +[2024-08-05 13:41:30,764][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000005392_2760704.pth... +[2024-08-05 13:41:30,767][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000004952_2535424.pth +[2024-08-05 13:41:30,767][09037] Saving new best policy, reward=2616.030! +[2024-08-05 13:41:33,461][09051] Updated weights for policy 0, policy_version 5440 (0.0006) +[2024-08-05 13:41:35,753][08991] Fps is (10 sec: 7372.5, 60 sec: 7509.3, 300 sec: 7497.7). Total num frames: 2797568. Throughput: 0: 7558.3. Samples: 2785504. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:41:35,754][08991] Avg episode reward: [(0, '2303.005')] +[2024-08-05 13:41:40,212][09051] Updated weights for policy 0, policy_version 5520 (0.0011) +[2024-08-05 13:41:40,753][08991] Fps is (10 sec: 6553.6, 60 sec: 7372.8, 300 sec: 7470.0). Total num frames: 2826240. Throughput: 0: 7370.0. Samples: 2822144. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:41:40,753][08991] Avg episode reward: [(0, '2348.198')] +[2024-08-05 13:41:45,753][08991] Fps is (10 sec: 6553.9, 60 sec: 7372.8, 300 sec: 7470.0). Total num frames: 2863104. Throughput: 0: 7260.0. Samples: 2862220. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:41:45,753][08991] Avg episode reward: [(0, '1946.507')] +[2024-08-05 13:41:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000005592_2863104.pth... +[2024-08-05 13:41:45,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000005176_2650112.pth +[2024-08-05 13:41:46,204][09051] Updated weights for policy 0, policy_version 5600 (0.0007) +[2024-08-05 13:41:50,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7304.5, 300 sec: 7470.0). Total num frames: 2899968. Throughput: 0: 7270.6. Samples: 2886184. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:41:50,753][08991] Avg episode reward: [(0, '2623.600')] +[2024-08-05 13:41:50,753][09037] Saving new best policy, reward=2623.600! +[2024-08-05 13:41:51,642][09051] Updated weights for policy 0, policy_version 5680 (0.0006) +[2024-08-05 13:41:55,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7304.5, 300 sec: 7470.0). Total num frames: 2936832. Throughput: 0: 7216.0. Samples: 2929776. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:41:55,753][08991] Avg episode reward: [(0, '2891.978')] +[2024-08-05 13:41:55,754][09037] Saving new best policy, reward=2891.978! +[2024-08-05 13:41:57,231][09051] Updated weights for policy 0, policy_version 5760 (0.0007) +[2024-08-05 13:42:00,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7304.5, 300 sec: 7470.0). Total num frames: 2973696. Throughput: 0: 7229.3. Samples: 2975652. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:42:00,753][08991] Avg episode reward: [(0, '2475.122')] +[2024-08-05 13:42:00,760][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000005816_2977792.pth... +[2024-08-05 13:42:00,764][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000005392_2760704.pth +[2024-08-05 13:42:02,482][09051] Updated weights for policy 0, policy_version 5840 (0.0006) +[2024-08-05 13:42:05,753][08991] Fps is (10 sec: 7782.3, 60 sec: 7372.8, 300 sec: 7483.9). Total num frames: 3014656. Throughput: 0: 7192.0. Samples: 2998328. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:42:05,754][08991] Avg episode reward: [(0, '2143.170')] +[2024-08-05 13:42:07,563][09051] Updated weights for policy 0, policy_version 5920 (0.0006) +[2024-08-05 13:42:10,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7372.8, 300 sec: 7497.8). Total num frames: 3055616. Throughput: 0: 7315.9. Samples: 3047460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:42:10,753][08991] Avg episode reward: [(0, '2658.916')] +[2024-08-05 13:42:12,889][09051] Updated weights for policy 0, policy_version 6000 (0.0009) +[2024-08-05 13:42:15,753][08991] Fps is (10 sec: 7782.5, 60 sec: 7372.8, 300 sec: 7483.9). Total num frames: 3092480. Throughput: 0: 7327.2. Samples: 3092544. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:42:15,761][08991] Avg episode reward: [(0, '3095.632')] +[2024-08-05 13:42:15,763][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000006040_3092480.pth... +[2024-08-05 13:42:15,767][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000005592_2863104.pth +[2024-08-05 13:42:15,768][09037] Saving new best policy, reward=3095.632! +[2024-08-05 13:42:18,075][09051] Updated weights for policy 0, policy_version 6080 (0.0006) +[2024-08-05 13:42:20,757][08991] Fps is (10 sec: 7779.5, 60 sec: 7440.6, 300 sec: 7497.7). Total num frames: 3133440. Throughput: 0: 7367.3. Samples: 3117056. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:42:20,757][08991] Avg episode reward: [(0, '2804.770')] +[2024-08-05 13:42:23,333][09051] Updated weights for policy 0, policy_version 6160 (0.0006) +[2024-08-05 13:42:25,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7441.1, 300 sec: 7497.8). Total num frames: 3170304. Throughput: 0: 7575.9. Samples: 3163060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:42:25,753][08991] Avg episode reward: [(0, '2581.300')] +[2024-08-05 13:42:28,463][09051] Updated weights for policy 0, policy_version 6240 (0.0006) +[2024-08-05 13:42:30,753][08991] Fps is (10 sec: 7785.3, 60 sec: 7509.3, 300 sec: 7497.8). Total num frames: 3211264. Throughput: 0: 7756.3. Samples: 3211252. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:42:30,753][08991] Avg episode reward: [(0, '2601.973')] +[2024-08-05 13:42:30,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000006272_3211264.pth... +[2024-08-05 13:42:30,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000005816_2977792.pth +[2024-08-05 13:42:34,115][09051] Updated weights for policy 0, policy_version 6320 (0.0006) +[2024-08-05 13:42:35,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7509.4, 300 sec: 7497.8). Total num frames: 3248128. Throughput: 0: 7680.5. Samples: 3231808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:42:35,753][08991] Avg episode reward: [(0, '2568.568')] +[2024-08-05 13:42:39,484][09051] Updated weights for policy 0, policy_version 6400 (0.0006) +[2024-08-05 13:42:40,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7645.9, 300 sec: 7497.8). Total num frames: 3284992. Throughput: 0: 7713.1. Samples: 3276864. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:42:40,753][08991] Avg episode reward: [(0, '2789.158')] +[2024-08-05 13:42:45,261][09051] Updated weights for policy 0, policy_version 6480 (0.0007) +[2024-08-05 13:42:45,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7645.9, 300 sec: 7497.8). Total num frames: 3321856. Throughput: 0: 7653.8. Samples: 3320072. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:42:45,753][08991] Avg episode reward: [(0, '3193.978')] +[2024-08-05 13:42:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000006488_3321856.pth... +[2024-08-05 13:42:45,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000006040_3092480.pth +[2024-08-05 13:42:45,760][09037] Saving new best policy, reward=3193.978! +[2024-08-05 13:42:50,658][09051] Updated weights for policy 0, policy_version 6560 (0.0006) +[2024-08-05 13:42:50,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7645.9, 300 sec: 7497.8). Total num frames: 3358720. Throughput: 0: 7676.2. Samples: 3343756. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:42:50,753][08991] Avg episode reward: [(0, '3677.676')] +[2024-08-05 13:42:50,754][09037] Saving new best policy, reward=3677.676! +[2024-08-05 13:42:55,679][09051] Updated weights for policy 0, policy_version 6640 (0.0006) +[2024-08-05 13:42:55,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7714.1, 300 sec: 7511.6). Total num frames: 3399680. Throughput: 0: 7596.7. Samples: 3389312. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:42:55,753][08991] Avg episode reward: [(0, '4312.682')] +[2024-08-05 13:42:55,754][09037] Saving new best policy, reward=4312.682! +[2024-08-05 13:43:00,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7714.1, 300 sec: 7497.8). Total num frames: 3436544. Throughput: 0: 7680.4. Samples: 3438164. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:43:00,753][08991] Avg episode reward: [(0, '4514.232')] +[2024-08-05 13:43:00,778][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000006720_3440640.pth... +[2024-08-05 13:43:00,779][09051] Updated weights for policy 0, policy_version 6720 (0.0006) +[2024-08-05 13:43:00,782][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000006272_3211264.pth +[2024-08-05 13:43:00,782][09037] Saving new best policy, reward=4514.232! +[2024-08-05 13:43:05,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7714.1, 300 sec: 7497.8). Total num frames: 3477504. Throughput: 0: 7692.1. Samples: 3463172. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:43:05,753][08991] Avg episode reward: [(0, '3438.080')] +[2024-08-05 13:43:05,891][09051] Updated weights for policy 0, policy_version 6800 (0.0005) +[2024-08-05 13:43:10,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7714.1, 300 sec: 7511.6). Total num frames: 3518464. Throughput: 0: 7701.9. Samples: 3509644. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:43:10,753][08991] Avg episode reward: [(0, '3557.427')] +[2024-08-05 13:43:11,108][09051] Updated weights for policy 0, policy_version 6880 (0.0007) +[2024-08-05 13:43:15,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7714.1, 300 sec: 7511.6). Total num frames: 3555328. Throughput: 0: 7647.6. Samples: 3555392. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:43:15,753][08991] Avg episode reward: [(0, '4004.567')] +[2024-08-05 13:43:15,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000006944_3555328.pth... +[2024-08-05 13:43:15,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000006488_3321856.pth +[2024-08-05 13:43:16,474][09051] Updated weights for policy 0, policy_version 6960 (0.0007) +[2024-08-05 13:43:20,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7714.6, 300 sec: 7525.5). Total num frames: 3596288. Throughput: 0: 7735.5. Samples: 3579904. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:43:20,761][08991] Avg episode reward: [(0, '3825.334')] +[2024-08-05 13:43:21,769][09051] Updated weights for policy 0, policy_version 7040 (0.0007) +[2024-08-05 13:43:25,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7714.1, 300 sec: 7525.5). Total num frames: 3633152. Throughput: 0: 7735.5. Samples: 3624960. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:43:25,761][08991] Avg episode reward: [(0, '3533.390')] +[2024-08-05 13:43:27,120][09051] Updated weights for policy 0, policy_version 7120 (0.0006) +[2024-08-05 13:43:30,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7645.9, 300 sec: 7511.6). Total num frames: 3670016. Throughput: 0: 7811.8. Samples: 3671604. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:43:30,753][08991] Avg episode reward: [(0, '3079.274')] +[2024-08-05 13:43:30,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000007168_3670016.pth... +[2024-08-05 13:43:30,774][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000006720_3440640.pth +[2024-08-05 13:43:33,058][09051] Updated weights for policy 0, policy_version 7200 (0.0006) +[2024-08-05 13:43:35,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7577.6, 300 sec: 7497.8). Total num frames: 3702784. Throughput: 0: 7699.6. Samples: 3690236. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:43:35,753][08991] Avg episode reward: [(0, '2997.296')] +[2024-08-05 13:43:38,704][09051] Updated weights for policy 0, policy_version 7280 (0.0007) +[2024-08-05 13:43:40,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7577.6, 300 sec: 7483.9). Total num frames: 3739648. Throughput: 0: 7610.4. Samples: 3731780. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:43:40,753][08991] Avg episode reward: [(0, '2936.035')] +[2024-08-05 13:43:44,582][09051] Updated weights for policy 0, policy_version 7360 (0.0006) +[2024-08-05 13:43:45,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7577.6, 300 sec: 7483.9). Total num frames: 3776512. Throughput: 0: 7473.5. Samples: 3774472. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:43:45,753][08991] Avg episode reward: [(0, '3310.711')] +[2024-08-05 13:43:45,755][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000007376_3776512.pth... +[2024-08-05 13:43:45,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000006944_3555328.pth +[2024-08-05 13:43:50,397][09051] Updated weights for policy 0, policy_version 7440 (0.0006) +[2024-08-05 13:43:50,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7509.3, 300 sec: 7470.0). Total num frames: 3809280. Throughput: 0: 7419.1. Samples: 3797032. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:43:50,753][08991] Avg episode reward: [(0, '3816.121')] +[2024-08-05 13:43:55,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7441.1, 300 sec: 7456.1). Total num frames: 3846144. Throughput: 0: 7332.9. Samples: 3839624. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:43:55,753][08991] Avg episode reward: [(0, '3996.459')] +[2024-08-05 13:43:55,842][09051] Updated weights for policy 0, policy_version 7520 (0.0007) +[2024-08-05 13:44:00,753][08991] Fps is (10 sec: 7372.7, 60 sec: 7441.1, 300 sec: 7456.1). Total num frames: 3883008. Throughput: 0: 7347.2. Samples: 3886016. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:44:00,753][08991] Avg episode reward: [(0, '3853.757')] +[2024-08-05 13:44:00,758][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000007592_3887104.pth... +[2024-08-05 13:44:00,763][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000007168_3670016.pth +[2024-08-05 13:44:01,420][09051] Updated weights for policy 0, policy_version 7600 (0.0007) +[2024-08-05 13:44:05,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7372.8, 300 sec: 7470.0). Total num frames: 3919872. Throughput: 0: 7283.2. Samples: 3907648. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:44:05,753][08991] Avg episode reward: [(0, '3829.178')] +[2024-08-05 13:44:07,134][09051] Updated weights for policy 0, policy_version 7680 (0.0006) +[2024-08-05 13:44:10,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7236.3, 300 sec: 7442.2). Total num frames: 3952640. Throughput: 0: 7189.9. Samples: 3948504. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:44:10,753][08991] Avg episode reward: [(0, '3847.403')] +[2024-08-05 13:44:13,125][09051] Updated weights for policy 0, policy_version 7760 (0.0006) +[2024-08-05 13:44:15,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7236.3, 300 sec: 7442.2). Total num frames: 3989504. Throughput: 0: 7056.8. Samples: 3989160. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:44:15,753][08991] Avg episode reward: [(0, '4111.104')] +[2024-08-05 13:44:15,757][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000007792_3989504.pth... +[2024-08-05 13:44:15,761][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000007376_3776512.pth +[2024-08-05 13:44:18,910][09051] Updated weights for policy 0, policy_version 7840 (0.0006) +[2024-08-05 13:44:20,753][08991] Fps is (10 sec: 6963.3, 60 sec: 7099.7, 300 sec: 7428.3). Total num frames: 4022272. Throughput: 0: 7142.0. Samples: 4011628. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:44:20,753][08991] Avg episode reward: [(0, '4509.178')] +[2024-08-05 13:44:25,138][09051] Updated weights for policy 0, policy_version 7920 (0.0006) +[2024-08-05 13:44:25,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7099.7, 300 sec: 7428.3). Total num frames: 4059136. Throughput: 0: 7100.8. Samples: 4051316. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:44:25,753][08991] Avg episode reward: [(0, '4530.914')] +[2024-08-05 13:44:25,754][09037] Saving new best policy, reward=4530.914! +[2024-08-05 13:44:30,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7031.5, 300 sec: 7414.5). Total num frames: 4091904. Throughput: 0: 7127.3. Samples: 4095200. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:44:30,753][08991] Avg episode reward: [(0, '4442.380')] +[2024-08-05 13:44:30,755][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000007992_4091904.pth... +[2024-08-05 13:44:30,761][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000007592_3887104.pth +[2024-08-05 13:44:30,800][09051] Updated weights for policy 0, policy_version 8000 (0.0006) +[2024-08-05 13:44:35,753][08991] Fps is (10 sec: 6963.3, 60 sec: 7099.7, 300 sec: 7400.6). Total num frames: 4128768. Throughput: 0: 7099.9. Samples: 4116528. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:44:35,761][08991] Avg episode reward: [(0, '4613.819')] +[2024-08-05 13:44:35,761][09037] Saving new best policy, reward=4613.819! +[2024-08-05 13:44:36,445][09051] Updated weights for policy 0, policy_version 8080 (0.0005) +[2024-08-05 13:44:40,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7099.7, 300 sec: 7414.5). Total num frames: 4165632. Throughput: 0: 7068.4. Samples: 4157700. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:44:40,753][08991] Avg episode reward: [(0, '4440.504')] +[2024-08-05 13:44:42,209][09051] Updated weights for policy 0, policy_version 8160 (0.0007) +[2024-08-05 13:44:45,754][08991] Fps is (10 sec: 7371.9, 60 sec: 7099.6, 300 sec: 7400.5). Total num frames: 4202496. Throughput: 0: 7032.7. Samples: 4202496. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:44:45,754][08991] Avg episode reward: [(0, '3762.593')] +[2024-08-05 13:44:45,757][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000008208_4202496.pth... +[2024-08-05 13:44:45,762][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000007792_3989504.pth +[2024-08-05 13:44:47,983][09051] Updated weights for policy 0, policy_version 8240 (0.0006) +[2024-08-05 13:44:50,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7099.7, 300 sec: 7386.7). Total num frames: 4235264. Throughput: 0: 7008.2. Samples: 4223016. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:44:50,753][08991] Avg episode reward: [(0, '3641.731')] +[2024-08-05 13:44:54,459][09051] Updated weights for policy 0, policy_version 8320 (0.0007) +[2024-08-05 13:44:55,753][08991] Fps is (10 sec: 6554.4, 60 sec: 7031.5, 300 sec: 7372.8). Total num frames: 4268032. Throughput: 0: 6921.9. Samples: 4259988. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:44:55,753][08991] Avg episode reward: [(0, '4221.604')] +[2024-08-05 13:45:00,325][09051] Updated weights for policy 0, policy_version 8400 (0.0006) +[2024-08-05 13:45:00,753][08991] Fps is (10 sec: 6553.6, 60 sec: 6963.2, 300 sec: 7358.9). Total num frames: 4300800. Throughput: 0: 6954.0. Samples: 4302092. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:45:00,753][08991] Avg episode reward: [(0, '4130.286')] +[2024-08-05 13:45:00,760][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000008400_4300800.pth... +[2024-08-05 13:45:00,765][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000007992_4091904.pth +[2024-08-05 13:45:05,753][08991] Fps is (10 sec: 6963.2, 60 sec: 6963.2, 300 sec: 7358.9). Total num frames: 4337664. Throughput: 0: 6939.7. Samples: 4323916. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:45:05,753][08991] Avg episode reward: [(0, '4114.237')] +[2024-08-05 13:45:06,216][09051] Updated weights for policy 0, policy_version 8480 (0.0006) +[2024-08-05 13:45:10,758][08991] Fps is (10 sec: 6959.8, 60 sec: 6962.6, 300 sec: 7344.9). Total num frames: 4370432. Throughput: 0: 6958.1. Samples: 4364464. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:45:10,758][08991] Avg episode reward: [(0, '4004.755')] +[2024-08-05 13:45:12,149][09051] Updated weights for policy 0, policy_version 8560 (0.0006) +[2024-08-05 13:45:15,753][08991] Fps is (10 sec: 6963.2, 60 sec: 6963.2, 300 sec: 7358.9). Total num frames: 4407296. Throughput: 0: 6903.5. Samples: 4405856. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:45:15,753][08991] Avg episode reward: [(0, '3754.284')] +[2024-08-05 13:45:15,758][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000008608_4407296.pth... +[2024-08-05 13:45:15,762][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000008208_4202496.pth +[2024-08-05 13:45:18,140][09051] Updated weights for policy 0, policy_version 8640 (0.0006) +[2024-08-05 13:45:20,753][08991] Fps is (10 sec: 6966.6, 60 sec: 6963.2, 300 sec: 7345.0). Total num frames: 4440064. Throughput: 0: 6893.6. Samples: 4426740. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:45:20,753][08991] Avg episode reward: [(0, '3457.157')] +[2024-08-05 13:45:23,567][09051] Updated weights for policy 0, policy_version 8720 (0.0006) +[2024-08-05 13:45:25,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7031.5, 300 sec: 7358.9). Total num frames: 4481024. Throughput: 0: 6981.2. Samples: 4471856. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:45:25,753][08991] Avg episode reward: [(0, '3539.958')] +[2024-08-05 13:45:29,536][09051] Updated weights for policy 0, policy_version 8800 (0.0006) +[2024-08-05 13:45:30,753][08991] Fps is (10 sec: 7372.9, 60 sec: 7031.5, 300 sec: 7345.0). Total num frames: 4513792. Throughput: 0: 6918.8. Samples: 4513832. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:45:30,753][08991] Avg episode reward: [(0, '3314.267')] +[2024-08-05 13:45:30,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000008816_4513792.pth... +[2024-08-05 13:45:30,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000008400_4300800.pth +[2024-08-05 13:45:34,802][09051] Updated weights for policy 0, policy_version 8880 (0.0007) +[2024-08-05 13:45:35,753][08991] Fps is (10 sec: 6963.0, 60 sec: 7031.4, 300 sec: 7345.0). Total num frames: 4550656. Throughput: 0: 7006.5. Samples: 4538308. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:45:35,753][08991] Avg episode reward: [(0, '3376.978')] +[2024-08-05 13:45:39,918][09051] Updated weights for policy 0, policy_version 8960 (0.0006) +[2024-08-05 13:45:40,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7099.7, 300 sec: 7358.9). Total num frames: 4591616. Throughput: 0: 7200.4. Samples: 4584004. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:45:40,753][08991] Avg episode reward: [(0, '4074.879')] +[2024-08-05 13:45:45,211][09051] Updated weights for policy 0, policy_version 9040 (0.0006) +[2024-08-05 13:45:45,753][08991] Fps is (10 sec: 8192.1, 60 sec: 7168.1, 300 sec: 7358.9). Total num frames: 4632576. Throughput: 0: 7314.8. Samples: 4631256. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:45:45,753][08991] Avg episode reward: [(0, '4215.490')] +[2024-08-05 13:45:45,757][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000009048_4632576.pth... +[2024-08-05 13:45:45,762][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000008608_4407296.pth +[2024-08-05 13:45:50,495][09051] Updated weights for policy 0, policy_version 9120 (0.0006) +[2024-08-05 13:45:50,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7236.3, 300 sec: 7358.9). Total num frames: 4669440. Throughput: 0: 7316.9. Samples: 4653176. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:45:50,753][08991] Avg episode reward: [(0, '4078.332')] +[2024-08-05 13:45:55,591][09051] Updated weights for policy 0, policy_version 9200 (0.0006) +[2024-08-05 13:45:55,753][08991] Fps is (10 sec: 7782.5, 60 sec: 7372.8, 300 sec: 7372.8). Total num frames: 4710400. Throughput: 0: 7495.8. Samples: 4701736. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:45:55,753][08991] Avg episode reward: [(0, '3833.345')] +[2024-08-05 13:46:00,753][08991] Fps is (10 sec: 7782.2, 60 sec: 7441.0, 300 sec: 7372.8). Total num frames: 4747264. Throughput: 0: 7628.6. Samples: 4749144. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:46:00,753][08991] Avg episode reward: [(0, '3974.553')] +[2024-08-05 13:46:00,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000009272_4747264.pth... +[2024-08-05 13:46:00,778][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000008816_4513792.pth +[2024-08-05 13:46:00,825][09051] Updated weights for policy 0, policy_version 9280 (0.0006) +[2024-08-05 13:46:05,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7509.3, 300 sec: 7372.8). Total num frames: 4788224. Throughput: 0: 7696.3. Samples: 4773072. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:46:05,761][08991] Avg episode reward: [(0, '4108.781')] +[2024-08-05 13:46:05,985][09051] Updated weights for policy 0, policy_version 9360 (0.0006) +[2024-08-05 13:46:10,753][08991] Fps is (10 sec: 8192.2, 60 sec: 7646.5, 300 sec: 7386.7). Total num frames: 4829184. Throughput: 0: 7756.5. Samples: 4820900. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:46:10,753][08991] Avg episode reward: [(0, '3257.051')] +[2024-08-05 13:46:11,122][09051] Updated weights for policy 0, policy_version 9440 (0.0006) +[2024-08-05 13:46:15,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7386.7). Total num frames: 4866048. Throughput: 0: 7856.6. Samples: 4867380. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:46:15,761][08991] Avg episode reward: [(0, '2939.792')] +[2024-08-05 13:46:15,763][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000009504_4866048.pth... +[2024-08-05 13:46:15,784][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000009048_4632576.pth +[2024-08-05 13:46:16,386][09051] Updated weights for policy 0, policy_version 9520 (0.0006) +[2024-08-05 13:46:20,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7782.4, 300 sec: 7400.6). Total num frames: 4907008. Throughput: 0: 7871.0. Samples: 4892500. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:46:20,753][08991] Avg episode reward: [(0, '3738.869')] +[2024-08-05 13:46:21,316][09051] Updated weights for policy 0, policy_version 9600 (0.0005) +[2024-08-05 13:46:25,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7782.4, 300 sec: 7414.5). Total num frames: 4947968. Throughput: 0: 7920.3. Samples: 4940416. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:46:25,753][08991] Avg episode reward: [(0, '4271.332')] +[2024-08-05 13:46:26,439][09051] Updated weights for policy 0, policy_version 9680 (0.0006) +[2024-08-05 13:46:30,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7918.9, 300 sec: 7428.4). Total num frames: 4988928. Throughput: 0: 7916.4. Samples: 4987492. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:46:30,761][08991] Avg episode reward: [(0, '4207.934')] +[2024-08-05 13:46:30,764][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000009744_4988928.pth... +[2024-08-05 13:46:30,767][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000009272_4747264.pth +[2024-08-05 13:46:31,719][09051] Updated weights for policy 0, policy_version 9760 (0.0006) +[2024-08-05 13:46:35,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7987.2, 300 sec: 7470.0). Total num frames: 5029888. Throughput: 0: 7977.4. Samples: 5012160. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:46:35,753][08991] Avg episode reward: [(0, '4211.184')] +[2024-08-05 13:46:36,685][09051] Updated weights for policy 0, policy_version 9840 (0.0005) +[2024-08-05 13:46:40,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7918.9, 300 sec: 7470.0). Total num frames: 5066752. Throughput: 0: 7964.9. Samples: 5060156. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:46:40,753][08991] Avg episode reward: [(0, '4451.047')] +[2024-08-05 13:46:42,008][09051] Updated weights for policy 0, policy_version 9920 (0.0007) +[2024-08-05 13:46:45,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7918.9, 300 sec: 7483.9). Total num frames: 5107712. Throughput: 0: 7969.1. Samples: 5107752. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:46:45,753][08991] Avg episode reward: [(0, '4691.275')] +[2024-08-05 13:46:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000009976_5107712.pth... +[2024-08-05 13:46:45,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000009504_4866048.pth +[2024-08-05 13:46:45,760][09037] Saving new best policy, reward=4691.275! +[2024-08-05 13:46:47,064][09051] Updated weights for policy 0, policy_version 10000 (0.0005) +[2024-08-05 13:46:50,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7987.2, 300 sec: 7497.8). Total num frames: 5148672. Throughput: 0: 7979.0. Samples: 5132128. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:46:50,753][08991] Avg episode reward: [(0, '4894.655')] +[2024-08-05 13:46:50,753][09037] Saving new best policy, reward=4894.655! +[2024-08-05 13:46:52,325][09051] Updated weights for policy 0, policy_version 10080 (0.0006) +[2024-08-05 13:46:55,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7918.9, 300 sec: 7497.8). Total num frames: 5185536. Throughput: 0: 7931.2. Samples: 5177804. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:46:55,753][08991] Avg episode reward: [(0, '4875.737')] +[2024-08-05 13:46:57,595][09051] Updated weights for policy 0, policy_version 10160 (0.0006) +[2024-08-05 13:47:00,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7987.2, 300 sec: 7497.8). Total num frames: 5226496. Throughput: 0: 7946.1. Samples: 5224956. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:47:00,753][08991] Avg episode reward: [(0, '4808.848')] +[2024-08-05 13:47:00,759][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000010208_5226496.pth... +[2024-08-05 13:47:00,763][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000009744_4988928.pth +[2024-08-05 13:47:02,847][09051] Updated weights for policy 0, policy_version 10240 (0.0006) +[2024-08-05 13:47:05,754][08991] Fps is (10 sec: 7782.0, 60 sec: 7918.9, 300 sec: 7483.9). Total num frames: 5263360. Throughput: 0: 7902.8. Samples: 5248132. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:47:05,754][08991] Avg episode reward: [(0, '4923.704')] +[2024-08-05 13:47:05,754][09037] Saving new best policy, reward=4923.704! +[2024-08-05 13:47:07,975][09051] Updated weights for policy 0, policy_version 10320 (0.0006) +[2024-08-05 13:47:10,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7918.9, 300 sec: 7497.8). Total num frames: 5304320. Throughput: 0: 7917.0. Samples: 5296680. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:47:10,753][08991] Avg episode reward: [(0, '4465.795')] +[2024-08-05 13:47:13,157][09051] Updated weights for policy 0, policy_version 10400 (0.0006) +[2024-08-05 13:47:15,753][08991] Fps is (10 sec: 7782.8, 60 sec: 7918.9, 300 sec: 7484.0). Total num frames: 5341184. Throughput: 0: 7912.3. Samples: 5343544. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:47:15,753][08991] Avg episode reward: [(0, '4028.719')] +[2024-08-05 13:47:15,765][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000010440_5345280.pth... +[2024-08-05 13:47:15,768][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000009976_5107712.pth +[2024-08-05 13:47:18,329][09051] Updated weights for policy 0, policy_version 10480 (0.0007) +[2024-08-05 13:47:20,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7918.9, 300 sec: 7497.8). Total num frames: 5382144. Throughput: 0: 7878.5. Samples: 5366692. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:47:20,753][08991] Avg episode reward: [(0, '4201.696')] +[2024-08-05 13:47:23,689][09051] Updated weights for policy 0, policy_version 10560 (0.0006) +[2024-08-05 13:47:25,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7918.9, 300 sec: 7497.8). Total num frames: 5423104. Throughput: 0: 7847.8. Samples: 5413308. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:47:25,753][08991] Avg episode reward: [(0, '4603.793')] +[2024-08-05 13:47:28,969][09051] Updated weights for policy 0, policy_version 10640 (0.0006) +[2024-08-05 13:47:30,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7850.7, 300 sec: 7497.8). Total num frames: 5459968. Throughput: 0: 7827.5. Samples: 5459988. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:47:30,753][08991] Avg episode reward: [(0, '4882.477')] +[2024-08-05 13:47:30,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000010664_5459968.pth... +[2024-08-05 13:47:30,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000010208_5226496.pth +[2024-08-05 13:47:34,070][09051] Updated weights for policy 0, policy_version 10720 (0.0006) +[2024-08-05 13:47:35,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7850.7, 300 sec: 7511.6). Total num frames: 5500928. Throughput: 0: 7832.9. Samples: 5484608. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:47:35,753][08991] Avg episode reward: [(0, '4988.890')] +[2024-08-05 13:47:35,753][09037] Saving new best policy, reward=4988.890! +[2024-08-05 13:47:39,347][09051] Updated weights for policy 0, policy_version 10800 (0.0006) +[2024-08-05 13:47:40,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7850.7, 300 sec: 7511.6). Total num frames: 5537792. Throughput: 0: 7848.7. Samples: 5530996. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:47:40,753][08991] Avg episode reward: [(0, '4515.422')] +[2024-08-05 13:47:44,469][09051] Updated weights for policy 0, policy_version 10880 (0.0006) +[2024-08-05 13:47:45,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7850.7, 300 sec: 7525.5). Total num frames: 5578752. Throughput: 0: 7862.0. Samples: 5578748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:47:45,753][08991] Avg episode reward: [(0, '3552.592')] +[2024-08-05 13:47:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000010896_5578752.pth... +[2024-08-05 13:47:45,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000010440_5345280.pth +[2024-08-05 13:47:49,920][09051] Updated weights for policy 0, policy_version 10960 (0.0007) +[2024-08-05 13:47:50,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7782.4, 300 sec: 7511.6). Total num frames: 5615616. Throughput: 0: 7850.7. Samples: 5601408. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:47:50,753][08991] Avg episode reward: [(0, '3257.721')] +[2024-08-05 13:47:55,169][09051] Updated weights for policy 0, policy_version 11040 (0.0006) +[2024-08-05 13:47:55,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7850.7, 300 sec: 7525.5). Total num frames: 5656576. Throughput: 0: 7809.6. Samples: 5648112. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:47:55,753][08991] Avg episode reward: [(0, '3444.947')] +[2024-08-05 13:48:00,227][09051] Updated weights for policy 0, policy_version 11120 (0.0005) +[2024-08-05 13:48:00,753][08991] Fps is (10 sec: 8191.8, 60 sec: 7850.6, 300 sec: 7525.5). Total num frames: 5697536. Throughput: 0: 7834.5. Samples: 5696096. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:48:00,754][08991] Avg episode reward: [(0, '2990.508')] +[2024-08-05 13:48:00,759][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000011128_5697536.pth... +[2024-08-05 13:48:00,765][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000010664_5459968.pth +[2024-08-05 13:48:05,611][09051] Updated weights for policy 0, policy_version 11200 (0.0006) +[2024-08-05 13:48:05,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7850.7, 300 sec: 7511.6). Total num frames: 5734400. Throughput: 0: 7807.9. Samples: 5718048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:48:05,753][08991] Avg episode reward: [(0, '3579.157')] +[2024-08-05 13:48:10,737][09051] Updated weights for policy 0, policy_version 11280 (0.0005) +[2024-08-05 13:48:10,753][08991] Fps is (10 sec: 7782.6, 60 sec: 7850.7, 300 sec: 7525.5). Total num frames: 5775360. Throughput: 0: 7821.1. Samples: 5765256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:48:10,753][08991] Avg episode reward: [(0, '3621.586')] +[2024-08-05 13:48:15,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7850.7, 300 sec: 7511.6). Total num frames: 5812224. Throughput: 0: 7825.8. Samples: 5812148. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:48:15,761][08991] Avg episode reward: [(0, '3499.154')] +[2024-08-05 13:48:15,763][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000011352_5812224.pth... +[2024-08-05 13:48:15,767][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000010896_5578752.pth +[2024-08-05 13:48:16,216][09051] Updated weights for policy 0, policy_version 11360 (0.0006) +[2024-08-05 13:48:20,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7714.1, 300 sec: 7497.8). Total num frames: 5844992. Throughput: 0: 7736.3. Samples: 5832744. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:48:20,753][08991] Avg episode reward: [(0, '3861.142')] +[2024-08-05 13:48:22,129][09051] Updated weights for policy 0, policy_version 11440 (0.0007) +[2024-08-05 13:48:25,753][08991] Fps is (10 sec: 6963.2, 60 sec: 7645.9, 300 sec: 7497.8). Total num frames: 5881856. Throughput: 0: 7625.5. Samples: 5874144. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:48:25,753][08991] Avg episode reward: [(0, '3867.704')] +[2024-08-05 13:48:27,565][09051] Updated weights for policy 0, policy_version 11520 (0.0006) +[2024-08-05 13:48:30,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7645.9, 300 sec: 7511.6). Total num frames: 5918720. Throughput: 0: 7589.9. Samples: 5920292. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:48:30,761][08991] Avg episode reward: [(0, '4226.685')] +[2024-08-05 13:48:30,763][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000011560_5918720.pth... +[2024-08-05 13:48:30,767][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000011128_5697536.pth +[2024-08-05 13:48:32,968][09051] Updated weights for policy 0, policy_version 11600 (0.0007) +[2024-08-05 13:48:35,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7525.5). Total num frames: 5959680. Throughput: 0: 7598.9. Samples: 5943360. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:48:35,753][08991] Avg episode reward: [(0, '4442.479')] +[2024-08-05 13:48:38,298][09051] Updated weights for policy 0, policy_version 11680 (0.0006) +[2024-08-05 13:48:40,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7525.5). Total num frames: 5996544. Throughput: 0: 7569.6. Samples: 5988744. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:48:40,753][08991] Avg episode reward: [(0, '4448.572')] +[2024-08-05 13:48:43,867][09051] Updated weights for policy 0, policy_version 11760 (0.0007) +[2024-08-05 13:48:45,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7577.6, 300 sec: 7539.4). Total num frames: 6033408. Throughput: 0: 7503.4. Samples: 6033748. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:48:45,753][08991] Avg episode reward: [(0, '4583.182')] +[2024-08-05 13:48:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000011784_6033408.pth... +[2024-08-05 13:48:45,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000011352_5812224.pth +[2024-08-05 13:48:49,057][09051] Updated weights for policy 0, policy_version 11840 (0.0006) +[2024-08-05 13:48:50,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7553.3). Total num frames: 6074368. Throughput: 0: 7556.5. Samples: 6058088. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:48:50,753][08991] Avg episode reward: [(0, '4255.258')] +[2024-08-05 13:48:54,284][09051] Updated weights for policy 0, policy_version 11920 (0.0005) +[2024-08-05 13:48:55,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7577.6, 300 sec: 7553.3). Total num frames: 6111232. Throughput: 0: 7551.0. Samples: 6105052. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:48:55,753][08991] Avg episode reward: [(0, '4127.886')] +[2024-08-05 13:48:59,728][09051] Updated weights for policy 0, policy_version 12000 (0.0006) +[2024-08-05 13:49:00,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7509.4, 300 sec: 7553.3). Total num frames: 6148096. Throughput: 0: 7507.8. Samples: 6150000. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:49:00,753][08991] Avg episode reward: [(0, '4184.683')] +[2024-08-05 13:49:00,793][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000012016_6152192.pth... +[2024-08-05 13:49:00,798][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000011560_5918720.pth +[2024-08-05 13:49:05,376][09051] Updated weights for policy 0, policy_version 12080 (0.0007) +[2024-08-05 13:49:05,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7509.3, 300 sec: 7567.2). Total num frames: 6184960. Throughput: 0: 7546.3. Samples: 6172328. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:49:05,753][08991] Avg episode reward: [(0, '4420.992')] +[2024-08-05 13:49:10,568][09051] Updated weights for policy 0, policy_version 12160 (0.0006) +[2024-08-05 13:49:10,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7509.3, 300 sec: 7581.1). Total num frames: 6225920. Throughput: 0: 7635.2. Samples: 6217728. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:49:10,753][08991] Avg episode reward: [(0, '4383.646')] +[2024-08-05 13:49:15,710][09051] Updated weights for policy 0, policy_version 12240 (0.0006) +[2024-08-05 13:49:15,753][08991] Fps is (10 sec: 8191.8, 60 sec: 7577.6, 300 sec: 7608.8). Total num frames: 6266880. Throughput: 0: 7674.2. Samples: 6265632. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:49:15,754][08991] Avg episode reward: [(0, '4403.821')] +[2024-08-05 13:49:15,761][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000012240_6266880.pth... +[2024-08-05 13:49:15,768][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000011784_6033408.pth +[2024-08-05 13:49:20,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7577.6, 300 sec: 7595.0). Total num frames: 6299648. Throughput: 0: 7644.4. Samples: 6287360. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:49:20,753][08991] Avg episode reward: [(0, '4807.834')] +[2024-08-05 13:49:21,415][09051] Updated weights for policy 0, policy_version 12320 (0.0006) +[2024-08-05 13:49:25,753][08991] Fps is (10 sec: 7373.0, 60 sec: 7645.9, 300 sec: 7622.7). Total num frames: 6340608. Throughput: 0: 7594.0. Samples: 6330472. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:49:25,761][08991] Avg episode reward: [(0, '4744.817')] +[2024-08-05 13:49:26,729][09051] Updated weights for policy 0, policy_version 12400 (0.0006) +[2024-08-05 13:49:30,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7622.7). Total num frames: 6377472. Throughput: 0: 7641.7. Samples: 6377624. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:49:30,753][08991] Avg episode reward: [(0, '4391.398')] +[2024-08-05 13:49:30,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000012456_6377472.pth... +[2024-08-05 13:49:30,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000012016_6152192.pth +[2024-08-05 13:49:32,042][09051] Updated weights for policy 0, policy_version 12480 (0.0006) +[2024-08-05 13:49:35,753][08991] Fps is (10 sec: 7782.3, 60 sec: 7645.9, 300 sec: 7636.6). Total num frames: 6418432. Throughput: 0: 7643.5. Samples: 6402048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:49:35,753][08991] Avg episode reward: [(0, '4515.727')] +[2024-08-05 13:49:37,276][09051] Updated weights for policy 0, policy_version 12560 (0.0006) +[2024-08-05 13:49:40,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7636.6). Total num frames: 6455296. Throughput: 0: 7603.8. Samples: 6447224. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:49:40,753][08991] Avg episode reward: [(0, '4686.942')] +[2024-08-05 13:49:42,748][09051] Updated weights for policy 0, policy_version 12640 (0.0007) +[2024-08-05 13:49:45,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7645.9, 300 sec: 7650.5). Total num frames: 6492160. Throughput: 0: 7638.3. Samples: 6493724. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:49:45,753][08991] Avg episode reward: [(0, '4887.560')] +[2024-08-05 13:49:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000012680_6492160.pth... +[2024-08-05 13:49:45,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000012240_6266880.pth +[2024-08-05 13:49:47,809][09051] Updated weights for policy 0, policy_version 12720 (0.0006) +[2024-08-05 13:49:50,753][08991] Fps is (10 sec: 7782.5, 60 sec: 7645.9, 300 sec: 7678.3). Total num frames: 6533120. Throughput: 0: 7686.2. Samples: 6518208. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:49:50,753][08991] Avg episode reward: [(0, '5134.607')] +[2024-08-05 13:49:50,753][09037] Saving new best policy, reward=5134.607! +[2024-08-05 13:49:53,114][09051] Updated weights for policy 0, policy_version 12800 (0.0007) +[2024-08-05 13:49:55,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7714.1, 300 sec: 7706.0). Total num frames: 6574080. Throughput: 0: 7736.9. Samples: 6565888. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:49:55,753][08991] Avg episode reward: [(0, '5286.773')] +[2024-08-05 13:49:55,754][09037] Saving new best policy, reward=5286.773! +[2024-08-05 13:49:58,321][09051] Updated weights for policy 0, policy_version 12880 (0.0006) +[2024-08-05 13:50:00,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7714.1, 300 sec: 7706.0). Total num frames: 6610944. Throughput: 0: 7689.8. Samples: 6611672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:50:00,753][08991] Avg episode reward: [(0, '5176.907')] +[2024-08-05 13:50:00,759][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000012912_6610944.pth... +[2024-08-05 13:50:00,763][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000012456_6377472.pth +[2024-08-05 13:50:03,542][09051] Updated weights for policy 0, policy_version 12960 (0.0006) +[2024-08-05 13:50:05,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7782.4, 300 sec: 7733.9). Total num frames: 6651904. Throughput: 0: 7737.9. Samples: 6635564. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:50:05,753][08991] Avg episode reward: [(0, '5231.405')] +[2024-08-05 13:50:08,876][09051] Updated weights for policy 0, policy_version 13040 (0.0006) +[2024-08-05 13:50:10,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7714.1, 300 sec: 7733.8). Total num frames: 6688768. Throughput: 0: 7779.7. Samples: 6680560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:50:10,753][08991] Avg episode reward: [(0, '5173.419')] +[2024-08-05 13:50:14,313][09051] Updated weights for policy 0, policy_version 13120 (0.0006) +[2024-08-05 13:50:15,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7645.9, 300 sec: 7747.7). Total num frames: 6725632. Throughput: 0: 7757.3. Samples: 6726704. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:50:15,753][08991] Avg episode reward: [(0, '5163.302')] +[2024-08-05 13:50:15,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000013136_6725632.pth... +[2024-08-05 13:50:15,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000012680_6492160.pth +[2024-08-05 13:50:19,776][09051] Updated weights for policy 0, policy_version 13200 (0.0007) +[2024-08-05 13:50:20,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7714.1, 300 sec: 7733.8). Total num frames: 6762496. Throughput: 0: 7728.6. Samples: 6749836. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:50:20,753][08991] Avg episode reward: [(0, '5179.579')] +[2024-08-05 13:50:25,205][09051] Updated weights for policy 0, policy_version 13280 (0.0006) +[2024-08-05 13:50:25,753][08991] Fps is (10 sec: 7782.3, 60 sec: 7714.1, 300 sec: 7761.6). Total num frames: 6803456. Throughput: 0: 7691.1. Samples: 6793324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:50:25,754][08991] Avg episode reward: [(0, '5063.643')] +[2024-08-05 13:50:30,592][09051] Updated weights for policy 0, policy_version 13360 (0.0006) +[2024-08-05 13:50:30,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7714.1, 300 sec: 7761.6). Total num frames: 6840320. Throughput: 0: 7693.2. Samples: 6839920. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:50:30,753][08991] Avg episode reward: [(0, '5048.469')] +[2024-08-05 13:50:30,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000013360_6840320.pth... +[2024-08-05 13:50:30,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000012912_6610944.pth +[2024-08-05 13:50:35,753][08991] Fps is (10 sec: 7372.9, 60 sec: 7645.9, 300 sec: 7747.7). Total num frames: 6877184. Throughput: 0: 7630.2. Samples: 6861568. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:50:35,753][08991] Avg episode reward: [(0, '5397.559')] +[2024-08-05 13:50:35,754][09037] Saving new best policy, reward=5397.559! +[2024-08-05 13:50:36,278][09051] Updated weights for policy 0, policy_version 13440 (0.0006) +[2024-08-05 13:50:40,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7645.9, 300 sec: 7733.8). Total num frames: 6914048. Throughput: 0: 7549.6. Samples: 6905620. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:50:40,753][08991] Avg episode reward: [(0, '5780.331')] +[2024-08-05 13:50:40,754][09037] Saving new best policy, reward=5780.331! +[2024-08-05 13:50:41,797][09051] Updated weights for policy 0, policy_version 13520 (0.0007) +[2024-08-05 13:50:45,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7645.9, 300 sec: 7733.8). Total num frames: 6950912. Throughput: 0: 7538.4. Samples: 6950900. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:50:45,753][08991] Avg episode reward: [(0, '5444.236')] +[2024-08-05 13:50:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000013576_6950912.pth... +[2024-08-05 13:50:45,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000013136_6725632.pth +[2024-08-05 13:50:47,112][09051] Updated weights for policy 0, policy_version 13600 (0.0006) +[2024-08-05 13:50:50,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7577.6, 300 sec: 7719.9). Total num frames: 6987776. Throughput: 0: 7487.2. Samples: 6972488. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:50:50,753][08991] Avg episode reward: [(0, '5174.440')] +[2024-08-05 13:50:52,819][09051] Updated weights for policy 0, policy_version 13680 (0.0007) +[2024-08-05 13:50:55,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7509.3, 300 sec: 7719.9). Total num frames: 7024640. Throughput: 0: 7471.5. Samples: 7016776. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:50:55,753][08991] Avg episode reward: [(0, '5199.096')] +[2024-08-05 13:50:58,276][09051] Updated weights for policy 0, policy_version 13760 (0.0006) +[2024-08-05 13:51:00,758][08991] Fps is (10 sec: 7369.3, 60 sec: 7508.7, 300 sec: 7705.9). Total num frames: 7061504. Throughput: 0: 7455.5. Samples: 7062236. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:51:00,758][08991] Avg episode reward: [(0, '5508.474')] +[2024-08-05 13:51:00,761][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000013792_7061504.pth... +[2024-08-05 13:51:00,765][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000013360_6840320.pth +[2024-08-05 13:51:03,599][09051] Updated weights for policy 0, policy_version 13840 (0.0006) +[2024-08-05 13:51:05,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7441.1, 300 sec: 7692.1). Total num frames: 7098368. Throughput: 0: 7461.9. Samples: 7085624. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:51:05,753][08991] Avg episode reward: [(0, '5847.801')] +[2024-08-05 13:51:05,754][09037] Saving new best policy, reward=5847.801! +[2024-08-05 13:51:09,007][09051] Updated weights for policy 0, policy_version 13920 (0.0006) +[2024-08-05 13:51:10,753][08991] Fps is (10 sec: 7786.2, 60 sec: 7509.3, 300 sec: 7706.0). Total num frames: 7139328. Throughput: 0: 7506.6. Samples: 7131120. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:51:10,760][08991] Avg episode reward: [(0, '6020.223')] +[2024-08-05 13:51:10,760][09037] Saving new best policy, reward=6020.223! +[2024-08-05 13:51:14,133][09051] Updated weights for policy 0, policy_version 14000 (0.0005) +[2024-08-05 13:51:15,753][08991] Fps is (10 sec: 8191.9, 60 sec: 7577.6, 300 sec: 7706.0). Total num frames: 7180288. Throughput: 0: 7542.4. Samples: 7179328. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:51:15,754][08991] Avg episode reward: [(0, '5939.981')] +[2024-08-05 13:51:15,760][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000014024_7180288.pth... +[2024-08-05 13:51:15,767][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000013576_6950912.pth +[2024-08-05 13:51:19,438][09051] Updated weights for policy 0, policy_version 14080 (0.0006) +[2024-08-05 13:51:20,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7577.6, 300 sec: 7692.1). Total num frames: 7217152. Throughput: 0: 7573.8. Samples: 7202388. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:51:20,753][08991] Avg episode reward: [(0, '5633.675')] +[2024-08-05 13:51:24,655][09051] Updated weights for policy 0, policy_version 14160 (0.0006) +[2024-08-05 13:51:25,753][08991] Fps is (10 sec: 7782.6, 60 sec: 7577.6, 300 sec: 7692.1). Total num frames: 7258112. Throughput: 0: 7634.3. Samples: 7249164. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:51:25,753][08991] Avg episode reward: [(0, '5598.298')] +[2024-08-05 13:51:29,869][09051] Updated weights for policy 0, policy_version 14240 (0.0006) +[2024-08-05 13:51:30,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7577.6, 300 sec: 7678.3). Total num frames: 7294976. Throughput: 0: 7678.0. Samples: 7296412. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:51:30,753][08991] Avg episode reward: [(0, '5921.741')] +[2024-08-05 13:51:30,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000014248_7294976.pth... +[2024-08-05 13:51:30,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000013792_7061504.pth +[2024-08-05 13:51:35,342][09051] Updated weights for policy 0, policy_version 14320 (0.0007) +[2024-08-05 13:51:35,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7577.6, 300 sec: 7678.3). Total num frames: 7331840. Throughput: 0: 7704.9. Samples: 7319208. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2024-08-05 13:51:35,753][08991] Avg episode reward: [(0, '5903.308')] +[2024-08-05 13:51:40,606][09051] Updated weights for policy 0, policy_version 14400 (0.0005) +[2024-08-05 13:51:40,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7678.3). Total num frames: 7372800. Throughput: 0: 7729.6. Samples: 7364608. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:51:40,753][08991] Avg episode reward: [(0, '5571.153')] +[2024-08-05 13:51:45,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7645.9, 300 sec: 7664.4). Total num frames: 7409664. Throughput: 0: 7731.9. Samples: 7410136. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2024-08-05 13:51:45,753][08991] Avg episode reward: [(0, '5239.750')] +[2024-08-05 13:51:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000014472_7409664.pth... +[2024-08-05 13:51:45,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000014024_7180288.pth +[2024-08-05 13:51:45,948][09051] Updated weights for policy 0, policy_version 14480 (0.0007) +[2024-08-05 13:51:50,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7714.1, 300 sec: 7678.3). Total num frames: 7450624. Throughput: 0: 7765.4. Samples: 7435068. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:51:50,753][08991] Avg episode reward: [(0, '5127.385')] +[2024-08-05 13:51:51,129][09051] Updated weights for policy 0, policy_version 14560 (0.0006) +[2024-08-05 13:51:55,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7714.1, 300 sec: 7664.4). Total num frames: 7487488. Throughput: 0: 7777.9. Samples: 7481128. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:51:55,753][08991] Avg episode reward: [(0, '5395.285')] +[2024-08-05 13:51:56,279][09051] Updated weights for policy 0, policy_version 14640 (0.0006) +[2024-08-05 13:52:00,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7783.0, 300 sec: 7678.3). Total num frames: 7528448. Throughput: 0: 7744.9. Samples: 7527848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:52:00,761][08991] Avg episode reward: [(0, '5423.914')] +[2024-08-05 13:52:00,764][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000014704_7528448.pth... +[2024-08-05 13:52:00,768][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000014248_7294976.pth +[2024-08-05 13:52:01,611][09051] Updated weights for policy 0, policy_version 14720 (0.0006) +[2024-08-05 13:52:05,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7782.4, 300 sec: 7664.4). Total num frames: 7565312. Throughput: 0: 7770.9. Samples: 7552080. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:52:05,753][08991] Avg episode reward: [(0, '4991.328')] +[2024-08-05 13:52:06,825][09051] Updated weights for policy 0, policy_version 14800 (0.0006) +[2024-08-05 13:52:10,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7782.4, 300 sec: 7678.3). Total num frames: 7606272. Throughput: 0: 7776.2. Samples: 7599092. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:52:10,753][08991] Avg episode reward: [(0, '5268.265')] +[2024-08-05 13:52:12,046][09051] Updated weights for policy 0, policy_version 14880 (0.0006) +[2024-08-05 13:52:15,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7782.4, 300 sec: 7678.3). Total num frames: 7647232. Throughput: 0: 7796.0. Samples: 7647232. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:52:15,753][08991] Avg episode reward: [(0, '5631.864')] +[2024-08-05 13:52:15,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000014936_7647232.pth... +[2024-08-05 13:52:15,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000014472_7409664.pth +[2024-08-05 13:52:17,031][09051] Updated weights for policy 0, policy_version 14960 (0.0005) +[2024-08-05 13:52:20,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7850.7, 300 sec: 7678.3). Total num frames: 7688192. Throughput: 0: 7836.4. Samples: 7671848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:52:20,753][08991] Avg episode reward: [(0, '5674.320')] +[2024-08-05 13:52:22,310][09051] Updated weights for policy 0, policy_version 15040 (0.0006) +[2024-08-05 13:52:25,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7782.4, 300 sec: 7678.3). Total num frames: 7725056. Throughput: 0: 7861.2. Samples: 7718360. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:52:25,753][08991] Avg episode reward: [(0, '5573.134')] +[2024-08-05 13:52:27,428][09051] Updated weights for policy 0, policy_version 15120 (0.0006) +[2024-08-05 13:52:30,753][08991] Fps is (10 sec: 6963.1, 60 sec: 7714.1, 300 sec: 7650.5). Total num frames: 7757824. Throughput: 0: 7778.9. Samples: 7760188. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:52:30,753][08991] Avg episode reward: [(0, '5503.399')] +[2024-08-05 13:52:30,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000015152_7757824.pth... +[2024-08-05 13:52:30,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000014704_7528448.pth +[2024-08-05 13:52:33,358][09051] Updated weights for policy 0, policy_version 15200 (0.0007) +[2024-08-05 13:52:35,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7782.4, 300 sec: 7664.4). Total num frames: 7798784. Throughput: 0: 7749.3. Samples: 7783788. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:52:35,753][08991] Avg episode reward: [(0, '5472.438')] +[2024-08-05 13:52:38,510][09051] Updated weights for policy 0, policy_version 15280 (0.0006) +[2024-08-05 13:52:40,753][08991] Fps is (10 sec: 8192.1, 60 sec: 7782.4, 300 sec: 7664.4). Total num frames: 7839744. Throughput: 0: 7787.2. Samples: 7831552. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:52:40,753][08991] Avg episode reward: [(0, '5502.181')] +[2024-08-05 13:52:43,743][09051] Updated weights for policy 0, policy_version 15360 (0.0006) +[2024-08-05 13:52:45,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7782.4, 300 sec: 7664.4). Total num frames: 7876608. Throughput: 0: 7766.0. Samples: 7877320. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:52:45,753][08991] Avg episode reward: [(0, '5745.577')] +[2024-08-05 13:52:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000015384_7876608.pth... +[2024-08-05 13:52:45,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000014936_7647232.pth +[2024-08-05 13:52:49,285][09051] Updated weights for policy 0, policy_version 15440 (0.0006) +[2024-08-05 13:52:50,753][08991] Fps is (10 sec: 7372.7, 60 sec: 7714.1, 300 sec: 7650.5). Total num frames: 7913472. Throughput: 0: 7757.8. Samples: 7901180. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:52:50,753][08991] Avg episode reward: [(0, '5763.891')] +[2024-08-05 13:52:55,066][09051] Updated weights for policy 0, policy_version 15520 (0.0006) +[2024-08-05 13:52:55,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7714.1, 300 sec: 7636.6). Total num frames: 7950336. Throughput: 0: 7625.7. Samples: 7942248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:52:55,753][08991] Avg episode reward: [(0, '5555.565')] +[2024-08-05 13:53:00,753][08991] Fps is (10 sec: 6963.3, 60 sec: 7577.6, 300 sec: 7622.7). Total num frames: 7983104. Throughput: 0: 7464.9. Samples: 7983152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:53:00,753][08991] Avg episode reward: [(0, '5307.616')] +[2024-08-05 13:53:00,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000015592_7983104.pth... +[2024-08-05 13:53:00,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000015152_7757824.pth +[2024-08-05 13:53:01,250][09051] Updated weights for policy 0, policy_version 15600 (0.0006) +[2024-08-05 13:53:05,753][08991] Fps is (10 sec: 6144.0, 60 sec: 7441.1, 300 sec: 7581.1). Total num frames: 8011776. Throughput: 0: 7305.4. Samples: 8000592. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:53:05,761][08991] Avg episode reward: [(0, '5469.269')] +[2024-08-05 13:53:08,403][09051] Updated weights for policy 0, policy_version 15680 (0.0007) +[2024-08-05 13:53:10,753][08991] Fps is (10 sec: 5734.4, 60 sec: 7236.3, 300 sec: 7553.3). Total num frames: 8040448. Throughput: 0: 7015.1. Samples: 8034040. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:53:10,753][08991] Avg episode reward: [(0, '5519.236')] +[2024-08-05 13:53:15,753][08991] Fps is (10 sec: 5324.8, 60 sec: 6963.2, 300 sec: 7525.5). Total num frames: 8065024. Throughput: 0: 6842.9. Samples: 8068116. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2024-08-05 13:53:15,761][08991] Avg episode reward: [(0, '5426.525')] +[2024-08-05 13:53:15,778][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000015760_8069120.pth... +[2024-08-05 13:53:15,779][09051] Updated weights for policy 0, policy_version 15760 (0.0006) +[2024-08-05 13:53:15,782][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000015384_7876608.pth +[2024-08-05 13:53:20,753][08991] Fps is (10 sec: 5734.3, 60 sec: 6826.6, 300 sec: 7511.6). Total num frames: 8097792. Throughput: 0: 6700.6. Samples: 8085316. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:53:20,754][08991] Avg episode reward: [(0, '5362.895')] +[2024-08-05 13:53:22,648][09051] Updated weights for policy 0, policy_version 15840 (0.0006) +[2024-08-05 13:53:25,754][08991] Fps is (10 sec: 6143.1, 60 sec: 6690.0, 300 sec: 7483.8). Total num frames: 8126464. Throughput: 0: 6461.4. Samples: 8122324. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:53:25,763][08991] Avg episode reward: [(0, '5197.938')] +[2024-08-05 13:53:29,740][09051] Updated weights for policy 0, policy_version 15920 (0.0006) +[2024-08-05 13:53:30,753][08991] Fps is (10 sec: 5734.5, 60 sec: 6621.9, 300 sec: 7442.2). Total num frames: 8155136. Throughput: 0: 6197.7. Samples: 8156216. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:53:30,753][08991] Avg episode reward: [(0, '5091.345')] +[2024-08-05 13:53:30,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000015928_8155136.pth... +[2024-08-05 13:53:30,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000015592_7983104.pth +[2024-08-05 13:53:35,753][08991] Fps is (10 sec: 5735.2, 60 sec: 6417.1, 300 sec: 7414.5). Total num frames: 8183808. Throughput: 0: 6083.3. Samples: 8174928. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:53:35,753][08991] Avg episode reward: [(0, '4816.590')] +[2024-08-05 13:53:36,476][09051] Updated weights for policy 0, policy_version 16000 (0.0006) +[2024-08-05 13:53:40,753][08991] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 7386.7). Total num frames: 8212480. Throughput: 0: 5915.4. Samples: 8208440. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:53:40,753][08991] Avg episode reward: [(0, '5033.472')] +[2024-08-05 13:53:43,394][09051] Updated weights for policy 0, policy_version 16080 (0.0005) +[2024-08-05 13:53:45,753][08991] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 7358.9). Total num frames: 8245248. Throughput: 0: 5829.7. Samples: 8245488. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:53:45,753][08991] Avg episode reward: [(0, '5191.414')] +[2024-08-05 13:53:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000016104_8245248.pth... +[2024-08-05 13:53:45,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000015760_8069120.pth +[2024-08-05 13:53:50,753][08991] Fps is (10 sec: 5734.4, 60 sec: 5939.2, 300 sec: 7317.3). Total num frames: 8269824. Throughput: 0: 5801.8. Samples: 8261672. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:53:50,761][08991] Avg episode reward: [(0, '5470.318')] +[2024-08-05 13:53:50,883][09051] Updated weights for policy 0, policy_version 16160 (0.0006) +[2024-08-05 13:53:55,753][08991] Fps is (10 sec: 5324.8, 60 sec: 5802.7, 300 sec: 7289.5). Total num frames: 8298496. Throughput: 0: 5771.6. Samples: 8293764. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:53:55,753][08991] Avg episode reward: [(0, '5339.212')] +[2024-08-05 13:53:58,179][09051] Updated weights for policy 0, policy_version 16240 (0.0006) +[2024-08-05 13:54:00,753][08991] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 7261.7). Total num frames: 8327168. Throughput: 0: 5764.8. Samples: 8327532. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:54:00,753][08991] Avg episode reward: [(0, '5337.543')] +[2024-08-05 13:54:00,757][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000016264_8327168.pth... +[2024-08-05 13:54:00,761][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000015928_8155136.pth +[2024-08-05 13:54:05,753][08991] Fps is (10 sec: 5324.8, 60 sec: 5666.1, 300 sec: 7206.2). Total num frames: 8351744. Throughput: 0: 5731.0. Samples: 8343212. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:54:05,761][08991] Avg episode reward: [(0, '5141.326')] +[2024-08-05 13:54:05,963][09051] Updated weights for policy 0, policy_version 16320 (0.0006) +[2024-08-05 13:54:10,753][08991] Fps is (10 sec: 5324.8, 60 sec: 5666.1, 300 sec: 7164.5). Total num frames: 8380416. Throughput: 0: 5643.9. Samples: 8376292. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:54:10,761][08991] Avg episode reward: [(0, '5296.211')] +[2024-08-05 13:54:12,997][09051] Updated weights for policy 0, policy_version 16400 (0.0006) +[2024-08-05 13:54:15,753][08991] Fps is (10 sec: 5734.4, 60 sec: 5734.4, 300 sec: 7150.6). Total num frames: 8409088. Throughput: 0: 5640.9. Samples: 8410056. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:54:15,753][08991] Avg episode reward: [(0, '5326.679')] +[2024-08-05 13:54:15,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000016424_8409088.pth... +[2024-08-05 13:54:15,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000016104_8245248.pth +[2024-08-05 13:54:20,452][09051] Updated weights for policy 0, policy_version 16480 (0.0006) +[2024-08-05 13:54:20,753][08991] Fps is (10 sec: 5734.4, 60 sec: 5666.2, 300 sec: 7109.0). Total num frames: 8437760. Throughput: 0: 5603.6. Samples: 8427088. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:54:20,753][08991] Avg episode reward: [(0, '5674.240')] +[2024-08-05 13:54:25,753][08991] Fps is (10 sec: 5734.4, 60 sec: 5666.3, 300 sec: 7081.2). Total num frames: 8466432. Throughput: 0: 5598.8. Samples: 8460388. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:54:25,753][08991] Avg episode reward: [(0, '5763.158')] +[2024-08-05 13:54:27,638][09051] Updated weights for policy 0, policy_version 16560 (0.0005) +[2024-08-05 13:54:30,753][08991] Fps is (10 sec: 6144.0, 60 sec: 5734.4, 300 sec: 7053.5). Total num frames: 8499200. Throughput: 0: 5655.2. Samples: 8499972. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:54:30,753][08991] Avg episode reward: [(0, '5671.492')] +[2024-08-05 13:54:30,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000016600_8499200.pth... +[2024-08-05 13:54:30,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000016264_8327168.pth +[2024-08-05 13:54:33,174][09051] Updated weights for policy 0, policy_version 16640 (0.0007) +[2024-08-05 13:54:35,753][08991] Fps is (10 sec: 6963.2, 60 sec: 5870.9, 300 sec: 7053.5). Total num frames: 8536064. Throughput: 0: 5793.8. Samples: 8522392. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:54:35,753][08991] Avg episode reward: [(0, '5500.827')] +[2024-08-05 13:54:38,578][09051] Updated weights for policy 0, policy_version 16720 (0.0006) +[2024-08-05 13:54:40,753][08991] Fps is (10 sec: 7782.4, 60 sec: 6075.7, 300 sec: 7067.3). Total num frames: 8577024. Throughput: 0: 6111.6. Samples: 8568784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:54:40,753][08991] Avg episode reward: [(0, '5518.323')] +[2024-08-05 13:54:43,942][09051] Updated weights for policy 0, policy_version 16800 (0.0006) +[2024-08-05 13:54:45,753][08991] Fps is (10 sec: 7782.4, 60 sec: 6144.0, 300 sec: 7053.5). Total num frames: 8613888. Throughput: 0: 6365.6. Samples: 8613984. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:54:45,753][08991] Avg episode reward: [(0, '5570.338')] +[2024-08-05 13:54:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000016824_8613888.pth... +[2024-08-05 13:54:45,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000016424_8409088.pth +[2024-08-05 13:54:49,343][09051] Updated weights for policy 0, policy_version 16880 (0.0007) +[2024-08-05 13:54:50,753][08991] Fps is (10 sec: 7372.7, 60 sec: 6348.8, 300 sec: 7039.6). Total num frames: 8650752. Throughput: 0: 6527.5. Samples: 8636948. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:54:50,753][08991] Avg episode reward: [(0, '5679.359')] +[2024-08-05 13:54:54,573][09051] Updated weights for policy 0, policy_version 16960 (0.0006) +[2024-08-05 13:54:55,753][08991] Fps is (10 sec: 7782.4, 60 sec: 6553.6, 300 sec: 7053.5). Total num frames: 8691712. Throughput: 0: 6827.3. Samples: 8683520. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:54:55,753][08991] Avg episode reward: [(0, '5554.827')] +[2024-08-05 13:54:59,545][09051] Updated weights for policy 0, policy_version 17040 (0.0006) +[2024-08-05 13:55:00,755][08991] Fps is (10 sec: 8190.2, 60 sec: 6758.1, 300 sec: 7053.4). Total num frames: 8732672. Throughput: 0: 7166.2. Samples: 8732552. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2024-08-05 13:55:00,756][08991] Avg episode reward: [(0, '5379.422')] +[2024-08-05 13:55:00,758][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000017056_8732672.pth... +[2024-08-05 13:55:00,762][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000016600_8499200.pth +[2024-08-05 13:55:04,722][09051] Updated weights for policy 0, policy_version 17120 (0.0006) +[2024-08-05 13:55:05,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7031.5, 300 sec: 7067.3). Total num frames: 8773632. Throughput: 0: 7336.1. Samples: 8757212. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:55:05,753][08991] Avg episode reward: [(0, '5440.297')] +[2024-08-05 13:55:09,754][09051] Updated weights for policy 0, policy_version 17200 (0.0005) +[2024-08-05 13:55:10,753][08991] Fps is (10 sec: 7784.2, 60 sec: 7168.0, 300 sec: 7067.3). Total num frames: 8810496. Throughput: 0: 7645.3. Samples: 8804428. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:55:10,753][08991] Avg episode reward: [(0, '5775.221')] +[2024-08-05 13:55:14,873][09051] Updated weights for policy 0, policy_version 17280 (0.0006) +[2024-08-05 13:55:15,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7372.8, 300 sec: 7081.2). Total num frames: 8851456. Throughput: 0: 7826.3. Samples: 8852156. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:55:15,753][08991] Avg episode reward: [(0, '6030.690')] +[2024-08-05 13:55:15,761][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000017288_8851456.pth... +[2024-08-05 13:55:15,766][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000016824_8613888.pth +[2024-08-05 13:55:15,767][09037] Saving new best policy, reward=6030.690! +[2024-08-05 13:55:20,027][09051] Updated weights for policy 0, policy_version 17360 (0.0006) +[2024-08-05 13:55:20,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7577.6, 300 sec: 7081.2). Total num frames: 8892416. Throughput: 0: 7879.7. Samples: 8876980. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:55:20,753][08991] Avg episode reward: [(0, '5918.696')] +[2024-08-05 13:55:25,407][09051] Updated weights for policy 0, policy_version 17440 (0.0007) +[2024-08-05 13:55:25,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7714.1, 300 sec: 7081.2). Total num frames: 8929280. Throughput: 0: 7854.7. Samples: 8922244. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:55:25,753][08991] Avg episode reward: [(0, '5644.165')] +[2024-08-05 13:55:30,672][09051] Updated weights for policy 0, policy_version 17520 (0.0006) +[2024-08-05 13:55:30,753][08991] Fps is (10 sec: 7782.2, 60 sec: 7850.6, 300 sec: 7095.1). Total num frames: 8970240. Throughput: 0: 7903.5. Samples: 8969644. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:55:30,753][08991] Avg episode reward: [(0, '5720.017')] +[2024-08-05 13:55:30,758][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000017520_8970240.pth... +[2024-08-05 13:55:30,764][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000017056_8732672.pth +[2024-08-05 13:55:35,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7850.7, 300 sec: 7095.1). Total num frames: 9007104. Throughput: 0: 7885.6. Samples: 8991800. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:55:35,753][08991] Avg episode reward: [(0, '6009.665')] +[2024-08-05 13:55:36,071][09051] Updated weights for policy 0, policy_version 17600 (0.0006) +[2024-08-05 13:55:40,753][08991] Fps is (10 sec: 7372.9, 60 sec: 7782.4, 300 sec: 7095.1). Total num frames: 9043968. Throughput: 0: 7868.9. Samples: 9037620. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:55:40,753][08991] Avg episode reward: [(0, '5816.486')] +[2024-08-05 13:55:41,240][09051] Updated weights for policy 0, policy_version 17680 (0.0006) +[2024-08-05 13:55:45,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7918.9, 300 sec: 7122.9). Total num frames: 9089024. Throughput: 0: 7882.1. Samples: 9087228. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:55:45,753][08991] Avg episode reward: [(0, '5643.118')] +[2024-08-05 13:55:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000017752_9089024.pth... +[2024-08-05 13:55:45,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000017288_8851456.pth +[2024-08-05 13:55:46,270][09051] Updated weights for policy 0, policy_version 17760 (0.0006) +[2024-08-05 13:55:50,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7919.0, 300 sec: 7122.9). Total num frames: 9125888. Throughput: 0: 7853.4. Samples: 9110616. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:55:50,761][08991] Avg episode reward: [(0, '5830.653')] +[2024-08-05 13:55:51,592][09051] Updated weights for policy 0, policy_version 17840 (0.0006) +[2024-08-05 13:55:55,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7918.9, 300 sec: 7136.9). Total num frames: 9166848. Throughput: 0: 7851.3. Samples: 9157736. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:55:55,753][08991] Avg episode reward: [(0, '5819.928')] +[2024-08-05 13:55:56,724][09051] Updated weights for policy 0, policy_version 17920 (0.0006) +[2024-08-05 13:56:00,753][08991] Fps is (10 sec: 7782.3, 60 sec: 7851.0, 300 sec: 7136.8). Total num frames: 9203712. Throughput: 0: 7813.2. Samples: 9203748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:56:00,761][08991] Avg episode reward: [(0, '6027.911')] +[2024-08-05 13:56:00,764][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000017976_9203712.pth... +[2024-08-05 13:56:00,767][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000017520_8970240.pth +[2024-08-05 13:56:01,979][09051] Updated weights for policy 0, policy_version 18000 (0.0006) +[2024-08-05 13:56:05,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7850.7, 300 sec: 7136.8). Total num frames: 9244672. Throughput: 0: 7808.3. Samples: 9228352. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:56:05,753][08991] Avg episode reward: [(0, '6046.702')] +[2024-08-05 13:56:05,753][09037] Saving new best policy, reward=6046.702! +[2024-08-05 13:56:06,992][09051] Updated weights for policy 0, policy_version 18080 (0.0005) +[2024-08-05 13:56:10,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7918.9, 300 sec: 7136.8). Total num frames: 9285632. Throughput: 0: 7894.3. Samples: 9277488. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:56:10,753][08991] Avg episode reward: [(0, '6300.732')] +[2024-08-05 13:56:10,753][09037] Saving new best policy, reward=6300.732! +[2024-08-05 13:56:12,190][09051] Updated weights for policy 0, policy_version 18160 (0.0006) +[2024-08-05 13:56:15,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7850.7, 300 sec: 7136.8). Total num frames: 9322496. Throughput: 0: 7844.7. Samples: 9322656. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:56:15,761][08991] Avg episode reward: [(0, '5914.963')] +[2024-08-05 13:56:15,763][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000018208_9322496.pth... +[2024-08-05 13:56:15,768][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000017752_9089024.pth +[2024-08-05 13:56:17,510][09051] Updated weights for policy 0, policy_version 18240 (0.0006) +[2024-08-05 13:56:20,756][08991] Fps is (10 sec: 7779.9, 60 sec: 7850.2, 300 sec: 7136.7). Total num frames: 9363456. Throughput: 0: 7904.4. Samples: 9347524. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:56:20,756][08991] Avg episode reward: [(0, '5683.451')] +[2024-08-05 13:56:22,454][09051] Updated weights for policy 0, policy_version 18320 (0.0005) +[2024-08-05 13:56:25,753][08991] Fps is (10 sec: 8192.0, 60 sec: 7918.9, 300 sec: 7150.6). Total num frames: 9404416. Throughput: 0: 8006.6. Samples: 9397916. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:56:25,753][08991] Avg episode reward: [(0, '5953.624')] +[2024-08-05 13:56:27,342][09051] Updated weights for policy 0, policy_version 18400 (0.0005) +[2024-08-05 13:56:30,753][08991] Fps is (10 sec: 8194.6, 60 sec: 7918.9, 300 sec: 7164.5). Total num frames: 9445376. Throughput: 0: 7956.9. Samples: 9445288. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:56:30,753][08991] Avg episode reward: [(0, '6070.743')] +[2024-08-05 13:56:30,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000018448_9445376.pth... +[2024-08-05 13:56:30,760][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000017976_9203712.pth +[2024-08-05 13:56:32,534][09051] Updated weights for policy 0, policy_version 18480 (0.0006) +[2024-08-05 13:56:35,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7918.9, 300 sec: 7150.6). Total num frames: 9482240. Throughput: 0: 7985.2. Samples: 9469952. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:56:35,761][08991] Avg episode reward: [(0, '6144.983')] +[2024-08-05 13:56:37,769][09051] Updated weights for policy 0, policy_version 18560 (0.0006) +[2024-08-05 13:56:40,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7987.2, 300 sec: 7164.5). Total num frames: 9523200. Throughput: 0: 7945.3. Samples: 9515272. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:56:40,761][08991] Avg episode reward: [(0, '6025.050')] +[2024-08-05 13:56:43,154][09051] Updated weights for policy 0, policy_version 18640 (0.0006) +[2024-08-05 13:56:45,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7850.7, 300 sec: 7150.6). Total num frames: 9560064. Throughput: 0: 7967.8. Samples: 9562300. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:56:45,753][08991] Avg episode reward: [(0, '6221.582')] +[2024-08-05 13:56:45,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000018672_9560064.pth... +[2024-08-05 13:56:45,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000018208_9322496.pth +[2024-08-05 13:56:48,687][09051] Updated weights for policy 0, policy_version 18720 (0.0007) +[2024-08-05 13:56:50,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7850.7, 300 sec: 7150.6). Total num frames: 9596928. Throughput: 0: 7900.3. Samples: 9583864. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:56:50,753][08991] Avg episode reward: [(0, '6188.949')] +[2024-08-05 13:56:54,278][09051] Updated weights for policy 0, policy_version 18800 (0.0006) +[2024-08-05 13:56:55,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7782.4, 300 sec: 7136.8). Total num frames: 9633792. Throughput: 0: 7777.2. Samples: 9627464. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2024-08-05 13:56:55,753][08991] Avg episode reward: [(0, '6261.119')] +[2024-08-05 13:56:59,646][09051] Updated weights for policy 0, policy_version 18880 (0.0007) +[2024-08-05 13:57:00,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7850.7, 300 sec: 7150.6). Total num frames: 9674752. Throughput: 0: 7810.8. Samples: 9674144. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:57:00,753][08991] Avg episode reward: [(0, '5978.226')] +[2024-08-05 13:57:00,757][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000018896_9674752.pth... +[2024-08-05 13:57:00,761][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000018448_9445376.pth +[2024-08-05 13:57:05,373][09051] Updated weights for policy 0, policy_version 18960 (0.0006) +[2024-08-05 13:57:05,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7714.1, 300 sec: 7122.9). Total num frames: 9707520. Throughput: 0: 7724.1. Samples: 9695084. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:57:05,753][08991] Avg episode reward: [(0, '5817.907')] +[2024-08-05 13:57:10,694][09051] Updated weights for policy 0, policy_version 19040 (0.0006) +[2024-08-05 13:57:10,754][08991] Fps is (10 sec: 7372.2, 60 sec: 7714.0, 300 sec: 7122.9). Total num frames: 9748480. Throughput: 0: 7579.5. Samples: 9739000. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:57:10,755][08991] Avg episode reward: [(0, '5786.818')] +[2024-08-05 13:57:15,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7645.9, 300 sec: 7095.1). Total num frames: 9781248. Throughput: 0: 7472.9. Samples: 9781568. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:57:15,753][08991] Avg episode reward: [(0, '5767.185')] +[2024-08-05 13:57:15,756][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000019104_9781248.pth... +[2024-08-05 13:57:15,759][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000018672_9560064.pth +[2024-08-05 13:57:16,515][09051] Updated weights for policy 0, policy_version 19120 (0.0006) +[2024-08-05 13:57:20,753][08991] Fps is (10 sec: 7373.4, 60 sec: 7646.3, 300 sec: 7109.0). Total num frames: 9822208. Throughput: 0: 7464.6. Samples: 9805860. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-08-05 13:57:20,753][08991] Avg episode reward: [(0, '5884.446')] +[2024-08-05 13:57:21,716][09051] Updated weights for policy 0, policy_version 19200 (0.0006) +[2024-08-05 13:57:25,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7577.6, 300 sec: 7122.9). Total num frames: 9859072. Throughput: 0: 7483.3. Samples: 9852020. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:57:25,753][08991] Avg episode reward: [(0, '5763.094')] +[2024-08-05 13:57:26,905][09051] Updated weights for policy 0, policy_version 19280 (0.0006) +[2024-08-05 13:57:30,753][08991] Fps is (10 sec: 7782.5, 60 sec: 7577.6, 300 sec: 7122.9). Total num frames: 9900032. Throughput: 0: 7504.5. Samples: 9900004. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:57:30,761][08991] Avg episode reward: [(0, '5958.293')] +[2024-08-05 13:57:30,763][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000019336_9900032.pth... +[2024-08-05 13:57:30,767][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000018896_9674752.pth +[2024-08-05 13:57:32,286][09051] Updated weights for policy 0, policy_version 19360 (0.0006) +[2024-08-05 13:57:35,753][08991] Fps is (10 sec: 7782.4, 60 sec: 7577.6, 300 sec: 7109.0). Total num frames: 9936896. Throughput: 0: 7507.8. Samples: 9921716. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2024-08-05 13:57:35,753][08991] Avg episode reward: [(0, '5908.662')] +[2024-08-05 13:57:37,607][09051] Updated weights for policy 0, policy_version 19440 (0.0007) +[2024-08-05 13:57:40,753][08991] Fps is (10 sec: 7372.8, 60 sec: 7509.3, 300 sec: 7109.0). Total num frames: 9973760. Throughput: 0: 7583.3. Samples: 9968712. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2024-08-05 13:57:40,753][08991] Avg episode reward: [(0, '5523.661')] +[2024-08-05 13:57:43,232][09051] Updated weights for policy 0, policy_version 19520 (0.0007) +[2024-08-05 13:57:44,720][09037] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000000 +[2024-08-05 13:57:44,721][08991] Component RolloutWorker_w4 stopped! +[2024-08-05 13:57:44,721][09053] Stopping RolloutWorker_w3... +[2024-08-05 13:57:44,721][09054] Stopping RolloutWorker_w2... +[2024-08-05 13:57:44,721][08991] Component RolloutWorker_w3 stopped! +[2024-08-05 13:57:44,721][09053] Loop rollout_proc3_evt_loop terminating... +[2024-08-05 13:57:44,721][09064] Stopping RolloutWorker_w6... +[2024-08-05 13:57:44,722][09050] Stopping RolloutWorker_w0... +[2024-08-05 13:57:44,722][08991] Component Batcher_0 stopped! +[2024-08-05 13:57:44,722][09064] Loop rollout_proc6_evt_loop terminating... +[2024-08-05 13:57:44,722][08991] Component RolloutWorker_w1 stopped! +[2024-08-05 13:57:44,722][09050] Loop rollout_proc0_evt_loop terminating... +[2024-08-05 13:57:44,722][08991] Component RolloutWorker_w5 stopped! +[2024-08-05 13:57:44,722][09065] Stopping RolloutWorker_w7... +[2024-08-05 13:57:44,722][08991] Component RolloutWorker_w2 stopped! +[2024-08-05 13:57:44,722][09065] Loop rollout_proc7_evt_loop terminating... +[2024-08-05 13:57:44,722][08991] Component RolloutWorker_w6 stopped! +[2024-08-05 13:57:44,722][08991] Component RolloutWorker_w0 stopped! +[2024-08-05 13:57:44,723][08991] Component RolloutWorker_w7 stopped! +[2024-08-05 13:57:44,723][09055] Stopping RolloutWorker_w4... +[2024-08-05 13:57:44,723][09055] Loop rollout_proc4_evt_loop terminating... +[2024-08-05 13:57:44,734][09051] Weights refcount: 2 0 +[2024-08-05 13:57:44,721][09052] Stopping RolloutWorker_w1... +[2024-08-05 13:57:44,721][09056] Stopping RolloutWorker_w5... +[2024-08-05 13:57:44,721][09037] Stopping Batcher_0... +[2024-08-05 13:57:44,721][09054] Loop rollout_proc2_evt_loop terminating... +[2024-08-05 13:57:44,735][09052] Loop rollout_proc1_evt_loop terminating... +[2024-08-05 13:57:44,735][09056] Loop rollout_proc5_evt_loop terminating... +[2024-08-05 13:57:44,735][09037] Loop batcher_evt_loop terminating... +[2024-08-05 13:57:44,722][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000019544_10006528.pth... +[2024-08-05 13:57:44,740][09051] Stopping InferenceWorker_p0-w0... +[2024-08-05 13:57:44,740][08991] Component InferenceWorker_p0-w0 stopped! +[2024-08-05 13:57:44,741][09051] Loop inference_proc0-0_evt_loop terminating... +[2024-08-05 13:57:44,744][09037] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000019104_9781248.pth +[2024-08-05 13:57:44,745][09037] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/mujoco_humanoid_first_run/checkpoint_p0/checkpoint_000019544_10006528.pth... +[2024-08-05 13:57:44,750][09037] Stopping LearnerWorker_p0... +[2024-08-05 13:57:44,750][08991] Component LearnerWorker_p0 stopped! +[2024-08-05 13:57:44,751][09037] Loop learner_proc0_evt_loop terminating... +[2024-08-05 13:57:44,751][08991] Waiting for process learner_proc0 to stop... +[2024-08-05 13:57:45,516][08991] Waiting for process inference_proc0-0 to join... +[2024-08-05 13:57:45,516][08991] Waiting for process rollout_proc0 to join... +[2024-08-05 13:57:45,516][08991] Waiting for process rollout_proc1 to join... +[2024-08-05 13:57:45,517][08991] Waiting for process rollout_proc2 to join... +[2024-08-05 13:57:45,517][08991] Waiting for process rollout_proc3 to join... +[2024-08-05 13:57:45,517][08991] Waiting for process rollout_proc4 to join... +[2024-08-05 13:57:45,517][08991] Waiting for process rollout_proc5 to join... +[2024-08-05 13:57:45,517][08991] Waiting for process rollout_proc6 to join... +[2024-08-05 13:57:45,517][08991] Waiting for process rollout_proc7 to join... +[2024-08-05 13:57:45,518][08991] Batcher 0 profile tree view: +batching: 46.1524, releasing_batches: 1.2764 +[2024-08-05 13:57:45,518][08991] InferenceWorker_p0-w0 profile tree view: +wait_policy: 0.0000 + wait_policy_total: 383.2305 +update_model: 13.4051 + weight_update: 0.0007 +one_step: 0.0013 + handle_policy_step: 883.6438 + deserialize: 25.7141, stack: 5.7803, obs_to_device_normalize: 220.6344, forward: 418.1444, send_messages: 53.0575 + prepare_outputs: 113.2659 + to_cpu: 64.6333 +[2024-08-05 13:57:45,518][08991] Learner 0 profile tree view: +misc: 0.0070, prepare_batch: 8.5310 +train: 91.9275 + epoch_init: 0.0381, minibatch_init: 1.4053, losses_postprocess: 2.9213, kl_divergence: 1.5013, after_optimizer: 1.5203 + calculate_losses: 29.4933 + losses_init: 0.0437, forward_head: 2.8280, bptt_initial: 0.2026, bptt: 0.1736, tail: 11.0967, advantages_returns: 1.5957, losses: 11.8880 + update: 53.0187 + clip: 6.6770 +[2024-08-05 13:57:45,518][08991] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 0.3834, enqueue_policy_requests: 25.7295, env_step: 682.0626, overhead: 39.6932, complete_rollouts: 0.6146 +save_policy_outputs: 55.9674 + split_output_tensors: 18.6894 +[2024-08-05 13:57:45,518][08991] RolloutWorker_w7 profile tree view: +wait_for_trajectories: 0.4096, enqueue_policy_requests: 24.8788, env_step: 678.6346, overhead: 39.0253, complete_rollouts: 0.6506 +save_policy_outputs: 55.9591 + split_output_tensors: 19.0231 +[2024-08-05 13:57:45,518][08991] Loop Runner_EvtLoop terminating... +[2024-08-05 13:57:45,519][08991] Runner profile tree view: +main_loop: 1347.0776 +[2024-08-05 13:57:45,519][08991] Collected {0: 10006528}, FPS: 7428.3