diff --git "a/sf_log.txt" "b/sf_log.txt" --- "a/sf_log.txt" +++ "b/sf_log.txt" @@ -1,50 +1,50 @@ -[2023-02-24 09:55:08,547][01623] Saving configuration to /content/train_dir/default_experiment/config.json... -[2023-02-24 09:55:08,550][01623] Rollout worker 0 uses device cpu -[2023-02-24 09:55:08,552][01623] Rollout worker 1 uses device cpu -[2023-02-24 09:55:08,554][01623] Rollout worker 2 uses device cpu -[2023-02-24 09:55:08,555][01623] Rollout worker 3 uses device cpu -[2023-02-24 09:55:08,557][01623] Rollout worker 4 uses device cpu -[2023-02-24 09:55:08,558][01623] Rollout worker 5 uses device cpu -[2023-02-24 09:55:08,560][01623] Rollout worker 6 uses device cpu -[2023-02-24 09:55:08,562][01623] Rollout worker 7 uses device cpu -[2023-02-24 09:55:08,740][01623] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-24 09:55:08,743][01623] InferenceWorker_p0-w0: min num requests: 2 -[2023-02-24 09:55:08,773][01623] Starting all processes... -[2023-02-24 09:55:08,774][01623] Starting process learner_proc0 -[2023-02-24 09:55:08,830][01623] Starting all processes... -[2023-02-24 09:55:08,846][01623] Starting process inference_proc0-0 -[2023-02-24 09:55:08,852][01623] Starting process rollout_proc0 -[2023-02-24 09:55:08,869][01623] Starting process rollout_proc1 -[2023-02-24 09:55:08,870][01623] Starting process rollout_proc2 -[2023-02-24 09:55:08,870][01623] Starting process rollout_proc3 -[2023-02-24 09:55:08,870][01623] Starting process rollout_proc4 -[2023-02-24 09:55:08,870][01623] Starting process rollout_proc5 -[2023-02-24 09:55:08,870][01623] Starting process rollout_proc6 -[2023-02-24 09:55:08,871][01623] Starting process rollout_proc7 -[2023-02-24 09:55:20,565][15460] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-24 09:55:20,569][15460] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 -[2023-02-24 09:55:20,661][15485] Worker 6 uses CPU cores [0] -[2023-02-24 09:55:20,682][15484] Worker 4 uses CPU cores [0] -[2023-02-24 09:55:20,728][15476] Worker 1 uses CPU cores [1] -[2023-02-24 09:55:20,738][15482] Worker 3 uses CPU cores [1] -[2023-02-24 09:55:20,772][15481] Worker 2 uses CPU cores [0] -[2023-02-24 09:55:20,780][15474] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-24 09:55:20,782][15474] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 -[2023-02-24 09:55:20,803][15486] Worker 7 uses CPU cores [1] -[2023-02-24 09:55:20,911][15475] Worker 0 uses CPU cores [0] -[2023-02-24 09:55:20,926][15483] Worker 5 uses CPU cores [1] -[2023-02-24 09:55:21,394][15474] Num visible devices: 1 -[2023-02-24 09:55:21,394][15460] Num visible devices: 1 -[2023-02-24 09:55:21,413][15460] Starting seed is not provided -[2023-02-24 09:55:21,414][15460] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-24 09:55:21,414][15460] Initializing actor-critic model on device cuda:0 -[2023-02-24 09:55:21,415][15460] RunningMeanStd input shape: (3, 72, 128) -[2023-02-24 09:55:21,417][15460] RunningMeanStd input shape: (1,) -[2023-02-24 09:55:21,430][15460] ConvEncoder: input_channels=3 -[2023-02-24 09:55:21,706][15460] Conv encoder output size: 512 -[2023-02-24 09:55:21,706][15460] Policy head output size: 512 -[2023-02-24 09:55:21,753][15460] Created Actor Critic model with architecture: -[2023-02-24 09:55:21,754][15460] ActorCriticSharedWeights( +[2023-02-24 12:14:52,972][00205] Saving configuration to /content/train_dir/default_experiment/config.json... +[2023-02-24 12:14:52,974][00205] Rollout worker 0 uses device cpu +[2023-02-24 12:14:52,976][00205] Rollout worker 1 uses device cpu +[2023-02-24 12:14:52,979][00205] Rollout worker 2 uses device cpu +[2023-02-24 12:14:52,980][00205] Rollout worker 3 uses device cpu +[2023-02-24 12:14:52,981][00205] Rollout worker 4 uses device cpu +[2023-02-24 12:14:52,983][00205] Rollout worker 5 uses device cpu +[2023-02-24 12:14:52,986][00205] Rollout worker 6 uses device cpu +[2023-02-24 12:14:52,987][00205] Rollout worker 7 uses device cpu +[2023-02-24 12:14:53,198][00205] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 12:14:53,201][00205] InferenceWorker_p0-w0: min num requests: 2 +[2023-02-24 12:14:53,241][00205] Starting all processes... +[2023-02-24 12:14:53,243][00205] Starting process learner_proc0 +[2023-02-24 12:14:53,333][00205] Starting all processes... +[2023-02-24 12:14:53,345][00205] Starting process inference_proc0-0 +[2023-02-24 12:14:53,345][00205] Starting process rollout_proc0 +[2023-02-24 12:14:53,345][00205] Starting process rollout_proc1 +[2023-02-24 12:14:53,345][00205] Starting process rollout_proc2 +[2023-02-24 12:14:53,346][00205] Starting process rollout_proc3 +[2023-02-24 12:14:53,346][00205] Starting process rollout_proc4 +[2023-02-24 12:14:53,346][00205] Starting process rollout_proc5 +[2023-02-24 12:14:53,346][00205] Starting process rollout_proc6 +[2023-02-24 12:14:53,346][00205] Starting process rollout_proc7 +[2023-02-24 12:15:02,266][11201] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 12:15:02,266][11201] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2023-02-24 12:15:02,435][11223] Worker 3 uses CPU cores [1] +[2023-02-24 12:15:02,459][11227] Worker 4 uses CPU cores [0] +[2023-02-24 12:15:02,579][11222] Worker 2 uses CPU cores [0] +[2023-02-24 12:15:02,669][11224] Worker 5 uses CPU cores [1] +[2023-02-24 12:15:02,680][11215] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 12:15:02,680][11215] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2023-02-24 12:15:02,710][11216] Worker 0 uses CPU cores [0] +[2023-02-24 12:15:02,785][11221] Worker 1 uses CPU cores [1] +[2023-02-24 12:15:02,791][11226] Worker 7 uses CPU cores [1] +[2023-02-24 12:15:02,922][11225] Worker 6 uses CPU cores [0] +[2023-02-24 12:15:03,232][11215] Num visible devices: 1 +[2023-02-24 12:15:03,232][11201] Num visible devices: 1 +[2023-02-24 12:15:03,238][11201] Starting seed is not provided +[2023-02-24 12:15:03,238][11201] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 12:15:03,238][11201] Initializing actor-critic model on device cuda:0 +[2023-02-24 12:15:03,239][11201] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 12:15:03,240][11201] RunningMeanStd input shape: (1,) +[2023-02-24 12:15:03,253][11201] ConvEncoder: input_channels=3 +[2023-02-24 12:15:03,567][11201] Conv encoder output size: 512 +[2023-02-24 12:15:03,568][11201] Policy head output size: 512 +[2023-02-24 12:15:03,624][11201] Created Actor Critic model with architecture: +[2023-02-24 12:15:03,625][11201] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( @@ -85,2822 +85,3457 @@ (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) -[2023-02-24 09:55:28,539][15460] Using optimizer -[2023-02-24 09:55:28,541][15460] No checkpoints found -[2023-02-24 09:55:28,541][15460] Did not load from checkpoint, starting from scratch! -[2023-02-24 09:55:28,542][15460] Initialized policy 0 weights for model version 0 -[2023-02-24 09:55:28,545][15460] LearnerWorker_p0 finished initialization! -[2023-02-24 09:55:28,547][15460] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-24 09:55:28,734][01623] Heartbeat connected on Batcher_0 -[2023-02-24 09:55:28,740][01623] Heartbeat connected on LearnerWorker_p0 -[2023-02-24 09:55:28,753][01623] Heartbeat connected on RolloutWorker_w0 -[2023-02-24 09:55:28,758][01623] Heartbeat connected on RolloutWorker_w1 -[2023-02-24 09:55:28,760][01623] Heartbeat connected on RolloutWorker_w2 -[2023-02-24 09:55:28,762][01623] Heartbeat connected on RolloutWorker_w3 -[2023-02-24 09:55:28,764][01623] Heartbeat connected on RolloutWorker_w4 -[2023-02-24 09:55:28,771][01623] Heartbeat connected on RolloutWorker_w5 -[2023-02-24 09:55:28,773][01623] Heartbeat connected on RolloutWorker_w6 -[2023-02-24 09:55:28,774][01623] Heartbeat connected on RolloutWorker_w7 -[2023-02-24 09:55:28,792][15474] RunningMeanStd input shape: (3, 72, 128) -[2023-02-24 09:55:28,793][15474] RunningMeanStd input shape: (1,) -[2023-02-24 09:55:28,816][15474] ConvEncoder: input_channels=3 -[2023-02-24 09:55:28,972][15474] Conv encoder output size: 512 -[2023-02-24 09:55:28,974][15474] Policy head output size: 512 -[2023-02-24 09:55:29,601][01623] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-02-24 09:55:31,694][01623] Inference worker 0-0 is ready! -[2023-02-24 09:55:31,696][01623] All inference workers are ready! Signal rollout workers to start! -[2023-02-24 09:55:31,707][01623] Heartbeat connected on InferenceWorker_p0-w0 -[2023-02-24 09:55:31,795][15476] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 09:55:31,810][15482] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 09:55:31,821][15483] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 09:55:31,853][15475] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 09:55:31,853][15486] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 09:55:31,866][15485] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 09:55:31,874][15484] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 09:55:31,877][15481] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 09:55:32,369][15485] Decorrelating experience for 0 frames... -[2023-02-24 09:55:32,715][15485] Decorrelating experience for 32 frames... -[2023-02-24 09:55:33,089][15483] Decorrelating experience for 0 frames... -[2023-02-24 09:55:33,096][15476] Decorrelating experience for 0 frames... -[2023-02-24 09:55:33,098][15482] Decorrelating experience for 0 frames... -[2023-02-24 09:55:33,102][15486] Decorrelating experience for 0 frames... -[2023-02-24 09:55:33,437][15486] Decorrelating experience for 32 frames... -[2023-02-24 09:55:33,839][15486] Decorrelating experience for 64 frames... -[2023-02-24 09:55:34,236][15486] Decorrelating experience for 96 frames... -[2023-02-24 09:55:34,445][15484] Decorrelating experience for 0 frames... -[2023-02-24 09:55:34,455][15481] Decorrelating experience for 0 frames... -[2023-02-24 09:55:34,495][15485] Decorrelating experience for 64 frames... -[2023-02-24 09:55:34,497][15475] Decorrelating experience for 0 frames... -[2023-02-24 09:55:34,601][01623] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-02-24 09:55:35,204][15476] Decorrelating experience for 32 frames... -[2023-02-24 09:55:35,334][15483] Decorrelating experience for 32 frames... -[2023-02-24 09:55:35,786][15482] Decorrelating experience for 32 frames... -[2023-02-24 09:55:36,085][15484] Decorrelating experience for 32 frames... -[2023-02-24 09:55:36,167][15481] Decorrelating experience for 32 frames... -[2023-02-24 09:55:36,175][15475] Decorrelating experience for 32 frames... -[2023-02-24 09:55:36,591][15483] Decorrelating experience for 64 frames... -[2023-02-24 09:55:37,021][15476] Decorrelating experience for 64 frames... -[2023-02-24 09:55:37,779][15482] Decorrelating experience for 64 frames... -[2023-02-24 09:55:37,835][15476] Decorrelating experience for 96 frames... -[2023-02-24 09:55:38,414][15485] Decorrelating experience for 96 frames... -[2023-02-24 09:55:38,679][15484] Decorrelating experience for 64 frames... -[2023-02-24 09:55:38,914][15481] Decorrelating experience for 64 frames... -[2023-02-24 09:55:39,017][15475] Decorrelating experience for 64 frames... -[2023-02-24 09:55:39,601][01623] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 2.0. Samples: 20. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-02-24 09:55:39,612][01623] Avg episode reward: [(0, '1.280')] -[2023-02-24 09:55:39,953][15482] Decorrelating experience for 96 frames... -[2023-02-24 09:55:40,566][15484] Decorrelating experience for 96 frames... -[2023-02-24 09:55:40,679][15481] Decorrelating experience for 96 frames... -[2023-02-24 09:55:44,058][15460] Signal inference workers to stop experience collection... -[2023-02-24 09:55:44,081][15474] InferenceWorker_p0-w0: stopping experience collection -[2023-02-24 09:55:44,282][15483] Decorrelating experience for 96 frames... -[2023-02-24 09:55:44,602][01623] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 140.7. Samples: 2110. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-02-24 09:55:44,608][01623] Avg episode reward: [(0, '3.098')] -[2023-02-24 09:55:45,110][15475] Decorrelating experience for 96 frames... -[2023-02-24 09:55:46,666][15460] Signal inference workers to resume experience collection... -[2023-02-24 09:55:46,667][15474] InferenceWorker_p0-w0: resuming experience collection -[2023-02-24 09:55:49,602][01623] Fps is (10 sec: 819.1, 60 sec: 409.6, 300 sec: 409.6). Total num frames: 8192. Throughput: 0: 113.0. Samples: 2260. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-02-24 09:55:49,605][01623] Avg episode reward: [(0, '3.056')] -[2023-02-24 09:55:54,601][01623] Fps is (10 sec: 2867.3, 60 sec: 1146.9, 300 sec: 1146.9). Total num frames: 28672. Throughput: 0: 257.8. Samples: 6444. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) -[2023-02-24 09:55:54,606][01623] Avg episode reward: [(0, '3.857')] -[2023-02-24 09:55:56,720][15474] Updated weights for policy 0, policy_version 10 (0.0594) -[2023-02-24 09:55:59,602][01623] Fps is (10 sec: 4096.1, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 49152. Throughput: 0: 422.3. Samples: 12668. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) -[2023-02-24 09:55:59,606][01623] Avg episode reward: [(0, '4.322')] -[2023-02-24 09:56:04,601][01623] Fps is (10 sec: 3686.4, 60 sec: 1872.5, 300 sec: 1872.5). Total num frames: 65536. Throughput: 0: 432.9. Samples: 15150. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 09:56:04,605][01623] Avg episode reward: [(0, '4.341')] -[2023-02-24 09:56:09,601][01623] Fps is (10 sec: 2867.3, 60 sec: 1945.6, 300 sec: 1945.6). Total num frames: 77824. Throughput: 0: 473.4. Samples: 18936. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) -[2023-02-24 09:56:09,604][01623] Avg episode reward: [(0, '4.517')] -[2023-02-24 09:56:10,856][15474] Updated weights for policy 0, policy_version 20 (0.0017) -[2023-02-24 09:56:14,601][01623] Fps is (10 sec: 2867.2, 60 sec: 2093.5, 300 sec: 2093.5). Total num frames: 94208. Throughput: 0: 535.2. Samples: 24086. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 09:56:14,604][01623] Avg episode reward: [(0, '4.636')] -[2023-02-24 09:56:19,601][01623] Fps is (10 sec: 3686.4, 60 sec: 2293.8, 300 sec: 2293.8). Total num frames: 114688. Throughput: 0: 604.9. Samples: 27220. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 09:56:19,604][01623] Avg episode reward: [(0, '4.592')] -[2023-02-24 09:56:19,647][15460] Saving new best policy, reward=4.592! -[2023-02-24 09:56:20,610][15474] Updated weights for policy 0, policy_version 30 (0.0019) -[2023-02-24 09:56:24,602][01623] Fps is (10 sec: 3686.2, 60 sec: 2383.1, 300 sec: 2383.1). Total num frames: 131072. Throughput: 0: 727.5. Samples: 32760. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 09:56:24,607][01623] Avg episode reward: [(0, '4.555')] -[2023-02-24 09:56:29,601][01623] Fps is (10 sec: 2867.2, 60 sec: 2389.3, 300 sec: 2389.3). Total num frames: 143360. Throughput: 0: 770.1. Samples: 36766. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 09:56:29,604][01623] Avg episode reward: [(0, '4.497')] -[2023-02-24 09:56:34,608][01623] Fps is (10 sec: 2455.9, 60 sec: 2593.8, 300 sec: 2394.3). Total num frames: 155648. Throughput: 0: 802.0. Samples: 38354. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 09:56:34,611][01623] Avg episode reward: [(0, '4.659')] -[2023-02-24 09:56:34,622][15460] Saving new best policy, reward=4.659! -[2023-02-24 09:56:36,662][15474] Updated weights for policy 0, policy_version 40 (0.0018) -[2023-02-24 09:56:39,604][01623] Fps is (10 sec: 2866.5, 60 sec: 2867.1, 300 sec: 2457.5). Total num frames: 172032. Throughput: 0: 796.4. Samples: 42284. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 09:56:39,607][01623] Avg episode reward: [(0, '4.679')] -[2023-02-24 09:56:39,611][15460] Saving new best policy, reward=4.679! -[2023-02-24 09:56:44,601][01623] Fps is (10 sec: 3279.1, 60 sec: 3140.3, 300 sec: 2512.2). Total num frames: 188416. Throughput: 0: 770.5. Samples: 47342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:56:44,604][01623] Avg episode reward: [(0, '4.802')] -[2023-02-24 09:56:44,610][15460] Saving new best policy, reward=4.802! -[2023-02-24 09:56:49,601][01623] Fps is (10 sec: 2867.9, 60 sec: 3208.6, 300 sec: 2508.8). Total num frames: 200704. Throughput: 0: 760.8. Samples: 49384. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:56:49,607][01623] Avg episode reward: [(0, '4.591')] -[2023-02-24 09:56:50,197][15474] Updated weights for policy 0, policy_version 50 (0.0044) -[2023-02-24 09:56:54,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 2554.0). Total num frames: 217088. Throughput: 0: 767.2. Samples: 53460. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 09:56:54,605][01623] Avg episode reward: [(0, '4.416')] -[2023-02-24 09:56:59,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 2639.6). Total num frames: 237568. Throughput: 0: 789.7. Samples: 59624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:56:59,611][01623] Avg episode reward: [(0, '4.464')] -[2023-02-24 09:57:01,010][15474] Updated weights for policy 0, policy_version 60 (0.0036) -[2023-02-24 09:57:04,607][01623] Fps is (10 sec: 4093.4, 60 sec: 3208.2, 300 sec: 2716.1). Total num frames: 258048. Throughput: 0: 788.8. Samples: 62722. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:57:04,610][01623] Avg episode reward: [(0, '4.713')] -[2023-02-24 09:57:04,622][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000063_258048.pth... -[2023-02-24 09:57:09,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 2703.4). Total num frames: 270336. Throughput: 0: 778.8. Samples: 67804. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:57:09,613][01623] Avg episode reward: [(0, '4.802')] -[2023-02-24 09:57:13,988][15474] Updated weights for policy 0, policy_version 70 (0.0028) -[2023-02-24 09:57:14,601][01623] Fps is (10 sec: 2869.0, 60 sec: 3208.5, 300 sec: 2730.7). Total num frames: 286720. Throughput: 0: 784.4. Samples: 72066. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 09:57:14,604][01623] Avg episode reward: [(0, '4.838')] -[2023-02-24 09:57:14,621][15460] Saving new best policy, reward=4.838! -[2023-02-24 09:57:19,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 2792.7). Total num frames: 307200. Throughput: 0: 811.3. Samples: 74858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:57:19,605][01623] Avg episode reward: [(0, '4.846')] -[2023-02-24 09:57:19,608][15460] Saving new best policy, reward=4.846! -[2023-02-24 09:57:23,711][15474] Updated weights for policy 0, policy_version 80 (0.0027) -[2023-02-24 09:57:24,602][01623] Fps is (10 sec: 4095.6, 60 sec: 3276.8, 300 sec: 2849.4). Total num frames: 327680. Throughput: 0: 868.3. Samples: 81356. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 09:57:24,610][01623] Avg episode reward: [(0, '4.863')] -[2023-02-24 09:57:24,653][15460] Saving new best policy, reward=4.863! -[2023-02-24 09:57:29,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2867.2). Total num frames: 344064. Throughput: 0: 861.8. Samples: 86124. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 09:57:29,604][01623] Avg episode reward: [(0, '4.891')] -[2023-02-24 09:57:29,607][15460] Saving new best policy, reward=4.891! -[2023-02-24 09:57:34,601][01623] Fps is (10 sec: 2867.5, 60 sec: 3345.5, 300 sec: 2850.8). Total num frames: 356352. Throughput: 0: 860.9. Samples: 88126. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:57:34,604][01623] Avg episode reward: [(0, '4.707')] -[2023-02-24 09:57:37,174][15474] Updated weights for policy 0, policy_version 90 (0.0032) -[2023-02-24 09:57:39,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3413.5, 300 sec: 2898.7). Total num frames: 376832. Throughput: 0: 883.5. Samples: 93218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 09:57:39,608][01623] Avg episode reward: [(0, '4.664')] -[2023-02-24 09:57:44,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 2943.1). Total num frames: 397312. Throughput: 0: 893.6. Samples: 99836. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 09:57:44,609][01623] Avg episode reward: [(0, '4.792')] -[2023-02-24 09:57:47,287][15474] Updated weights for policy 0, policy_version 100 (0.0020) -[2023-02-24 09:57:49,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 2955.0). Total num frames: 413696. Throughput: 0: 886.0. Samples: 102588. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 09:57:49,604][01623] Avg episode reward: [(0, '5.149')] -[2023-02-24 09:57:49,607][15460] Saving new best policy, reward=5.149! -[2023-02-24 09:57:54,603][01623] Fps is (10 sec: 2866.6, 60 sec: 3481.5, 300 sec: 2937.8). Total num frames: 425984. Throughput: 0: 864.1. Samples: 106690. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 09:57:54,607][01623] Avg episode reward: [(0, '5.300')] -[2023-02-24 09:57:54,622][15460] Saving new best policy, reward=5.300! -[2023-02-24 09:57:59,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 2976.4). Total num frames: 446464. Throughput: 0: 884.6. Samples: 111874. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:57:59,604][01623] Avg episode reward: [(0, '5.256')] -[2023-02-24 09:57:59,842][15474] Updated weights for policy 0, policy_version 110 (0.0031) -[2023-02-24 09:58:04,602][01623] Fps is (10 sec: 4506.1, 60 sec: 3550.2, 300 sec: 3038.9). Total num frames: 471040. Throughput: 0: 896.2. Samples: 115190. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 09:58:04,606][01623] Avg episode reward: [(0, '5.379')] -[2023-02-24 09:58:04,622][15460] Saving new best policy, reward=5.379! -[2023-02-24 09:58:09,601][01623] Fps is (10 sec: 4095.9, 60 sec: 3618.1, 300 sec: 3046.4). Total num frames: 487424. Throughput: 0: 886.9. Samples: 121264. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 09:58:09,606][01623] Avg episode reward: [(0, '5.480')] -[2023-02-24 09:58:09,613][15460] Saving new best policy, reward=5.480! -[2023-02-24 09:58:11,058][15474] Updated weights for policy 0, policy_version 120 (0.0025) -[2023-02-24 09:58:14,602][01623] Fps is (10 sec: 2867.4, 60 sec: 3549.8, 300 sec: 3028.6). Total num frames: 499712. Throughput: 0: 871.6. Samples: 125348. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 09:58:14,611][01623] Avg episode reward: [(0, '5.273')] -[2023-02-24 09:58:19,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3035.9). Total num frames: 516096. Throughput: 0: 873.6. Samples: 127440. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) -[2023-02-24 09:58:19,603][01623] Avg episode reward: [(0, '4.856')] -[2023-02-24 09:58:22,694][15474] Updated weights for policy 0, policy_version 130 (0.0025) -[2023-02-24 09:58:24,601][01623] Fps is (10 sec: 3686.6, 60 sec: 3481.7, 300 sec: 3066.2). Total num frames: 536576. Throughput: 0: 897.0. Samples: 133584. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 09:58:24,608][01623] Avg episode reward: [(0, '4.843')] -[2023-02-24 09:58:29,601][01623] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3094.8). Total num frames: 557056. Throughput: 0: 878.3. Samples: 139360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:58:29,608][01623] Avg episode reward: [(0, '5.210')] -[2023-02-24 09:58:34,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3077.5). Total num frames: 569344. Throughput: 0: 864.3. Samples: 141480. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 09:58:34,608][01623] Avg episode reward: [(0, '5.300')] -[2023-02-24 09:58:35,374][15474] Updated weights for policy 0, policy_version 140 (0.0029) -[2023-02-24 09:58:39,601][01623] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3082.8). Total num frames: 585728. Throughput: 0: 866.0. Samples: 145656. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 09:58:39,608][01623] Avg episode reward: [(0, '5.623')] -[2023-02-24 09:58:39,612][15460] Saving new best policy, reward=5.623! -[2023-02-24 09:58:44,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3129.8). Total num frames: 610304. Throughput: 0: 897.0. Samples: 152238. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 09:58:44,604][01623] Avg episode reward: [(0, '5.774')] -[2023-02-24 09:58:44,614][15460] Saving new best policy, reward=5.774! -[2023-02-24 09:58:45,451][15474] Updated weights for policy 0, policy_version 150 (0.0019) -[2023-02-24 09:58:49,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3133.4). Total num frames: 626688. Throughput: 0: 897.1. Samples: 155558. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 09:58:49,604][01623] Avg episode reward: [(0, '5.783')] -[2023-02-24 09:58:49,606][15460] Saving new best policy, reward=5.783! -[2023-02-24 09:58:54,606][01623] Fps is (10 sec: 3275.1, 60 sec: 3618.0, 300 sec: 3136.9). Total num frames: 643072. Throughput: 0: 862.2. Samples: 160068. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:58:54,620][01623] Avg episode reward: [(0, '5.628')] -[2023-02-24 09:58:58,616][15474] Updated weights for policy 0, policy_version 160 (0.0025) -[2023-02-24 09:58:59,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3140.3). Total num frames: 659456. Throughput: 0: 870.0. Samples: 164496. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:58:59,604][01623] Avg episode reward: [(0, '5.407')] -[2023-02-24 09:59:04,601][01623] Fps is (10 sec: 3688.2, 60 sec: 3481.7, 300 sec: 3162.5). Total num frames: 679936. Throughput: 0: 897.8. Samples: 167842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 09:59:04,607][01623] Avg episode reward: [(0, '5.295')] -[2023-02-24 09:59:04,618][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000166_679936.pth... -[2023-02-24 09:59:07,752][15474] Updated weights for policy 0, policy_version 170 (0.0021) -[2023-02-24 09:59:09,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3183.7). Total num frames: 700416. Throughput: 0: 908.1. Samples: 174448. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 09:59:09,603][01623] Avg episode reward: [(0, '5.823')] -[2023-02-24 09:59:09,609][15460] Saving new best policy, reward=5.823! -[2023-02-24 09:59:14,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3167.6). Total num frames: 712704. Throughput: 0: 876.0. Samples: 178778. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 09:59:14,611][01623] Avg episode reward: [(0, '6.096')] -[2023-02-24 09:59:14,627][15460] Saving new best policy, reward=6.096! -[2023-02-24 09:59:19,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3170.0). Total num frames: 729088. Throughput: 0: 874.6. Samples: 180836. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 09:59:19,608][01623] Avg episode reward: [(0, '5.990')] -[2023-02-24 09:59:21,228][15474] Updated weights for policy 0, policy_version 180 (0.0013) -[2023-02-24 09:59:24,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3189.7). Total num frames: 749568. Throughput: 0: 908.8. Samples: 186552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 09:59:24,607][01623] Avg episode reward: [(0, '5.658')] -[2023-02-24 09:59:29,601][01623] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3208.5). Total num frames: 770048. Throughput: 0: 909.1. Samples: 193148. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 09:59:29,605][01623] Avg episode reward: [(0, '5.918')] -[2023-02-24 09:59:31,346][15474] Updated weights for policy 0, policy_version 190 (0.0018) -[2023-02-24 09:59:34,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3209.9). Total num frames: 786432. Throughput: 0: 882.8. Samples: 195284. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 09:59:34,608][01623] Avg episode reward: [(0, '5.978')] -[2023-02-24 09:59:39,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3194.9). Total num frames: 798720. Throughput: 0: 876.1. Samples: 199488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:59:39,603][01623] Avg episode reward: [(0, '6.391')] -[2023-02-24 09:59:39,607][15460] Saving new best policy, reward=6.391! -[2023-02-24 09:59:43,847][15474] Updated weights for policy 0, policy_version 200 (0.0014) -[2023-02-24 09:59:44,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3212.5). Total num frames: 819200. Throughput: 0: 904.7. Samples: 205206. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:59:44,604][01623] Avg episode reward: [(0, '6.701')] -[2023-02-24 09:59:44,613][15460] Saving new best policy, reward=6.701! -[2023-02-24 09:59:49,604][01623] Fps is (10 sec: 4094.7, 60 sec: 3549.7, 300 sec: 3229.5). Total num frames: 839680. Throughput: 0: 898.5. Samples: 208278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 09:59:49,607][01623] Avg episode reward: [(0, '6.829')] -[2023-02-24 09:59:49,609][15460] Saving new best policy, reward=6.829! -[2023-02-24 09:59:54,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3550.2, 300 sec: 3230.4). Total num frames: 856064. Throughput: 0: 867.7. Samples: 213496. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 09:59:54,607][01623] Avg episode reward: [(0, '6.911')] -[2023-02-24 09:59:54,621][15460] Saving new best policy, reward=6.911! -[2023-02-24 09:59:55,876][15474] Updated weights for policy 0, policy_version 210 (0.0014) -[2023-02-24 09:59:59,601][01623] Fps is (10 sec: 2868.1, 60 sec: 3481.6, 300 sec: 3216.1). Total num frames: 868352. Throughput: 0: 862.9. Samples: 217608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 09:59:59,607][01623] Avg episode reward: [(0, '7.136')] -[2023-02-24 09:59:59,612][15460] Saving new best policy, reward=7.136! -[2023-02-24 10:00:04,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3232.1). Total num frames: 888832. Throughput: 0: 872.7. Samples: 220106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:00:04,604][01623] Avg episode reward: [(0, '6.957')] -[2023-02-24 10:00:07,073][15474] Updated weights for policy 0, policy_version 220 (0.0015) -[2023-02-24 10:00:09,602][01623] Fps is (10 sec: 3686.3, 60 sec: 3413.3, 300 sec: 3232.9). Total num frames: 905216. Throughput: 0: 887.9. Samples: 226508. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:00:09,605][01623] Avg episode reward: [(0, '6.833')] -[2023-02-24 10:00:14,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3219.3). Total num frames: 917504. Throughput: 0: 822.0. Samples: 230138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:00:14,609][01623] Avg episode reward: [(0, '6.858')] -[2023-02-24 10:00:19,601][01623] Fps is (10 sec: 2457.7, 60 sec: 3345.1, 300 sec: 3206.2). Total num frames: 929792. Throughput: 0: 812.1. Samples: 231830. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:00:19,606][01623] Avg episode reward: [(0, '7.099')] -[2023-02-24 10:00:23,971][15474] Updated weights for policy 0, policy_version 230 (0.0019) -[2023-02-24 10:00:24,604][01623] Fps is (10 sec: 2456.8, 60 sec: 3208.4, 300 sec: 3193.5). Total num frames: 942080. Throughput: 0: 792.8. Samples: 235168. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:00:24,608][01623] Avg episode reward: [(0, '6.945')] -[2023-02-24 10:00:29,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 962560. Throughput: 0: 791.2. Samples: 240810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:00:29,604][01623] Avg episode reward: [(0, '6.872')] -[2023-02-24 10:00:33,671][15474] Updated weights for policy 0, policy_version 240 (0.0014) -[2023-02-24 10:00:34,601][01623] Fps is (10 sec: 4507.0, 60 sec: 3345.1, 300 sec: 3346.2). Total num frames: 987136. Throughput: 0: 796.9. Samples: 244138. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:00:34,604][01623] Avg episode reward: [(0, '7.081')] -[2023-02-24 10:00:39,606][01623] Fps is (10 sec: 3684.5, 60 sec: 3344.8, 300 sec: 3387.8). Total num frames: 999424. Throughput: 0: 806.1. Samples: 249774. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:00:39,612][01623] Avg episode reward: [(0, '7.376')] -[2023-02-24 10:00:39,616][15460] Saving new best policy, reward=7.376! -[2023-02-24 10:00:44,601][01623] Fps is (10 sec: 2867.1, 60 sec: 3276.8, 300 sec: 3415.7). Total num frames: 1015808. Throughput: 0: 805.4. Samples: 253852. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:00:44,604][01623] Avg episode reward: [(0, '7.371')] -[2023-02-24 10:00:46,830][15474] Updated weights for policy 0, policy_version 250 (0.0016) -[2023-02-24 10:00:49,601][01623] Fps is (10 sec: 3688.3, 60 sec: 3277.0, 300 sec: 3415.6). Total num frames: 1036288. Throughput: 0: 807.6. Samples: 256450. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:00:49,607][01623] Avg episode reward: [(0, '7.348')] -[2023-02-24 10:00:54,601][01623] Fps is (10 sec: 4096.1, 60 sec: 3345.1, 300 sec: 3415.7). Total num frames: 1056768. Throughput: 0: 815.5. Samples: 263206. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:00:54,609][01623] Avg episode reward: [(0, '7.778')] -[2023-02-24 10:00:54,618][15460] Saving new best policy, reward=7.778! -[2023-02-24 10:00:56,311][15474] Updated weights for policy 0, policy_version 260 (0.0014) -[2023-02-24 10:00:59,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 1073152. Throughput: 0: 852.5. Samples: 268500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:00:59,606][01623] Avg episode reward: [(0, '8.148')] -[2023-02-24 10:00:59,618][15460] Saving new best policy, reward=8.148! -[2023-02-24 10:01:04,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 1085440. Throughput: 0: 860.3. Samples: 270544. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:01:04,606][01623] Avg episode reward: [(0, '8.419')] -[2023-02-24 10:01:04,623][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000265_1085440.pth... -[2023-02-24 10:01:04,757][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000063_258048.pth -[2023-02-24 10:01:04,776][15460] Saving new best policy, reward=8.419! -[2023-02-24 10:01:09,456][15474] Updated weights for policy 0, policy_version 270 (0.0033) -[2023-02-24 10:01:09,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3429.5). Total num frames: 1105920. Throughput: 0: 888.3. Samples: 275138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:01:09,609][01623] Avg episode reward: [(0, '8.458')] -[2023-02-24 10:01:09,614][15460] Saving new best policy, reward=8.458! -[2023-02-24 10:01:14,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1126400. Throughput: 0: 907.5. Samples: 281646. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:01:14,607][01623] Avg episode reward: [(0, '8.902')] -[2023-02-24 10:01:14,620][15460] Saving new best policy, reward=8.902! -[2023-02-24 10:01:19,601][01623] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1142784. Throughput: 0: 899.6. Samples: 284620. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:01:19,607][01623] Avg episode reward: [(0, '9.433')] -[2023-02-24 10:01:19,613][15460] Saving new best policy, reward=9.433! -[2023-02-24 10:01:20,813][15474] Updated weights for policy 0, policy_version 280 (0.0016) -[2023-02-24 10:01:24,604][01623] Fps is (10 sec: 2866.3, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1155072. Throughput: 0: 863.7. Samples: 288638. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) -[2023-02-24 10:01:24,611][01623] Avg episode reward: [(0, '9.001')] -[2023-02-24 10:01:29,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3457.4). Total num frames: 1175552. Throughput: 0: 881.1. Samples: 293500. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:01:29,608][01623] Avg episode reward: [(0, '9.622')] -[2023-02-24 10:01:29,612][15460] Saving new best policy, reward=9.622! -[2023-02-24 10:01:32,365][15474] Updated weights for policy 0, policy_version 290 (0.0013) -[2023-02-24 10:01:34,601][01623] Fps is (10 sec: 4097.3, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1196032. Throughput: 0: 895.4. Samples: 296742. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:01:34,604][01623] Avg episode reward: [(0, '9.686')] -[2023-02-24 10:01:34,614][15460] Saving new best policy, reward=9.686! -[2023-02-24 10:01:39,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3550.2, 300 sec: 3471.2). Total num frames: 1212416. Throughput: 0: 882.5. Samples: 302918. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:01:39,605][01623] Avg episode reward: [(0, '9.937')] -[2023-02-24 10:01:39,610][15460] Saving new best policy, reward=9.937! -[2023-02-24 10:01:44,601][01623] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1224704. Throughput: 0: 856.0. Samples: 307020. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:01:44,606][01623] Avg episode reward: [(0, '11.370')] -[2023-02-24 10:01:44,620][15460] Saving new best policy, reward=11.370! -[2023-02-24 10:01:45,110][15474] Updated weights for policy 0, policy_version 300 (0.0013) -[2023-02-24 10:01:49,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1241088. Throughput: 0: 855.7. Samples: 309052. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:01:49,604][01623] Avg episode reward: [(0, '11.794')] -[2023-02-24 10:01:49,678][15460] Saving new best policy, reward=11.794! -[2023-02-24 10:01:54,601][01623] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1265664. Throughput: 0: 887.7. Samples: 315086. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:01:54,604][01623] Avg episode reward: [(0, '12.334')] -[2023-02-24 10:01:54,615][15460] Saving new best policy, reward=12.334! -[2023-02-24 10:01:55,546][15474] Updated weights for policy 0, policy_version 310 (0.0016) -[2023-02-24 10:01:59,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.3). Total num frames: 1282048. Throughput: 0: 882.4. Samples: 321356. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:01:59,604][01623] Avg episode reward: [(0, '13.059')] -[2023-02-24 10:01:59,607][15460] Saving new best policy, reward=13.059! -[2023-02-24 10:02:04,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 1298432. Throughput: 0: 862.0. Samples: 323412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:02:04,607][01623] Avg episode reward: [(0, '12.536')] -[2023-02-24 10:02:08,502][15474] Updated weights for policy 0, policy_version 320 (0.0038) -[2023-02-24 10:02:09,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1314816. Throughput: 0: 868.5. Samples: 327718. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:02:09,608][01623] Avg episode reward: [(0, '12.612')] -[2023-02-24 10:02:14,601][01623] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1335296. Throughput: 0: 904.1. Samples: 334184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:02:14,603][01623] Avg episode reward: [(0, '13.126')] -[2023-02-24 10:02:14,616][15460] Saving new best policy, reward=13.126! -[2023-02-24 10:02:17,750][15474] Updated weights for policy 0, policy_version 330 (0.0015) -[2023-02-24 10:02:19,601][01623] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 1355776. Throughput: 0: 904.7. Samples: 337454. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:02:19,608][01623] Avg episode reward: [(0, '12.559')] -[2023-02-24 10:02:24,601][01623] Fps is (10 sec: 3276.7, 60 sec: 3550.0, 300 sec: 3471.2). Total num frames: 1368064. Throughput: 0: 873.2. Samples: 342212. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:02:24,609][01623] Avg episode reward: [(0, '13.064')] -[2023-02-24 10:02:29,601][01623] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1384448. Throughput: 0: 875.4. Samples: 346412. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:02:29,609][01623] Avg episode reward: [(0, '13.035')] -[2023-02-24 10:02:31,059][15474] Updated weights for policy 0, policy_version 340 (0.0012) -[2023-02-24 10:02:34,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1404928. Throughput: 0: 898.5. Samples: 349484. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:02:34,609][01623] Avg episode reward: [(0, '11.485')] -[2023-02-24 10:02:39,602][01623] Fps is (10 sec: 4095.8, 60 sec: 3549.8, 300 sec: 3485.1). Total num frames: 1425408. Throughput: 0: 909.2. Samples: 356002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:02:39,605][01623] Avg episode reward: [(0, '11.966')] -[2023-02-24 10:02:41,293][15474] Updated weights for policy 0, policy_version 350 (0.0018) -[2023-02-24 10:02:44,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 1441792. Throughput: 0: 872.1. Samples: 360600. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:02:44,603][01623] Avg episode reward: [(0, '11.484')] -[2023-02-24 10:02:49,601][01623] Fps is (10 sec: 2867.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 1454080. Throughput: 0: 871.2. Samples: 362618. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:02:49,608][01623] Avg episode reward: [(0, '11.155')] -[2023-02-24 10:02:53,807][15474] Updated weights for policy 0, policy_version 360 (0.0025) -[2023-02-24 10:02:54,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1474560. Throughput: 0: 894.1. Samples: 367954. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:02:54,609][01623] Avg episode reward: [(0, '12.310')] -[2023-02-24 10:02:59,601][01623] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 1499136. Throughput: 0: 893.6. Samples: 374398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:02:59,603][01623] Avg episode reward: [(0, '12.829')] -[2023-02-24 10:03:04,602][01623] Fps is (10 sec: 3686.0, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 1511424. Throughput: 0: 877.3. Samples: 376932. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:03:04,606][01623] Avg episode reward: [(0, '12.561')] -[2023-02-24 10:03:04,620][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000369_1511424.pth... -[2023-02-24 10:03:04,748][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000166_679936.pth -[2023-02-24 10:03:05,545][15474] Updated weights for policy 0, policy_version 370 (0.0014) -[2023-02-24 10:03:09,601][01623] Fps is (10 sec: 2457.5, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1523712. Throughput: 0: 861.6. Samples: 380982. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) -[2023-02-24 10:03:09,608][01623] Avg episode reward: [(0, '13.109')] -[2023-02-24 10:03:14,601][01623] Fps is (10 sec: 3277.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1544192. Throughput: 0: 885.0. Samples: 386238. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:03:14,608][01623] Avg episode reward: [(0, '14.178')] -[2023-02-24 10:03:14,617][15460] Saving new best policy, reward=14.178! -[2023-02-24 10:03:17,307][15474] Updated weights for policy 0, policy_version 380 (0.0025) -[2023-02-24 10:03:19,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1564672. Throughput: 0: 884.6. Samples: 389292. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:03:19,609][01623] Avg episode reward: [(0, '14.786')] -[2023-02-24 10:03:19,613][15460] Saving new best policy, reward=14.786! -[2023-02-24 10:03:24,603][01623] Fps is (10 sec: 3685.9, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 1581056. Throughput: 0: 859.0. Samples: 394656. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:03:24,605][01623] Avg episode reward: [(0, '14.672')] -[2023-02-24 10:03:29,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1593344. Throughput: 0: 845.6. Samples: 398650. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:03:29,610][01623] Avg episode reward: [(0, '15.984')] -[2023-02-24 10:03:29,612][15460] Saving new best policy, reward=15.984! -[2023-02-24 10:03:30,812][15474] Updated weights for policy 0, policy_version 390 (0.0031) -[2023-02-24 10:03:34,601][01623] Fps is (10 sec: 2867.6, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1609728. Throughput: 0: 848.4. Samples: 400794. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:03:34,608][01623] Avg episode reward: [(0, '15.915')] -[2023-02-24 10:03:39,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1634304. Throughput: 0: 870.8. Samples: 407138. Policy #0 lag: (min: 0.0, avg: 0.8, max: 1.0) -[2023-02-24 10:03:39,604][01623] Avg episode reward: [(0, '15.268')] -[2023-02-24 10:03:40,517][15474] Updated weights for policy 0, policy_version 400 (0.0014) -[2023-02-24 10:03:44,601][01623] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1650688. Throughput: 0: 849.5. Samples: 412624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:03:44,606][01623] Avg episode reward: [(0, '15.724')] -[2023-02-24 10:03:49,606][01623] Fps is (10 sec: 2865.7, 60 sec: 3481.3, 300 sec: 3457.3). Total num frames: 1662976. Throughput: 0: 839.3. Samples: 414706. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:03:49,614][01623] Avg episode reward: [(0, '16.029')] -[2023-02-24 10:03:49,618][15460] Saving new best policy, reward=16.029! -[2023-02-24 10:03:54,602][01623] Fps is (10 sec: 2047.8, 60 sec: 3276.7, 300 sec: 3429.5). Total num frames: 1671168. Throughput: 0: 821.2. Samples: 417936. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:03:54,610][01623] Avg episode reward: [(0, '15.228')] -[2023-02-24 10:03:56,578][15474] Updated weights for policy 0, policy_version 410 (0.0014) -[2023-02-24 10:03:59,602][01623] Fps is (10 sec: 2048.9, 60 sec: 3072.0, 300 sec: 3401.8). Total num frames: 1683456. Throughput: 0: 788.0. Samples: 421698. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:03:59,605][01623] Avg episode reward: [(0, '15.284')] -[2023-02-24 10:04:04,601][01623] Fps is (10 sec: 3277.2, 60 sec: 3208.6, 300 sec: 3401.8). Total num frames: 1703936. Throughput: 0: 782.1. Samples: 424488. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:04:04,607][01623] Avg episode reward: [(0, '14.918')] -[2023-02-24 10:04:09,027][15474] Updated weights for policy 0, policy_version 420 (0.0015) -[2023-02-24 10:04:09,602][01623] Fps is (10 sec: 3686.2, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 1720320. Throughput: 0: 777.8. Samples: 429658. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:04:09,607][01623] Avg episode reward: [(0, '15.504')] -[2023-02-24 10:04:14,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3401.8). Total num frames: 1732608. Throughput: 0: 780.0. Samples: 433748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:04:14,610][01623] Avg episode reward: [(0, '15.924')] -[2023-02-24 10:04:19,601][01623] Fps is (10 sec: 3277.1, 60 sec: 3140.3, 300 sec: 3401.8). Total num frames: 1753088. Throughput: 0: 792.7. Samples: 436466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:04:19,607][01623] Avg episode reward: [(0, '16.672')] -[2023-02-24 10:04:19,611][15460] Saving new best policy, reward=16.672! -[2023-02-24 10:04:21,060][15474] Updated weights for policy 0, policy_version 430 (0.0022) -[2023-02-24 10:04:24,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3208.6, 300 sec: 3401.8). Total num frames: 1773568. Throughput: 0: 790.6. Samples: 442716. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:04:24,608][01623] Avg episode reward: [(0, '17.575')] -[2023-02-24 10:04:24,619][15460] Saving new best policy, reward=17.575! -[2023-02-24 10:04:29,601][01623] Fps is (10 sec: 3686.5, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 1789952. Throughput: 0: 779.3. Samples: 447692. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:04:29,611][01623] Avg episode reward: [(0, '17.729')] -[2023-02-24 10:04:29,615][15460] Saving new best policy, reward=17.729! -[2023-02-24 10:04:33,839][15474] Updated weights for policy 0, policy_version 440 (0.0023) -[2023-02-24 10:04:34,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3401.8). Total num frames: 1802240. Throughput: 0: 777.4. Samples: 449686. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:04:34,608][01623] Avg episode reward: [(0, '17.100')] -[2023-02-24 10:04:39,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3401.8). Total num frames: 1822720. Throughput: 0: 811.1. Samples: 454436. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:04:39,604][01623] Avg episode reward: [(0, '16.484')] -[2023-02-24 10:04:44,330][15474] Updated weights for policy 0, policy_version 450 (0.0023) -[2023-02-24 10:04:44,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3208.5, 300 sec: 3401.8). Total num frames: 1843200. Throughput: 0: 868.2. Samples: 460766. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:04:44,604][01623] Avg episode reward: [(0, '15.741')] -[2023-02-24 10:04:49,602][01623] Fps is (10 sec: 3276.7, 60 sec: 3208.8, 300 sec: 3387.9). Total num frames: 1855488. Throughput: 0: 866.7. Samples: 463492. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:04:49,604][01623] Avg episode reward: [(0, '15.975')] -[2023-02-24 10:04:54,605][01623] Fps is (10 sec: 2866.0, 60 sec: 3344.9, 300 sec: 3401.7). Total num frames: 1871872. Throughput: 0: 844.6. Samples: 467668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:04:54,607][01623] Avg episode reward: [(0, '15.501')] -[2023-02-24 10:04:57,617][15474] Updated weights for policy 0, policy_version 460 (0.0033) -[2023-02-24 10:04:59,601][01623] Fps is (10 sec: 3686.6, 60 sec: 3481.6, 300 sec: 3401.8). Total num frames: 1892352. Throughput: 0: 868.9. Samples: 472850. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:04:59,604][01623] Avg episode reward: [(0, '16.011')] -[2023-02-24 10:05:04,604][01623] Fps is (10 sec: 4096.5, 60 sec: 3481.4, 300 sec: 3415.6). Total num frames: 1912832. Throughput: 0: 878.4. Samples: 475998. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:05:04,607][01623] Avg episode reward: [(0, '15.949')] -[2023-02-24 10:05:04,622][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000467_1912832.pth... -[2023-02-24 10:05:04,745][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000265_1085440.pth -[2023-02-24 10:05:07,438][15474] Updated weights for policy 0, policy_version 470 (0.0012) -[2023-02-24 10:05:09,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3429.5). Total num frames: 1929216. Throughput: 0: 873.2. Samples: 482012. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:05:09,609][01623] Avg episode reward: [(0, '16.740')] -[2023-02-24 10:05:14,601][01623] Fps is (10 sec: 2868.0, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1941504. Throughput: 0: 853.3. Samples: 486092. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:05:14,604][01623] Avg episode reward: [(0, '17.286')] -[2023-02-24 10:05:19,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3443.5). Total num frames: 1957888. Throughput: 0: 852.9. Samples: 488068. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:05:19,604][01623] Avg episode reward: [(0, '18.113')] -[2023-02-24 10:05:19,612][15460] Saving new best policy, reward=18.113! -[2023-02-24 10:05:20,728][15474] Updated weights for policy 0, policy_version 480 (0.0031) -[2023-02-24 10:05:24,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 1978368. Throughput: 0: 884.7. Samples: 494248. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:05:24,607][01623] Avg episode reward: [(0, '19.522')] -[2023-02-24 10:05:24,652][15460] Saving new best policy, reward=19.522! -[2023-02-24 10:05:29,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1998848. Throughput: 0: 866.8. Samples: 499774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:05:29,604][01623] Avg episode reward: [(0, '19.879')] -[2023-02-24 10:05:29,610][15460] Saving new best policy, reward=19.879! -[2023-02-24 10:05:32,357][15474] Updated weights for policy 0, policy_version 490 (0.0014) -[2023-02-24 10:05:34,604][01623] Fps is (10 sec: 3275.8, 60 sec: 3481.4, 300 sec: 3429.6). Total num frames: 2011136. Throughput: 0: 851.0. Samples: 501788. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:05:34,608][01623] Avg episode reward: [(0, '19.909')] -[2023-02-24 10:05:34,623][15460] Saving new best policy, reward=19.909! -[2023-02-24 10:05:39,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2027520. Throughput: 0: 849.9. Samples: 505910. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:05:39,603][01623] Avg episode reward: [(0, '19.955')] -[2023-02-24 10:05:39,613][15460] Saving new best policy, reward=19.955! -[2023-02-24 10:05:43,763][15474] Updated weights for policy 0, policy_version 500 (0.0016) -[2023-02-24 10:05:44,601][01623] Fps is (10 sec: 3687.6, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2048000. Throughput: 0: 882.6. Samples: 512566. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:05:44,608][01623] Avg episode reward: [(0, '20.631')] -[2023-02-24 10:05:44,619][15460] Saving new best policy, reward=20.631! -[2023-02-24 10:05:49,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 2068480. Throughput: 0: 885.4. Samples: 515840. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:05:49,610][01623] Avg episode reward: [(0, '20.844')] -[2023-02-24 10:05:49,615][15460] Saving new best policy, reward=20.844! -[2023-02-24 10:05:54,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.8, 300 sec: 3415.6). Total num frames: 2080768. Throughput: 0: 853.4. Samples: 520414. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:05:54,606][01623] Avg episode reward: [(0, '21.272')] -[2023-02-24 10:05:54,626][15460] Saving new best policy, reward=21.272! -[2023-02-24 10:05:56,321][15474] Updated weights for policy 0, policy_version 510 (0.0016) -[2023-02-24 10:05:59,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2097152. Throughput: 0: 860.3. Samples: 524806. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:05:59,608][01623] Avg episode reward: [(0, '21.640')] -[2023-02-24 10:05:59,611][15460] Saving new best policy, reward=21.640! -[2023-02-24 10:06:04,601][01623] Fps is (10 sec: 4095.9, 60 sec: 3481.8, 300 sec: 3443.4). Total num frames: 2121728. Throughput: 0: 889.1. Samples: 528076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:06:04,609][01623] Avg episode reward: [(0, '21.780')] -[2023-02-24 10:06:04,624][15460] Saving new best policy, reward=21.780! -[2023-02-24 10:06:06,286][15474] Updated weights for policy 0, policy_version 520 (0.0015) -[2023-02-24 10:06:09,603][01623] Fps is (10 sec: 4504.9, 60 sec: 3549.8, 300 sec: 3443.4). Total num frames: 2142208. Throughput: 0: 898.5. Samples: 534684. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:06:09,610][01623] Avg episode reward: [(0, '23.552')] -[2023-02-24 10:06:09,617][15460] Saving new best policy, reward=23.552! -[2023-02-24 10:06:14,603][01623] Fps is (10 sec: 3276.3, 60 sec: 3549.8, 300 sec: 3429.5). Total num frames: 2154496. Throughput: 0: 874.2. Samples: 539116. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:06:14,610][01623] Avg episode reward: [(0, '23.781')] -[2023-02-24 10:06:14,627][15460] Saving new best policy, reward=23.781! -[2023-02-24 10:06:19,343][15474] Updated weights for policy 0, policy_version 530 (0.0026) -[2023-02-24 10:06:19,601][01623] Fps is (10 sec: 2867.7, 60 sec: 3549.9, 300 sec: 3443.5). Total num frames: 2170880. Throughput: 0: 875.3. Samples: 541174. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:06:19,608][01623] Avg episode reward: [(0, '22.682')] -[2023-02-24 10:06:24,601][01623] Fps is (10 sec: 3687.1, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2191360. Throughput: 0: 912.2. Samples: 546958. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:06:24,609][01623] Avg episode reward: [(0, '22.237')] -[2023-02-24 10:06:28,771][15474] Updated weights for policy 0, policy_version 540 (0.0027) -[2023-02-24 10:06:29,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2211840. Throughput: 0: 910.6. Samples: 553542. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:06:29,608][01623] Avg episode reward: [(0, '20.801')] -[2023-02-24 10:06:34,607][01623] Fps is (10 sec: 3684.1, 60 sec: 3618.0, 300 sec: 3443.3). Total num frames: 2228224. Throughput: 0: 886.6. Samples: 555744. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:06:34,617][01623] Avg episode reward: [(0, '18.850')] -[2023-02-24 10:06:39,602][01623] Fps is (10 sec: 2866.8, 60 sec: 3549.8, 300 sec: 3443.4). Total num frames: 2240512. Throughput: 0: 875.7. Samples: 559820. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:06:39,608][01623] Avg episode reward: [(0, '17.854')] -[2023-02-24 10:06:41,896][15474] Updated weights for policy 0, policy_version 550 (0.0020) -[2023-02-24 10:06:44,601][01623] Fps is (10 sec: 3278.8, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2260992. Throughput: 0: 907.7. Samples: 565652. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:06:44,611][01623] Avg episode reward: [(0, '18.313')] -[2023-02-24 10:06:49,601][01623] Fps is (10 sec: 4506.2, 60 sec: 3618.1, 300 sec: 3457.3). Total num frames: 2285568. Throughput: 0: 908.8. Samples: 568970. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:06:49,607][01623] Avg episode reward: [(0, '18.114')] -[2023-02-24 10:06:51,963][15474] Updated weights for policy 0, policy_version 560 (0.0014) -[2023-02-24 10:06:54,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3443.4). Total num frames: 2297856. Throughput: 0: 882.3. Samples: 574386. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:06:54,604][01623] Avg episode reward: [(0, '18.056')] -[2023-02-24 10:06:59,602][01623] Fps is (10 sec: 2867.0, 60 sec: 3618.1, 300 sec: 3443.4). Total num frames: 2314240. Throughput: 0: 874.4. Samples: 578462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:06:59,609][01623] Avg episode reward: [(0, '18.771')] -[2023-02-24 10:07:04,344][15474] Updated weights for policy 0, policy_version 570 (0.0023) -[2023-02-24 10:07:04,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2334720. Throughput: 0: 889.2. Samples: 581190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:07:04,604][01623] Avg episode reward: [(0, '19.929')] -[2023-02-24 10:07:04,618][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000570_2334720.pth... -[2023-02-24 10:07:04,736][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000369_1511424.pth -[2023-02-24 10:07:09,601][01623] Fps is (10 sec: 4096.3, 60 sec: 3550.0, 300 sec: 3457.3). Total num frames: 2355200. Throughput: 0: 909.9. Samples: 587904. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:07:09,604][01623] Avg episode reward: [(0, '19.326')] -[2023-02-24 10:07:14,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3443.4). Total num frames: 2371584. Throughput: 0: 878.3. Samples: 593066. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:07:14,604][01623] Avg episode reward: [(0, '18.592')] -[2023-02-24 10:07:15,816][15474] Updated weights for policy 0, policy_version 580 (0.0016) -[2023-02-24 10:07:19,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2383872. Throughput: 0: 875.5. Samples: 595138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:07:19,608][01623] Avg episode reward: [(0, '18.477')] -[2023-02-24 10:07:24,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2404352. Throughput: 0: 898.0. Samples: 600228. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:07:24,606][01623] Avg episode reward: [(0, '20.244')] -[2023-02-24 10:07:27,745][15474] Updated weights for policy 0, policy_version 590 (0.0019) -[2023-02-24 10:07:29,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2420736. Throughput: 0: 883.7. Samples: 605420. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:07:29,606][01623] Avg episode reward: [(0, '19.592')] -[2023-02-24 10:07:34,601][01623] Fps is (10 sec: 2457.6, 60 sec: 3345.4, 300 sec: 3401.8). Total num frames: 2428928. Throughput: 0: 851.5. Samples: 607288. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:07:34,604][01623] Avg episode reward: [(0, '20.107')] -[2023-02-24 10:07:39,601][01623] Fps is (10 sec: 2048.0, 60 sec: 3345.1, 300 sec: 3387.9). Total num frames: 2441216. Throughput: 0: 803.9. Samples: 610562. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:07:39,609][01623] Avg episode reward: [(0, '21.932')] -[2023-02-24 10:07:44,063][15474] Updated weights for policy 0, policy_version 600 (0.0026) -[2023-02-24 10:07:44,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 2457600. Throughput: 0: 804.8. Samples: 614676. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:07:44,604][01623] Avg episode reward: [(0, '22.029')] -[2023-02-24 10:07:49,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3401.8). Total num frames: 2478080. Throughput: 0: 817.5. Samples: 617978. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:07:49,604][01623] Avg episode reward: [(0, '24.077')] -[2023-02-24 10:07:49,619][15460] Saving new best policy, reward=24.077! -[2023-02-24 10:07:53,257][15474] Updated weights for policy 0, policy_version 610 (0.0019) -[2023-02-24 10:07:54,603][01623] Fps is (10 sec: 4504.8, 60 sec: 3413.2, 300 sec: 3401.7). Total num frames: 2502656. Throughput: 0: 814.6. Samples: 624564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:07:54,638][01623] Avg episode reward: [(0, '21.676')] -[2023-02-24 10:07:59,603][01623] Fps is (10 sec: 3685.6, 60 sec: 3345.0, 300 sec: 3401.8). Total num frames: 2514944. Throughput: 0: 806.0. Samples: 629338. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:07:59,607][01623] Avg episode reward: [(0, '21.667')] -[2023-02-24 10:08:04,601][01623] Fps is (10 sec: 2867.7, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 2531328. Throughput: 0: 806.5. Samples: 631432. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) -[2023-02-24 10:08:04,603][01623] Avg episode reward: [(0, '20.829')] -[2023-02-24 10:08:06,462][15474] Updated weights for policy 0, policy_version 620 (0.0014) -[2023-02-24 10:08:09,601][01623] Fps is (10 sec: 3687.2, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 2551808. Throughput: 0: 815.6. Samples: 636932. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:08:09,604][01623] Avg episode reward: [(0, '21.048')] -[2023-02-24 10:08:14,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 2572288. Throughput: 0: 850.2. Samples: 643680. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:08:14,609][01623] Avg episode reward: [(0, '21.229')] -[2023-02-24 10:08:16,146][15474] Updated weights for policy 0, policy_version 630 (0.0034) -[2023-02-24 10:08:19,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3415.7). Total num frames: 2588672. Throughput: 0: 864.9. Samples: 646208. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:08:19,604][01623] Avg episode reward: [(0, '20.770')] -[2023-02-24 10:08:24,604][01623] Fps is (10 sec: 2866.4, 60 sec: 3276.7, 300 sec: 3415.6). Total num frames: 2600960. Throughput: 0: 886.5. Samples: 650456. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:08:24,609][01623] Avg episode reward: [(0, '22.123')] -[2023-02-24 10:08:28,723][15474] Updated weights for policy 0, policy_version 640 (0.0017) -[2023-02-24 10:08:29,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3429.5). Total num frames: 2621440. Throughput: 0: 918.0. Samples: 655986. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:08:29,609][01623] Avg episode reward: [(0, '22.073')] -[2023-02-24 10:08:34,601][01623] Fps is (10 sec: 4506.8, 60 sec: 3618.1, 300 sec: 3429.5). Total num frames: 2646016. Throughput: 0: 917.7. Samples: 659276. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:08:34,608][01623] Avg episode reward: [(0, '21.749')] -[2023-02-24 10:08:39,462][15474] Updated weights for policy 0, policy_version 650 (0.0015) -[2023-02-24 10:08:39,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3429.5). Total num frames: 2662400. Throughput: 0: 899.0. Samples: 665018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:08:39,607][01623] Avg episode reward: [(0, '21.415')] -[2023-02-24 10:08:44,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3429.6). Total num frames: 2674688. Throughput: 0: 885.6. Samples: 669188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:08:44,606][01623] Avg episode reward: [(0, '21.579')] -[2023-02-24 10:08:49,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3471.2). Total num frames: 2695168. Throughput: 0: 890.8. Samples: 671520. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:08:49,607][01623] Avg episode reward: [(0, '22.475')] -[2023-02-24 10:08:51,080][15474] Updated weights for policy 0, policy_version 660 (0.0012) -[2023-02-24 10:08:54,601][01623] Fps is (10 sec: 4505.7, 60 sec: 3618.2, 300 sec: 3512.8). Total num frames: 2719744. Throughput: 0: 918.0. Samples: 678240. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:08:54,607][01623] Avg episode reward: [(0, '21.894')] -[2023-02-24 10:08:59,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3485.1). Total num frames: 2732032. Throughput: 0: 893.2. Samples: 683874. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:08:59,604][01623] Avg episode reward: [(0, '22.262')] -[2023-02-24 10:09:02,695][15474] Updated weights for policy 0, policy_version 670 (0.0017) -[2023-02-24 10:09:04,602][01623] Fps is (10 sec: 2867.1, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 2748416. Throughput: 0: 885.4. Samples: 686052. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:09:04,605][01623] Avg episode reward: [(0, '24.232')] -[2023-02-24 10:09:04,622][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000671_2748416.pth... -[2023-02-24 10:09:04,776][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000467_1912832.pth -[2023-02-24 10:09:04,789][15460] Saving new best policy, reward=24.232! -[2023-02-24 10:09:09,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 2764800. Throughput: 0: 892.8. Samples: 690628. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:09:09,604][01623] Avg episode reward: [(0, '24.161')] -[2023-02-24 10:09:13,529][15474] Updated weights for policy 0, policy_version 680 (0.0012) -[2023-02-24 10:09:14,601][01623] Fps is (10 sec: 4096.2, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 2789376. Throughput: 0: 910.6. Samples: 696964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:09:14,608][01623] Avg episode reward: [(0, '24.324')] -[2023-02-24 10:09:14,621][15460] Saving new best policy, reward=24.324! -[2023-02-24 10:09:19,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 2801664. Throughput: 0: 895.1. Samples: 699554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:09:19,605][01623] Avg episode reward: [(0, '24.367')] -[2023-02-24 10:09:19,610][15460] Saving new best policy, reward=24.367! -[2023-02-24 10:09:24,601][01623] Fps is (10 sec: 2457.6, 60 sec: 3550.0, 300 sec: 3471.2). Total num frames: 2813952. Throughput: 0: 854.8. Samples: 703486. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:09:24,606][01623] Avg episode reward: [(0, '23.909')] -[2023-02-24 10:09:28,437][15474] Updated weights for policy 0, policy_version 690 (0.0030) -[2023-02-24 10:09:29,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2830336. Throughput: 0: 853.2. Samples: 707580. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:09:29,607][01623] Avg episode reward: [(0, '23.510')] -[2023-02-24 10:09:34,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 2850816. Throughput: 0: 872.7. Samples: 710790. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:09:34,604][01623] Avg episode reward: [(0, '22.200')] -[2023-02-24 10:09:37,931][15474] Updated weights for policy 0, policy_version 700 (0.0012) -[2023-02-24 10:09:39,601][01623] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2871296. Throughput: 0: 865.6. Samples: 717190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:09:39,611][01623] Avg episode reward: [(0, '22.911')] -[2023-02-24 10:09:44,601][01623] Fps is (10 sec: 3276.7, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2883584. Throughput: 0: 832.7. Samples: 721344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:09:44,610][01623] Avg episode reward: [(0, '22.923')] -[2023-02-24 10:09:49,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 2899968. Throughput: 0: 828.7. Samples: 723344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:09:49,608][01623] Avg episode reward: [(0, '22.383')] -[2023-02-24 10:09:51,240][15474] Updated weights for policy 0, policy_version 710 (0.0025) -[2023-02-24 10:09:54,601][01623] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3485.1). Total num frames: 2920448. Throughput: 0: 857.6. Samples: 729220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:09:54,603][01623] Avg episode reward: [(0, '23.340')] -[2023-02-24 10:09:59,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2940928. Throughput: 0: 861.3. Samples: 735722. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:09:59,603][01623] Avg episode reward: [(0, '22.931')] -[2023-02-24 10:10:01,844][15474] Updated weights for policy 0, policy_version 720 (0.0036) -[2023-02-24 10:10:04,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3471.2). Total num frames: 2953216. Throughput: 0: 849.6. Samples: 737786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:10:04,608][01623] Avg episode reward: [(0, '21.784')] -[2023-02-24 10:10:09,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 2969600. Throughput: 0: 855.0. Samples: 741962. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:10:09,607][01623] Avg episode reward: [(0, '21.709')] -[2023-02-24 10:10:13,855][15474] Updated weights for policy 0, policy_version 730 (0.0015) -[2023-02-24 10:10:14,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 2990080. Throughput: 0: 897.3. Samples: 747960. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:10:14,603][01623] Avg episode reward: [(0, '20.498')] -[2023-02-24 10:10:19,601][01623] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3014656. Throughput: 0: 899.9. Samples: 751286. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:10:19,604][01623] Avg episode reward: [(0, '21.542')] -[2023-02-24 10:10:24,603][01623] Fps is (10 sec: 3685.6, 60 sec: 3549.7, 300 sec: 3485.0). Total num frames: 3026944. Throughput: 0: 872.2. Samples: 756440. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:10:24,610][01623] Avg episode reward: [(0, '22.666')] -[2023-02-24 10:10:25,462][15474] Updated weights for policy 0, policy_version 740 (0.0013) -[2023-02-24 10:10:29,603][01623] Fps is (10 sec: 2457.1, 60 sec: 3481.5, 300 sec: 3485.1). Total num frames: 3039232. Throughput: 0: 869.5. Samples: 760472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:10:29,606][01623] Avg episode reward: [(0, '24.674')] -[2023-02-24 10:10:29,610][15460] Saving new best policy, reward=24.674! -[2023-02-24 10:10:34,601][01623] Fps is (10 sec: 3277.5, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3059712. Throughput: 0: 880.5. Samples: 762968. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:10:34,604][01623] Avg episode reward: [(0, '24.153')] -[2023-02-24 10:10:36,776][15474] Updated weights for policy 0, policy_version 750 (0.0014) -[2023-02-24 10:10:39,601][01623] Fps is (10 sec: 4506.5, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3084288. Throughput: 0: 897.2. Samples: 769592. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:10:39,604][01623] Avg episode reward: [(0, '23.399')] -[2023-02-24 10:10:44,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3096576. Throughput: 0: 871.3. Samples: 774932. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:10:44,608][01623] Avg episode reward: [(0, '21.750')] -[2023-02-24 10:10:49,245][15474] Updated weights for policy 0, policy_version 760 (0.0034) -[2023-02-24 10:10:49,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3112960. Throughput: 0: 872.8. Samples: 777062. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:10:49,603][01623] Avg episode reward: [(0, '19.816')] -[2023-02-24 10:10:54,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3133440. Throughput: 0: 889.6. Samples: 781994. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:10:54,609][01623] Avg episode reward: [(0, '17.983')] -[2023-02-24 10:10:59,206][15474] Updated weights for policy 0, policy_version 770 (0.0014) -[2023-02-24 10:10:59,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3153920. Throughput: 0: 903.1. Samples: 788598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:10:59,609][01623] Avg episode reward: [(0, '18.945')] -[2023-02-24 10:11:04,602][01623] Fps is (10 sec: 3686.0, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 3170304. Throughput: 0: 892.8. Samples: 791464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:11:04,608][01623] Avg episode reward: [(0, '19.522')] -[2023-02-24 10:11:04,620][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000774_3170304.pth... -[2023-02-24 10:11:04,758][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000570_2334720.pth -[2023-02-24 10:11:09,602][01623] Fps is (10 sec: 2457.3, 60 sec: 3481.5, 300 sec: 3471.2). Total num frames: 3178496. Throughput: 0: 853.0. Samples: 794826. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:11:09,605][01623] Avg episode reward: [(0, '19.659')] -[2023-02-24 10:11:14,601][01623] Fps is (10 sec: 2048.2, 60 sec: 3345.1, 300 sec: 3457.3). Total num frames: 3190784. Throughput: 0: 833.0. Samples: 797954. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:11:14,610][01623] Avg episode reward: [(0, '19.986')] -[2023-02-24 10:11:15,992][15474] Updated weights for policy 0, policy_version 780 (0.0059) -[2023-02-24 10:11:19,601][01623] Fps is (10 sec: 2457.9, 60 sec: 3140.3, 300 sec: 3429.5). Total num frames: 3203072. Throughput: 0: 812.4. Samples: 799524. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:11:19,604][01623] Avg episode reward: [(0, '20.589')] -[2023-02-24 10:11:24,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3276.9, 300 sec: 3429.5). Total num frames: 3223552. Throughput: 0: 796.8. Samples: 805450. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:11:24,604][01623] Avg episode reward: [(0, '21.398')] -[2023-02-24 10:11:26,498][15474] Updated weights for policy 0, policy_version 790 (0.0013) -[2023-02-24 10:11:29,606][01623] Fps is (10 sec: 4093.9, 60 sec: 3413.2, 300 sec: 3443.4). Total num frames: 3244032. Throughput: 0: 803.0. Samples: 811070. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:11:29,613][01623] Avg episode reward: [(0, '20.400')] -[2023-02-24 10:11:34,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3443.4). Total num frames: 3256320. Throughput: 0: 800.9. Samples: 813104. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:11:34,606][01623] Avg episode reward: [(0, '21.088')] -[2023-02-24 10:11:39,601][01623] Fps is (10 sec: 2868.7, 60 sec: 3140.3, 300 sec: 3429.5). Total num frames: 3272704. Throughput: 0: 777.4. Samples: 816976. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:11:39,606][01623] Avg episode reward: [(0, '20.692')] -[2023-02-24 10:11:40,214][15474] Updated weights for policy 0, policy_version 800 (0.0028) -[2023-02-24 10:11:44,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 3293184. Throughput: 0: 770.7. Samples: 823278. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:11:44,604][01623] Avg episode reward: [(0, '21.360')] -[2023-02-24 10:11:49,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3429.5). Total num frames: 3309568. Throughput: 0: 771.4. Samples: 826178. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:11:49,604][01623] Avg episode reward: [(0, '20.705')] -[2023-02-24 10:11:52,431][15474] Updated weights for policy 0, policy_version 810 (0.0012) -[2023-02-24 10:11:54,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3415.7). Total num frames: 3321856. Throughput: 0: 786.6. Samples: 830222. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:11:54,608][01623] Avg episode reward: [(0, '20.444')] -[2023-02-24 10:11:59,603][01623] Fps is (10 sec: 2866.6, 60 sec: 3071.9, 300 sec: 3401.7). Total num frames: 3338240. Throughput: 0: 807.9. Samples: 834312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:11:59,612][01623] Avg episode reward: [(0, '21.143')] -[2023-02-24 10:12:04,452][15474] Updated weights for policy 0, policy_version 820 (0.0021) -[2023-02-24 10:12:04,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 3401.8). Total num frames: 3358720. Throughput: 0: 842.8. Samples: 837450. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:12:04,604][01623] Avg episode reward: [(0, '21.276')] -[2023-02-24 10:12:09,602][01623] Fps is (10 sec: 3686.8, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 3375104. Throughput: 0: 848.2. Samples: 843618. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:12:09,607][01623] Avg episode reward: [(0, '21.101')] -[2023-02-24 10:12:14,608][01623] Fps is (10 sec: 2865.4, 60 sec: 3276.5, 300 sec: 3401.7). Total num frames: 3387392. Throughput: 0: 809.7. Samples: 847508. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:12:14,618][01623] Avg episode reward: [(0, '21.444')] -[2023-02-24 10:12:17,960][15474] Updated weights for policy 0, policy_version 830 (0.0036) -[2023-02-24 10:12:19,602][01623] Fps is (10 sec: 2867.4, 60 sec: 3345.0, 300 sec: 3387.9). Total num frames: 3403776. Throughput: 0: 811.2. Samples: 849610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:12:19,608][01623] Avg episode reward: [(0, '21.444')] -[2023-02-24 10:12:24,601][01623] Fps is (10 sec: 3688.7, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3424256. Throughput: 0: 852.8. Samples: 855350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:12:24,604][01623] Avg episode reward: [(0, '22.164')] -[2023-02-24 10:12:28,049][15474] Updated weights for policy 0, policy_version 840 (0.0019) -[2023-02-24 10:12:29,606][01623] Fps is (10 sec: 4094.1, 60 sec: 3345.1, 300 sec: 3443.4). Total num frames: 3444736. Throughput: 0: 845.7. Samples: 861340. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:12:29,614][01623] Avg episode reward: [(0, '21.595')] -[2023-02-24 10:12:34,603][01623] Fps is (10 sec: 3276.2, 60 sec: 3345.0, 300 sec: 3443.4). Total num frames: 3457024. Throughput: 0: 822.1. Samples: 863176. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:12:34,609][01623] Avg episode reward: [(0, '22.144')] -[2023-02-24 10:12:39,601][01623] Fps is (10 sec: 2458.9, 60 sec: 3276.8, 300 sec: 3429.5). Total num frames: 3469312. Throughput: 0: 815.2. Samples: 866906. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:12:39,604][01623] Avg episode reward: [(0, '22.594')] -[2023-02-24 10:12:42,299][15474] Updated weights for policy 0, policy_version 850 (0.0016) -[2023-02-24 10:12:44,601][01623] Fps is (10 sec: 3277.5, 60 sec: 3276.8, 300 sec: 3429.5). Total num frames: 3489792. Throughput: 0: 844.1. Samples: 872296. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:12:44,604][01623] Avg episode reward: [(0, '22.530')] -[2023-02-24 10:12:49,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3415.7). Total num frames: 3510272. Throughput: 0: 841.7. Samples: 875326. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:12:49,604][01623] Avg episode reward: [(0, '21.928')] -[2023-02-24 10:12:54,081][15474] Updated weights for policy 0, policy_version 860 (0.0020) -[2023-02-24 10:12:54,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3415.7). Total num frames: 3522560. Throughput: 0: 814.1. Samples: 880250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:12:54,605][01623] Avg episode reward: [(0, '22.352')] -[2023-02-24 10:12:59,601][01623] Fps is (10 sec: 2457.6, 60 sec: 3276.9, 300 sec: 3401.8). Total num frames: 3534848. Throughput: 0: 814.2. Samples: 884144. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:12:59,609][01623] Avg episode reward: [(0, '22.036')] -[2023-02-24 10:13:04,602][01623] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 3555328. Throughput: 0: 826.2. Samples: 886790. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:13:04,605][01623] Avg episode reward: [(0, '21.063')] -[2023-02-24 10:13:04,615][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000868_3555328.pth... -[2023-02-24 10:13:04,730][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000671_2748416.pth -[2023-02-24 10:13:06,324][15474] Updated weights for policy 0, policy_version 870 (0.0019) -[2023-02-24 10:13:09,602][01623] Fps is (10 sec: 4095.8, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3575808. Throughput: 0: 835.1. Samples: 892930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:13:09,609][01623] Avg episode reward: [(0, '23.167')] -[2023-02-24 10:13:14,601][01623] Fps is (10 sec: 3276.9, 60 sec: 3345.4, 300 sec: 3387.9). Total num frames: 3588096. Throughput: 0: 809.3. Samples: 897754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:13:14,604][01623] Avg episode reward: [(0, '23.379')] -[2023-02-24 10:13:19,369][15474] Updated weights for policy 0, policy_version 880 (0.0012) -[2023-02-24 10:13:19,601][01623] Fps is (10 sec: 2867.3, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3604480. Throughput: 0: 813.7. Samples: 899790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:13:19,604][01623] Avg episode reward: [(0, '24.310')] -[2023-02-24 10:13:24,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 3620864. Throughput: 0: 837.6. Samples: 904596. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:13:24,608][01623] Avg episode reward: [(0, '25.492')] -[2023-02-24 10:13:24,618][15460] Saving new best policy, reward=25.492! -[2023-02-24 10:13:29,601][01623] Fps is (10 sec: 3686.5, 60 sec: 3277.1, 300 sec: 3374.0). Total num frames: 3641344. Throughput: 0: 858.0. Samples: 910908. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:13:29,607][01623] Avg episode reward: [(0, '24.617')] -[2023-02-24 10:13:29,630][15474] Updated weights for policy 0, policy_version 890 (0.0018) -[2023-02-24 10:13:34,604][01623] Fps is (10 sec: 3685.2, 60 sec: 3345.0, 300 sec: 3374.0). Total num frames: 3657728. Throughput: 0: 855.7. Samples: 913836. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:13:34,609][01623] Avg episode reward: [(0, '26.436')] -[2023-02-24 10:13:34,628][15460] Saving new best policy, reward=26.436! -[2023-02-24 10:13:39,603][01623] Fps is (10 sec: 2866.7, 60 sec: 3345.0, 300 sec: 3374.0). Total num frames: 3670016. Throughput: 0: 832.0. Samples: 917692. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:13:39,612][01623] Avg episode reward: [(0, '27.368')] -[2023-02-24 10:13:39,620][15460] Saving new best policy, reward=27.368! -[2023-02-24 10:13:43,654][15474] Updated weights for policy 0, policy_version 900 (0.0013) -[2023-02-24 10:13:44,601][01623] Fps is (10 sec: 3277.9, 60 sec: 3345.1, 300 sec: 3374.0). Total num frames: 3690496. Throughput: 0: 848.8. Samples: 922338. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:13:44,610][01623] Avg episode reward: [(0, '27.946')] -[2023-02-24 10:13:44,625][15460] Saving new best policy, reward=27.946! -[2023-02-24 10:13:49,601][01623] Fps is (10 sec: 4096.6, 60 sec: 3345.1, 300 sec: 3360.1). Total num frames: 3710976. Throughput: 0: 859.9. Samples: 925484. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:13:49,608][01623] Avg episode reward: [(0, '28.720')] -[2023-02-24 10:13:49,611][15460] Saving new best policy, reward=28.720! -[2023-02-24 10:13:53,752][15474] Updated weights for policy 0, policy_version 910 (0.0026) -[2023-02-24 10:13:54,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3727360. Throughput: 0: 858.2. Samples: 931548. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:13:54,605][01623] Avg episode reward: [(0, '27.513')] -[2023-02-24 10:13:59,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3360.1). Total num frames: 3739648. Throughput: 0: 844.0. Samples: 935736. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:13:59,611][01623] Avg episode reward: [(0, '27.387')] -[2023-02-24 10:14:04,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3360.1). Total num frames: 3756032. Throughput: 0: 844.4. Samples: 937790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:14:04,604][01623] Avg episode reward: [(0, '28.952')] -[2023-02-24 10:14:04,615][15460] Saving new best policy, reward=28.952! -[2023-02-24 10:14:06,782][15474] Updated weights for policy 0, policy_version 920 (0.0022) -[2023-02-24 10:14:09,601][01623] Fps is (10 sec: 3686.3, 60 sec: 3345.1, 300 sec: 3346.2). Total num frames: 3776512. Throughput: 0: 870.1. Samples: 943750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:14:09,608][01623] Avg episode reward: [(0, '27.557')] -[2023-02-24 10:14:14,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3374.0). Total num frames: 3796992. Throughput: 0: 863.6. Samples: 949768. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:14:14,609][01623] Avg episode reward: [(0, '26.532')] -[2023-02-24 10:14:18,176][15474] Updated weights for policy 0, policy_version 930 (0.0016) -[2023-02-24 10:14:19,602][01623] Fps is (10 sec: 3276.7, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3809280. Throughput: 0: 843.1. Samples: 951774. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:14:19,609][01623] Avg episode reward: [(0, '26.854')] -[2023-02-24 10:14:24,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3825664. Throughput: 0: 848.5. Samples: 955872. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) -[2023-02-24 10:14:24,610][01623] Avg episode reward: [(0, '25.906')] -[2023-02-24 10:14:29,601][01623] Fps is (10 sec: 3686.5, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3846144. Throughput: 0: 874.9. Samples: 961710. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:14:29,610][01623] Avg episode reward: [(0, '27.590')] -[2023-02-24 10:14:30,188][15474] Updated weights for policy 0, policy_version 940 (0.0020) -[2023-02-24 10:14:34,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.8, 300 sec: 3374.0). Total num frames: 3866624. Throughput: 0: 873.6. Samples: 964796. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:14:34,604][01623] Avg episode reward: [(0, '26.828')] -[2023-02-24 10:14:39,601][01623] Fps is (10 sec: 3276.9, 60 sec: 3481.7, 300 sec: 3374.0). Total num frames: 3878912. Throughput: 0: 851.1. Samples: 969846. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:14:39,604][01623] Avg episode reward: [(0, '25.900')] -[2023-02-24 10:14:43,678][15474] Updated weights for policy 0, policy_version 950 (0.0022) -[2023-02-24 10:14:44,601][01623] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3360.1). Total num frames: 3891200. Throughput: 0: 835.8. Samples: 973346. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:14:44,606][01623] Avg episode reward: [(0, '26.237')] -[2023-02-24 10:14:49,604][01623] Fps is (10 sec: 2456.8, 60 sec: 3208.4, 300 sec: 3332.3). Total num frames: 3903488. Throughput: 0: 825.8. Samples: 974952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:14:49,607][01623] Avg episode reward: [(0, '25.226')] -[2023-02-24 10:14:54,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3318.5). Total num frames: 3919872. Throughput: 0: 780.5. Samples: 978874. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:14:54,608][01623] Avg episode reward: [(0, '24.419')] -[2023-02-24 10:14:57,279][15474] Updated weights for policy 0, policy_version 960 (0.0034) -[2023-02-24 10:14:59,601][01623] Fps is (10 sec: 3277.9, 60 sec: 3276.8, 300 sec: 3332.3). Total num frames: 3936256. Throughput: 0: 778.3. Samples: 984792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:14:59,605][01623] Avg episode reward: [(0, '24.662')] -[2023-02-24 10:15:04,607][01623] Fps is (10 sec: 3274.8, 60 sec: 3276.5, 300 sec: 3332.3). Total num frames: 3952640. Throughput: 0: 779.6. Samples: 986862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:15:04,610][01623] Avg episode reward: [(0, '24.905')] -[2023-02-24 10:15:04,628][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000965_3952640.pth... -[2023-02-24 10:15:04,773][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000774_3170304.pth -[2023-02-24 10:15:09,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3304.6). Total num frames: 3964928. Throughput: 0: 778.5. Samples: 990904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:15:09,604][01623] Avg episode reward: [(0, '25.501')] -[2023-02-24 10:15:10,712][15474] Updated weights for policy 0, policy_version 970 (0.0027) -[2023-02-24 10:15:14,601][01623] Fps is (10 sec: 3278.8, 60 sec: 3140.3, 300 sec: 3290.7). Total num frames: 3985408. Throughput: 0: 783.4. Samples: 996964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:15:14,608][01623] Avg episode reward: [(0, '24.148')] -[2023-02-24 10:15:19,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3276.8, 300 sec: 3318.5). Total num frames: 4005888. Throughput: 0: 782.5. Samples: 1000010. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:15:19,604][01623] Avg episode reward: [(0, '23.874')] -[2023-02-24 10:15:21,850][15474] Updated weights for policy 0, policy_version 980 (0.0015) -[2023-02-24 10:15:24,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3318.5). Total num frames: 4018176. Throughput: 0: 772.6. Samples: 1004612. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:15:24,604][01623] Avg episode reward: [(0, '24.898')] -[2023-02-24 10:15:29,602][01623] Fps is (10 sec: 2866.9, 60 sec: 3140.2, 300 sec: 3304.6). Total num frames: 4034560. Throughput: 0: 780.9. Samples: 1008488. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:15:29,609][01623] Avg episode reward: [(0, '24.381')] -[2023-02-24 10:15:34,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3072.0, 300 sec: 3276.8). Total num frames: 4050944. Throughput: 0: 803.7. Samples: 1011116. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) -[2023-02-24 10:15:34,606][01623] Avg episode reward: [(0, '23.606')] -[2023-02-24 10:15:34,727][15474] Updated weights for policy 0, policy_version 990 (0.0023) -[2023-02-24 10:15:39,601][01623] Fps is (10 sec: 3686.7, 60 sec: 3208.5, 300 sec: 3304.6). Total num frames: 4071424. Throughput: 0: 852.1. Samples: 1017218. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:15:39,611][01623] Avg episode reward: [(0, '22.623')] -[2023-02-24 10:15:44,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 4087808. Throughput: 0: 822.6. Samples: 1021808. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) -[2023-02-24 10:15:44,603][01623] Avg episode reward: [(0, '23.347')] -[2023-02-24 10:15:47,916][15474] Updated weights for policy 0, policy_version 1000 (0.0035) -[2023-02-24 10:15:49,601][01623] Fps is (10 sec: 2867.3, 60 sec: 3277.0, 300 sec: 3276.8). Total num frames: 4100096. Throughput: 0: 818.0. Samples: 1023666. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:15:49,609][01623] Avg episode reward: [(0, '23.191')] -[2023-02-24 10:15:54,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4116480. Throughput: 0: 833.9. Samples: 1028430. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:15:54,608][01623] Avg episode reward: [(0, '25.141')] -[2023-02-24 10:15:59,076][15474] Updated weights for policy 0, policy_version 1010 (0.0022) -[2023-02-24 10:15:59,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 4136960. Throughput: 0: 835.3. Samples: 1034554. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:15:59,609][01623] Avg episode reward: [(0, '25.842')] -[2023-02-24 10:16:04,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.4, 300 sec: 3304.6). Total num frames: 4153344. Throughput: 0: 823.3. Samples: 1037058. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:16:04,604][01623] Avg episode reward: [(0, '26.643')] -[2023-02-24 10:16:09,601][01623] Fps is (10 sec: 2867.1, 60 sec: 3345.1, 300 sec: 3304.6). Total num frames: 4165632. Throughput: 0: 805.3. Samples: 1040852. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:16:09,604][01623] Avg episode reward: [(0, '26.817')] -[2023-02-24 10:16:13,184][15474] Updated weights for policy 0, policy_version 1020 (0.0019) -[2023-02-24 10:16:14,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3318.5). Total num frames: 4182016. Throughput: 0: 824.2. Samples: 1045576. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:16:14,607][01623] Avg episode reward: [(0, '26.849')] -[2023-02-24 10:16:19,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3318.5). Total num frames: 4202496. Throughput: 0: 832.0. Samples: 1048558. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:16:19,606][01623] Avg episode reward: [(0, '27.046')] -[2023-02-24 10:16:24,323][15474] Updated weights for policy 0, policy_version 1030 (0.0027) -[2023-02-24 10:16:24,602][01623] Fps is (10 sec: 3686.3, 60 sec: 3345.0, 300 sec: 3304.6). Total num frames: 4218880. Throughput: 0: 819.2. Samples: 1054080. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:16:24,605][01623] Avg episode reward: [(0, '26.736')] -[2023-02-24 10:16:29,603][01623] Fps is (10 sec: 2866.7, 60 sec: 3276.8, 300 sec: 3304.5). Total num frames: 4231168. Throughput: 0: 810.2. Samples: 1058268. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:16:29,608][01623] Avg episode reward: [(0, '25.177')] -[2023-02-24 10:16:34,601][01623] Fps is (10 sec: 2867.3, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 4247552. Throughput: 0: 814.0. Samples: 1060294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:16:34,603][01623] Avg episode reward: [(0, '24.502')] -[2023-02-24 10:16:36,892][15474] Updated weights for policy 0, policy_version 1040 (0.0039) -[2023-02-24 10:16:39,601][01623] Fps is (10 sec: 3687.1, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 4268032. Throughput: 0: 844.4. Samples: 1066426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:16:39,610][01623] Avg episode reward: [(0, '25.174')] -[2023-02-24 10:16:44,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4288512. Throughput: 0: 836.5. Samples: 1072196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:16:44,610][01623] Avg episode reward: [(0, '24.707')] -[2023-02-24 10:16:48,937][15474] Updated weights for policy 0, policy_version 1050 (0.0014) -[2023-02-24 10:16:49,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4300800. Throughput: 0: 827.1. Samples: 1074276. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:16:49,610][01623] Avg episode reward: [(0, '26.092')] -[2023-02-24 10:16:54,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4317184. Throughput: 0: 834.7. Samples: 1078412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:16:54,603][01623] Avg episode reward: [(0, '26.249')] -[2023-02-24 10:16:59,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4337664. Throughput: 0: 868.2. Samples: 1084646. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:16:59,604][01623] Avg episode reward: [(0, '26.634')] -[2023-02-24 10:17:00,178][15474] Updated weights for policy 0, policy_version 1060 (0.0031) -[2023-02-24 10:17:04,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4354048. Throughput: 0: 869.4. Samples: 1087680. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:17:04,609][01623] Avg episode reward: [(0, '27.748')] -[2023-02-24 10:17:04,624][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001063_4354048.pth... -[2023-02-24 10:17:04,779][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000868_3555328.pth -[2023-02-24 10:17:09,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4366336. Throughput: 0: 841.5. Samples: 1091948. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:17:09,605][01623] Avg episode reward: [(0, '28.713')] -[2023-02-24 10:17:14,252][15474] Updated weights for policy 0, policy_version 1070 (0.0013) -[2023-02-24 10:17:14,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4382720. Throughput: 0: 836.7. Samples: 1095918. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:17:14,609][01623] Avg episode reward: [(0, '28.407')] -[2023-02-24 10:17:19,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4403200. Throughput: 0: 856.5. Samples: 1098838. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) -[2023-02-24 10:17:19,604][01623] Avg episode reward: [(0, '30.109')] -[2023-02-24 10:17:19,606][15460] Saving new best policy, reward=30.109! -[2023-02-24 10:17:24,248][15474] Updated weights for policy 0, policy_version 1080 (0.0013) -[2023-02-24 10:17:24,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3318.5). Total num frames: 4423680. Throughput: 0: 856.7. Samples: 1104978. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:17:24,609][01623] Avg episode reward: [(0, '29.185')] -[2023-02-24 10:17:29,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3318.5). Total num frames: 4435968. Throughput: 0: 823.5. Samples: 1109254. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:17:29,606][01623] Avg episode reward: [(0, '28.348')] -[2023-02-24 10:17:34,601][01623] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4448256. Throughput: 0: 818.8. Samples: 1111124. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:17:34,610][01623] Avg episode reward: [(0, '27.903')] -[2023-02-24 10:17:38,757][15474] Updated weights for policy 0, policy_version 1090 (0.0040) -[2023-02-24 10:17:39,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 4464640. Throughput: 0: 830.8. Samples: 1115796. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:17:39,610][01623] Avg episode reward: [(0, '28.200')] -[2023-02-24 10:17:44,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 4485120. Throughput: 0: 820.6. Samples: 1121572. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:17:44,607][01623] Avg episode reward: [(0, '29.896')] -[2023-02-24 10:17:49,601][01623] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 4497408. Throughput: 0: 806.2. Samples: 1123958. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:17:49,611][01623] Avg episode reward: [(0, '29.531')] -[2023-02-24 10:17:51,520][15460] Stopping Batcher_0... -[2023-02-24 10:17:51,522][15460] Loop batcher_evt_loop terminating... -[2023-02-24 10:17:51,523][01623] Component Batcher_0 stopped! -[2023-02-24 10:17:51,528][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_4505600.pth... -[2023-02-24 10:17:51,545][15474] Updated weights for policy 0, policy_version 1100 (0.0014) -[2023-02-24 10:17:51,605][15474] Weights refcount: 2 0 -[2023-02-24 10:17:51,619][01623] Component InferenceWorker_p0-w0 stopped! -[2023-02-24 10:17:51,623][15474] Stopping InferenceWorker_p0-w0... -[2023-02-24 10:17:51,632][15474] Loop inference_proc0-0_evt_loop terminating... -[2023-02-24 10:17:51,674][01623] Component RolloutWorker_w7 stopped! -[2023-02-24 10:17:51,682][01623] Component RolloutWorker_w4 stopped! -[2023-02-24 10:17:51,682][15484] Stopping RolloutWorker_w4... -[2023-02-24 10:17:51,687][15484] Loop rollout_proc4_evt_loop terminating... -[2023-02-24 10:17:51,677][15486] Stopping RolloutWorker_w7... -[2023-02-24 10:17:51,704][15486] Loop rollout_proc7_evt_loop terminating... -[2023-02-24 10:17:51,707][01623] Component RolloutWorker_w3 stopped! -[2023-02-24 10:17:51,709][15482] Stopping RolloutWorker_w3... -[2023-02-24 10:17:51,710][15482] Loop rollout_proc3_evt_loop terminating... -[2023-02-24 10:17:51,729][01623] Component RolloutWorker_w5 stopped! -[2023-02-24 10:17:51,731][15483] Stopping RolloutWorker_w5... -[2023-02-24 10:17:51,737][01623] Component RolloutWorker_w1 stopped! -[2023-02-24 10:17:51,739][15476] Stopping RolloutWorker_w1... -[2023-02-24 10:17:51,739][15476] Loop rollout_proc1_evt_loop terminating... -[2023-02-24 10:17:51,751][15483] Loop rollout_proc5_evt_loop terminating... -[2023-02-24 10:17:51,763][15485] Stopping RolloutWorker_w6... -[2023-02-24 10:17:51,763][01623] Component RolloutWorker_w6 stopped! -[2023-02-24 10:17:51,779][15475] Stopping RolloutWorker_w0... -[2023-02-24 10:17:51,780][15475] Loop rollout_proc0_evt_loop terminating... -[2023-02-24 10:17:51,779][01623] Component RolloutWorker_w0 stopped! -[2023-02-24 10:17:51,788][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000965_3952640.pth -[2023-02-24 10:17:51,798][15485] Loop rollout_proc6_evt_loop terminating... -[2023-02-24 10:17:51,810][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_4505600.pth... -[2023-02-24 10:17:51,820][15481] Stopping RolloutWorker_w2... -[2023-02-24 10:17:51,821][15481] Loop rollout_proc2_evt_loop terminating... -[2023-02-24 10:17:51,820][01623] Component RolloutWorker_w2 stopped! -[2023-02-24 10:17:52,128][01623] Component LearnerWorker_p0 stopped! -[2023-02-24 10:17:52,130][01623] Waiting for process learner_proc0 to stop... -[2023-02-24 10:17:52,128][15460] Stopping LearnerWorker_p0... -[2023-02-24 10:17:52,133][15460] Loop learner_proc0_evt_loop terminating... -[2023-02-24 10:17:54,606][01623] Waiting for process inference_proc0-0 to join... -[2023-02-24 10:17:55,158][01623] Waiting for process rollout_proc0 to join... -[2023-02-24 10:17:55,837][01623] Waiting for process rollout_proc1 to join... -[2023-02-24 10:17:55,839][01623] Waiting for process rollout_proc2 to join... -[2023-02-24 10:17:55,842][01623] Waiting for process rollout_proc3 to join... -[2023-02-24 10:17:55,846][01623] Waiting for process rollout_proc4 to join... -[2023-02-24 10:17:55,850][01623] Waiting for process rollout_proc5 to join... -[2023-02-24 10:17:55,852][01623] Waiting for process rollout_proc6 to join... -[2023-02-24 10:17:55,854][01623] Waiting for process rollout_proc7 to join... -[2023-02-24 10:17:55,856][01623] Batcher 0 profile tree view: -batching: 30.6921, releasing_batches: 0.0303 -[2023-02-24 10:17:55,858][01623] InferenceWorker_p0-w0 profile tree view: -wait_policy: 0.0000 - wait_policy_total: 654.1631 -update_model: 9.7801 - weight_update: 0.0014 -one_step: 0.0027 - handle_policy_step: 625.2704 - deserialize: 17.8837, stack: 3.4432, obs_to_device_normalize: 134.9107, forward: 306.1076, send_messages: 30.4159 - prepare_outputs: 100.6715 - to_cpu: 61.9524 -[2023-02-24 10:17:55,859][01623] Learner 0 profile tree view: -misc: 0.0073, prepare_batch: 19.6645 -train: 86.4863 - epoch_init: 0.0120, minibatch_init: 0.0111, losses_postprocess: 0.6117, kl_divergence: 0.5941, after_optimizer: 37.0888 - calculate_losses: 30.5907 - losses_init: 0.0042, forward_head: 2.1203, bptt_initial: 20.0486, tail: 1.2396, advantages_returns: 0.3540, losses: 3.8575 - bptt: 2.5773 - bptt_forward_core: 2.4497 - update: 16.7731 - clip: 1.6500 -[2023-02-24 10:17:55,861][01623] RolloutWorker_w0 profile tree view: -wait_for_trajectories: 0.3763, enqueue_policy_requests: 186.0768, env_step: 996.3301, overhead: 26.8683, complete_rollouts: 8.7595 -save_policy_outputs: 25.3304 - split_output_tensors: 12.6970 -[2023-02-24 10:17:55,863][01623] RolloutWorker_w7 profile tree view: -wait_for_trajectories: 0.3390, enqueue_policy_requests: 189.8915, env_step: 997.6871, overhead: 26.5033, complete_rollouts: 8.6160 -save_policy_outputs: 24.7950 - split_output_tensors: 11.7159 -[2023-02-24 10:17:55,864][01623] Loop Runner_EvtLoop terminating... -[2023-02-24 10:17:55,871][01623] Runner profile tree view: -main_loop: 1367.0984 -[2023-02-24 10:17:55,875][01623] Collected {0: 4505600}, FPS: 3295.7 -[2023-02-24 10:21:22,492][01623] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json -[2023-02-24 10:21:22,493][01623] Overriding arg 'num_workers' with value 1 passed from command line -[2023-02-24 10:21:22,495][01623] Adding new argument 'no_render'=True that is not in the saved config file! -[2023-02-24 10:21:22,500][01623] Adding new argument 'save_video'=True that is not in the saved config file! -[2023-02-24 10:21:22,502][01623] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! -[2023-02-24 10:21:22,504][01623] Adding new argument 'video_name'=None that is not in the saved config file! -[2023-02-24 10:21:22,505][01623] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! -[2023-02-24 10:21:22,506][01623] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! -[2023-02-24 10:21:22,508][01623] Adding new argument 'push_to_hub'=False that is not in the saved config file! -[2023-02-24 10:21:22,510][01623] Adding new argument 'hf_repository'=None that is not in the saved config file! -[2023-02-24 10:21:22,512][01623] Adding new argument 'policy_index'=0 that is not in the saved config file! -[2023-02-24 10:21:22,514][01623] Adding new argument 'eval_deterministic'=False that is not in the saved config file! -[2023-02-24 10:21:22,517][01623] Adding new argument 'train_script'=None that is not in the saved config file! -[2023-02-24 10:21:22,525][01623] Adding new argument 'enjoy_script'=None that is not in the saved config file! -[2023-02-24 10:21:22,526][01623] Using frameskip 1 and render_action_repeat=4 for evaluation -[2023-02-24 10:21:22,550][01623] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 10:21:22,555][01623] RunningMeanStd input shape: (3, 72, 128) -[2023-02-24 10:21:22,559][01623] RunningMeanStd input shape: (1,) -[2023-02-24 10:21:22,577][01623] ConvEncoder: input_channels=3 -[2023-02-24 10:21:23,261][01623] Conv encoder output size: 512 -[2023-02-24 10:21:23,263][01623] Policy head output size: 512 -[2023-02-24 10:21:26,135][01623] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_4505600.pth... -[2023-02-24 10:21:27,796][01623] Num frames 100... -[2023-02-24 10:21:27,904][01623] Num frames 200... -[2023-02-24 10:21:28,015][01623] Num frames 300... -[2023-02-24 10:21:28,127][01623] Avg episode rewards: #0: 4.520, true rewards: #0: 3.520 -[2023-02-24 10:21:28,128][01623] Avg episode reward: 4.520, avg true_objective: 3.520 -[2023-02-24 10:21:28,194][01623] Num frames 400... -[2023-02-24 10:21:28,316][01623] Num frames 500... -[2023-02-24 10:21:28,436][01623] Num frames 600... -[2023-02-24 10:21:28,549][01623] Num frames 700... -[2023-02-24 10:21:28,663][01623] Num frames 800... -[2023-02-24 10:21:28,782][01623] Num frames 900... -[2023-02-24 10:21:28,870][01623] Avg episode rewards: #0: 8.100, true rewards: #0: 4.600 -[2023-02-24 10:21:28,872][01623] Avg episode reward: 8.100, avg true_objective: 4.600 -[2023-02-24 10:21:28,968][01623] Num frames 1000... -[2023-02-24 10:21:29,089][01623] Num frames 1100... -[2023-02-24 10:21:29,209][01623] Num frames 1200... -[2023-02-24 10:21:29,323][01623] Num frames 1300... -[2023-02-24 10:21:29,438][01623] Num frames 1400... -[2023-02-24 10:21:29,555][01623] Num frames 1500... -[2023-02-24 10:21:29,667][01623] Num frames 1600... -[2023-02-24 10:21:29,785][01623] Num frames 1700... -[2023-02-24 10:21:29,905][01623] Num frames 1800... -[2023-02-24 10:21:30,017][01623] Num frames 1900... -[2023-02-24 10:21:30,121][01623] Avg episode rewards: #0: 14.473, true rewards: #0: 6.473 -[2023-02-24 10:21:30,123][01623] Avg episode reward: 14.473, avg true_objective: 6.473 -[2023-02-24 10:21:30,201][01623] Num frames 2000... -[2023-02-24 10:21:30,315][01623] Num frames 2100... -[2023-02-24 10:21:30,431][01623] Num frames 2200... -[2023-02-24 10:21:30,553][01623] Num frames 2300... -[2023-02-24 10:21:30,671][01623] Num frames 2400... -[2023-02-24 10:21:30,789][01623] Num frames 2500... -[2023-02-24 10:21:30,909][01623] Num frames 2600... -[2023-02-24 10:21:31,018][01623] Num frames 2700... -[2023-02-24 10:21:31,137][01623] Num frames 2800... -[2023-02-24 10:21:31,256][01623] Num frames 2900... -[2023-02-24 10:21:31,370][01623] Num frames 3000... -[2023-02-24 10:21:31,487][01623] Num frames 3100... -[2023-02-24 10:21:31,568][01623] Avg episode rewards: #0: 18.553, true rewards: #0: 7.802 -[2023-02-24 10:21:31,569][01623] Avg episode reward: 18.553, avg true_objective: 7.802 -[2023-02-24 10:21:31,665][01623] Num frames 3200... -[2023-02-24 10:21:31,782][01623] Num frames 3300... -[2023-02-24 10:21:31,904][01623] Num frames 3400... -[2023-02-24 10:21:32,013][01623] Num frames 3500... -[2023-02-24 10:21:32,131][01623] Num frames 3600... -[2023-02-24 10:21:32,243][01623] Num frames 3700... -[2023-02-24 10:21:32,359][01623] Num frames 3800... -[2023-02-24 10:21:32,470][01623] Num frames 3900... -[2023-02-24 10:21:32,556][01623] Avg episode rewards: #0: 18.452, true rewards: #0: 7.852 -[2023-02-24 10:21:32,558][01623] Avg episode reward: 18.452, avg true_objective: 7.852 -[2023-02-24 10:21:32,644][01623] Num frames 4000... -[2023-02-24 10:21:32,764][01623] Num frames 4100... -[2023-02-24 10:21:32,889][01623] Num frames 4200... -[2023-02-24 10:21:32,998][01623] Num frames 4300... -[2023-02-24 10:21:33,106][01623] Num frames 4400... -[2023-02-24 10:21:33,222][01623] Num frames 4500... -[2023-02-24 10:21:33,336][01623] Num frames 4600... -[2023-02-24 10:21:33,447][01623] Num frames 4700... -[2023-02-24 10:21:33,558][01623] Num frames 4800... -[2023-02-24 10:21:33,671][01623] Num frames 4900... -[2023-02-24 10:21:33,792][01623] Num frames 5000... -[2023-02-24 10:21:33,906][01623] Avg episode rewards: #0: 20.413, true rewards: #0: 8.413 -[2023-02-24 10:21:33,909][01623] Avg episode reward: 20.413, avg true_objective: 8.413 -[2023-02-24 10:21:33,975][01623] Num frames 5100... -[2023-02-24 10:21:34,102][01623] Num frames 5200... -[2023-02-24 10:21:34,216][01623] Num frames 5300... -[2023-02-24 10:21:34,340][01623] Num frames 5400... -[2023-02-24 10:21:34,392][01623] Avg episode rewards: #0: 18.143, true rewards: #0: 7.714 -[2023-02-24 10:21:34,396][01623] Avg episode reward: 18.143, avg true_objective: 7.714 -[2023-02-24 10:21:34,522][01623] Num frames 5500... -[2023-02-24 10:21:34,638][01623] Num frames 5600... -[2023-02-24 10:21:34,757][01623] Num frames 5700... -[2023-02-24 10:21:34,884][01623] Num frames 5800... -[2023-02-24 10:21:34,999][01623] Num frames 5900... -[2023-02-24 10:21:35,109][01623] Num frames 6000... -[2023-02-24 10:21:35,245][01623] Avg episode rewards: #0: 17.215, true rewards: #0: 7.590 -[2023-02-24 10:21:35,247][01623] Avg episode reward: 17.215, avg true_objective: 7.590 -[2023-02-24 10:21:35,283][01623] Num frames 6100... -[2023-02-24 10:21:35,409][01623] Num frames 6200... -[2023-02-24 10:21:35,519][01623] Num frames 6300... -[2023-02-24 10:21:35,641][01623] Num frames 6400... -[2023-02-24 10:21:35,754][01623] Num frames 6500... -[2023-02-24 10:21:35,888][01623] Num frames 6600... -[2023-02-24 10:21:35,999][01623] Num frames 6700... -[2023-02-24 10:21:36,112][01623] Num frames 6800... -[2023-02-24 10:21:36,224][01623] Num frames 6900... -[2023-02-24 10:21:36,336][01623] Num frames 7000... -[2023-02-24 10:21:36,449][01623] Num frames 7100... -[2023-02-24 10:21:36,560][01623] Num frames 7200... -[2023-02-24 10:21:36,671][01623] Num frames 7300... -[2023-02-24 10:21:36,815][01623] Avg episode rewards: #0: 18.649, true rewards: #0: 8.204 -[2023-02-24 10:21:36,817][01623] Avg episode reward: 18.649, avg true_objective: 8.204 -[2023-02-24 10:21:36,840][01623] Num frames 7400... -[2023-02-24 10:21:36,961][01623] Num frames 7500... -[2023-02-24 10:21:37,089][01623] Num frames 7600... -[2023-02-24 10:21:37,207][01623] Num frames 7700... -[2023-02-24 10:21:37,320][01623] Num frames 7800... -[2023-02-24 10:21:37,435][01623] Num frames 7900... -[2023-02-24 10:21:37,568][01623] Num frames 8000... -[2023-02-24 10:21:37,724][01623] Num frames 8100... -[2023-02-24 10:21:37,882][01623] Num frames 8200... -[2023-02-24 10:21:38,036][01623] Num frames 8300... -[2023-02-24 10:21:38,190][01623] Num frames 8400... -[2023-02-24 10:21:38,356][01623] Num frames 8500... -[2023-02-24 10:21:38,510][01623] Num frames 8600... -[2023-02-24 10:21:38,674][01623] Num frames 8700... -[2023-02-24 10:21:38,775][01623] Avg episode rewards: #0: 20.028, true rewards: #0: 8.728 -[2023-02-24 10:21:38,777][01623] Avg episode reward: 20.028, avg true_objective: 8.728 -[2023-02-24 10:22:38,283][01623] Replay video saved to /content/train_dir/default_experiment/replay.mp4! -[2023-02-24 10:27:22,190][01623] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json -[2023-02-24 10:27:22,192][01623] Overriding arg 'num_workers' with value 1 passed from command line -[2023-02-24 10:27:22,194][01623] Adding new argument 'no_render'=True that is not in the saved config file! -[2023-02-24 10:27:22,199][01623] Adding new argument 'save_video'=True that is not in the saved config file! -[2023-02-24 10:27:22,201][01623] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! -[2023-02-24 10:27:22,203][01623] Adding new argument 'video_name'=None that is not in the saved config file! -[2023-02-24 10:27:22,204][01623] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! -[2023-02-24 10:27:22,206][01623] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! -[2023-02-24 10:27:22,208][01623] Adding new argument 'push_to_hub'=True that is not in the saved config file! -[2023-02-24 10:27:22,209][01623] Adding new argument 'hf_repository'='ThomasSimonini/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! -[2023-02-24 10:27:22,210][01623] Adding new argument 'policy_index'=0 that is not in the saved config file! -[2023-02-24 10:27:22,212][01623] Adding new argument 'eval_deterministic'=False that is not in the saved config file! -[2023-02-24 10:27:22,213][01623] Adding new argument 'train_script'=None that is not in the saved config file! -[2023-02-24 10:27:22,214][01623] Adding new argument 'enjoy_script'=None that is not in the saved config file! -[2023-02-24 10:27:22,216][01623] Using frameskip 1 and render_action_repeat=4 for evaluation -[2023-02-24 10:27:22,246][01623] RunningMeanStd input shape: (3, 72, 128) -[2023-02-24 10:27:22,249][01623] RunningMeanStd input shape: (1,) -[2023-02-24 10:27:22,273][01623] ConvEncoder: input_channels=3 -[2023-02-24 10:27:22,335][01623] Conv encoder output size: 512 -[2023-02-24 10:27:22,339][01623] Policy head output size: 512 -[2023-02-24 10:27:22,371][01623] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_4505600.pth... -[2023-02-24 10:27:23,003][01623] Num frames 100... -[2023-02-24 10:27:23,164][01623] Num frames 200... -[2023-02-24 10:27:23,329][01623] Num frames 300... -[2023-02-24 10:27:23,494][01623] Num frames 400... -[2023-02-24 10:27:23,654][01623] Num frames 500... -[2023-02-24 10:27:23,737][01623] Avg episode rewards: #0: 9.120, true rewards: #0: 5.120 -[2023-02-24 10:27:23,740][01623] Avg episode reward: 9.120, avg true_objective: 5.120 -[2023-02-24 10:27:23,877][01623] Num frames 600... -[2023-02-24 10:27:24,041][01623] Num frames 700... -[2023-02-24 10:27:24,190][01623] Num frames 800... -[2023-02-24 10:27:24,301][01623] Num frames 900... -[2023-02-24 10:27:24,418][01623] Num frames 1000... -[2023-02-24 10:27:24,531][01623] Num frames 1100... -[2023-02-24 10:27:24,644][01623] Num frames 1200... -[2023-02-24 10:27:24,760][01623] Num frames 1300... -[2023-02-24 10:27:24,875][01623] Num frames 1400... -[2023-02-24 10:27:24,995][01623] Num frames 1500... -[2023-02-24 10:27:25,106][01623] Num frames 1600... -[2023-02-24 10:27:25,223][01623] Num frames 1700... -[2023-02-24 10:27:25,339][01623] Num frames 1800... -[2023-02-24 10:27:25,450][01623] Num frames 1900... -[2023-02-24 10:27:25,514][01623] Avg episode rewards: #0: 21.525, true rewards: #0: 9.525 -[2023-02-24 10:27:25,517][01623] Avg episode reward: 21.525, avg true_objective: 9.525 -[2023-02-24 10:27:25,629][01623] Num frames 2000... -[2023-02-24 10:27:25,747][01623] Num frames 2100... -[2023-02-24 10:27:25,862][01623] Num frames 2200... -[2023-02-24 10:27:25,993][01623] Num frames 2300... -[2023-02-24 10:27:26,076][01623] Avg episode rewards: #0: 16.403, true rewards: #0: 7.737 -[2023-02-24 10:27:26,078][01623] Avg episode reward: 16.403, avg true_objective: 7.737 -[2023-02-24 10:27:26,171][01623] Num frames 2400... -[2023-02-24 10:27:26,294][01623] Num frames 2500... -[2023-02-24 10:27:26,411][01623] Num frames 2600... -[2023-02-24 10:27:26,529][01623] Num frames 2700... -[2023-02-24 10:27:26,638][01623] Num frames 2800... -[2023-02-24 10:27:26,749][01623] Num frames 2900... -[2023-02-24 10:27:26,857][01623] Num frames 3000... -[2023-02-24 10:27:26,987][01623] Num frames 3100... -[2023-02-24 10:27:27,097][01623] Num frames 3200... -[2023-02-24 10:27:27,205][01623] Num frames 3300... -[2023-02-24 10:27:27,319][01623] Num frames 3400... -[2023-02-24 10:27:27,430][01623] Num frames 3500... -[2023-02-24 10:27:27,540][01623] Num frames 3600... -[2023-02-24 10:27:27,650][01623] Num frames 3700... -[2023-02-24 10:27:27,767][01623] Num frames 3800... -[2023-02-24 10:27:27,882][01623] Num frames 3900... -[2023-02-24 10:27:28,000][01623] Num frames 4000... -[2023-02-24 10:27:28,111][01623] Num frames 4100... -[2023-02-24 10:27:28,222][01623] Num frames 4200... -[2023-02-24 10:27:28,337][01623] Num frames 4300... -[2023-02-24 10:27:28,455][01623] Num frames 4400... -[2023-02-24 10:27:28,535][01623] Avg episode rewards: #0: 27.552, true rewards: #0: 11.053 -[2023-02-24 10:27:28,537][01623] Avg episode reward: 27.552, avg true_objective: 11.053 -[2023-02-24 10:27:28,634][01623] Num frames 4500... -[2023-02-24 10:27:28,754][01623] Num frames 4600... -[2023-02-24 10:27:28,880][01623] Num frames 4700... -[2023-02-24 10:27:28,996][01623] Num frames 4800... -[2023-02-24 10:27:29,106][01623] Num frames 4900... -[2023-02-24 10:27:29,216][01623] Num frames 5000... -[2023-02-24 10:27:29,335][01623] Num frames 5100... -[2023-02-24 10:27:29,445][01623] Num frames 5200... -[2023-02-24 10:27:29,554][01623] Num frames 5300... -[2023-02-24 10:27:29,662][01623] Num frames 5400... -[2023-02-24 10:27:29,780][01623] Num frames 5500... -[2023-02-24 10:27:29,886][01623] Num frames 5600... -[2023-02-24 10:27:30,014][01623] Num frames 5700... -[2023-02-24 10:27:30,123][01623] Num frames 5800... -[2023-02-24 10:27:30,240][01623] Num frames 5900... -[2023-02-24 10:27:30,352][01623] Num frames 6000... -[2023-02-24 10:27:30,472][01623] Num frames 6100... -[2023-02-24 10:27:30,583][01623] Avg episode rewards: #0: 30.892, true rewards: #0: 12.292 -[2023-02-24 10:27:30,585][01623] Avg episode reward: 30.892, avg true_objective: 12.292 -[2023-02-24 10:27:30,657][01623] Num frames 6200... -[2023-02-24 10:27:30,781][01623] Num frames 6300... -[2023-02-24 10:27:30,898][01623] Num frames 6400... -[2023-02-24 10:27:31,007][01623] Num frames 6500... -[2023-02-24 10:27:31,138][01623] Num frames 6600... -[2023-02-24 10:27:31,249][01623] Num frames 6700... -[2023-02-24 10:27:31,367][01623] Num frames 6800... -[2023-02-24 10:27:31,479][01623] Num frames 6900... -[2023-02-24 10:27:31,594][01623] Num frames 7000... -[2023-02-24 10:27:31,705][01623] Num frames 7100... -[2023-02-24 10:27:31,815][01623] Num frames 7200... -[2023-02-24 10:27:31,933][01623] Num frames 7300... -[2023-02-24 10:27:32,054][01623] Num frames 7400... -[2023-02-24 10:27:32,163][01623] Num frames 7500... -[2023-02-24 10:27:32,282][01623] Num frames 7600... -[2023-02-24 10:27:32,393][01623] Num frames 7700... -[2023-02-24 10:27:32,505][01623] Num frames 7800... -[2023-02-24 10:27:32,617][01623] Num frames 7900... -[2023-02-24 10:27:32,738][01623] Num frames 8000... -[2023-02-24 10:27:32,830][01623] Avg episode rewards: #0: 33.223, true rewards: #0: 13.390 -[2023-02-24 10:27:32,831][01623] Avg episode reward: 33.223, avg true_objective: 13.390 -[2023-02-24 10:27:32,907][01623] Num frames 8100... -[2023-02-24 10:27:33,025][01623] Num frames 8200... -[2023-02-24 10:27:33,144][01623] Num frames 8300... -[2023-02-24 10:27:33,254][01623] Num frames 8400... -[2023-02-24 10:27:33,365][01623] Num frames 8500... -[2023-02-24 10:27:33,478][01623] Num frames 8600... -[2023-02-24 10:27:33,593][01623] Num frames 8700... -[2023-02-24 10:27:33,708][01623] Num frames 8800... -[2023-02-24 10:27:33,819][01623] Num frames 8900... -[2023-02-24 10:27:33,935][01623] Num frames 9000... -[2023-02-24 10:27:34,043][01623] Num frames 9100... -[2023-02-24 10:27:34,142][01623] Avg episode rewards: #0: 31.904, true rewards: #0: 13.047 -[2023-02-24 10:27:34,144][01623] Avg episode reward: 31.904, avg true_objective: 13.047 -[2023-02-24 10:27:34,246][01623] Num frames 9200... -[2023-02-24 10:27:34,409][01623] Num frames 9300... -[2023-02-24 10:27:34,566][01623] Num frames 9400... -[2023-02-24 10:27:34,726][01623] Num frames 9500... -[2023-02-24 10:27:34,883][01623] Num frames 9600... -[2023-02-24 10:27:35,037][01623] Num frames 9700... -[2023-02-24 10:27:35,201][01623] Num frames 9800... -[2023-02-24 10:27:35,357][01623] Num frames 9900... -[2023-02-24 10:27:35,510][01623] Num frames 10000... -[2023-02-24 10:27:35,668][01623] Num frames 10100... -[2023-02-24 10:27:35,836][01623] Num frames 10200... -[2023-02-24 10:27:35,995][01623] Num frames 10300... -[2023-02-24 10:27:36,160][01623] Num frames 10400... -[2023-02-24 10:27:36,323][01623] Num frames 10500... -[2023-02-24 10:27:36,399][01623] Avg episode rewards: #0: 32.011, true rewards: #0: 13.136 -[2023-02-24 10:27:36,402][01623] Avg episode reward: 32.011, avg true_objective: 13.136 -[2023-02-24 10:27:36,561][01623] Num frames 10600... -[2023-02-24 10:27:36,720][01623] Num frames 10700... -[2023-02-24 10:27:36,880][01623] Num frames 10800... -[2023-02-24 10:27:37,031][01623] Num frames 10900... -[2023-02-24 10:27:37,192][01623] Num frames 11000... -[2023-02-24 10:27:37,499][01623] Num frames 11100... -[2023-02-24 10:27:37,666][01623] Num frames 11200... -[2023-02-24 10:27:37,805][01623] Num frames 11300... -[2023-02-24 10:27:37,923][01623] Num frames 11400... -[2023-02-24 10:27:38,035][01623] Num frames 11500... -[2023-02-24 10:27:38,148][01623] Num frames 11600... -[2023-02-24 10:27:38,266][01623] Num frames 11700... -[2023-02-24 10:27:38,381][01623] Num frames 11800... -[2023-02-24 10:27:38,505][01623] Num frames 11900... -[2023-02-24 10:27:38,621][01623] Num frames 12000... -[2023-02-24 10:27:38,739][01623] Num frames 12100... -[2023-02-24 10:27:38,858][01623] Num frames 12200... -[2023-02-24 10:27:38,971][01623] Num frames 12300... -[2023-02-24 10:27:39,082][01623] Num frames 12400... -[2023-02-24 10:27:39,202][01623] Num frames 12500... -[2023-02-24 10:27:39,291][01623] Avg episode rewards: #0: 34.583, true rewards: #0: 13.917 -[2023-02-24 10:27:39,293][01623] Avg episode reward: 34.583, avg true_objective: 13.917 -[2023-02-24 10:27:39,383][01623] Num frames 12600... -[2023-02-24 10:27:39,504][01623] Num frames 12700... -[2023-02-24 10:27:39,620][01623] Num frames 12800... -[2023-02-24 10:27:39,732][01623] Num frames 12900... -[2023-02-24 10:27:39,843][01623] Num frames 13000... -[2023-02-24 10:27:39,955][01623] Num frames 13100... -[2023-02-24 10:27:40,070][01623] Num frames 13200... -[2023-02-24 10:27:40,181][01623] Num frames 13300... -[2023-02-24 10:27:40,305][01623] Num frames 13400... -[2023-02-24 10:27:40,418][01623] Num frames 13500... -[2023-02-24 10:27:40,533][01623] Num frames 13600... -[2023-02-24 10:27:40,647][01623] Num frames 13700... -[2023-02-24 10:27:40,762][01623] Num frames 13800... -[2023-02-24 10:27:40,880][01623] Num frames 13900... -[2023-02-24 10:27:40,989][01623] Num frames 14000... -[2023-02-24 10:27:41,078][01623] Avg episode rewards: #0: 34.829, true rewards: #0: 14.029 -[2023-02-24 10:27:41,079][01623] Avg episode reward: 34.829, avg true_objective: 14.029 -[2023-02-24 10:29:09,913][01623] Replay video saved to /content/train_dir/default_experiment/replay.mp4! -[2023-02-24 10:34:07,056][01623] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json -[2023-02-24 10:34:07,057][01623] Overriding arg 'num_workers' with value 1 passed from command line -[2023-02-24 10:34:07,060][01623] Adding new argument 'no_render'=True that is not in the saved config file! -[2023-02-24 10:34:07,062][01623] Adding new argument 'save_video'=True that is not in the saved config file! -[2023-02-24 10:34:07,064][01623] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! -[2023-02-24 10:34:07,065][01623] Adding new argument 'video_name'=None that is not in the saved config file! -[2023-02-24 10:34:07,067][01623] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! -[2023-02-24 10:34:07,068][01623] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! -[2023-02-24 10:34:07,069][01623] Adding new argument 'push_to_hub'=True that is not in the saved config file! -[2023-02-24 10:34:07,070][01623] Adding new argument 'hf_repository'='dbaibak/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! -[2023-02-24 10:34:07,072][01623] Adding new argument 'policy_index'=0 that is not in the saved config file! -[2023-02-24 10:34:07,073][01623] Adding new argument 'eval_deterministic'=False that is not in the saved config file! -[2023-02-24 10:34:07,074][01623] Adding new argument 'train_script'=None that is not in the saved config file! -[2023-02-24 10:34:07,075][01623] Adding new argument 'enjoy_script'=None that is not in the saved config file! -[2023-02-24 10:34:07,076][01623] Using frameskip 1 and render_action_repeat=4 for evaluation -[2023-02-24 10:34:07,106][01623] RunningMeanStd input shape: (3, 72, 128) -[2023-02-24 10:34:07,108][01623] RunningMeanStd input shape: (1,) -[2023-02-24 10:34:07,122][01623] ConvEncoder: input_channels=3 -[2023-02-24 10:34:07,156][01623] Conv encoder output size: 512 -[2023-02-24 10:34:07,158][01623] Policy head output size: 512 -[2023-02-24 10:34:07,178][01623] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_4505600.pth... -[2023-02-24 10:34:07,615][01623] Num frames 100... -[2023-02-24 10:34:07,729][01623] Num frames 200... -[2023-02-24 10:34:07,843][01623] Num frames 300... -[2023-02-24 10:34:07,956][01623] Num frames 400... -[2023-02-24 10:34:08,075][01623] Num frames 500... -[2023-02-24 10:34:08,185][01623] Num frames 600... -[2023-02-24 10:34:08,302][01623] Num frames 700... -[2023-02-24 10:34:08,435][01623] Num frames 800... -[2023-02-24 10:34:08,562][01623] Avg episode rewards: #0: 17.640, true rewards: #0: 8.640 -[2023-02-24 10:34:08,565][01623] Avg episode reward: 17.640, avg true_objective: 8.640 -[2023-02-24 10:34:08,612][01623] Num frames 900... -[2023-02-24 10:34:08,725][01623] Num frames 1000... -[2023-02-24 10:34:08,838][01623] Num frames 1100... -[2023-02-24 10:34:08,954][01623] Num frames 1200... -[2023-02-24 10:34:09,066][01623] Num frames 1300... -[2023-02-24 10:34:09,180][01623] Num frames 1400... -[2023-02-24 10:34:09,304][01623] Num frames 1500... -[2023-02-24 10:34:09,417][01623] Num frames 1600... -[2023-02-24 10:34:09,534][01623] Num frames 1700... -[2023-02-24 10:34:09,643][01623] Num frames 1800... -[2023-02-24 10:34:09,784][01623] Avg episode rewards: #0: 19.890, true rewards: #0: 9.390 -[2023-02-24 10:34:09,786][01623] Avg episode reward: 19.890, avg true_objective: 9.390 -[2023-02-24 10:34:09,817][01623] Num frames 1900... -[2023-02-24 10:34:09,941][01623] Num frames 2000... -[2023-02-24 10:34:10,060][01623] Num frames 2100... -[2023-02-24 10:34:10,182][01623] Num frames 2200... -[2023-02-24 10:34:10,294][01623] Num frames 2300... -[2023-02-24 10:34:10,406][01623] Num frames 2400... -[2023-02-24 10:34:10,517][01623] Num frames 2500... -[2023-02-24 10:34:10,624][01623] Num frames 2600... -[2023-02-24 10:34:10,735][01623] Num frames 2700... -[2023-02-24 10:34:10,845][01623] Num frames 2800... -[2023-02-24 10:34:10,973][01623] Num frames 2900... -[2023-02-24 10:34:11,084][01623] Num frames 3000... -[2023-02-24 10:34:11,204][01623] Num frames 3100... -[2023-02-24 10:34:11,338][01623] Num frames 3200... -[2023-02-24 10:34:11,456][01623] Num frames 3300... -[2023-02-24 10:34:11,574][01623] Num frames 3400... -[2023-02-24 10:34:11,690][01623] Num frames 3500... -[2023-02-24 10:34:11,810][01623] Num frames 3600... -[2023-02-24 10:34:11,930][01623] Num frames 3700... -[2023-02-24 10:34:12,054][01623] Num frames 3800... -[2023-02-24 10:34:12,167][01623] Num frames 3900... -[2023-02-24 10:34:12,355][01623] Avg episode rewards: #0: 32.926, true rewards: #0: 13.260 -[2023-02-24 10:34:12,357][01623] Avg episode reward: 32.926, avg true_objective: 13.260 -[2023-02-24 10:34:12,398][01623] Num frames 4000... -[2023-02-24 10:34:12,561][01623] Num frames 4100... -[2023-02-24 10:34:12,718][01623] Num frames 4200... -[2023-02-24 10:34:12,871][01623] Num frames 4300... -[2023-02-24 10:34:13,044][01623] Num frames 4400... -[2023-02-24 10:34:13,205][01623] Num frames 4500... -[2023-02-24 10:34:13,368][01623] Num frames 4600... -[2023-02-24 10:34:13,533][01623] Num frames 4700... -[2023-02-24 10:34:13,702][01623] Num frames 4800... -[2023-02-24 10:34:13,863][01623] Num frames 4900... -[2023-02-24 10:34:14,026][01623] Num frames 5000... -[2023-02-24 10:34:14,184][01623] Num frames 5100... -[2023-02-24 10:34:14,344][01623] Num frames 5200... -[2023-02-24 10:34:14,445][01623] Avg episode rewards: #0: 32.065, true rewards: #0: 13.065 -[2023-02-24 10:34:14,447][01623] Avg episode reward: 32.065, avg true_objective: 13.065 -[2023-02-24 10:34:14,573][01623] Num frames 5300... -[2023-02-24 10:34:14,731][01623] Num frames 5400... -[2023-02-24 10:34:14,886][01623] Num frames 5500... -[2023-02-24 10:34:15,050][01623] Num frames 5600... -[2023-02-24 10:34:15,177][01623] Avg episode rewards: #0: 26.884, true rewards: #0: 11.284 -[2023-02-24 10:34:15,180][01623] Avg episode reward: 26.884, avg true_objective: 11.284 -[2023-02-24 10:34:15,284][01623] Num frames 5700... -[2023-02-24 10:34:15,447][01623] Num frames 5800... -[2023-02-24 10:34:15,610][01623] Num frames 5900... -[2023-02-24 10:34:15,717][01623] Avg episode rewards: #0: 23.217, true rewards: #0: 9.883 -[2023-02-24 10:34:15,720][01623] Avg episode reward: 23.217, avg true_objective: 9.883 -[2023-02-24 10:34:15,812][01623] Num frames 6000... -[2023-02-24 10:34:15,927][01623] Num frames 6100... -[2023-02-24 10:34:16,040][01623] Num frames 6200... -[2023-02-24 10:34:16,155][01623] Num frames 6300... -[2023-02-24 10:34:16,268][01623] Num frames 6400... -[2023-02-24 10:34:16,401][01623] Avg episode rewards: #0: 21.100, true rewards: #0: 9.243 -[2023-02-24 10:34:16,403][01623] Avg episode reward: 21.100, avg true_objective: 9.243 -[2023-02-24 10:34:16,440][01623] Num frames 6500... -[2023-02-24 10:34:16,571][01623] Num frames 6600... -[2023-02-24 10:34:16,685][01623] Num frames 6700... -[2023-02-24 10:34:16,796][01623] Num frames 6800... -[2023-02-24 10:34:16,905][01623] Num frames 6900... -[2023-02-24 10:34:17,018][01623] Num frames 7000... -[2023-02-24 10:34:17,190][01623] Avg episode rewards: #0: 19.874, true rewards: #0: 8.874 -[2023-02-24 10:34:17,193][01623] Avg episode reward: 19.874, avg true_objective: 8.874 -[2023-02-24 10:34:17,197][01623] Num frames 7100... -[2023-02-24 10:34:17,318][01623] Num frames 7200... -[2023-02-24 10:34:17,441][01623] Num frames 7300... -[2023-02-24 10:34:17,553][01623] Num frames 7400... -[2023-02-24 10:34:17,662][01623] Num frames 7500... -[2023-02-24 10:34:17,775][01623] Num frames 7600... -[2023-02-24 10:34:17,886][01623] Num frames 7700... -[2023-02-24 10:34:17,997][01623] Num frames 7800... -[2023-02-24 10:34:18,115][01623] Num frames 7900... -[2023-02-24 10:34:18,225][01623] Num frames 8000... -[2023-02-24 10:34:18,307][01623] Avg episode rewards: #0: 19.909, true rewards: #0: 8.909 -[2023-02-24 10:34:18,310][01623] Avg episode reward: 19.909, avg true_objective: 8.909 -[2023-02-24 10:34:18,408][01623] Num frames 8100... -[2023-02-24 10:34:18,523][01623] Num frames 8200... -[2023-02-24 10:34:18,637][01623] Num frames 8300... -[2023-02-24 10:34:18,755][01623] Num frames 8400... -[2023-02-24 10:34:18,863][01623] Num frames 8500... -[2023-02-24 10:34:18,969][01623] Num frames 8600... -[2023-02-24 10:34:19,082][01623] Num frames 8700... -[2023-02-24 10:34:19,203][01623] Num frames 8800... -[2023-02-24 10:34:19,314][01623] Num frames 8900... -[2023-02-24 10:34:19,426][01623] Num frames 9000... -[2023-02-24 10:34:19,537][01623] Num frames 9100... -[2023-02-24 10:34:19,662][01623] Num frames 9200... -[2023-02-24 10:34:19,773][01623] Num frames 9300... -[2023-02-24 10:34:19,881][01623] Num frames 9400... -[2023-02-24 10:34:19,965][01623] Avg episode rewards: #0: 21.426, true rewards: #0: 9.426 -[2023-02-24 10:34:19,967][01623] Avg episode reward: 21.426, avg true_objective: 9.426 -[2023-02-24 10:35:20,550][01623] Replay video saved to /content/train_dir/default_experiment/replay.mp4! -[2023-02-24 10:35:37,448][01623] The model has been pushed to https://huggingface.co/dbaibak/rl_course_vizdoom_health_gathering_supreme -[2023-02-24 10:38:14,091][01623] Environment doom_basic already registered, overwriting... -[2023-02-24 10:38:14,094][01623] Environment doom_two_colors_easy already registered, overwriting... -[2023-02-24 10:38:14,096][01623] Environment doom_two_colors_hard already registered, overwriting... -[2023-02-24 10:38:14,097][01623] Environment doom_dm already registered, overwriting... -[2023-02-24 10:38:14,103][01623] Environment doom_dwango5 already registered, overwriting... -[2023-02-24 10:38:14,106][01623] Environment doom_my_way_home_flat_actions already registered, overwriting... -[2023-02-24 10:38:14,107][01623] Environment doom_defend_the_center_flat_actions already registered, overwriting... -[2023-02-24 10:38:14,109][01623] Environment doom_my_way_home already registered, overwriting... -[2023-02-24 10:38:14,110][01623] Environment doom_deadly_corridor already registered, overwriting... -[2023-02-24 10:38:14,112][01623] Environment doom_defend_the_center already registered, overwriting... -[2023-02-24 10:38:14,114][01623] Environment doom_defend_the_line already registered, overwriting... -[2023-02-24 10:38:14,116][01623] Environment doom_health_gathering already registered, overwriting... -[2023-02-24 10:38:14,117][01623] Environment doom_health_gathering_supreme already registered, overwriting... -[2023-02-24 10:38:14,119][01623] Environment doom_battle already registered, overwriting... -[2023-02-24 10:38:14,120][01623] Environment doom_battle2 already registered, overwriting... -[2023-02-24 10:38:14,122][01623] Environment doom_duel_bots already registered, overwriting... -[2023-02-24 10:38:14,124][01623] Environment doom_deathmatch_bots already registered, overwriting... -[2023-02-24 10:38:14,127][01623] Environment doom_duel already registered, overwriting... -[2023-02-24 10:38:14,129][01623] Environment doom_deathmatch_full already registered, overwriting... -[2023-02-24 10:38:14,131][01623] Environment doom_benchmark already registered, overwriting... -[2023-02-24 10:38:14,133][01623] register_encoder_factory: -[2023-02-24 10:38:14,160][01623] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json -[2023-02-24 10:38:14,162][01623] Overriding arg 'train_for_env_steps' with value 10000000 passed from command line -[2023-02-24 10:38:14,169][01623] Experiment dir /content/train_dir/default_experiment already exists! -[2023-02-24 10:38:14,170][01623] Resuming existing experiment from /content/train_dir/default_experiment... -[2023-02-24 10:38:14,172][01623] Weights and Biases integration disabled -[2023-02-24 10:38:14,176][01623] Environment var CUDA_VISIBLE_DEVICES is 0 - -[2023-02-24 10:38:15,667][01623] Starting experiment with the following configuration: -help=False -algo=APPO -env=doom_health_gathering_supreme -experiment=default_experiment -train_dir=/content/train_dir -restart_behavior=resume -device=gpu -seed=None -num_policies=1 -async_rl=True -serial_mode=False -batched_sampling=False -num_batches_to_accumulate=2 -worker_num_splits=2 -policy_workers_per_policy=1 -max_policy_lag=1000 -num_workers=8 -num_envs_per_worker=4 -batch_size=1024 -num_batches_per_epoch=1 -num_epochs=1 -rollout=32 -recurrence=32 -shuffle_minibatches=False -gamma=0.99 -reward_scale=1.0 -reward_clip=1000.0 -value_bootstrap=False -normalize_returns=True -exploration_loss_coeff=0.001 -value_loss_coeff=0.5 -kl_loss_coeff=0.0 -exploration_loss=symmetric_kl -gae_lambda=0.95 -ppo_clip_ratio=0.1 -ppo_clip_value=0.2 -with_vtrace=False -vtrace_rho=1.0 -vtrace_c=1.0 -optimizer=adam -adam_eps=1e-06 -adam_beta1=0.9 -adam_beta2=0.999 -max_grad_norm=4.0 -learning_rate=0.0001 -lr_schedule=constant -lr_schedule_kl_threshold=0.008 -lr_adaptive_min=1e-06 -lr_adaptive_max=0.01 -obs_subtract_mean=0.0 -obs_scale=255.0 -normalize_input=True -normalize_input_keys=None -decorrelate_experience_max_seconds=0 -decorrelate_envs_on_one_worker=True -actor_worker_gpus=[] -set_workers_cpu_affinity=True -force_envs_single_thread=False -default_niceness=0 -log_to_file=True -experiment_summaries_interval=10 -flush_summaries_interval=30 -stats_avg=100 -summaries_use_frameskip=True -heartbeat_interval=20 -heartbeat_reporting_interval=600 -train_for_env_steps=10000000 -train_for_seconds=10000000000 -save_every_sec=120 -keep_checkpoints=2 -load_checkpoint_kind=latest -save_milestones_sec=-1 -save_best_every_sec=5 -save_best_metric=reward -save_best_after=100000 -benchmark=False -encoder_mlp_layers=[512, 512] -encoder_conv_architecture=convnet_simple -encoder_conv_mlp_layers=[512] -use_rnn=True -rnn_size=512 -rnn_type=gru -rnn_num_layers=1 -decoder_mlp_layers=[] -nonlinearity=elu -policy_initialization=orthogonal -policy_init_gain=1.0 -actor_critic_share_weights=True -adaptive_stddev=True -continuous_tanh_scale=0.0 -initial_stddev=1.0 -use_env_info_cache=False -env_gpu_actions=False -env_gpu_observations=True -env_frameskip=4 -env_framestack=1 -pixel_format=CHW -use_record_episode_statistics=False -with_wandb=False -wandb_user=None -wandb_project=sample_factory -wandb_group=None -wandb_job_type=SF -wandb_tags=[] -with_pbt=False -pbt_mix_policies_in_one_env=True -pbt_period_env_steps=5000000 -pbt_start_mutation=20000000 -pbt_replace_fraction=0.3 -pbt_mutation_rate=0.15 -pbt_replace_reward_gap=0.1 -pbt_replace_reward_gap_absolute=1e-06 -pbt_optimize_gamma=False -pbt_target_objective=true_objective -pbt_perturb_min=1.1 -pbt_perturb_max=1.5 -num_agents=-1 -num_humans=0 -num_bots=-1 -start_bot_difficulty=None -timelimit=None -res_w=128 -res_h=72 -wide_aspect_ratio=False -eval_env_frameskip=1 -fps=35 -command_line=--env=doom_health_gathering_supreme --num_workers=8 --num_envs_per_worker=4 --train_for_env_steps=4500000 -cli_args={'env': 'doom_health_gathering_supreme', 'num_workers': 8, 'num_envs_per_worker': 4, 'train_for_env_steps': 4500000} -git_hash=unknown -git_repo_name=not a git repository -[2023-02-24 10:38:15,671][01623] Saving configuration to /content/train_dir/default_experiment/config.json... -[2023-02-24 10:38:15,677][01623] Rollout worker 0 uses device cpu -[2023-02-24 10:38:15,678][01623] Rollout worker 1 uses device cpu -[2023-02-24 10:38:15,682][01623] Rollout worker 2 uses device cpu -[2023-02-24 10:38:15,684][01623] Rollout worker 3 uses device cpu -[2023-02-24 10:38:15,689][01623] Rollout worker 4 uses device cpu -[2023-02-24 10:38:15,690][01623] Rollout worker 5 uses device cpu -[2023-02-24 10:38:15,692][01623] Rollout worker 6 uses device cpu -[2023-02-24 10:38:15,694][01623] Rollout worker 7 uses device cpu -[2023-02-24 10:38:15,809][01623] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-24 10:38:15,812][01623] InferenceWorker_p0-w0: min num requests: 2 -[2023-02-24 10:38:15,842][01623] Starting all processes... -[2023-02-24 10:38:15,843][01623] Starting process learner_proc0 -[2023-02-24 10:38:15,988][01623] Starting all processes... -[2023-02-24 10:38:15,996][01623] Starting process inference_proc0-0 -[2023-02-24 10:38:15,996][01623] Starting process rollout_proc0 -[2023-02-24 10:38:15,998][01623] Starting process rollout_proc1 -[2023-02-24 10:38:15,998][01623] Starting process rollout_proc2 -[2023-02-24 10:38:16,076][01623] Starting process rollout_proc3 -[2023-02-24 10:38:16,083][01623] Starting process rollout_proc4 -[2023-02-24 10:38:16,083][01623] Starting process rollout_proc5 -[2023-02-24 10:38:16,083][01623] Starting process rollout_proc6 -[2023-02-24 10:38:16,083][01623] Starting process rollout_proc7 -[2023-02-24 10:38:25,215][28910] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-24 10:38:25,219][28910] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 -[2023-02-24 10:38:25,245][28910] Num visible devices: 1 -[2023-02-24 10:38:25,269][28910] Starting seed is not provided -[2023-02-24 10:38:25,270][28910] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-24 10:38:25,271][28910] Initializing actor-critic model on device cuda:0 -[2023-02-24 10:38:25,272][28910] RunningMeanStd input shape: (3, 72, 128) -[2023-02-24 10:38:25,273][28910] RunningMeanStd input shape: (1,) -[2023-02-24 10:38:25,329][28910] ConvEncoder: input_channels=3 -[2023-02-24 10:38:26,185][28910] Conv encoder output size: 512 -[2023-02-24 10:38:26,191][28910] Policy head output size: 512 -[2023-02-24 10:38:26,264][28910] Created Actor Critic model with architecture: -[2023-02-24 10:38:26,283][28910] ActorCriticSharedWeights( - (obs_normalizer): ObservationNormalizer( - (running_mean_std): RunningMeanStdDictInPlace( - (running_mean_std): ModuleDict( - (obs): RunningMeanStdInPlace() - ) - ) - ) - (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) - (encoder): VizdoomEncoder( - (basic_encoder): ConvEncoder( - (enc): RecursiveScriptModule( - original_name=ConvEncoderImpl - (conv_head): RecursiveScriptModule( - original_name=Sequential - (0): RecursiveScriptModule(original_name=Conv2d) - (1): RecursiveScriptModule(original_name=ELU) - (2): RecursiveScriptModule(original_name=Conv2d) - (3): RecursiveScriptModule(original_name=ELU) - (4): RecursiveScriptModule(original_name=Conv2d) - (5): RecursiveScriptModule(original_name=ELU) - ) - (mlp_layers): RecursiveScriptModule( - original_name=Sequential - (0): RecursiveScriptModule(original_name=Linear) - (1): RecursiveScriptModule(original_name=ELU) - ) - ) - ) - ) - (core): ModelCoreRNN( - (core): GRU(512, 512) - ) - (decoder): MlpDecoder( - (mlp): Identity() - ) - (critic_linear): Linear(in_features=512, out_features=1, bias=True) - (action_parameterization): ActionParameterizationDefault( - (distribution_linear): Linear(in_features=512, out_features=5, bias=True) - ) -) -[2023-02-24 10:38:27,193][28926] Worker 0 uses CPU cores [0] -[2023-02-24 10:38:27,416][28925] Worker 1 uses CPU cores [1] -[2023-02-24 10:38:27,475][28924] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-24 10:38:27,477][28924] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 -[2023-02-24 10:38:27,538][28924] Num visible devices: 1 -[2023-02-24 10:38:27,857][28927] Worker 3 uses CPU cores [1] -[2023-02-24 10:38:28,100][28937] Worker 2 uses CPU cores [0] -[2023-02-24 10:38:28,226][28931] Worker 4 uses CPU cores [0] -[2023-02-24 10:38:28,248][28939] Worker 5 uses CPU cores [1] -[2023-02-24 10:38:28,369][28943] Worker 6 uses CPU cores [0] -[2023-02-24 10:38:28,460][28941] Worker 7 uses CPU cores [1] -[2023-02-24 10:38:30,388][28910] Using optimizer -[2023-02-24 10:38:30,389][28910] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_4505600.pth... -[2023-02-24 10:38:30,422][28910] Loading model from checkpoint -[2023-02-24 10:38:30,426][28910] Loaded experiment state at self.train_step=1100, self.env_steps=4505600 -[2023-02-24 10:38:30,427][28910] Initialized policy 0 weights for model version 1100 -[2023-02-24 10:38:30,430][28910] LearnerWorker_p0 finished initialization! -[2023-02-24 10:38:30,432][28910] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-02-24 10:38:30,573][28924] RunningMeanStd input shape: (3, 72, 128) -[2023-02-24 10:38:30,574][28924] RunningMeanStd input shape: (1,) -[2023-02-24 10:38:30,586][28924] ConvEncoder: input_channels=3 -[2023-02-24 10:38:30,684][28924] Conv encoder output size: 512 -[2023-02-24 10:38:30,684][28924] Policy head output size: 512 -[2023-02-24 10:38:32,989][01623] Inference worker 0-0 is ready! -[2023-02-24 10:38:32,992][01623] All inference workers are ready! Signal rollout workers to start! -[2023-02-24 10:38:33,115][28926] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 10:38:33,118][28931] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 10:38:33,122][28937] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 10:38:33,144][28925] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 10:38:33,144][28927] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 10:38:33,154][28941] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 10:38:33,160][28939] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 10:38:33,163][28943] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-02-24 10:38:33,657][28941] Decorrelating experience for 0 frames... -[2023-02-24 10:38:34,177][01623] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 4505600. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-02-24 10:38:34,387][28939] Decorrelating experience for 0 frames... -[2023-02-24 10:38:34,389][28941] Decorrelating experience for 32 frames... -[2023-02-24 10:38:34,655][28926] Decorrelating experience for 0 frames... -[2023-02-24 10:38:34,666][28931] Decorrelating experience for 0 frames... -[2023-02-24 10:38:34,682][28943] Decorrelating experience for 0 frames... -[2023-02-24 10:38:34,680][28937] Decorrelating experience for 0 frames... -[2023-02-24 10:38:35,313][28939] Decorrelating experience for 32 frames... -[2023-02-24 10:38:35,805][01623] Heartbeat connected on Batcher_0 -[2023-02-24 10:38:35,808][01623] Heartbeat connected on LearnerWorker_p0 -[2023-02-24 10:38:35,836][01623] Heartbeat connected on InferenceWorker_p0-w0 -[2023-02-24 10:38:35,946][28926] Decorrelating experience for 32 frames... -[2023-02-24 10:38:35,986][28937] Decorrelating experience for 32 frames... -[2023-02-24 10:38:36,160][28941] Decorrelating experience for 64 frames... -[2023-02-24 10:38:36,179][28927] Decorrelating experience for 0 frames... -[2023-02-24 10:38:36,951][28943] Decorrelating experience for 32 frames... -[2023-02-24 10:38:37,596][28939] Decorrelating experience for 64 frames... -[2023-02-24 10:38:37,836][28925] Decorrelating experience for 0 frames... -[2023-02-24 10:38:37,867][28927] Decorrelating experience for 32 frames... -[2023-02-24 10:38:37,995][28926] Decorrelating experience for 64 frames... -[2023-02-24 10:38:38,027][28937] Decorrelating experience for 64 frames... -[2023-02-24 10:38:38,803][28943] Decorrelating experience for 64 frames... -[2023-02-24 10:38:39,029][28939] Decorrelating experience for 96 frames... -[2023-02-24 10:38:39,176][01623] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4505600. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-02-24 10:38:39,197][28931] Decorrelating experience for 32 frames... -[2023-02-24 10:38:39,384][01623] Heartbeat connected on RolloutWorker_w5 -[2023-02-24 10:38:39,690][28925] Decorrelating experience for 32 frames... -[2023-02-24 10:38:39,993][28937] Decorrelating experience for 96 frames... -[2023-02-24 10:38:40,616][01623] Heartbeat connected on RolloutWorker_w2 -[2023-02-24 10:38:41,112][28927] Decorrelating experience for 64 frames... -[2023-02-24 10:38:41,762][28925] Decorrelating experience for 64 frames... -[2023-02-24 10:38:42,447][28943] Decorrelating experience for 96 frames... -[2023-02-24 10:38:42,521][28927] Decorrelating experience for 96 frames... -[2023-02-24 10:38:42,758][01623] Heartbeat connected on RolloutWorker_w3 -[2023-02-24 10:38:43,095][01623] Heartbeat connected on RolloutWorker_w6 -[2023-02-24 10:38:43,092][28926] Decorrelating experience for 96 frames... -[2023-02-24 10:38:43,979][01623] Heartbeat connected on RolloutWorker_w0 -[2023-02-24 10:38:44,177][01623] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4505600. Throughput: 0: 6.4. Samples: 64. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-02-24 10:38:44,185][01623] Avg episode reward: [(0, '3.288')] -[2023-02-24 10:38:45,174][28931] Decorrelating experience for 64 frames... -[2023-02-24 10:38:45,212][28941] Decorrelating experience for 96 frames... -[2023-02-24 10:38:45,609][01623] Heartbeat connected on RolloutWorker_w7 -[2023-02-24 10:38:46,374][28910] Signal inference workers to stop experience collection... -[2023-02-24 10:38:46,388][28924] InferenceWorker_p0-w0: stopping experience collection -[2023-02-24 10:38:47,033][28931] Decorrelating experience for 96 frames... -[2023-02-24 10:38:47,111][01623] Heartbeat connected on RolloutWorker_w4 -[2023-02-24 10:38:47,127][28925] Decorrelating experience for 96 frames... -[2023-02-24 10:38:47,206][01623] Heartbeat connected on RolloutWorker_w1 -[2023-02-24 10:38:49,180][01623] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4505600. Throughput: 0: 160.4. Samples: 2406. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-02-24 10:38:49,182][01623] Avg episode reward: [(0, '4.869')] -[2023-02-24 10:38:49,426][28910] Signal inference workers to resume experience collection... -[2023-02-24 10:38:49,427][28924] InferenceWorker_p0-w0: resuming experience collection -[2023-02-24 10:38:54,177][01623] Fps is (10 sec: 2048.0, 60 sec: 1024.0, 300 sec: 1024.0). Total num frames: 4526080. Throughput: 0: 322.3. Samples: 6446. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) -[2023-02-24 10:38:54,182][01623] Avg episode reward: [(0, '9.415')] -[2023-02-24 10:38:59,177][01623] Fps is (10 sec: 3687.2, 60 sec: 1474.6, 300 sec: 1474.6). Total num frames: 4542464. Throughput: 0: 338.3. Samples: 8458. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:38:59,196][01623] Avg episode reward: [(0, '11.976')] -[2023-02-24 10:39:00,348][28924] Updated weights for policy 0, policy_version 1110 (0.0018) -[2023-02-24 10:39:04,177][01623] Fps is (10 sec: 2867.2, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 4554752. Throughput: 0: 421.9. Samples: 12656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:39:04,179][01623] Avg episode reward: [(0, '14.624')] -[2023-02-24 10:39:09,177][01623] Fps is (10 sec: 3276.8, 60 sec: 1989.5, 300 sec: 1989.5). Total num frames: 4575232. Throughput: 0: 520.2. Samples: 18208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:39:09,184][01623] Avg episode reward: [(0, '16.763')] -[2023-02-24 10:39:11,415][28924] Updated weights for policy 0, policy_version 1120 (0.0012) -[2023-02-24 10:39:14,177][01623] Fps is (10 sec: 4096.0, 60 sec: 2252.8, 300 sec: 2252.8). Total num frames: 4595712. Throughput: 0: 540.0. Samples: 21602. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:39:14,182][01623] Avg episode reward: [(0, '19.612')] -[2023-02-24 10:39:19,177][01623] Fps is (10 sec: 4096.0, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 4616192. Throughput: 0: 611.3. Samples: 27508. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:39:19,179][01623] Avg episode reward: [(0, '22.054')] -[2023-02-24 10:39:23,184][28924] Updated weights for policy 0, policy_version 1130 (0.0015) -[2023-02-24 10:39:24,177][01623] Fps is (10 sec: 3276.8, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 4628480. Throughput: 0: 705.5. Samples: 31748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:39:24,183][01623] Avg episode reward: [(0, '23.631')] -[2023-02-24 10:39:29,177][01623] Fps is (10 sec: 3276.8, 60 sec: 2606.5, 300 sec: 2606.5). Total num frames: 4648960. Throughput: 0: 753.2. Samples: 33958. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:39:29,180][01623] Avg episode reward: [(0, '26.722')] -[2023-02-24 10:39:33,745][28924] Updated weights for policy 0, policy_version 1140 (0.0017) -[2023-02-24 10:39:34,177][01623] Fps is (10 sec: 4096.0, 60 sec: 2730.7, 300 sec: 2730.7). Total num frames: 4669440. Throughput: 0: 851.8. Samples: 40736. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:39:34,179][01623] Avg episode reward: [(0, '28.759')] -[2023-02-24 10:39:39,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3072.0, 300 sec: 2835.7). Total num frames: 4689920. Throughput: 0: 889.9. Samples: 46490. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:39:39,179][01623] Avg episode reward: [(0, '29.656')] -[2023-02-24 10:39:44,177][01623] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 2808.7). Total num frames: 4702208. Throughput: 0: 891.9. Samples: 48592. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:39:44,180][01623] Avg episode reward: [(0, '30.191')] -[2023-02-24 10:39:44,194][28910] Saving new best policy, reward=30.191! -[2023-02-24 10:39:46,576][28924] Updated weights for policy 0, policy_version 1150 (0.0028) -[2023-02-24 10:39:49,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3550.0, 300 sec: 2839.9). Total num frames: 4718592. Throughput: 0: 897.6. Samples: 53048. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:39:49,179][01623] Avg episode reward: [(0, '29.059')] -[2023-02-24 10:39:54,177][01623] Fps is (10 sec: 4096.1, 60 sec: 3618.1, 300 sec: 2969.6). Total num frames: 4743168. Throughput: 0: 926.8. Samples: 59916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:39:54,180][01623] Avg episode reward: [(0, '27.072')] -[2023-02-24 10:39:55,782][28924] Updated weights for policy 0, policy_version 1160 (0.0022) -[2023-02-24 10:39:59,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 2987.7). Total num frames: 4759552. Throughput: 0: 927.5. Samples: 63340. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:39:59,184][01623] Avg episode reward: [(0, '26.260')] -[2023-02-24 10:40:04,179][01623] Fps is (10 sec: 2866.4, 60 sec: 3618.0, 300 sec: 2958.1). Total num frames: 4771840. Throughput: 0: 882.5. Samples: 67222. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:40:04,185][01623] Avg episode reward: [(0, '26.167')] -[2023-02-24 10:40:09,177][01623] Fps is (10 sec: 2457.4, 60 sec: 3481.6, 300 sec: 2931.9). Total num frames: 4784128. Throughput: 0: 862.3. Samples: 70550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:40:09,182][01623] Avg episode reward: [(0, '25.514')] -[2023-02-24 10:40:11,694][28924] Updated weights for policy 0, policy_version 1170 (0.0031) -[2023-02-24 10:40:14,177][01623] Fps is (10 sec: 2458.3, 60 sec: 3345.1, 300 sec: 2908.2). Total num frames: 4796416. Throughput: 0: 852.3. Samples: 72310. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:40:14,185][01623] Avg episode reward: [(0, '26.188')] -[2023-02-24 10:40:14,193][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001171_4796416.pth... -[2023-02-24 10:40:14,438][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001063_4354048.pth -[2023-02-24 10:40:19,177][01623] Fps is (10 sec: 3277.0, 60 sec: 3345.1, 300 sec: 2964.7). Total num frames: 4816896. Throughput: 0: 826.8. Samples: 77940. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:40:19,179][01623] Avg episode reward: [(0, '24.339')] -[2023-02-24 10:40:22,058][28924] Updated weights for policy 0, policy_version 1180 (0.0033) -[2023-02-24 10:40:24,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3016.1). Total num frames: 4837376. Throughput: 0: 834.5. Samples: 84044. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:40:24,179][01623] Avg episode reward: [(0, '24.590')] -[2023-02-24 10:40:29,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3027.5). Total num frames: 4853760. Throughput: 0: 835.3. Samples: 86182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:40:29,183][01623] Avg episode reward: [(0, '24.623')] -[2023-02-24 10:40:34,177][01623] Fps is (10 sec: 3276.9, 60 sec: 3345.1, 300 sec: 3037.9). Total num frames: 4870144. Throughput: 0: 832.8. Samples: 90522. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:40:34,179][01623] Avg episode reward: [(0, '24.255')] -[2023-02-24 10:40:35,253][28924] Updated weights for policy 0, policy_version 1190 (0.0017) -[2023-02-24 10:40:39,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3080.2). Total num frames: 4890624. Throughput: 0: 824.1. Samples: 97002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:40:39,179][01623] Avg episode reward: [(0, '24.742')] -[2023-02-24 10:40:44,177][01623] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3119.3). Total num frames: 4911104. Throughput: 0: 823.2. Samples: 100384. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:40:44,183][01623] Avg episode reward: [(0, '23.775')] -[2023-02-24 10:40:45,026][28924] Updated weights for policy 0, policy_version 1200 (0.0012) -[2023-02-24 10:40:49,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3094.8). Total num frames: 4923392. Throughput: 0: 841.3. Samples: 105078. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:40:49,184][01623] Avg episode reward: [(0, '24.153')] -[2023-02-24 10:40:54,177][01623] Fps is (10 sec: 2867.3, 60 sec: 3276.8, 300 sec: 3101.3). Total num frames: 4939776. Throughput: 0: 863.6. Samples: 109412. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:40:54,179][01623] Avg episode reward: [(0, '24.563')] -[2023-02-24 10:40:57,263][28924] Updated weights for policy 0, policy_version 1210 (0.0017) -[2023-02-24 10:40:59,180][01623] Fps is (10 sec: 4094.5, 60 sec: 3413.1, 300 sec: 3163.7). Total num frames: 4964352. Throughput: 0: 900.9. Samples: 112854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:40:59,184][01623] Avg episode reward: [(0, '25.321')] -[2023-02-24 10:41:04,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3550.0, 300 sec: 3194.9). Total num frames: 4984832. Throughput: 0: 926.0. Samples: 119608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:41:04,185][01623] Avg episode reward: [(0, '25.737')] -[2023-02-24 10:41:08,081][28924] Updated weights for policy 0, policy_version 1220 (0.0020) -[2023-02-24 10:41:09,177][01623] Fps is (10 sec: 3278.0, 60 sec: 3549.9, 300 sec: 3171.1). Total num frames: 4997120. Throughput: 0: 888.8. Samples: 124042. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:41:09,182][01623] Avg episode reward: [(0, '24.884')] -[2023-02-24 10:41:14,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3174.4). Total num frames: 5013504. Throughput: 0: 886.6. Samples: 126080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:41:14,180][01623] Avg episode reward: [(0, '25.072')] -[2023-02-24 10:41:19,178][01623] Fps is (10 sec: 3686.0, 60 sec: 3618.1, 300 sec: 3202.3). Total num frames: 5033984. Throughput: 0: 915.9. Samples: 131740. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:41:19,184][01623] Avg episode reward: [(0, '25.720')] -[2023-02-24 10:41:19,744][28924] Updated weights for policy 0, policy_version 1230 (0.0014) -[2023-02-24 10:41:24,177][01623] Fps is (10 sec: 4096.1, 60 sec: 3618.1, 300 sec: 3228.6). Total num frames: 5054464. Throughput: 0: 923.4. Samples: 138554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:41:24,179][01623] Avg episode reward: [(0, '25.557')] -[2023-02-24 10:41:29,177][01623] Fps is (10 sec: 3686.8, 60 sec: 3618.1, 300 sec: 3230.0). Total num frames: 5070848. Throughput: 0: 899.7. Samples: 140872. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:41:29,182][01623] Avg episode reward: [(0, '23.273')] -[2023-02-24 10:41:31,515][28924] Updated weights for policy 0, policy_version 1240 (0.0022) -[2023-02-24 10:41:34,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3208.5). Total num frames: 5083136. Throughput: 0: 889.3. Samples: 145096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:41:34,183][01623] Avg episode reward: [(0, '23.467')] -[2023-02-24 10:41:39,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3232.5). Total num frames: 5103616. Throughput: 0: 919.0. Samples: 150766. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:41:39,183][01623] Avg episode reward: [(0, '24.702')] -[2023-02-24 10:41:42,080][28924] Updated weights for policy 0, policy_version 1250 (0.0018) -[2023-02-24 10:41:44,177][01623] Fps is (10 sec: 4505.5, 60 sec: 3618.1, 300 sec: 3276.8). Total num frames: 5128192. Throughput: 0: 915.6. Samples: 154054. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:41:44,179][01623] Avg episode reward: [(0, '24.497')] -[2023-02-24 10:41:49,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3276.8). Total num frames: 5144576. Throughput: 0: 889.6. Samples: 159642. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:41:49,180][01623] Avg episode reward: [(0, '23.015')] -[2023-02-24 10:41:54,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3256.3). Total num frames: 5156864. Throughput: 0: 885.1. Samples: 163870. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:41:54,184][01623] Avg episode reward: [(0, '23.580')] -[2023-02-24 10:41:54,765][28924] Updated weights for policy 0, policy_version 1260 (0.0017) -[2023-02-24 10:41:59,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3550.1, 300 sec: 3276.8). Total num frames: 5177344. Throughput: 0: 896.3. Samples: 166414. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:41:59,184][01623] Avg episode reward: [(0, '24.457')] -[2023-02-24 10:42:04,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3296.3). Total num frames: 5197824. Throughput: 0: 918.2. Samples: 173056. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:42:04,180][01623] Avg episode reward: [(0, '23.990')] -[2023-02-24 10:42:04,435][28924] Updated weights for policy 0, policy_version 1270 (0.0014) -[2023-02-24 10:42:09,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3295.9). Total num frames: 5214208. Throughput: 0: 888.8. Samples: 178550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:42:09,180][01623] Avg episode reward: [(0, '24.410')] -[2023-02-24 10:42:14,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3295.4). Total num frames: 5230592. Throughput: 0: 884.8. Samples: 180688. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:42:14,186][01623] Avg episode reward: [(0, '23.390')] -[2023-02-24 10:42:14,198][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001277_5230592.pth... -[2023-02-24 10:42:14,402][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_4505600.pth -[2023-02-24 10:42:17,384][28924] Updated weights for policy 0, policy_version 1280 (0.0013) -[2023-02-24 10:42:19,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3313.2). Total num frames: 5251072. Throughput: 0: 898.5. Samples: 185530. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:42:19,178][01623] Avg episode reward: [(0, '26.329')] -[2023-02-24 10:42:24,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3330.2). Total num frames: 5271552. Throughput: 0: 922.8. Samples: 192294. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:42:24,180][01623] Avg episode reward: [(0, '25.696')] -[2023-02-24 10:42:26,508][28924] Updated weights for policy 0, policy_version 1290 (0.0014) -[2023-02-24 10:42:29,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3329.1). Total num frames: 5287936. Throughput: 0: 920.7. Samples: 195486. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:42:29,183][01623] Avg episode reward: [(0, '26.962')] -[2023-02-24 10:42:34,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3328.0). Total num frames: 5304320. Throughput: 0: 891.2. Samples: 199748. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:42:34,180][01623] Avg episode reward: [(0, '26.428')] -[2023-02-24 10:42:39,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3327.0). Total num frames: 5320704. Throughput: 0: 912.1. Samples: 204916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:42:39,181][01623] Avg episode reward: [(0, '26.146')] -[2023-02-24 10:42:39,451][28924] Updated weights for policy 0, policy_version 1300 (0.0033) -[2023-02-24 10:42:44,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3358.7). Total num frames: 5345280. Throughput: 0: 929.8. Samples: 208256. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:42:44,178][01623] Avg episode reward: [(0, '26.688')] -[2023-02-24 10:42:49,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3357.1). Total num frames: 5361664. Throughput: 0: 920.4. Samples: 214476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:42:49,183][01623] Avg episode reward: [(0, '24.312')] -[2023-02-24 10:42:49,510][28924] Updated weights for policy 0, policy_version 1310 (0.0014) -[2023-02-24 10:42:54,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3355.6). Total num frames: 5378048. Throughput: 0: 894.5. Samples: 218802. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:42:54,190][01623] Avg episode reward: [(0, '24.828')] -[2023-02-24 10:42:59,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3354.1). Total num frames: 5394432. Throughput: 0: 896.2. Samples: 221016. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:42:59,179][01623] Avg episode reward: [(0, '24.184')] -[2023-02-24 10:43:01,657][28924] Updated weights for policy 0, policy_version 1320 (0.0018) -[2023-02-24 10:43:04,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3367.8). Total num frames: 5414912. Throughput: 0: 931.4. Samples: 227442. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:43:04,179][01623] Avg episode reward: [(0, '23.402')] -[2023-02-24 10:43:09,178][01623] Fps is (10 sec: 4095.2, 60 sec: 3686.3, 300 sec: 3381.0). Total num frames: 5435392. Throughput: 0: 922.0. Samples: 233786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:43:09,181][01623] Avg episode reward: [(0, '23.133')] -[2023-02-24 10:43:12,186][28924] Updated weights for policy 0, policy_version 1330 (0.0012) -[2023-02-24 10:43:14,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3379.2). Total num frames: 5451776. Throughput: 0: 898.3. Samples: 235908. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:43:14,182][01623] Avg episode reward: [(0, '24.456')] -[2023-02-24 10:43:19,176][01623] Fps is (10 sec: 3277.4, 60 sec: 3618.1, 300 sec: 3377.4). Total num frames: 5468160. Throughput: 0: 900.1. Samples: 240254. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:43:19,181][01623] Avg episode reward: [(0, '25.422')] -[2023-02-24 10:43:23,594][28924] Updated weights for policy 0, policy_version 1340 (0.0014) -[2023-02-24 10:43:24,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3389.8). Total num frames: 5488640. Throughput: 0: 930.1. Samples: 246770. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:43:24,185][01623] Avg episode reward: [(0, '25.943')] -[2023-02-24 10:43:29,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3401.8). Total num frames: 5509120. Throughput: 0: 931.7. Samples: 250182. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:43:29,184][01623] Avg episode reward: [(0, '24.858')] -[2023-02-24 10:43:34,177][01623] Fps is (10 sec: 3686.5, 60 sec: 3686.4, 300 sec: 3457.3). Total num frames: 5525504. Throughput: 0: 901.9. Samples: 255062. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:43:34,179][01623] Avg episode reward: [(0, '25.832')] -[2023-02-24 10:43:35,158][28924] Updated weights for policy 0, policy_version 1350 (0.0023) -[2023-02-24 10:43:39,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3499.0). Total num frames: 5537792. Throughput: 0: 894.3. Samples: 259044. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:43:39,180][01623] Avg episode reward: [(0, '25.780')] -[2023-02-24 10:43:44,177][01623] Fps is (10 sec: 2457.6, 60 sec: 3413.3, 300 sec: 3540.6). Total num frames: 5550080. Throughput: 0: 886.8. Samples: 260924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:43:44,184][01623] Avg episode reward: [(0, '25.613')] -[2023-02-24 10:43:49,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3526.7). Total num frames: 5566464. Throughput: 0: 841.9. Samples: 265326. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:43:49,179][01623] Avg episode reward: [(0, '25.162')] -[2023-02-24 10:43:49,645][28924] Updated weights for policy 0, policy_version 1360 (0.0020) -[2023-02-24 10:43:54,180][01623] Fps is (10 sec: 3275.6, 60 sec: 3413.1, 300 sec: 3526.7). Total num frames: 5582848. Throughput: 0: 805.4. Samples: 270032. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:43:54,183][01623] Avg episode reward: [(0, '25.255')] -[2023-02-24 10:43:59,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3526.7). Total num frames: 5595136. Throughput: 0: 804.8. Samples: 272126. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:43:59,183][01623] Avg episode reward: [(0, '26.377')] -[2023-02-24 10:44:02,621][28924] Updated weights for policy 0, policy_version 1370 (0.0012) -[2023-02-24 10:44:04,177][01623] Fps is (10 sec: 3278.0, 60 sec: 3345.1, 300 sec: 3526.7). Total num frames: 5615616. Throughput: 0: 827.4. Samples: 277486. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:44:04,182][01623] Avg episode reward: [(0, '26.656')] -[2023-02-24 10:44:09,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3413.4, 300 sec: 3540.6). Total num frames: 5640192. Throughput: 0: 831.4. Samples: 284182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:44:09,184][01623] Avg episode reward: [(0, '26.001')] -[2023-02-24 10:44:12,728][28924] Updated weights for policy 0, policy_version 1380 (0.0014) -[2023-02-24 10:44:14,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3512.8). Total num frames: 5652480. Throughput: 0: 813.8. Samples: 286802. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:44:14,181][01623] Avg episode reward: [(0, '26.197')] -[2023-02-24 10:44:14,282][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001381_5656576.pth... -[2023-02-24 10:44:14,608][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001171_4796416.pth -[2023-02-24 10:44:19,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3526.7). Total num frames: 5668864. Throughput: 0: 797.2. Samples: 290936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:44:19,179][01623] Avg episode reward: [(0, '25.554')] -[2023-02-24 10:44:24,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3526.7). Total num frames: 5689344. Throughput: 0: 831.6. Samples: 296468. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:44:24,182][01623] Avg episode reward: [(0, '26.177')] -[2023-02-24 10:44:25,077][28924] Updated weights for policy 0, policy_version 1390 (0.0021) -[2023-02-24 10:44:29,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3526.7). Total num frames: 5709824. Throughput: 0: 859.7. Samples: 299610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:44:29,183][01623] Avg episode reward: [(0, '24.695')] -[2023-02-24 10:44:34,179][01623] Fps is (10 sec: 3685.5, 60 sec: 3344.9, 300 sec: 3512.8). Total num frames: 5726208. Throughput: 0: 883.9. Samples: 305106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:44:34,186][01623] Avg episode reward: [(0, '27.673')] -[2023-02-24 10:44:36,633][28924] Updated weights for policy 0, policy_version 1400 (0.0013) -[2023-02-24 10:44:39,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3512.8). Total num frames: 5738496. Throughput: 0: 870.9. Samples: 309218. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) -[2023-02-24 10:44:39,183][01623] Avg episode reward: [(0, '27.093')] -[2023-02-24 10:44:44,177][01623] Fps is (10 sec: 3277.6, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 5758976. Throughput: 0: 874.5. Samples: 311480. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:44:44,179][01623] Avg episode reward: [(0, '27.463')] -[2023-02-24 10:44:47,976][28924] Updated weights for policy 0, policy_version 1410 (0.0022) -[2023-02-24 10:44:49,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 5779456. Throughput: 0: 903.9. Samples: 318160. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:44:49,184][01623] Avg episode reward: [(0, '29.117')] -[2023-02-24 10:44:54,177][01623] Fps is (10 sec: 3686.5, 60 sec: 3550.1, 300 sec: 3512.8). Total num frames: 5795840. Throughput: 0: 883.1. Samples: 323922. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:44:54,179][01623] Avg episode reward: [(0, '29.151')] -[2023-02-24 10:44:59,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3526.8). Total num frames: 5812224. Throughput: 0: 873.1. Samples: 326090. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:44:59,179][01623] Avg episode reward: [(0, '29.213')] -[2023-02-24 10:45:00,342][28924] Updated weights for policy 0, policy_version 1420 (0.0020) -[2023-02-24 10:45:04,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5828608. Throughput: 0: 884.0. Samples: 330714. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:45:04,184][01623] Avg episode reward: [(0, '29.695')] -[2023-02-24 10:45:09,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 5853184. Throughput: 0: 913.1. Samples: 337556. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:45:09,183][01623] Avg episode reward: [(0, '27.375')] -[2023-02-24 10:45:09,828][28924] Updated weights for policy 0, policy_version 1430 (0.0020) -[2023-02-24 10:45:14,177][01623] Fps is (10 sec: 4505.4, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 5873664. Throughput: 0: 917.7. Samples: 340908. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:45:14,181][01623] Avg episode reward: [(0, '27.900')] -[2023-02-24 10:45:19,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 5885952. Throughput: 0: 892.4. Samples: 345262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:45:19,186][01623] Avg episode reward: [(0, '26.080')] -[2023-02-24 10:45:22,745][28924] Updated weights for policy 0, policy_version 1440 (0.0032) -[2023-02-24 10:45:24,177][01623] Fps is (10 sec: 2867.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5902336. Throughput: 0: 910.2. Samples: 350176. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:45:24,179][01623] Avg episode reward: [(0, '26.142')] -[2023-02-24 10:45:29,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 5926912. Throughput: 0: 935.0. Samples: 353556. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:45:29,180][01623] Avg episode reward: [(0, '26.819')] -[2023-02-24 10:45:32,016][28924] Updated weights for policy 0, policy_version 1450 (0.0013) -[2023-02-24 10:45:34,178][01623] Fps is (10 sec: 4095.5, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 5943296. Throughput: 0: 926.9. Samples: 359872. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:45:34,181][01623] Avg episode reward: [(0, '27.108')] -[2023-02-24 10:45:39,181][01623] Fps is (10 sec: 3275.5, 60 sec: 3686.2, 300 sec: 3554.4). Total num frames: 5959680. Throughput: 0: 890.9. Samples: 364014. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:45:39,189][01623] Avg episode reward: [(0, '26.686')] -[2023-02-24 10:45:44,177][01623] Fps is (10 sec: 2867.6, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5971968. Throughput: 0: 890.0. Samples: 366140. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:45:44,185][01623] Avg episode reward: [(0, '27.805')] -[2023-02-24 10:45:45,288][28924] Updated weights for policy 0, policy_version 1460 (0.0020) -[2023-02-24 10:45:49,177][01623] Fps is (10 sec: 3687.9, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 5996544. Throughput: 0: 922.3. Samples: 372216. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:45:49,180][01623] Avg episode reward: [(0, '27.385')] -[2023-02-24 10:45:54,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 6017024. Throughput: 0: 910.5. Samples: 378528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:45:54,182][01623] Avg episode reward: [(0, '26.535')] -[2023-02-24 10:45:55,362][28924] Updated weights for policy 0, policy_version 1470 (0.0012) -[2023-02-24 10:45:59,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6029312. Throughput: 0: 883.5. Samples: 380666. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:45:59,181][01623] Avg episode reward: [(0, '26.269')] -[2023-02-24 10:46:04,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6045696. Throughput: 0: 882.4. Samples: 384972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:46:04,185][01623] Avg episode reward: [(0, '25.310')] -[2023-02-24 10:46:07,264][28924] Updated weights for policy 0, policy_version 1480 (0.0034) -[2023-02-24 10:46:09,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 6070272. Throughput: 0: 921.5. Samples: 391642. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:46:09,183][01623] Avg episode reward: [(0, '27.435')] -[2023-02-24 10:46:14,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3582.3). Total num frames: 6090752. Throughput: 0: 919.8. Samples: 394948. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:46:14,179][01623] Avg episode reward: [(0, '26.578')] -[2023-02-24 10:46:14,191][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001487_6090752.pth... -[2023-02-24 10:46:14,364][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001277_5230592.pth -[2023-02-24 10:46:18,678][28924] Updated weights for policy 0, policy_version 1490 (0.0012) -[2023-02-24 10:46:19,178][01623] Fps is (10 sec: 3276.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6103040. Throughput: 0: 880.8. Samples: 399508. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:46:19,182][01623] Avg episode reward: [(0, '26.928')] -[2023-02-24 10:46:24,177][01623] Fps is (10 sec: 2457.6, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6115328. Throughput: 0: 879.6. Samples: 403594. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:46:24,180][01623] Avg episode reward: [(0, '28.441')] -[2023-02-24 10:46:29,177][01623] Fps is (10 sec: 3686.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6139904. Throughput: 0: 904.7. Samples: 406850. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:46:29,180][01623] Avg episode reward: [(0, '28.376')] -[2023-02-24 10:46:29,991][28924] Updated weights for policy 0, policy_version 1500 (0.0020) -[2023-02-24 10:46:34,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3582.3). Total num frames: 6160384. Throughput: 0: 917.1. Samples: 413486. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:46:34,182][01623] Avg episode reward: [(0, '27.725')] -[2023-02-24 10:46:39,177][01623] Fps is (10 sec: 3276.6, 60 sec: 3550.1, 300 sec: 3540.6). Total num frames: 6172672. Throughput: 0: 879.8. Samples: 418118. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:46:39,181][01623] Avg episode reward: [(0, '27.315')] -[2023-02-24 10:46:42,306][28924] Updated weights for policy 0, policy_version 1510 (0.0014) -[2023-02-24 10:46:44,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6189056. Throughput: 0: 877.7. Samples: 420162. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:46:44,180][01623] Avg episode reward: [(0, '26.192')] -[2023-02-24 10:46:49,177][01623] Fps is (10 sec: 3686.6, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6209536. Throughput: 0: 903.6. Samples: 425632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:46:49,185][01623] Avg episode reward: [(0, '25.696')] -[2023-02-24 10:46:52,462][28924] Updated weights for policy 0, policy_version 1520 (0.0022) -[2023-02-24 10:46:54,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6230016. Throughput: 0: 905.1. Samples: 432370. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:46:54,185][01623] Avg episode reward: [(0, '24.908')] -[2023-02-24 10:46:59,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6246400. Throughput: 0: 884.0. Samples: 434730. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:46:59,183][01623] Avg episode reward: [(0, '25.616')] -[2023-02-24 10:47:04,178][01623] Fps is (10 sec: 2866.7, 60 sec: 3549.8, 300 sec: 3540.6). Total num frames: 6258688. Throughput: 0: 876.4. Samples: 438948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:47:04,181][01623] Avg episode reward: [(0, '25.023')] -[2023-02-24 10:47:05,561][28924] Updated weights for policy 0, policy_version 1530 (0.0013) -[2023-02-24 10:47:09,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 6279168. Throughput: 0: 912.7. Samples: 444666. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:47:09,179][01623] Avg episode reward: [(0, '24.310')] -[2023-02-24 10:47:14,177][01623] Fps is (10 sec: 4506.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6303744. Throughput: 0: 911.5. Samples: 447868. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:47:14,183][01623] Avg episode reward: [(0, '24.429')] -[2023-02-24 10:47:15,228][28924] Updated weights for policy 0, policy_version 1540 (0.0023) -[2023-02-24 10:47:19,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6316032. Throughput: 0: 871.5. Samples: 452704. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:47:19,179][01623] Avg episode reward: [(0, '25.264')] -[2023-02-24 10:47:24,177][01623] Fps is (10 sec: 2048.0, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 6324224. Throughput: 0: 845.0. Samples: 456142. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:47:24,182][01623] Avg episode reward: [(0, '24.670')] -[2023-02-24 10:47:29,177][01623] Fps is (10 sec: 2048.0, 60 sec: 3276.8, 300 sec: 3499.0). Total num frames: 6336512. Throughput: 0: 835.4. Samples: 457756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:47:29,183][01623] Avg episode reward: [(0, '24.629')] -[2023-02-24 10:47:32,089][28924] Updated weights for policy 0, policy_version 1550 (0.0028) -[2023-02-24 10:47:34,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3512.8). Total num frames: 6356992. Throughput: 0: 815.8. Samples: 462344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:47:34,179][01623] Avg episode reward: [(0, '24.879')] -[2023-02-24 10:47:39,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3499.0). Total num frames: 6377472. Throughput: 0: 812.5. Samples: 468934. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:47:39,179][01623] Avg episode reward: [(0, '25.972')] -[2023-02-24 10:47:41,659][28924] Updated weights for policy 0, policy_version 1560 (0.0029) -[2023-02-24 10:47:44,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3499.0). Total num frames: 6393856. Throughput: 0: 825.8. Samples: 471890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:47:44,181][01623] Avg episode reward: [(0, '26.030')] -[2023-02-24 10:47:49,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 6410240. Throughput: 0: 824.1. Samples: 476032. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:47:49,183][01623] Avg episode reward: [(0, '25.609')] -[2023-02-24 10:47:54,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3499.0). Total num frames: 6426624. Throughput: 0: 809.9. Samples: 481110. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:47:54,179][01623] Avg episode reward: [(0, '25.638')] -[2023-02-24 10:47:54,543][28924] Updated weights for policy 0, policy_version 1570 (0.0018) -[2023-02-24 10:47:59,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 6447104. Throughput: 0: 813.2. Samples: 484462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:47:59,183][01623] Avg episode reward: [(0, '26.261')] -[2023-02-24 10:48:04,177][01623] Fps is (10 sec: 4095.8, 60 sec: 3481.7, 300 sec: 3499.0). Total num frames: 6467584. Throughput: 0: 843.5. Samples: 490664. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:48:04,185][01623] Avg episode reward: [(0, '25.736')] -[2023-02-24 10:48:05,151][28924] Updated weights for policy 0, policy_version 1580 (0.0018) -[2023-02-24 10:48:09,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3485.1). Total num frames: 6479872. Throughput: 0: 860.8. Samples: 494880. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:48:09,180][01623] Avg episode reward: [(0, '25.759')] -[2023-02-24 10:48:14,177][01623] Fps is (10 sec: 2867.3, 60 sec: 3208.5, 300 sec: 3485.1). Total num frames: 6496256. Throughput: 0: 869.1. Samples: 496866. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:48:14,185][01623] Avg episode reward: [(0, '24.768')] -[2023-02-24 10:48:14,199][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001586_6496256.pth... -[2023-02-24 10:48:14,382][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001381_5656576.pth -[2023-02-24 10:48:16,969][28924] Updated weights for policy 0, policy_version 1590 (0.0015) -[2023-02-24 10:48:19,179][01623] Fps is (10 sec: 4094.9, 60 sec: 3413.2, 300 sec: 3498.9). Total num frames: 6520832. Throughput: 0: 909.3. Samples: 503266. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:48:19,187][01623] Avg episode reward: [(0, '25.565')] -[2023-02-24 10:48:24,180][01623] Fps is (10 sec: 4094.9, 60 sec: 3549.7, 300 sec: 3485.0). Total num frames: 6537216. Throughput: 0: 896.1. Samples: 509260. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:48:24,183][01623] Avg episode reward: [(0, '26.238')] -[2023-02-24 10:48:28,920][28924] Updated weights for policy 0, policy_version 1600 (0.0026) -[2023-02-24 10:48:29,177][01623] Fps is (10 sec: 3277.7, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 6553600. Throughput: 0: 875.4. Samples: 511282. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:48:29,183][01623] Avg episode reward: [(0, '25.663')] -[2023-02-24 10:48:34,177][01623] Fps is (10 sec: 3277.7, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 6569984. Throughput: 0: 876.6. Samples: 515480. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:48:34,180][01623] Avg episode reward: [(0, '24.993')] -[2023-02-24 10:48:39,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 6590464. Throughput: 0: 909.1. Samples: 522018. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:48:39,183][01623] Avg episode reward: [(0, '26.526')] -[2023-02-24 10:48:39,771][28924] Updated weights for policy 0, policy_version 1610 (0.0026) -[2023-02-24 10:48:44,177][01623] Fps is (10 sec: 4095.7, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6610944. Throughput: 0: 908.1. Samples: 525328. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:48:44,181][01623] Avg episode reward: [(0, '26.041')] -[2023-02-24 10:48:49,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3526.8). Total num frames: 6623232. Throughput: 0: 872.1. Samples: 529906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:48:49,183][01623] Avg episode reward: [(0, '26.145')] -[2023-02-24 10:48:52,578][28924] Updated weights for policy 0, policy_version 1620 (0.0012) -[2023-02-24 10:48:54,177][01623] Fps is (10 sec: 2867.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6639616. Throughput: 0: 875.6. Samples: 534280. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:48:54,184][01623] Avg episode reward: [(0, '25.430')] -[2023-02-24 10:48:59,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6660096. Throughput: 0: 905.8. Samples: 537626. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:48:59,179][01623] Avg episode reward: [(0, '25.385')] -[2023-02-24 10:49:01,961][28924] Updated weights for policy 0, policy_version 1630 (0.0021) -[2023-02-24 10:49:04,179][01623] Fps is (10 sec: 4504.6, 60 sec: 3618.0, 300 sec: 3540.6). Total num frames: 6684672. Throughput: 0: 914.5. Samples: 544416. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) -[2023-02-24 10:49:04,185][01623] Avg episode reward: [(0, '24.762')] -[2023-02-24 10:49:09,183][01623] Fps is (10 sec: 3684.1, 60 sec: 3617.8, 300 sec: 3540.5). Total num frames: 6696960. Throughput: 0: 881.9. Samples: 548950. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:49:09,186][01623] Avg episode reward: [(0, '22.924')] -[2023-02-24 10:49:14,177][01623] Fps is (10 sec: 2867.9, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6713344. Throughput: 0: 882.9. Samples: 551014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:49:14,184][01623] Avg episode reward: [(0, '22.828')] -[2023-02-24 10:49:15,034][28924] Updated weights for policy 0, policy_version 1640 (0.0032) -[2023-02-24 10:49:19,177][01623] Fps is (10 sec: 3688.7, 60 sec: 3550.0, 300 sec: 3540.6). Total num frames: 6733824. Throughput: 0: 920.9. Samples: 556920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:49:19,180][01623] Avg episode reward: [(0, '23.898')] -[2023-02-24 10:49:24,178][01623] Fps is (10 sec: 4095.2, 60 sec: 3618.2, 300 sec: 3540.6). Total num frames: 6754304. Throughput: 0: 923.4. Samples: 563574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:49:24,187][01623] Avg episode reward: [(0, '24.169')] -[2023-02-24 10:49:24,451][28924] Updated weights for policy 0, policy_version 1650 (0.0012) -[2023-02-24 10:49:29,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6770688. Throughput: 0: 896.5. Samples: 565672. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:49:29,185][01623] Avg episode reward: [(0, '23.879')] -[2023-02-24 10:49:34,177][01623] Fps is (10 sec: 2867.7, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6782976. Throughput: 0: 887.0. Samples: 569820. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:49:34,180][01623] Avg episode reward: [(0, '24.856')] -[2023-02-24 10:49:37,371][28924] Updated weights for policy 0, policy_version 1660 (0.0026) -[2023-02-24 10:49:39,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6807552. Throughput: 0: 925.0. Samples: 575906. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:49:39,183][01623] Avg episode reward: [(0, '24.689')] -[2023-02-24 10:49:44,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3554.5). Total num frames: 6828032. Throughput: 0: 921.8. Samples: 579106. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) -[2023-02-24 10:49:44,184][01623] Avg episode reward: [(0, '25.806')] -[2023-02-24 10:49:47,767][28924] Updated weights for policy 0, policy_version 1670 (0.0020) -[2023-02-24 10:49:49,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3554.5). Total num frames: 6844416. Throughput: 0: 887.1. Samples: 584334. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:49:49,181][01623] Avg episode reward: [(0, '26.098')] -[2023-02-24 10:49:54,178][01623] Fps is (10 sec: 2866.7, 60 sec: 3618.0, 300 sec: 3540.6). Total num frames: 6856704. Throughput: 0: 880.7. Samples: 588578. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:49:54,182][01623] Avg episode reward: [(0, '25.973')] -[2023-02-24 10:49:59,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6877184. Throughput: 0: 897.0. Samples: 591380. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:49:59,183][01623] Avg episode reward: [(0, '25.533')] -[2023-02-24 10:49:59,592][28924] Updated weights for policy 0, policy_version 1680 (0.0012) -[2023-02-24 10:50:04,177][01623] Fps is (10 sec: 4096.7, 60 sec: 3550.0, 300 sec: 3540.6). Total num frames: 6897664. Throughput: 0: 916.3. Samples: 598154. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:50:04,185][01623] Avg episode reward: [(0, '27.167')] -[2023-02-24 10:50:09,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.5, 300 sec: 3526.7). Total num frames: 6914048. Throughput: 0: 886.1. Samples: 603446. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:50:09,186][01623] Avg episode reward: [(0, '27.809')] -[2023-02-24 10:50:11,013][28924] Updated weights for policy 0, policy_version 1690 (0.0015) -[2023-02-24 10:50:14,177][01623] Fps is (10 sec: 3276.7, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6930432. Throughput: 0: 887.0. Samples: 605588. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:50:14,181][01623] Avg episode reward: [(0, '26.213')] -[2023-02-24 10:50:14,193][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001692_6930432.pth... -[2023-02-24 10:50:14,458][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001487_6090752.pth -[2023-02-24 10:50:19,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6946816. Throughput: 0: 904.3. Samples: 610514. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:50:19,183][01623] Avg episode reward: [(0, '25.021')] -[2023-02-24 10:50:22,137][28924] Updated weights for policy 0, policy_version 1700 (0.0017) -[2023-02-24 10:50:24,177][01623] Fps is (10 sec: 4096.2, 60 sec: 3618.2, 300 sec: 3540.6). Total num frames: 6971392. Throughput: 0: 915.3. Samples: 617096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:50:24,180][01623] Avg episode reward: [(0, '26.422')] -[2023-02-24 10:50:29,180][01623] Fps is (10 sec: 4094.7, 60 sec: 3617.9, 300 sec: 3540.6). Total num frames: 6987776. Throughput: 0: 912.3. Samples: 620164. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:50:29,183][01623] Avg episode reward: [(0, '25.641')] -[2023-02-24 10:50:34,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3526.8). Total num frames: 7000064. Throughput: 0: 888.2. Samples: 624302. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:50:34,183][01623] Avg episode reward: [(0, '24.012')] -[2023-02-24 10:50:34,297][28924] Updated weights for policy 0, policy_version 1710 (0.0011) -[2023-02-24 10:50:39,177][01623] Fps is (10 sec: 3277.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7020544. Throughput: 0: 912.2. Samples: 629626. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:50:39,179][01623] Avg episode reward: [(0, '23.486')] -[2023-02-24 10:50:44,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7041024. Throughput: 0: 925.2. Samples: 633014. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:50:44,184][01623] Avg episode reward: [(0, '24.315')] -[2023-02-24 10:50:44,201][28924] Updated weights for policy 0, policy_version 1720 (0.0016) -[2023-02-24 10:50:49,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 7061504. Throughput: 0: 907.6. Samples: 638996. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:50:49,179][01623] Avg episode reward: [(0, '24.132')] -[2023-02-24 10:50:54,178][01623] Fps is (10 sec: 3276.3, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 7073792. Throughput: 0: 866.4. Samples: 642436. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:50:54,181][01623] Avg episode reward: [(0, '23.521')] -[2023-02-24 10:50:59,177][01623] Fps is (10 sec: 2047.8, 60 sec: 3413.3, 300 sec: 3512.8). Total num frames: 7081984. Throughput: 0: 854.5. Samples: 644042. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:50:59,180][01623] Avg episode reward: [(0, '23.887')] -[2023-02-24 10:50:59,992][28924] Updated weights for policy 0, policy_version 1730 (0.0067) -[2023-02-24 10:51:04,182][01623] Fps is (10 sec: 2456.6, 60 sec: 3344.7, 300 sec: 3485.0). Total num frames: 7098368. Throughput: 0: 824.5. Samples: 647620. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:51:04,185][01623] Avg episode reward: [(0, '25.100')] -[2023-02-24 10:51:09,177][01623] Fps is (10 sec: 3686.7, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 7118848. Throughput: 0: 821.2. Samples: 654050. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:51:09,184][01623] Avg episode reward: [(0, '26.697')] -[2023-02-24 10:51:10,793][28924] Updated weights for policy 0, policy_version 1740 (0.0016) -[2023-02-24 10:51:14,179][01623] Fps is (10 sec: 3687.7, 60 sec: 3413.2, 300 sec: 3498.9). Total num frames: 7135232. Throughput: 0: 824.0. Samples: 657244. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:51:14,182][01623] Avg episode reward: [(0, '26.712')] -[2023-02-24 10:51:19,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3512.8). Total num frames: 7151616. Throughput: 0: 826.1. Samples: 661478. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:51:19,179][01623] Avg episode reward: [(0, '26.547')] -[2023-02-24 10:51:23,838][28924] Updated weights for policy 0, policy_version 1750 (0.0016) -[2023-02-24 10:51:24,177][01623] Fps is (10 sec: 3277.5, 60 sec: 3276.8, 300 sec: 3485.1). Total num frames: 7168000. Throughput: 0: 818.4. Samples: 666456. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:51:24,180][01623] Avg episode reward: [(0, '25.635')] -[2023-02-24 10:51:29,179][01623] Fps is (10 sec: 3685.3, 60 sec: 3345.1, 300 sec: 3485.0). Total num frames: 7188480. Throughput: 0: 818.2. Samples: 669834. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:51:29,182][01623] Avg episode reward: [(0, '24.960')] -[2023-02-24 10:51:33,268][28924] Updated weights for policy 0, policy_version 1760 (0.0012) -[2023-02-24 10:51:34,177][01623] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 7208960. Throughput: 0: 827.2. Samples: 676218. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:51:34,182][01623] Avg episode reward: [(0, '24.270')] -[2023-02-24 10:51:39,177][01623] Fps is (10 sec: 3687.5, 60 sec: 3413.3, 300 sec: 3512.8). Total num frames: 7225344. Throughput: 0: 844.9. Samples: 680454. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:51:39,180][01623] Avg episode reward: [(0, '24.442')] -[2023-02-24 10:51:44,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 7241728. Throughput: 0: 857.0. Samples: 682606. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:51:44,179][01623] Avg episode reward: [(0, '24.606')] -[2023-02-24 10:51:45,843][28924] Updated weights for policy 0, policy_version 1770 (0.0024) -[2023-02-24 10:51:49,177][01623] Fps is (10 sec: 3686.3, 60 sec: 3345.0, 300 sec: 3499.0). Total num frames: 7262208. Throughput: 0: 922.2. Samples: 689114. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:51:49,186][01623] Avg episode reward: [(0, '24.156')] -[2023-02-24 10:51:54,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.7, 300 sec: 3512.8). Total num frames: 7282688. Throughput: 0: 917.6. Samples: 695342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:51:54,184][01623] Avg episode reward: [(0, '24.310')] -[2023-02-24 10:51:56,134][28924] Updated weights for policy 0, policy_version 1780 (0.0012) -[2023-02-24 10:51:59,177][01623] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3512.9). Total num frames: 7294976. Throughput: 0: 893.4. Samples: 697444. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:51:59,184][01623] Avg episode reward: [(0, '24.189')] -[2023-02-24 10:52:04,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3550.2, 300 sec: 3499.0). Total num frames: 7311360. Throughput: 0: 895.6. Samples: 701782. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:52:04,180][01623] Avg episode reward: [(0, '25.950')] -[2023-02-24 10:52:07,819][28924] Updated weights for policy 0, policy_version 1790 (0.0029) -[2023-02-24 10:52:09,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3499.0). Total num frames: 7335936. Throughput: 0: 936.1. Samples: 708580. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:52:09,186][01623] Avg episode reward: [(0, '25.192')] -[2023-02-24 10:52:14,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3686.5, 300 sec: 3526.7). Total num frames: 7356416. Throughput: 0: 935.5. Samples: 711930. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:52:14,181][01623] Avg episode reward: [(0, '25.358')] -[2023-02-24 10:52:14,195][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001796_7356416.pth... -[2023-02-24 10:52:14,363][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001586_6496256.pth -[2023-02-24 10:52:19,075][28924] Updated weights for policy 0, policy_version 1800 (0.0031) -[2023-02-24 10:52:19,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3554.5). Total num frames: 7372800. Throughput: 0: 895.8. Samples: 716530. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:52:19,184][01623] Avg episode reward: [(0, '25.843')] -[2023-02-24 10:52:24,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7385088. Throughput: 0: 899.6. Samples: 720938. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:52:24,179][01623] Avg episode reward: [(0, '27.058')] -[2023-02-24 10:52:29,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3686.6, 300 sec: 3568.4). Total num frames: 7409664. Throughput: 0: 926.4. Samples: 724292. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:52:29,186][01623] Avg episode reward: [(0, '27.456')] -[2023-02-24 10:52:29,969][28924] Updated weights for policy 0, policy_version 1810 (0.0021) -[2023-02-24 10:52:34,179][01623] Fps is (10 sec: 4504.6, 60 sec: 3686.3, 300 sec: 3568.4). Total num frames: 7430144. Throughput: 0: 929.0. Samples: 730922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:52:34,186][01623] Avg episode reward: [(0, '26.833')] -[2023-02-24 10:52:39,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7442432. Throughput: 0: 891.8. Samples: 735472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:52:39,191][01623] Avg episode reward: [(0, '27.059')] -[2023-02-24 10:52:42,379][28924] Updated weights for policy 0, policy_version 1820 (0.0014) -[2023-02-24 10:52:44,179][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.0, 300 sec: 3554.5). Total num frames: 7458816. Throughput: 0: 892.4. Samples: 737604. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:52:44,186][01623] Avg episode reward: [(0, '27.815')] -[2023-02-24 10:52:49,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 7479296. Throughput: 0: 926.5. Samples: 743476. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:52:49,185][01623] Avg episode reward: [(0, '27.613')] -[2023-02-24 10:52:52,456][28924] Updated weights for policy 0, policy_version 1830 (0.0015) -[2023-02-24 10:52:54,177][01623] Fps is (10 sec: 4096.9, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 7499776. Throughput: 0: 921.8. Samples: 750060. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) -[2023-02-24 10:52:54,183][01623] Avg episode reward: [(0, '27.478')] -[2023-02-24 10:52:59,177][01623] Fps is (10 sec: 3686.1, 60 sec: 3686.4, 300 sec: 3554.5). Total num frames: 7516160. Throughput: 0: 896.0. Samples: 752250. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:52:59,181][01623] Avg episode reward: [(0, '26.935')] -[2023-02-24 10:53:04,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 7532544. Throughput: 0: 891.3. Samples: 756640. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:53:04,179][01623] Avg episode reward: [(0, '26.621')] -[2023-02-24 10:53:05,207][28924] Updated weights for policy 0, policy_version 1840 (0.0013) -[2023-02-24 10:53:09,181][01623] Fps is (10 sec: 3684.9, 60 sec: 3617.8, 300 sec: 3582.2). Total num frames: 7553024. Throughput: 0: 929.3. Samples: 762762. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:53:09,188][01623] Avg episode reward: [(0, '26.538')] -[2023-02-24 10:53:14,149][28924] Updated weights for policy 0, policy_version 1850 (0.0016) -[2023-02-24 10:53:14,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 7577600. Throughput: 0: 931.3. Samples: 766200. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:53:14,179][01623] Avg episode reward: [(0, '26.391')] -[2023-02-24 10:53:19,177][01623] Fps is (10 sec: 3688.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 7589888. Throughput: 0: 905.9. Samples: 771684. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:53:19,179][01623] Avg episode reward: [(0, '26.516')] -[2023-02-24 10:53:24,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 7606272. Throughput: 0: 899.2. Samples: 775936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:53:24,185][01623] Avg episode reward: [(0, '26.852')] -[2023-02-24 10:53:27,036][28924] Updated weights for policy 0, policy_version 1860 (0.0015) -[2023-02-24 10:53:29,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 7626752. Throughput: 0: 913.4. Samples: 778704. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:53:29,182][01623] Avg episode reward: [(0, '27.809')] -[2023-02-24 10:53:34,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.3, 300 sec: 3582.3). Total num frames: 7647232. Throughput: 0: 932.6. Samples: 785442. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:53:34,183][01623] Avg episode reward: [(0, '28.648')] -[2023-02-24 10:53:36,679][28924] Updated weights for policy 0, policy_version 1870 (0.0021) -[2023-02-24 10:53:39,178][01623] Fps is (10 sec: 3685.8, 60 sec: 3686.3, 300 sec: 3568.4). Total num frames: 7663616. Throughput: 0: 903.4. Samples: 790716. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:53:39,183][01623] Avg episode reward: [(0, '29.804')] -[2023-02-24 10:53:44,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.5, 300 sec: 3582.3). Total num frames: 7680000. Throughput: 0: 902.9. Samples: 792880. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:53:44,185][01623] Avg episode reward: [(0, '28.952')] -[2023-02-24 10:53:49,177][01623] Fps is (10 sec: 3277.3, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 7696384. Throughput: 0: 915.7. Samples: 797848. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:53:49,179][01623] Avg episode reward: [(0, '29.583')] -[2023-02-24 10:53:49,311][28924] Updated weights for policy 0, policy_version 1880 (0.0012) -[2023-02-24 10:53:54,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3596.1). Total num frames: 7720960. Throughput: 0: 927.3. Samples: 804488. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:53:54,184][01623] Avg episode reward: [(0, '29.967')] -[2023-02-24 10:53:59,180][01623] Fps is (10 sec: 4094.4, 60 sec: 3686.2, 300 sec: 3568.4). Total num frames: 7737344. Throughput: 0: 917.0. Samples: 807468. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:53:59,184][01623] Avg episode reward: [(0, '29.194')] -[2023-02-24 10:53:59,779][28924] Updated weights for policy 0, policy_version 1890 (0.0019) -[2023-02-24 10:54:04,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 7753728. Throughput: 0: 892.0. Samples: 811824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:54:04,179][01623] Avg episode reward: [(0, '29.722')] -[2023-02-24 10:54:09,177][01623] Fps is (10 sec: 3278.0, 60 sec: 3618.4, 300 sec: 3582.3). Total num frames: 7770112. Throughput: 0: 917.2. Samples: 817208. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:54:09,185][01623] Avg episode reward: [(0, '28.400')] -[2023-02-24 10:54:11,372][28924] Updated weights for policy 0, policy_version 1900 (0.0017) -[2023-02-24 10:54:14,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3596.2). Total num frames: 7794688. Throughput: 0: 930.7. Samples: 820584. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:54:14,179][01623] Avg episode reward: [(0, '29.216')] -[2023-02-24 10:54:14,198][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001903_7794688.pth... -[2023-02-24 10:54:14,378][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001692_6930432.pth -[2023-02-24 10:54:19,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 7811072. Throughput: 0: 919.1. Samples: 826800. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:54:19,184][01623] Avg episode reward: [(0, '27.770')] -[2023-02-24 10:54:22,858][28924] Updated weights for policy 0, policy_version 1910 (0.0047) -[2023-02-24 10:54:24,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 7823360. Throughput: 0: 887.0. Samples: 830628. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:54:24,184][01623] Avg episode reward: [(0, '27.442')] -[2023-02-24 10:54:29,177][01623] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 7835648. Throughput: 0: 876.3. Samples: 832314. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:54:29,183][01623] Avg episode reward: [(0, '26.922')] -[2023-02-24 10:54:34,177][01623] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3526.7). Total num frames: 7847936. Throughput: 0: 848.2. Samples: 836016. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:54:34,180][01623] Avg episode reward: [(0, '27.101')] -[2023-02-24 10:54:37,267][28924] Updated weights for policy 0, policy_version 1920 (0.0018) -[2023-02-24 10:54:39,177][01623] Fps is (10 sec: 3686.5, 60 sec: 3481.7, 300 sec: 3540.6). Total num frames: 7872512. Throughput: 0: 839.7. Samples: 842276. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:54:39,180][01623] Avg episode reward: [(0, '26.350')] -[2023-02-24 10:54:44,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 7888896. Throughput: 0: 837.3. Samples: 845144. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:54:44,182][01623] Avg episode reward: [(0, '26.592')] -[2023-02-24 10:54:49,177][01623] Fps is (10 sec: 2867.0, 60 sec: 3413.3, 300 sec: 3540.6). Total num frames: 7901184. Throughput: 0: 833.9. Samples: 849348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:54:49,184][01623] Avg episode reward: [(0, '25.915')] -[2023-02-24 10:54:49,761][28924] Updated weights for policy 0, policy_version 1930 (0.0016) -[2023-02-24 10:54:54,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3540.6). Total num frames: 7921664. Throughput: 0: 835.8. Samples: 854818. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) -[2023-02-24 10:54:54,183][01623] Avg episode reward: [(0, '26.963')] -[2023-02-24 10:54:59,177][01623] Fps is (10 sec: 4096.2, 60 sec: 3413.5, 300 sec: 3540.6). Total num frames: 7942144. Throughput: 0: 835.3. Samples: 858172. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:54:59,183][01623] Avg episode reward: [(0, '27.215')] -[2023-02-24 10:54:59,439][28924] Updated weights for policy 0, policy_version 1940 (0.0023) -[2023-02-24 10:55:04,177][01623] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 7962624. Throughput: 0: 833.0. Samples: 864286. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:55:04,181][01623] Avg episode reward: [(0, '28.408')] -[2023-02-24 10:55:09,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3540.6). Total num frames: 7974912. Throughput: 0: 841.3. Samples: 868488. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:55:09,183][01623] Avg episode reward: [(0, '28.632')] -[2023-02-24 10:55:12,367][28924] Updated weights for policy 0, policy_version 1950 (0.0015) -[2023-02-24 10:55:14,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3554.5). Total num frames: 7995392. Throughput: 0: 852.5. Samples: 870678. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) -[2023-02-24 10:55:14,179][01623] Avg episode reward: [(0, '28.717')] -[2023-02-24 10:55:19,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3540.6). Total num frames: 8015872. Throughput: 0: 921.1. Samples: 877466. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:55:19,179][01623] Avg episode reward: [(0, '27.587')] -[2023-02-24 10:55:21,529][28924] Updated weights for policy 0, policy_version 1960 (0.0014) -[2023-02-24 10:55:24,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8036352. Throughput: 0: 913.2. Samples: 883370. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:55:24,183][01623] Avg episode reward: [(0, '26.901')] -[2023-02-24 10:55:29,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8048640. Throughput: 0: 897.5. Samples: 885532. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:55:29,186][01623] Avg episode reward: [(0, '26.682')] -[2023-02-24 10:55:34,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 8065024. Throughput: 0: 902.1. Samples: 889940. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:55:34,179][01623] Avg episode reward: [(0, '24.487')] -[2023-02-24 10:55:34,466][28924] Updated weights for policy 0, policy_version 1970 (0.0026) -[2023-02-24 10:55:39,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8089600. Throughput: 0: 931.3. Samples: 896728. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:55:39,179][01623] Avg episode reward: [(0, '24.311')] -[2023-02-24 10:55:43,755][28924] Updated weights for policy 0, policy_version 1980 (0.0013) -[2023-02-24 10:55:44,184][01623] Fps is (10 sec: 4502.1, 60 sec: 3685.9, 300 sec: 3554.4). Total num frames: 8110080. Throughput: 0: 932.0. Samples: 900120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:55:44,191][01623] Avg episode reward: [(0, '24.030')] -[2023-02-24 10:55:49,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3554.5). Total num frames: 8122368. Throughput: 0: 900.4. Samples: 904802. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:55:49,179][01623] Avg episode reward: [(0, '25.781')] -[2023-02-24 10:55:54,177][01623] Fps is (10 sec: 2869.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8138752. Throughput: 0: 909.3. Samples: 909408. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:55:54,180][01623] Avg episode reward: [(0, '25.867')] -[2023-02-24 10:55:56,364][28924] Updated weights for policy 0, policy_version 1990 (0.0015) -[2023-02-24 10:55:59,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3610.1). Total num frames: 8163328. Throughput: 0: 933.8. Samples: 912698. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:55:59,185][01623] Avg episode reward: [(0, '26.098')] -[2023-02-24 10:56:04,178][01623] Fps is (10 sec: 4504.9, 60 sec: 3686.3, 300 sec: 3610.0). Total num frames: 8183808. Throughput: 0: 934.9. Samples: 919536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:56:04,180][01623] Avg episode reward: [(0, '27.690')] -[2023-02-24 10:56:06,563][28924] Updated weights for policy 0, policy_version 2000 (0.0020) -[2023-02-24 10:56:09,178][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3596.2). Total num frames: 8196096. Throughput: 0: 901.7. Samples: 923948. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:56:09,189][01623] Avg episode reward: [(0, '27.582')] -[2023-02-24 10:56:14,177][01623] Fps is (10 sec: 2867.6, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 8212480. Throughput: 0: 900.8. Samples: 926070. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:56:14,180][01623] Avg episode reward: [(0, '26.813')] -[2023-02-24 10:56:14,188][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002005_8212480.pth... -[2023-02-24 10:56:14,372][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001796_7356416.pth -[2023-02-24 10:56:18,408][28924] Updated weights for policy 0, policy_version 2010 (0.0015) -[2023-02-24 10:56:19,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3610.0). Total num frames: 8232960. Throughput: 0: 931.5. Samples: 931856. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:56:19,179][01623] Avg episode reward: [(0, '25.944')] -[2023-02-24 10:56:24,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 8257536. Throughput: 0: 933.7. Samples: 938744. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:56:24,186][01623] Avg episode reward: [(0, '26.111')] -[2023-02-24 10:56:29,183][01623] Fps is (10 sec: 3683.9, 60 sec: 3686.0, 300 sec: 3596.1). Total num frames: 8269824. Throughput: 0: 905.5. Samples: 940866. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:56:29,186][01623] Avg episode reward: [(0, '26.173')] -[2023-02-24 10:56:29,658][28924] Updated weights for policy 0, policy_version 2020 (0.0015) -[2023-02-24 10:56:34,177][01623] Fps is (10 sec: 2457.7, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8282112. Throughput: 0: 892.5. Samples: 944964. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:56:34,180][01623] Avg episode reward: [(0, '27.397')] -[2023-02-24 10:56:39,177][01623] Fps is (10 sec: 3688.9, 60 sec: 3618.1, 300 sec: 3610.0). Total num frames: 8306688. Throughput: 0: 922.9. Samples: 950940. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:56:39,179][01623] Avg episode reward: [(0, '26.306')] -[2023-02-24 10:56:40,741][28924] Updated weights for policy 0, policy_version 2030 (0.0018) -[2023-02-24 10:56:44,178][01623] Fps is (10 sec: 4504.8, 60 sec: 3618.5, 300 sec: 3610.0). Total num frames: 8327168. Throughput: 0: 921.0. Samples: 954146. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:56:44,181][01623] Avg episode reward: [(0, '28.025')] -[2023-02-24 10:56:49,178][01623] Fps is (10 sec: 3685.7, 60 sec: 3686.3, 300 sec: 3596.1). Total num frames: 8343552. Throughput: 0: 887.6. Samples: 959480. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 10:56:49,185][01623] Avg episode reward: [(0, '29.013')] -[2023-02-24 10:56:53,179][28924] Updated weights for policy 0, policy_version 2040 (0.0012) -[2023-02-24 10:56:54,177][01623] Fps is (10 sec: 2867.7, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 8355840. Throughput: 0: 882.0. Samples: 963636. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:56:54,185][01623] Avg episode reward: [(0, '27.806')] -[2023-02-24 10:56:59,177][01623] Fps is (10 sec: 3277.4, 60 sec: 3549.9, 300 sec: 3610.0). Total num frames: 8376320. Throughput: 0: 895.6. Samples: 966370. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:56:59,179][01623] Avg episode reward: [(0, '27.261')] -[2023-02-24 10:57:03,093][28924] Updated weights for policy 0, policy_version 2050 (0.0015) -[2023-02-24 10:57:04,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3610.0). Total num frames: 8400896. Throughput: 0: 920.0. Samples: 973254. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:57:04,182][01623] Avg episode reward: [(0, '26.492')] -[2023-02-24 10:57:09,183][01623] Fps is (10 sec: 4093.2, 60 sec: 3686.0, 300 sec: 3596.1). Total num frames: 8417280. Throughput: 0: 886.1. Samples: 978624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:57:09,186][01623] Avg episode reward: [(0, '26.822')] -[2023-02-24 10:57:14,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.2, 300 sec: 3582.3). Total num frames: 8429568. Throughput: 0: 887.6. Samples: 980800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:57:14,181][01623] Avg episode reward: [(0, '27.264')] -[2023-02-24 10:57:15,740][28924] Updated weights for policy 0, policy_version 2060 (0.0027) -[2023-02-24 10:57:19,177][01623] Fps is (10 sec: 3278.9, 60 sec: 3618.1, 300 sec: 3610.0). Total num frames: 8450048. Throughput: 0: 909.2. Samples: 985880. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:57:19,185][01623] Avg episode reward: [(0, '26.415')] -[2023-02-24 10:57:24,177][01623] Fps is (10 sec: 4095.7, 60 sec: 3549.8, 300 sec: 3596.1). Total num frames: 8470528. Throughput: 0: 929.0. Samples: 992744. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:57:24,180][01623] Avg episode reward: [(0, '27.332')] -[2023-02-24 10:57:24,992][28924] Updated weights for policy 0, policy_version 2070 (0.0016) -[2023-02-24 10:57:29,177][01623] Fps is (10 sec: 4096.1, 60 sec: 3686.8, 300 sec: 3596.2). Total num frames: 8491008. Throughput: 0: 925.1. Samples: 995774. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:57:29,181][01623] Avg episode reward: [(0, '27.371')] -[2023-02-24 10:57:34,177][01623] Fps is (10 sec: 3277.0, 60 sec: 3686.4, 300 sec: 3596.1). Total num frames: 8503296. Throughput: 0: 903.3. Samples: 1000126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:57:34,179][01623] Avg episode reward: [(0, '26.507')] -[2023-02-24 10:57:38,022][28924] Updated weights for policy 0, policy_version 2080 (0.0015) -[2023-02-24 10:57:39,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3610.1). Total num frames: 8523776. Throughput: 0: 923.6. Samples: 1005196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:57:39,185][01623] Avg episode reward: [(0, '26.413')] -[2023-02-24 10:57:44,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3610.0). Total num frames: 8544256. Throughput: 0: 936.9. Samples: 1008532. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:57:44,179][01623] Avg episode reward: [(0, '25.975')] -[2023-02-24 10:57:46,958][28924] Updated weights for policy 0, policy_version 2090 (0.0024) -[2023-02-24 10:57:49,183][01623] Fps is (10 sec: 4093.2, 60 sec: 3686.1, 300 sec: 3610.0). Total num frames: 8564736. Throughput: 0: 928.7. Samples: 1015052. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:57:49,186][01623] Avg episode reward: [(0, '26.465')] -[2023-02-24 10:57:54,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3610.0). Total num frames: 8581120. Throughput: 0: 903.2. Samples: 1019264. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:57:54,184][01623] Avg episode reward: [(0, '25.332')] -[2023-02-24 10:57:59,177][01623] Fps is (10 sec: 2459.3, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 8589312. Throughput: 0: 894.9. Samples: 1021072. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:57:59,181][01623] Avg episode reward: [(0, '24.737')] -[2023-02-24 10:58:02,347][28924] Updated weights for policy 0, policy_version 2100 (0.0023) -[2023-02-24 10:58:04,177][01623] Fps is (10 sec: 2457.6, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 8605696. Throughput: 0: 868.5. Samples: 1024964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 10:58:04,179][01623] Avg episode reward: [(0, '24.348')] -[2023-02-24 10:58:09,178][01623] Fps is (10 sec: 3276.2, 60 sec: 3413.6, 300 sec: 3540.6). Total num frames: 8622080. Throughput: 0: 834.2. Samples: 1030284. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:58:09,187][01623] Avg episode reward: [(0, '25.506')] -[2023-02-24 10:58:14,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 8638464. Throughput: 0: 812.8. Samples: 1032348. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:58:14,191][01623] Avg episode reward: [(0, '26.833')] -[2023-02-24 10:58:14,203][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002109_8638464.pth... -[2023-02-24 10:58:14,511][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001903_7794688.pth -[2023-02-24 10:58:15,744][28924] Updated weights for policy 0, policy_version 2110 (0.0040) -[2023-02-24 10:58:19,177][01623] Fps is (10 sec: 2867.7, 60 sec: 3345.1, 300 sec: 3540.6). Total num frames: 8650752. Throughput: 0: 806.4. Samples: 1036412. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:58:19,183][01623] Avg episode reward: [(0, '25.630')] -[2023-02-24 10:58:24,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3540.6). Total num frames: 8671232. Throughput: 0: 832.3. Samples: 1042648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:58:24,183][01623] Avg episode reward: [(0, '25.383')] -[2023-02-24 10:58:26,034][28924] Updated weights for policy 0, policy_version 2120 (0.0017) -[2023-02-24 10:58:29,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3413.3, 300 sec: 3554.5). Total num frames: 8695808. Throughput: 0: 831.6. Samples: 1045954. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:58:29,180][01623] Avg episode reward: [(0, '25.320')] -[2023-02-24 10:58:34,179][01623] Fps is (10 sec: 3685.4, 60 sec: 3413.2, 300 sec: 3540.6). Total num frames: 8708096. Throughput: 0: 805.4. Samples: 1051290. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:58:34,183][01623] Avg episode reward: [(0, '24.742')] -[2023-02-24 10:58:38,411][28924] Updated weights for policy 0, policy_version 2130 (0.0015) -[2023-02-24 10:58:39,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3540.6). Total num frames: 8724480. Throughput: 0: 808.5. Samples: 1055646. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:58:39,181][01623] Avg episode reward: [(0, '23.439')] -[2023-02-24 10:58:44,177][01623] Fps is (10 sec: 3687.4, 60 sec: 3345.1, 300 sec: 3554.5). Total num frames: 8744960. Throughput: 0: 831.2. Samples: 1058474. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:58:44,184][01623] Avg episode reward: [(0, '23.126')] -[2023-02-24 10:58:48,087][28924] Updated weights for policy 0, policy_version 2140 (0.0023) -[2023-02-24 10:58:49,176][01623] Fps is (10 sec: 4505.7, 60 sec: 3413.7, 300 sec: 3554.5). Total num frames: 8769536. Throughput: 0: 896.9. Samples: 1065326. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 10:58:49,182][01623] Avg episode reward: [(0, '23.553')] -[2023-02-24 10:58:54,180][01623] Fps is (10 sec: 3685.0, 60 sec: 3344.8, 300 sec: 3540.6). Total num frames: 8781824. Throughput: 0: 896.0. Samples: 1070606. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:58:54,183][01623] Avg episode reward: [(0, '25.100')] -[2023-02-24 10:58:59,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 8798208. Throughput: 0: 898.0. Samples: 1072760. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:58:59,186][01623] Avg episode reward: [(0, '26.125')] -[2023-02-24 10:59:01,161][28924] Updated weights for policy 0, policy_version 2150 (0.0021) -[2023-02-24 10:59:04,177][01623] Fps is (10 sec: 3687.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8818688. Throughput: 0: 923.4. Samples: 1077964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:59:04,185][01623] Avg episode reward: [(0, '27.145')] -[2023-02-24 10:59:09,177][01623] Fps is (10 sec: 4505.7, 60 sec: 3686.5, 300 sec: 3554.5). Total num frames: 8843264. Throughput: 0: 937.7. Samples: 1084844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:59:09,185][01623] Avg episode reward: [(0, '28.273')] -[2023-02-24 10:59:10,075][28924] Updated weights for policy 0, policy_version 2160 (0.0014) -[2023-02-24 10:59:14,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3554.5). Total num frames: 8859648. Throughput: 0: 928.4. Samples: 1087734. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:59:14,179][01623] Avg episode reward: [(0, '30.143')] -[2023-02-24 10:59:19,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3554.5). Total num frames: 8871936. Throughput: 0: 904.1. Samples: 1091974. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:59:19,178][01623] Avg episode reward: [(0, '30.906')] -[2023-02-24 10:59:19,190][28910] Saving new best policy, reward=30.906! -[2023-02-24 10:59:23,224][28924] Updated weights for policy 0, policy_version 2170 (0.0021) -[2023-02-24 10:59:24,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 8892416. Throughput: 0: 922.6. Samples: 1097162. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:59:24,179][01623] Avg episode reward: [(0, '29.352')] -[2023-02-24 10:59:29,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3610.0). Total num frames: 8912896. Throughput: 0: 934.9. Samples: 1100544. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 10:59:29,180][01623] Avg episode reward: [(0, '28.611')] -[2023-02-24 10:59:32,927][28924] Updated weights for policy 0, policy_version 2180 (0.0012) -[2023-02-24 10:59:34,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3686.6, 300 sec: 3582.3). Total num frames: 8929280. Throughput: 0: 916.2. Samples: 1106554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:59:34,180][01623] Avg episode reward: [(0, '29.746')] -[2023-02-24 10:59:39,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 8945664. Throughput: 0: 894.9. Samples: 1110874. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 10:59:39,180][01623] Avg episode reward: [(0, '29.647')] -[2023-02-24 10:59:44,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3596.2). Total num frames: 8962048. Throughput: 0: 894.8. Samples: 1113024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:59:44,180][01623] Avg episode reward: [(0, '30.851')] -[2023-02-24 10:59:45,461][28924] Updated weights for policy 0, policy_version 2190 (0.0029) -[2023-02-24 10:59:49,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3610.0). Total num frames: 8986624. Throughput: 0: 926.8. Samples: 1119672. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 10:59:49,183][01623] Avg episode reward: [(0, '29.256')] -[2023-02-24 10:59:54,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3754.9, 300 sec: 3610.0). Total num frames: 9007104. Throughput: 0: 909.2. Samples: 1125758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:59:54,182][01623] Avg episode reward: [(0, '28.856')] -[2023-02-24 10:59:55,668][28924] Updated weights for policy 0, policy_version 2200 (0.0017) -[2023-02-24 10:59:59,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 9019392. Throughput: 0: 893.2. Samples: 1127926. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 10:59:59,182][01623] Avg episode reward: [(0, '30.200')] -[2023-02-24 11:00:04,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9035776. Throughput: 0: 896.6. Samples: 1132320. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:00:04,180][01623] Avg episode reward: [(0, '30.995')] -[2023-02-24 11:00:04,196][28910] Saving new best policy, reward=30.995! -[2023-02-24 11:00:07,407][28924] Updated weights for policy 0, policy_version 2210 (0.0030) -[2023-02-24 11:00:09,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 9056256. Throughput: 0: 929.1. Samples: 1138972. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-02-24 11:00:09,181][01623] Avg episode reward: [(0, '30.846')] -[2023-02-24 11:00:14,179][01623] Fps is (10 sec: 4094.8, 60 sec: 3618.0, 300 sec: 3596.1). Total num frames: 9076736. Throughput: 0: 927.5. Samples: 1142286. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:00:14,188][01623] Avg episode reward: [(0, '29.888')] -[2023-02-24 11:00:14,210][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002216_9076736.pth... -[2023-02-24 11:00:14,529][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002005_8212480.pth -[2023-02-24 11:00:18,884][28924] Updated weights for policy 0, policy_version 2220 (0.0012) -[2023-02-24 11:00:19,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 9093120. Throughput: 0: 894.7. Samples: 1146814. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:00:19,185][01623] Avg episode reward: [(0, '30.221')] -[2023-02-24 11:00:24,177][01623] Fps is (10 sec: 2868.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9105408. Throughput: 0: 898.6. Samples: 1151312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:00:24,179][01623] Avg episode reward: [(0, '31.021')] -[2023-02-24 11:00:24,249][28910] Saving new best policy, reward=31.021! -[2023-02-24 11:00:29,177][01623] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3610.0). Total num frames: 9129984. Throughput: 0: 922.8. Samples: 1154550. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:00:29,181][01623] Avg episode reward: [(0, '31.943')] -[2023-02-24 11:00:29,186][28910] Saving new best policy, reward=31.943! -[2023-02-24 11:00:29,933][28924] Updated weights for policy 0, policy_version 2230 (0.0013) -[2023-02-24 11:00:34,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3596.1). Total num frames: 9150464. Throughput: 0: 919.5. Samples: 1161048. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:00:34,184][01623] Avg episode reward: [(0, '30.633')] -[2023-02-24 11:00:39,179][01623] Fps is (10 sec: 3275.9, 60 sec: 3618.0, 300 sec: 3568.4). Total num frames: 9162752. Throughput: 0: 883.5. Samples: 1165516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:00:39,185][01623] Avg episode reward: [(0, '31.235')] -[2023-02-24 11:00:42,448][28924] Updated weights for policy 0, policy_version 2240 (0.0030) -[2023-02-24 11:00:44,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9179136. Throughput: 0: 882.0. Samples: 1167616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:00:44,179][01623] Avg episode reward: [(0, '30.624')] -[2023-02-24 11:00:49,177][01623] Fps is (10 sec: 3687.4, 60 sec: 3549.9, 300 sec: 3596.2). Total num frames: 9199616. Throughput: 0: 913.2. Samples: 1173412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:00:49,179][01623] Avg episode reward: [(0, '29.648')] -[2023-02-24 11:00:52,307][28924] Updated weights for policy 0, policy_version 2250 (0.0023) -[2023-02-24 11:00:54,177][01623] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9224192. Throughput: 0: 915.1. Samples: 1180150. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:00:54,179][01623] Avg episode reward: [(0, '27.824')] -[2023-02-24 11:00:59,178][01623] Fps is (10 sec: 3685.8, 60 sec: 3618.0, 300 sec: 3568.4). Total num frames: 9236480. Throughput: 0: 891.4. Samples: 1182398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:00:59,182][01623] Avg episode reward: [(0, '27.319')] -[2023-02-24 11:01:04,177][01623] Fps is (10 sec: 2867.1, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9252864. Throughput: 0: 882.5. Samples: 1186528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:01:04,182][01623] Avg episode reward: [(0, '26.603')] -[2023-02-24 11:01:05,562][28924] Updated weights for policy 0, policy_version 2260 (0.0021) -[2023-02-24 11:01:09,176][01623] Fps is (10 sec: 3687.0, 60 sec: 3618.1, 300 sec: 3596.2). Total num frames: 9273344. Throughput: 0: 912.6. Samples: 1192378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:01:09,179][01623] Avg episode reward: [(0, '26.534')] -[2023-02-24 11:01:14,177][01623] Fps is (10 sec: 4096.1, 60 sec: 3618.3, 300 sec: 3596.1). Total num frames: 9293824. Throughput: 0: 914.8. Samples: 1195716. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:01:14,179][01623] Avg episode reward: [(0, '25.960')] -[2023-02-24 11:01:14,537][28924] Updated weights for policy 0, policy_version 2270 (0.0014) -[2023-02-24 11:01:19,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 9310208. Throughput: 0: 893.9. Samples: 1201274. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:01:19,181][01623] Avg episode reward: [(0, '25.395')] -[2023-02-24 11:01:24,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 9326592. Throughput: 0: 892.2. Samples: 1205664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:01:24,179][01623] Avg episode reward: [(0, '26.514')] -[2023-02-24 11:01:28,177][28924] Updated weights for policy 0, policy_version 2280 (0.0019) -[2023-02-24 11:01:29,177][01623] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 9338880. Throughput: 0: 900.5. Samples: 1208138. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 11:01:29,180][01623] Avg episode reward: [(0, '25.870')] -[2023-02-24 11:01:34,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3554.5). Total num frames: 9355264. Throughput: 0: 867.6. Samples: 1212452. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:01:34,180][01623] Avg episode reward: [(0, '25.679')] -[2023-02-24 11:01:39,177][01623] Fps is (10 sec: 2867.3, 60 sec: 3413.5, 300 sec: 3526.7). Total num frames: 9367552. Throughput: 0: 800.7. Samples: 1216180. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 11:01:39,185][01623] Avg episode reward: [(0, '25.468')] -[2023-02-24 11:01:43,423][28924] Updated weights for policy 0, policy_version 2290 (0.0017) -[2023-02-24 11:01:44,177][01623] Fps is (10 sec: 2457.4, 60 sec: 3345.0, 300 sec: 3512.9). Total num frames: 9379840. Throughput: 0: 795.3. Samples: 1218184. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:01:44,184][01623] Avg episode reward: [(0, '25.464')] -[2023-02-24 11:01:49,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3540.6). Total num frames: 9400320. Throughput: 0: 812.8. Samples: 1223106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:01:49,183][01623] Avg episode reward: [(0, '24.950')] -[2023-02-24 11:01:53,528][28924] Updated weights for policy 0, policy_version 2300 (0.0012) -[2023-02-24 11:01:54,177][01623] Fps is (10 sec: 4096.3, 60 sec: 3276.8, 300 sec: 3540.6). Total num frames: 9420800. Throughput: 0: 833.0. Samples: 1229862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:01:54,183][01623] Avg episode reward: [(0, '26.796')] -[2023-02-24 11:01:59,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3526.7). Total num frames: 9441280. Throughput: 0: 831.8. Samples: 1233146. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-02-24 11:01:59,182][01623] Avg episode reward: [(0, '25.902')] -[2023-02-24 11:02:04,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3512.9). Total num frames: 9453568. Throughput: 0: 801.9. Samples: 1237360. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-02-24 11:02:04,184][01623] Avg episode reward: [(0, '27.125')] -[2023-02-24 11:02:06,201][28924] Updated weights for policy 0, policy_version 2310 (0.0015) -[2023-02-24 11:02:09,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3526.7). Total num frames: 9469952. Throughput: 0: 818.8. Samples: 1242510. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:02:09,182][01623] Avg episode reward: [(0, '27.341')] -[2023-02-24 11:02:14,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3540.6). Total num frames: 9494528. Throughput: 0: 838.5. Samples: 1245868. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:02:14,182][01623] Avg episode reward: [(0, '27.595')] -[2023-02-24 11:02:14,198][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002318_9494528.pth... -[2023-02-24 11:02:14,367][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002109_8638464.pth -[2023-02-24 11:02:15,723][28924] Updated weights for policy 0, policy_version 2320 (0.0014) -[2023-02-24 11:02:19,177][01623] Fps is (10 sec: 4505.5, 60 sec: 3413.3, 300 sec: 3540.6). Total num frames: 9515008. Throughput: 0: 884.1. Samples: 1252238. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:02:19,179][01623] Avg episode reward: [(0, '28.559')] -[2023-02-24 11:02:24,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3512.8). Total num frames: 9527296. Throughput: 0: 896.9. Samples: 1256542. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 11:02:24,183][01623] Avg episode reward: [(0, '29.460')] -[2023-02-24 11:02:28,581][28924] Updated weights for policy 0, policy_version 2330 (0.0041) -[2023-02-24 11:02:29,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3526.7). Total num frames: 9543680. Throughput: 0: 899.5. Samples: 1258662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:02:29,185][01623] Avg episode reward: [(0, '29.382')] -[2023-02-24 11:02:34,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 9568256. Throughput: 0: 934.8. Samples: 1265170. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:02:34,180][01623] Avg episode reward: [(0, '29.301')] -[2023-02-24 11:02:37,610][28924] Updated weights for policy 0, policy_version 2340 (0.0013) -[2023-02-24 11:02:39,177][01623] Fps is (10 sec: 4505.8, 60 sec: 3686.4, 300 sec: 3540.6). Total num frames: 9588736. Throughput: 0: 920.5. Samples: 1271284. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:02:39,179][01623] Avg episode reward: [(0, '29.703')] -[2023-02-24 11:02:44,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3512.9). Total num frames: 9601024. Throughput: 0: 894.3. Samples: 1273388. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:02:44,183][01623] Avg episode reward: [(0, '30.472')] -[2023-02-24 11:02:49,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 9617408. Throughput: 0: 893.7. Samples: 1277576. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:02:49,185][01623] Avg episode reward: [(0, '31.420')] -[2023-02-24 11:02:50,813][28924] Updated weights for policy 0, policy_version 2350 (0.0015) -[2023-02-24 11:02:54,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 9637888. Throughput: 0: 926.3. Samples: 1284194. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:02:54,180][01623] Avg episode reward: [(0, '30.242')] -[2023-02-24 11:02:59,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 9658368. Throughput: 0: 927.4. Samples: 1287602. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:02:59,185][01623] Avg episode reward: [(0, '29.147')] -[2023-02-24 11:03:00,881][28924] Updated weights for policy 0, policy_version 2360 (0.0013) -[2023-02-24 11:03:04,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 9674752. Throughput: 0: 888.5. Samples: 1292222. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:03:04,186][01623] Avg episode reward: [(0, '29.383')] -[2023-02-24 11:03:09,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 9687040. Throughput: 0: 890.0. Samples: 1296592. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:03:09,185][01623] Avg episode reward: [(0, '29.458')] -[2023-02-24 11:03:13,106][28924] Updated weights for policy 0, policy_version 2370 (0.0018) -[2023-02-24 11:03:14,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9711616. Throughput: 0: 917.3. Samples: 1299938. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:03:14,181][01623] Avg episode reward: [(0, '28.881')] -[2023-02-24 11:03:19,177][01623] Fps is (10 sec: 4505.5, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9732096. Throughput: 0: 920.6. Samples: 1306598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:03:19,181][01623] Avg episode reward: [(0, '28.473')] -[2023-02-24 11:03:24,000][28924] Updated weights for policy 0, policy_version 2380 (0.0012) -[2023-02-24 11:03:24,179][01623] Fps is (10 sec: 3685.7, 60 sec: 3686.3, 300 sec: 3568.4). Total num frames: 9748480. Throughput: 0: 888.1. Samples: 1311250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:03:24,188][01623] Avg episode reward: [(0, '29.448')] -[2023-02-24 11:03:29,177][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 9760768. Throughput: 0: 889.9. Samples: 1313434. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:03:29,182][01623] Avg episode reward: [(0, '30.294')] -[2023-02-24 11:03:34,177][01623] Fps is (10 sec: 3277.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9781248. Throughput: 0: 925.5. Samples: 1319224. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 11:03:34,179][01623] Avg episode reward: [(0, '29.362')] -[2023-02-24 11:03:35,192][28924] Updated weights for policy 0, policy_version 2390 (0.0018) -[2023-02-24 11:03:39,178][01623] Fps is (10 sec: 4504.8, 60 sec: 3618.0, 300 sec: 3596.1). Total num frames: 9805824. Throughput: 0: 925.2. Samples: 1325828. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:03:39,184][01623] Avg episode reward: [(0, '29.915')] -[2023-02-24 11:03:44,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 9818112. Throughput: 0: 900.8. Samples: 1328136. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 11:03:44,183][01623] Avg episode reward: [(0, '30.350')] -[2023-02-24 11:03:47,084][28924] Updated weights for policy 0, policy_version 2400 (0.0020) -[2023-02-24 11:03:49,178][01623] Fps is (10 sec: 2867.3, 60 sec: 3618.0, 300 sec: 3568.4). Total num frames: 9834496. Throughput: 0: 891.7. Samples: 1332348. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:03:49,185][01623] Avg episode reward: [(0, '30.460')] -[2023-02-24 11:03:54,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9854976. Throughput: 0: 926.0. Samples: 1338260. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:03:54,184][01623] Avg episode reward: [(0, '30.649')] -[2023-02-24 11:03:57,383][28924] Updated weights for policy 0, policy_version 2410 (0.0019) -[2023-02-24 11:03:59,177][01623] Fps is (10 sec: 4096.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9875456. Throughput: 0: 927.5. Samples: 1341676. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:03:59,179][01623] Avg episode reward: [(0, '28.935')] -[2023-02-24 11:04:04,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 9895936. Throughput: 0: 903.8. Samples: 1347270. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-02-24 11:04:04,186][01623] Avg episode reward: [(0, '28.967')] -[2023-02-24 11:04:09,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3554.5). Total num frames: 9908224. Throughput: 0: 893.0. Samples: 1351432. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:04:09,184][01623] Avg episode reward: [(0, '30.409')] -[2023-02-24 11:04:10,077][28924] Updated weights for policy 0, policy_version 2420 (0.0024) -[2023-02-24 11:04:14,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9928704. Throughput: 0: 903.7. Samples: 1354100. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:04:14,179][01623] Avg episode reward: [(0, '30.000')] -[2023-02-24 11:04:14,186][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002424_9928704.pth... -[2023-02-24 11:04:14,355][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002216_9076736.pth -[2023-02-24 11:04:19,177][01623] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9949184. Throughput: 0: 925.9. Samples: 1360888. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:04:19,179][01623] Avg episode reward: [(0, '30.205')] -[2023-02-24 11:04:19,447][28924] Updated weights for policy 0, policy_version 2430 (0.0014) -[2023-02-24 11:04:24,177][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 9965568. Throughput: 0: 895.8. Samples: 1366136. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-02-24 11:04:24,185][01623] Avg episode reward: [(0, '29.022')] -[2023-02-24 11:04:29,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 9981952. Throughput: 0: 893.3. Samples: 1368334. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-02-24 11:04:29,181][01623] Avg episode reward: [(0, '29.417')] -[2023-02-24 11:04:32,717][28924] Updated weights for policy 0, policy_version 2440 (0.0018) -[2023-02-24 11:04:34,177][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 9998336. Throughput: 0: 907.0. Samples: 1373160. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-02-24 11:04:34,180][01623] Avg episode reward: [(0, '30.233')] -[2023-02-24 11:04:35,540][01623] Component Batcher_0 stopped! -[2023-02-24 11:04:35,539][28910] Stopping Batcher_0... -[2023-02-24 11:04:35,544][28910] Loop batcher_evt_loop terminating... -[2023-02-24 11:04:35,540][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth... -[2023-02-24 11:04:35,621][28924] Weights refcount: 2 0 -[2023-02-24 11:04:35,646][28924] Stopping InferenceWorker_p0-w0... -[2023-02-24 11:04:35,646][01623] Component InferenceWorker_p0-w0 stopped! -[2023-02-24 11:04:35,656][28924] Loop inference_proc0-0_evt_loop terminating... -[2023-02-24 11:04:35,693][28910] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002318_9494528.pth -[2023-02-24 11:04:35,705][28910] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth... -[2023-02-24 11:04:35,821][28910] Stopping LearnerWorker_p0... -[2023-02-24 11:04:35,821][28910] Loop learner_proc0_evt_loop terminating... -[2023-02-24 11:04:35,818][01623] Component LearnerWorker_p0 stopped! -[2023-02-24 11:04:35,923][28937] Stopping RolloutWorker_w2... -[2023-02-24 11:04:35,923][28937] Loop rollout_proc2_evt_loop terminating... -[2023-02-24 11:04:35,920][28926] Stopping RolloutWorker_w0... -[2023-02-24 11:04:35,920][01623] Component RolloutWorker_w0 stopped! -[2023-02-24 11:04:35,924][28926] Loop rollout_proc0_evt_loop terminating... -[2023-02-24 11:04:35,927][01623] Component RolloutWorker_w2 stopped! -[2023-02-24 11:04:35,934][28931] Stopping RolloutWorker_w4... -[2023-02-24 11:04:35,936][28931] Loop rollout_proc4_evt_loop terminating... -[2023-02-24 11:04:35,934][01623] Component RolloutWorker_w4 stopped! -[2023-02-24 11:04:35,941][28943] Stopping RolloutWorker_w6... -[2023-02-24 11:04:35,942][28943] Loop rollout_proc6_evt_loop terminating... -[2023-02-24 11:04:35,941][01623] Component RolloutWorker_w6 stopped! -[2023-02-24 11:04:35,964][01623] Component RolloutWorker_w1 stopped! -[2023-02-24 11:04:35,968][28925] Stopping RolloutWorker_w1... -[2023-02-24 11:04:35,969][28925] Loop rollout_proc1_evt_loop terminating... -[2023-02-24 11:04:35,996][28939] Stopping RolloutWorker_w5... -[2023-02-24 11:04:35,996][01623] Component RolloutWorker_w7 stopped! -[2023-02-24 11:04:36,005][01623] Component RolloutWorker_w5 stopped! -[2023-02-24 11:04:36,016][01623] Component RolloutWorker_w3 stopped! -[2023-02-24 11:04:36,017][01623] Waiting for process learner_proc0 to stop... -[2023-02-24 11:04:36,027][28927] Stopping RolloutWorker_w3... -[2023-02-24 11:04:36,027][28927] Loop rollout_proc3_evt_loop terminating... -[2023-02-24 11:04:35,998][28941] Stopping RolloutWorker_w7... -[2023-02-24 11:04:35,997][28939] Loop rollout_proc5_evt_loop terminating... -[2023-02-24 11:04:36,039][28941] Loop rollout_proc7_evt_loop terminating... -[2023-02-24 11:04:39,181][01623] Waiting for process inference_proc0-0 to join... -[2023-02-24 11:04:39,183][01623] Waiting for process rollout_proc0 to join... -[2023-02-24 11:04:39,187][01623] Waiting for process rollout_proc1 to join... -[2023-02-24 11:04:39,193][01623] Waiting for process rollout_proc2 to join... -[2023-02-24 11:04:39,193][01623] Waiting for process rollout_proc3 to join... -[2023-02-24 11:04:39,195][01623] Waiting for process rollout_proc4 to join... -[2023-02-24 11:04:39,196][01623] Waiting for process rollout_proc5 to join... -[2023-02-24 11:04:39,198][01623] Waiting for process rollout_proc6 to join... -[2023-02-24 11:04:39,206][01623] Waiting for process rollout_proc7 to join... -[2023-02-24 11:04:39,209][01623] Batcher 0 profile tree view: -batching: 35.6157, releasing_batches: 0.0329 -[2023-02-24 11:04:39,211][01623] InferenceWorker_p0-w0 profile tree view: -wait_policy: 0.0065 - wait_policy_total: 747.8741 -update_model: 10.2728 - weight_update: 0.0014 -one_step: 0.0027 - handle_policy_step: 744.4427 - deserialize: 20.7052, stack: 4.1453, obs_to_device_normalize: 160.9053, forward: 363.6956, send_messages: 36.5638 - prepare_outputs: 121.5696 - to_cpu: 76.1946 -[2023-02-24 11:04:39,214][01623] Learner 0 profile tree view: -misc: 0.0085, prepare_batch: 21.2186 -train: 110.5493 - epoch_init: 0.0081, minibatch_init: 0.0083, losses_postprocess: 0.7529, kl_divergence: 0.7241, after_optimizer: 4.4222 - calculate_losses: 36.8311 - losses_init: 0.0144, forward_head: 2.4315, bptt_initial: 24.0398, tail: 1.5803, advantages_returns: 0.4345, losses: 4.6497 - bptt: 3.1972 - bptt_forward_core: 3.0780 - update: 66.7834 - clip: 1.9878 -[2023-02-24 11:04:39,216][01623] RolloutWorker_w0 profile tree view: -wait_for_trajectories: 0.4310, enqueue_policy_requests: 208.4554, env_step: 1174.5719, overhead: 31.4377, complete_rollouts: 10.4650 -save_policy_outputs: 29.8957 - split_output_tensors: 14.5189 -[2023-02-24 11:04:39,221][01623] RolloutWorker_w7 profile tree view: -wait_for_trajectories: 0.4278, enqueue_policy_requests: 203.1694, env_step: 1178.1392, overhead: 29.8791, complete_rollouts: 10.1686 -save_policy_outputs: 28.9438 - split_output_tensors: 14.4667 -[2023-02-24 11:04:39,223][01623] Loop Runner_EvtLoop terminating... -[2023-02-24 11:04:39,225][01623] Runner profile tree view: -main_loop: 1583.3832 -[2023-02-24 11:04:39,227][01623] Collected {0: 10006528}, FPS: 3474.2 -[2023-02-24 11:06:52,177][01623] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json -[2023-02-24 11:06:52,180][01623] Overriding arg 'num_workers' with value 1 passed from command line -[2023-02-24 11:06:52,182][01623] Adding new argument 'no_render'=True that is not in the saved config file! -[2023-02-24 11:06:52,186][01623] Adding new argument 'save_video'=True that is not in the saved config file! -[2023-02-24 11:06:52,189][01623] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! -[2023-02-24 11:06:52,191][01623] Adding new argument 'video_name'=None that is not in the saved config file! -[2023-02-24 11:06:52,194][01623] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! -[2023-02-24 11:06:52,196][01623] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! -[2023-02-24 11:06:52,198][01623] Adding new argument 'push_to_hub'=False that is not in the saved config file! -[2023-02-24 11:06:52,199][01623] Adding new argument 'hf_repository'=None that is not in the saved config file! -[2023-02-24 11:06:52,200][01623] Adding new argument 'policy_index'=0 that is not in the saved config file! -[2023-02-24 11:06:52,201][01623] Adding new argument 'eval_deterministic'=False that is not in the saved config file! -[2023-02-24 11:06:52,202][01623] Adding new argument 'train_script'=None that is not in the saved config file! -[2023-02-24 11:06:52,204][01623] Adding new argument 'enjoy_script'=None that is not in the saved config file! -[2023-02-24 11:06:52,205][01623] Using frameskip 1 and render_action_repeat=4 for evaluation -[2023-02-24 11:06:52,244][01623] RunningMeanStd input shape: (3, 72, 128) -[2023-02-24 11:06:52,251][01623] RunningMeanStd input shape: (1,) -[2023-02-24 11:06:52,274][01623] ConvEncoder: input_channels=3 -[2023-02-24 11:06:52,418][01623] Conv encoder output size: 512 -[2023-02-24 11:06:52,419][01623] Policy head output size: 512 -[2023-02-24 11:06:52,520][01623] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth... -[2023-02-24 11:06:53,433][01623] Num frames 100... -[2023-02-24 11:06:53,550][01623] Num frames 200... -[2023-02-24 11:06:53,658][01623] Num frames 300... -[2023-02-24 11:06:53,776][01623] Num frames 400... -[2023-02-24 11:06:53,897][01623] Num frames 500... -[2023-02-24 11:06:54,011][01623] Num frames 600... -[2023-02-24 11:06:54,121][01623] Num frames 700... -[2023-02-24 11:06:54,233][01623] Num frames 800... -[2023-02-24 11:06:54,356][01623] Num frames 900... -[2023-02-24 11:06:54,478][01623] Num frames 1000... -[2023-02-24 11:06:54,601][01623] Num frames 1100... -[2023-02-24 11:06:54,715][01623] Num frames 1200... -[2023-02-24 11:06:54,820][01623] Avg episode rewards: #0: 39.390, true rewards: #0: 12.390 -[2023-02-24 11:06:54,821][01623] Avg episode reward: 39.390, avg true_objective: 12.390 -[2023-02-24 11:06:54,895][01623] Num frames 1300... -[2023-02-24 11:06:55,018][01623] Num frames 1400... -[2023-02-24 11:06:55,137][01623] Num frames 1500... -[2023-02-24 11:06:55,245][01623] Num frames 1600... -[2023-02-24 11:06:55,360][01623] Num frames 1700... -[2023-02-24 11:06:55,484][01623] Num frames 1800... -[2023-02-24 11:06:55,605][01623] Num frames 1900... -[2023-02-24 11:06:55,718][01623] Num frames 2000... -[2023-02-24 11:06:55,834][01623] Num frames 2100... -[2023-02-24 11:06:55,951][01623] Num frames 2200... -[2023-02-24 11:06:56,067][01623] Num frames 2300... -[2023-02-24 11:06:56,179][01623] Num frames 2400... -[2023-02-24 11:06:56,291][01623] Num frames 2500... -[2023-02-24 11:06:56,407][01623] Num frames 2600... -[2023-02-24 11:06:56,524][01623] Num frames 2700... -[2023-02-24 11:06:56,633][01623] Num frames 2800... -[2023-02-24 11:06:56,746][01623] Num frames 2900... -[2023-02-24 11:06:56,848][01623] Avg episode rewards: #0: 40.175, true rewards: #0: 14.675 -[2023-02-24 11:06:56,850][01623] Avg episode reward: 40.175, avg true_objective: 14.675 -[2023-02-24 11:06:56,932][01623] Num frames 3000... -[2023-02-24 11:06:57,056][01623] Num frames 3100... -[2023-02-24 11:06:57,176][01623] Num frames 3200... -[2023-02-24 11:06:57,294][01623] Num frames 3300... -[2023-02-24 11:06:57,463][01623] Num frames 3400... -[2023-02-24 11:06:57,545][01623] Avg episode rewards: #0: 30.383, true rewards: #0: 11.383 -[2023-02-24 11:06:57,547][01623] Avg episode reward: 30.383, avg true_objective: 11.383 -[2023-02-24 11:06:57,699][01623] Num frames 3500... -[2023-02-24 11:06:57,863][01623] Num frames 3600... -[2023-02-24 11:06:58,031][01623] Num frames 3700... -[2023-02-24 11:06:58,185][01623] Num frames 3800... -[2023-02-24 11:06:58,345][01623] Num frames 3900... -[2023-02-24 11:06:58,499][01623] Num frames 4000... -[2023-02-24 11:06:58,658][01623] Num frames 4100... -[2023-02-24 11:06:58,815][01623] Num frames 4200... -[2023-02-24 11:06:58,973][01623] Num frames 4300... -[2023-02-24 11:06:59,127][01623] Num frames 4400... -[2023-02-24 11:06:59,290][01623] Num frames 4500... -[2023-02-24 11:06:59,450][01623] Num frames 4600... -[2023-02-24 11:06:59,612][01623] Num frames 4700... -[2023-02-24 11:06:59,775][01623] Num frames 4800... -[2023-02-24 11:06:59,938][01623] Num frames 4900... -[2023-02-24 11:07:00,095][01623] Num frames 5000... -[2023-02-24 11:07:00,255][01623] Num frames 5100... -[2023-02-24 11:07:00,424][01623] Num frames 5200... -[2023-02-24 11:07:00,589][01623] Num frames 5300... -[2023-02-24 11:07:00,804][01623] Avg episode rewards: #0: 34.497, true rewards: #0: 13.497 -[2023-02-24 11:07:00,807][01623] Avg episode reward: 34.497, avg true_objective: 13.497 -[2023-02-24 11:07:00,811][01623] Num frames 5400... -[2023-02-24 11:07:00,935][01623] Num frames 5500... -[2023-02-24 11:07:01,046][01623] Num frames 5600... -[2023-02-24 11:07:01,155][01623] Num frames 5700... -[2023-02-24 11:07:01,272][01623] Num frames 5800... -[2023-02-24 11:07:01,385][01623] Num frames 5900... -[2023-02-24 11:07:01,495][01623] Num frames 6000... -[2023-02-24 11:07:01,603][01623] Num frames 6100... -[2023-02-24 11:07:01,714][01623] Num frames 6200... -[2023-02-24 11:07:01,830][01623] Num frames 6300... -[2023-02-24 11:07:01,966][01623] Num frames 6400... -[2023-02-24 11:07:02,090][01623] Num frames 6500... -[2023-02-24 11:07:02,219][01623] Num frames 6600... -[2023-02-24 11:07:02,350][01623] Num frames 6700... -[2023-02-24 11:07:02,472][01623] Num frames 6800... -[2023-02-24 11:07:02,593][01623] Num frames 6900... -[2023-02-24 11:07:02,715][01623] Num frames 7000... -[2023-02-24 11:07:02,822][01623] Avg episode rewards: #0: 35.288, true rewards: #0: 14.088 -[2023-02-24 11:07:02,823][01623] Avg episode reward: 35.288, avg true_objective: 14.088 -[2023-02-24 11:07:02,899][01623] Num frames 7100... -[2023-02-24 11:07:03,024][01623] Num frames 7200... -[2023-02-24 11:07:03,138][01623] Num frames 7300... -[2023-02-24 11:07:03,246][01623] Num frames 7400... -[2023-02-24 11:07:03,360][01623] Num frames 7500... -[2023-02-24 11:07:03,471][01623] Num frames 7600... -[2023-02-24 11:07:03,549][01623] Avg episode rewards: #0: 31.533, true rewards: #0: 12.700 -[2023-02-24 11:07:03,550][01623] Avg episode reward: 31.533, avg true_objective: 12.700 -[2023-02-24 11:07:03,649][01623] Num frames 7700... -[2023-02-24 11:07:03,769][01623] Num frames 7800... -[2023-02-24 11:07:03,888][01623] Num frames 7900... -[2023-02-24 11:07:04,017][01623] Num frames 8000... -[2023-02-24 11:07:04,134][01623] Num frames 8100... -[2023-02-24 11:07:04,253][01623] Num frames 8200... -[2023-02-24 11:07:04,371][01623] Num frames 8300... -[2023-02-24 11:07:04,485][01623] Num frames 8400... -[2023-02-24 11:07:04,598][01623] Num frames 8500... -[2023-02-24 11:07:04,715][01623] Num frames 8600... -[2023-02-24 11:07:04,824][01623] Num frames 8700... -[2023-02-24 11:07:04,941][01623] Num frames 8800... -[2023-02-24 11:07:05,060][01623] Num frames 8900... -[2023-02-24 11:07:05,176][01623] Num frames 9000... -[2023-02-24 11:07:05,288][01623] Num frames 9100... -[2023-02-24 11:07:05,436][01623] Avg episode rewards: #0: 32.265, true rewards: #0: 13.123 -[2023-02-24 11:07:05,437][01623] Avg episode reward: 32.265, avg true_objective: 13.123 -[2023-02-24 11:07:05,459][01623] Num frames 9200... -[2023-02-24 11:07:05,576][01623] Num frames 9300... -[2023-02-24 11:07:05,693][01623] Num frames 9400... -[2023-02-24 11:07:05,807][01623] Num frames 9500... -[2023-02-24 11:07:05,924][01623] Num frames 9600... -[2023-02-24 11:07:06,051][01623] Num frames 9700... -[2023-02-24 11:07:06,178][01623] Num frames 9800... -[2023-02-24 11:07:06,285][01623] Avg episode rewards: #0: 30.054, true rewards: #0: 12.304 -[2023-02-24 11:07:06,288][01623] Avg episode reward: 30.054, avg true_objective: 12.304 -[2023-02-24 11:07:06,361][01623] Num frames 9900... -[2023-02-24 11:07:06,483][01623] Num frames 10000... -[2023-02-24 11:07:06,603][01623] Num frames 10100... -[2023-02-24 11:07:06,728][01623] Num frames 10200... -[2023-02-24 11:07:06,885][01623] Avg episode rewards: #0: 27.323, true rewards: #0: 11.434 -[2023-02-24 11:07:06,887][01623] Avg episode reward: 27.323, avg true_objective: 11.434 -[2023-02-24 11:07:06,904][01623] Num frames 10300... -[2023-02-24 11:07:07,034][01623] Num frames 10400... -[2023-02-24 11:07:07,174][01623] Num frames 10500... -[2023-02-24 11:07:07,309][01623] Num frames 10600... -[2023-02-24 11:07:07,426][01623] Num frames 10700... -[2023-02-24 11:07:07,543][01623] Num frames 10800... -[2023-02-24 11:07:07,659][01623] Num frames 10900... -[2023-02-24 11:07:07,772][01623] Num frames 11000... -[2023-02-24 11:07:07,890][01623] Num frames 11100... -[2023-02-24 11:07:08,005][01623] Num frames 11200... -[2023-02-24 11:07:08,125][01623] Num frames 11300... -[2023-02-24 11:07:08,290][01623] Avg episode rewards: #0: 27.294, true rewards: #0: 11.394 -[2023-02-24 11:07:08,292][01623] Avg episode reward: 27.294, avg true_objective: 11.394 -[2023-02-24 11:08:20,362][01623] Replay video saved to /content/train_dir/default_experiment/replay.mp4! -[2023-02-24 11:13:41,500][01623] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json -[2023-02-24 11:13:41,504][01623] Overriding arg 'num_workers' with value 1 passed from command line -[2023-02-24 11:13:41,506][01623] Adding new argument 'no_render'=True that is not in the saved config file! -[2023-02-24 11:13:41,511][01623] Adding new argument 'save_video'=True that is not in the saved config file! -[2023-02-24 11:13:41,515][01623] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! -[2023-02-24 11:13:41,516][01623] Adding new argument 'video_name'=None that is not in the saved config file! -[2023-02-24 11:13:41,518][01623] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! -[2023-02-24 11:13:41,519][01623] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! -[2023-02-24 11:13:41,523][01623] Adding new argument 'push_to_hub'=True that is not in the saved config file! -[2023-02-24 11:13:41,524][01623] Adding new argument 'hf_repository'='dbaibak/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! -[2023-02-24 11:13:41,525][01623] Adding new argument 'policy_index'=0 that is not in the saved config file! -[2023-02-24 11:13:41,526][01623] Adding new argument 'eval_deterministic'=False that is not in the saved config file! -[2023-02-24 11:13:41,528][01623] Adding new argument 'train_script'=None that is not in the saved config file! -[2023-02-24 11:13:41,531][01623] Adding new argument 'enjoy_script'=None that is not in the saved config file! -[2023-02-24 11:13:41,532][01623] Using frameskip 1 and render_action_repeat=4 for evaluation -[2023-02-24 11:13:41,570][01623] RunningMeanStd input shape: (3, 72, 128) -[2023-02-24 11:13:41,572][01623] RunningMeanStd input shape: (1,) -[2023-02-24 11:13:41,593][01623] ConvEncoder: input_channels=3 -[2023-02-24 11:13:41,655][01623] Conv encoder output size: 512 -[2023-02-24 11:13:41,657][01623] Policy head output size: 512 -[2023-02-24 11:13:41,685][01623] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth... -[2023-02-24 11:13:42,350][01623] Num frames 100... -[2023-02-24 11:13:42,514][01623] Num frames 200... -[2023-02-24 11:13:42,675][01623] Num frames 300... -[2023-02-24 11:13:42,839][01623] Num frames 400... -[2023-02-24 11:13:42,962][01623] Avg episode rewards: #0: 9.410, true rewards: #0: 4.410 -[2023-02-24 11:13:42,965][01623] Avg episode reward: 9.410, avg true_objective: 4.410 -[2023-02-24 11:13:43,060][01623] Num frames 500... -[2023-02-24 11:13:43,193][01623] Num frames 600... -[2023-02-24 11:13:43,303][01623] Num frames 700... -[2023-02-24 11:13:43,419][01623] Num frames 800... -[2023-02-24 11:13:43,538][01623] Num frames 900... -[2023-02-24 11:13:43,661][01623] Num frames 1000... -[2023-02-24 11:13:43,775][01623] Num frames 1100... -[2023-02-24 11:13:43,895][01623] Num frames 1200... -[2023-02-24 11:13:43,961][01623] Avg episode rewards: #0: 13.045, true rewards: #0: 6.045 -[2023-02-24 11:13:43,963][01623] Avg episode reward: 13.045, avg true_objective: 6.045 -[2023-02-24 11:13:44,063][01623] Num frames 1300... -[2023-02-24 11:13:44,172][01623] Num frames 1400... -[2023-02-24 11:13:44,293][01623] Num frames 1500... -[2023-02-24 11:13:44,413][01623] Num frames 1600... -[2023-02-24 11:13:44,530][01623] Num frames 1700... -[2023-02-24 11:13:44,639][01623] Num frames 1800... -[2023-02-24 11:13:44,783][01623] Avg episode rewards: #0: 12.937, true rewards: #0: 6.270 -[2023-02-24 11:13:44,785][01623] Avg episode reward: 12.937, avg true_objective: 6.270 -[2023-02-24 11:13:44,812][01623] Num frames 1900... -[2023-02-24 11:13:44,925][01623] Num frames 2000... -[2023-02-24 11:13:45,038][01623] Num frames 2100... -[2023-02-24 11:13:45,154][01623] Num frames 2200... -[2023-02-24 11:13:45,271][01623] Num frames 2300... -[2023-02-24 11:13:45,387][01623] Num frames 2400... -[2023-02-24 11:13:45,498][01623] Num frames 2500... -[2023-02-24 11:13:45,614][01623] Num frames 2600... -[2023-02-24 11:13:45,737][01623] Num frames 2700... -[2023-02-24 11:13:45,850][01623] Num frames 2800... -[2023-02-24 11:13:45,980][01623] Num frames 2900... -[2023-02-24 11:13:46,106][01623] Num frames 3000... -[2023-02-24 11:13:46,222][01623] Num frames 3100... -[2023-02-24 11:13:46,351][01623] Num frames 3200... -[2023-02-24 11:13:46,480][01623] Num frames 3300... -[2023-02-24 11:13:46,570][01623] Avg episode rewards: #0: 20.577, true rewards: #0: 8.327 -[2023-02-24 11:13:46,571][01623] Avg episode reward: 20.577, avg true_objective: 8.327 -[2023-02-24 11:13:46,656][01623] Num frames 3400... -[2023-02-24 11:13:46,787][01623] Num frames 3500... -[2023-02-24 11:13:46,906][01623] Num frames 3600... -[2023-02-24 11:13:47,024][01623] Num frames 3700... -[2023-02-24 11:13:47,145][01623] Num frames 3800... -[2023-02-24 11:13:47,267][01623] Num frames 3900... -[2023-02-24 11:13:47,391][01623] Num frames 4000... -[2023-02-24 11:13:47,517][01623] Num frames 4100... -[2023-02-24 11:13:47,631][01623] Num frames 4200... -[2023-02-24 11:13:47,754][01623] Num frames 4300... -[2023-02-24 11:13:47,882][01623] Num frames 4400... -[2023-02-24 11:13:48,010][01623] Num frames 4500... -[2023-02-24 11:13:48,135][01623] Num frames 4600... -[2023-02-24 11:13:48,248][01623] Num frames 4700... -[2023-02-24 11:13:48,367][01623] Num frames 4800... -[2023-02-24 11:13:48,489][01623] Num frames 4900... -[2023-02-24 11:13:48,601][01623] Num frames 5000... -[2023-02-24 11:13:48,720][01623] Num frames 5100... -[2023-02-24 11:13:48,842][01623] Num frames 5200... -[2023-02-24 11:13:48,966][01623] Num frames 5300... -[2023-02-24 11:13:49,077][01623] Num frames 5400... -[2023-02-24 11:13:49,168][01623] Avg episode rewards: #0: 28.262, true rewards: #0: 10.862 -[2023-02-24 11:13:49,170][01623] Avg episode reward: 28.262, avg true_objective: 10.862 -[2023-02-24 11:13:49,270][01623] Num frames 5500... -[2023-02-24 11:13:49,399][01623] Num frames 5600... -[2023-02-24 11:13:49,519][01623] Num frames 5700... -[2023-02-24 11:13:49,630][01623] Num frames 5800... -[2023-02-24 11:13:49,748][01623] Num frames 5900... -[2023-02-24 11:13:49,866][01623] Num frames 6000... -[2023-02-24 11:13:49,982][01623] Num frames 6100... -[2023-02-24 11:13:50,110][01623] Num frames 6200... -[2023-02-24 11:13:50,225][01623] Num frames 6300... -[2023-02-24 11:13:50,424][01623] Num frames 6400... -[2023-02-24 11:13:50,588][01623] Num frames 6500... -[2023-02-24 11:13:50,667][01623] Avg episode rewards: #0: 27.198, true rewards: #0: 10.865 -[2023-02-24 11:13:50,669][01623] Avg episode reward: 27.198, avg true_objective: 10.865 -[2023-02-24 11:13:50,772][01623] Num frames 6600... -[2023-02-24 11:13:50,903][01623] Num frames 6700... -[2023-02-24 11:13:51,025][01623] Num frames 6800... -[2023-02-24 11:13:51,150][01623] Num frames 6900... -[2023-02-24 11:13:51,263][01623] Num frames 7000... -[2023-02-24 11:13:51,385][01623] Num frames 7100... -[2023-02-24 11:13:51,513][01623] Num frames 7200... -[2023-02-24 11:13:51,630][01623] Num frames 7300... -[2023-02-24 11:13:51,747][01623] Num frames 7400... -[2023-02-24 11:13:51,867][01623] Num frames 7500... -[2023-02-24 11:13:51,985][01623] Num frames 7600... -[2023-02-24 11:13:52,105][01623] Num frames 7700... -[2023-02-24 11:13:52,234][01623] Num frames 7800... -[2023-02-24 11:13:52,370][01623] Num frames 7900... -[2023-02-24 11:13:52,508][01623] Num frames 8000... -[2023-02-24 11:13:52,628][01623] Num frames 8100... -[2023-02-24 11:13:52,749][01623] Num frames 8200... -[2023-02-24 11:13:52,880][01623] Num frames 8300... -[2023-02-24 11:13:52,999][01623] Num frames 8400... -[2023-02-24 11:13:53,086][01623] Avg episode rewards: #0: 29.894, true rewards: #0: 12.037 -[2023-02-24 11:13:53,088][01623] Avg episode reward: 29.894, avg true_objective: 12.037 -[2023-02-24 11:13:53,214][01623] Num frames 8500... -[2023-02-24 11:13:53,405][01623] Num frames 8600... -[2023-02-24 11:13:53,576][01623] Num frames 8700... -[2023-02-24 11:13:53,740][01623] Num frames 8800... -[2023-02-24 11:13:53,916][01623] Avg episode rewards: #0: 26.842, true rewards: #0: 11.092 -[2023-02-24 11:13:53,918][01623] Avg episode reward: 26.842, avg true_objective: 11.092 -[2023-02-24 11:13:53,962][01623] Num frames 8900... -[2023-02-24 11:13:54,113][01623] Num frames 9000... -[2023-02-24 11:13:54,272][01623] Num frames 9100... -[2023-02-24 11:13:54,432][01623] Num frames 9200... -[2023-02-24 11:13:54,592][01623] Num frames 9300... -[2023-02-24 11:13:54,692][01623] Avg episode rewards: #0: 24.802, true rewards: #0: 10.358 -[2023-02-24 11:13:54,693][01623] Avg episode reward: 24.802, avg true_objective: 10.358 -[2023-02-24 11:13:54,820][01623] Num frames 9400... -[2023-02-24 11:13:54,991][01623] Num frames 9500... -[2023-02-24 11:13:55,153][01623] Num frames 9600... -[2023-02-24 11:13:55,316][01623] Num frames 9700... -[2023-02-24 11:13:55,490][01623] Num frames 9800... -[2023-02-24 11:13:55,659][01623] Num frames 9900... -[2023-02-24 11:13:55,827][01623] Num frames 10000... -[2023-02-24 11:13:55,991][01623] Num frames 10100... -[2023-02-24 11:13:56,174][01623] Num frames 10200... -[2023-02-24 11:13:56,359][01623] Num frames 10300... -[2023-02-24 11:13:56,542][01623] Num frames 10400... -[2023-02-24 11:13:56,714][01623] Num frames 10500... -[2023-02-24 11:13:56,810][01623] Avg episode rewards: #0: 25.324, true rewards: #0: 10.524 -[2023-02-24 11:13:56,813][01623] Avg episode reward: 25.324, avg true_objective: 10.524 -[2023-02-24 11:15:04,289][01623] Replay video saved to /content/train_dir/default_experiment/replay.mp4! +[2023-02-24 12:15:10,546][11201] Using optimizer +[2023-02-24 12:15:10,547][11201] No checkpoints found +[2023-02-24 12:15:10,547][11201] Did not load from checkpoint, starting from scratch! +[2023-02-24 12:15:10,547][11201] Initialized policy 0 weights for model version 0 +[2023-02-24 12:15:10,551][11201] LearnerWorker_p0 finished initialization! +[2023-02-24 12:15:10,551][11201] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 12:15:10,778][11215] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 12:15:10,779][11215] RunningMeanStd input shape: (1,) +[2023-02-24 12:15:10,792][11215] ConvEncoder: input_channels=3 +[2023-02-24 12:15:10,892][11215] Conv encoder output size: 512 +[2023-02-24 12:15:10,892][11215] Policy head output size: 512 +[2023-02-24 12:15:12,870][00205] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-02-24 12:15:13,135][00205] Inference worker 0-0 is ready! +[2023-02-24 12:15:13,137][00205] All inference workers are ready! Signal rollout workers to start! +[2023-02-24 12:15:13,189][00205] Heartbeat connected on Batcher_0 +[2023-02-24 12:15:13,193][00205] Heartbeat connected on LearnerWorker_p0 +[2023-02-24 12:15:13,242][00205] Heartbeat connected on InferenceWorker_p0-w0 +[2023-02-24 12:15:13,280][11226] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,277][11227] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,288][11222] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,307][11224] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,311][11216] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,327][11223] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,328][11221] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,325][11225] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:14,505][11223] Decorrelating experience for 0 frames... +[2023-02-24 12:15:14,502][11216] Decorrelating experience for 0 frames... +[2023-02-24 12:15:14,504][11224] Decorrelating experience for 0 frames... +[2023-02-24 12:15:14,503][11227] Decorrelating experience for 0 frames... +[2023-02-24 12:15:14,506][11226] Decorrelating experience for 0 frames... +[2023-02-24 12:15:14,505][11225] Decorrelating experience for 0 frames... +[2023-02-24 12:15:15,534][11221] Decorrelating experience for 0 frames... +[2023-02-24 12:15:15,544][11225] Decorrelating experience for 32 frames... +[2023-02-24 12:15:15,551][11226] Decorrelating experience for 32 frames... +[2023-02-24 12:15:15,550][11216] Decorrelating experience for 32 frames... +[2023-02-24 12:15:15,555][11224] Decorrelating experience for 32 frames... +[2023-02-24 12:15:15,553][11227] Decorrelating experience for 32 frames... +[2023-02-24 12:15:16,392][11223] Decorrelating experience for 32 frames... +[2023-02-24 12:15:16,405][11222] Decorrelating experience for 0 frames... +[2023-02-24 12:15:16,494][11216] Decorrelating experience for 64 frames... +[2023-02-24 12:15:16,504][11224] Decorrelating experience for 64 frames... +[2023-02-24 12:15:17,196][11222] Decorrelating experience for 32 frames... +[2023-02-24 12:15:17,361][11223] Decorrelating experience for 64 frames... +[2023-02-24 12:15:17,441][11224] Decorrelating experience for 96 frames... +[2023-02-24 12:15:17,581][11216] Decorrelating experience for 96 frames... +[2023-02-24 12:15:17,619][00205] Heartbeat connected on RolloutWorker_w5 +[2023-02-24 12:15:17,800][00205] Heartbeat connected on RolloutWorker_w0 +[2023-02-24 12:15:17,870][00205] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-02-24 12:15:18,286][11227] Decorrelating experience for 64 frames... +[2023-02-24 12:15:18,426][11222] Decorrelating experience for 64 frames... +[2023-02-24 12:15:18,910][11227] Decorrelating experience for 96 frames... +[2023-02-24 12:15:19,117][00205] Heartbeat connected on RolloutWorker_w4 +[2023-02-24 12:15:19,278][11223] Decorrelating experience for 96 frames... +[2023-02-24 12:15:19,558][11222] Decorrelating experience for 96 frames... +[2023-02-24 12:15:19,660][00205] Heartbeat connected on RolloutWorker_w3 +[2023-02-24 12:15:19,678][00205] Heartbeat connected on RolloutWorker_w2 +[2023-02-24 12:15:19,943][11226] Decorrelating experience for 64 frames... +[2023-02-24 12:15:21,076][11221] Decorrelating experience for 32 frames... +[2023-02-24 12:15:21,171][11226] Decorrelating experience for 96 frames... +[2023-02-24 12:15:21,501][00205] Heartbeat connected on RolloutWorker_w7 +[2023-02-24 12:15:21,637][11225] Decorrelating experience for 64 frames... +[2023-02-24 12:15:22,201][11225] Decorrelating experience for 96 frames... +[2023-02-24 12:15:22,366][00205] Heartbeat connected on RolloutWorker_w6 +[2023-02-24 12:15:22,492][11221] Decorrelating experience for 64 frames... +[2023-02-24 12:15:22,876][00205] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 3.6. Samples: 36. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-02-24 12:15:24,394][11221] Decorrelating experience for 96 frames... +[2023-02-24 12:15:25,075][00205] Heartbeat connected on RolloutWorker_w1 +[2023-02-24 12:15:27,251][11201] Signal inference workers to stop experience collection... +[2023-02-24 12:15:27,262][11215] InferenceWorker_p0-w0: stopping experience collection +[2023-02-24 12:15:27,870][00205] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 108.0. Samples: 1620. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-02-24 12:15:27,872][00205] Avg episode reward: [(0, '2.148')] +[2023-02-24 12:15:29,651][11201] Signal inference workers to resume experience collection... +[2023-02-24 12:15:29,653][11215] InferenceWorker_p0-w0: resuming experience collection +[2023-02-24 12:15:32,870][00205] Fps is (10 sec: 1639.3, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 16384. Throughput: 0: 191.1. Samples: 3822. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-24 12:15:32,876][00205] Avg episode reward: [(0, '3.277')] +[2023-02-24 12:15:37,870][00205] Fps is (10 sec: 3686.3, 60 sec: 1474.6, 300 sec: 1474.6). Total num frames: 36864. Throughput: 0: 408.2. Samples: 10204. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:15:37,877][00205] Avg episode reward: [(0, '3.970')] +[2023-02-24 12:15:37,944][11215] Updated weights for policy 0, policy_version 10 (0.0017) +[2023-02-24 12:15:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 1774.9, 300 sec: 1774.9). Total num frames: 53248. Throughput: 0: 414.3. Samples: 12428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:15:42,879][00205] Avg episode reward: [(0, '4.276')] +[2023-02-24 12:15:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 1989.5, 300 sec: 1989.5). Total num frames: 69632. Throughput: 0: 467.9. Samples: 16378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:15:47,872][00205] Avg episode reward: [(0, '4.387')] +[2023-02-24 12:15:50,151][11215] Updated weights for policy 0, policy_version 20 (0.0018) +[2023-02-24 12:15:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 2252.8, 300 sec: 2252.8). Total num frames: 90112. Throughput: 0: 582.8. Samples: 23310. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:15:52,877][00205] Avg episode reward: [(0, '4.311')] +[2023-02-24 12:15:57,870][00205] Fps is (10 sec: 4505.4, 60 sec: 2548.6, 300 sec: 2548.6). Total num frames: 114688. Throughput: 0: 597.3. Samples: 26878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:15:57,876][00205] Avg episode reward: [(0, '4.498')] +[2023-02-24 12:15:57,885][11201] Saving new best policy, reward=4.498! +[2023-02-24 12:16:00,589][11215] Updated weights for policy 0, policy_version 30 (0.0021) +[2023-02-24 12:16:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 2539.5, 300 sec: 2539.5). Total num frames: 126976. Throughput: 0: 702.6. Samples: 31616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:16:02,876][00205] Avg episode reward: [(0, '4.507')] +[2023-02-24 12:16:02,879][11201] Saving new best policy, reward=4.507! +[2023-02-24 12:16:07,870][00205] Fps is (10 sec: 3277.0, 60 sec: 2681.0, 300 sec: 2681.0). Total num frames: 147456. Throughput: 0: 814.0. Samples: 36662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:16:07,878][00205] Avg episode reward: [(0, '4.360')] +[2023-02-24 12:16:11,386][11215] Updated weights for policy 0, policy_version 40 (0.0012) +[2023-02-24 12:16:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 2798.9, 300 sec: 2798.9). Total num frames: 167936. Throughput: 0: 856.4. Samples: 40160. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:16:12,873][00205] Avg episode reward: [(0, '4.326')] +[2023-02-24 12:16:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3140.3, 300 sec: 2898.7). Total num frames: 188416. Throughput: 0: 957.8. Samples: 46924. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:16:17,872][00205] Avg episode reward: [(0, '4.480')] +[2023-02-24 12:16:22,602][11215] Updated weights for policy 0, policy_version 50 (0.0019) +[2023-02-24 12:16:22,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3413.6, 300 sec: 2925.7). Total num frames: 204800. Throughput: 0: 914.2. Samples: 51344. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:16:22,875][00205] Avg episode reward: [(0, '4.418')] +[2023-02-24 12:16:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3003.7). Total num frames: 225280. Throughput: 0: 916.9. Samples: 53690. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:16:27,873][00205] Avg episode reward: [(0, '4.363')] +[2023-02-24 12:16:32,346][11215] Updated weights for policy 0, policy_version 60 (0.0012) +[2023-02-24 12:16:32,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3822.9, 300 sec: 3072.0). Total num frames: 245760. Throughput: 0: 982.3. Samples: 60580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:16:32,873][00205] Avg episode reward: [(0, '4.446')] +[2023-02-24 12:16:37,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3822.8, 300 sec: 3132.1). Total num frames: 266240. Throughput: 0: 968.6. Samples: 66900. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:16:37,875][00205] Avg episode reward: [(0, '4.487')] +[2023-02-24 12:16:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3094.8). Total num frames: 278528. Throughput: 0: 939.7. Samples: 69166. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:16:42,879][00205] Avg episode reward: [(0, '4.349')] +[2023-02-24 12:16:44,177][11215] Updated weights for policy 0, policy_version 70 (0.0048) +[2023-02-24 12:16:47,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3822.9, 300 sec: 3147.4). Total num frames: 299008. Throughput: 0: 945.0. Samples: 74142. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:16:47,875][00205] Avg episode reward: [(0, '4.269')] +[2023-02-24 12:16:47,972][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000074_303104.pth... +[2023-02-24 12:16:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3235.8). Total num frames: 323584. Throughput: 0: 985.9. Samples: 81028. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:16:52,878][00205] Avg episode reward: [(0, '4.319')] +[2023-02-24 12:16:53,454][11215] Updated weights for policy 0, policy_version 80 (0.0018) +[2023-02-24 12:16:57,874][00205] Fps is (10 sec: 4094.3, 60 sec: 3754.4, 300 sec: 3237.7). Total num frames: 339968. Throughput: 0: 988.0. Samples: 84624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:16:57,883][00205] Avg episode reward: [(0, '4.369')] +[2023-02-24 12:17:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3239.6). Total num frames: 356352. Throughput: 0: 939.6. Samples: 89206. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:17:02,874][00205] Avg episode reward: [(0, '4.496')] +[2023-02-24 12:17:05,500][11215] Updated weights for policy 0, policy_version 90 (0.0025) +[2023-02-24 12:17:07,870][00205] Fps is (10 sec: 3688.0, 60 sec: 3822.9, 300 sec: 3276.8). Total num frames: 376832. Throughput: 0: 962.2. Samples: 94642. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:17:07,875][00205] Avg episode reward: [(0, '4.494')] +[2023-02-24 12:17:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3345.1). Total num frames: 401408. Throughput: 0: 989.4. Samples: 98212. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:17:12,873][00205] Avg episode reward: [(0, '4.592')] +[2023-02-24 12:17:12,880][11201] Saving new best policy, reward=4.592! +[2023-02-24 12:17:14,319][11215] Updated weights for policy 0, policy_version 100 (0.0029) +[2023-02-24 12:17:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3342.3). Total num frames: 417792. Throughput: 0: 981.1. Samples: 104730. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:17:17,872][00205] Avg episode reward: [(0, '4.498')] +[2023-02-24 12:17:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3339.8). Total num frames: 434176. Throughput: 0: 934.1. Samples: 108934. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:17:22,873][00205] Avg episode reward: [(0, '4.362')] +[2023-02-24 12:17:27,018][11215] Updated weights for policy 0, policy_version 110 (0.0029) +[2023-02-24 12:17:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3367.8). Total num frames: 454656. Throughput: 0: 934.0. Samples: 111194. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:17:27,872][00205] Avg episode reward: [(0, '4.453')] +[2023-02-24 12:17:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3393.8). Total num frames: 475136. Throughput: 0: 978.1. Samples: 118156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:17:32,878][00205] Avg episode reward: [(0, '4.647')] +[2023-02-24 12:17:32,884][11201] Saving new best policy, reward=4.647! +[2023-02-24 12:17:36,281][11215] Updated weights for policy 0, policy_version 120 (0.0015) +[2023-02-24 12:17:37,875][00205] Fps is (10 sec: 4094.1, 60 sec: 3822.8, 300 sec: 3417.9). Total num frames: 495616. Throughput: 0: 958.5. Samples: 124166. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:17:37,878][00205] Avg episode reward: [(0, '4.540')] +[2023-02-24 12:17:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3386.0). Total num frames: 507904. Throughput: 0: 924.4. Samples: 126220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:17:42,873][00205] Avg episode reward: [(0, '4.494')] +[2023-02-24 12:17:47,870][00205] Fps is (10 sec: 2868.6, 60 sec: 3754.7, 300 sec: 3382.5). Total num frames: 524288. Throughput: 0: 924.0. Samples: 130784. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:17:47,873][00205] Avg episode reward: [(0, '4.480')] +[2023-02-24 12:17:48,958][11215] Updated weights for policy 0, policy_version 130 (0.0029) +[2023-02-24 12:17:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3430.4). Total num frames: 548864. Throughput: 0: 948.7. Samples: 137334. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:17:52,877][00205] Avg episode reward: [(0, '4.535')] +[2023-02-24 12:17:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.9, 300 sec: 3425.7). Total num frames: 565248. Throughput: 0: 944.3. Samples: 140704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:17:57,875][00205] Avg episode reward: [(0, '4.569')] +[2023-02-24 12:17:59,506][11215] Updated weights for policy 0, policy_version 140 (0.0025) +[2023-02-24 12:18:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3421.4). Total num frames: 581632. Throughput: 0: 893.5. Samples: 144938. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:18:02,872][00205] Avg episode reward: [(0, '4.381')] +[2023-02-24 12:18:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3417.2). Total num frames: 598016. Throughput: 0: 900.6. Samples: 149460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:18:07,872][00205] Avg episode reward: [(0, '4.500')] +[2023-02-24 12:18:11,604][11215] Updated weights for policy 0, policy_version 150 (0.0025) +[2023-02-24 12:18:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3436.1). Total num frames: 618496. Throughput: 0: 919.8. Samples: 152584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:18:12,873][00205] Avg episode reward: [(0, '4.487')] +[2023-02-24 12:18:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3431.8). Total num frames: 634880. Throughput: 0: 904.4. Samples: 158856. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:18:17,873][00205] Avg episode reward: [(0, '4.541')] +[2023-02-24 12:18:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3406.1). Total num frames: 647168. Throughput: 0: 862.2. Samples: 162960. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:18:22,874][00205] Avg episode reward: [(0, '4.610')] +[2023-02-24 12:18:24,292][11215] Updated weights for policy 0, policy_version 160 (0.0015) +[2023-02-24 12:18:27,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3423.8). Total num frames: 667648. Throughput: 0: 863.5. Samples: 165078. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:18:27,873][00205] Avg episode reward: [(0, '4.621')] +[2023-02-24 12:18:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3440.6). Total num frames: 688128. Throughput: 0: 904.7. Samples: 171496. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:18:32,873][00205] Avg episode reward: [(0, '4.513')] +[2023-02-24 12:18:34,056][11215] Updated weights for policy 0, policy_version 170 (0.0012) +[2023-02-24 12:18:37,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3550.1, 300 sec: 3456.6). Total num frames: 708608. Throughput: 0: 891.1. Samples: 177432. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:18:37,873][00205] Avg episode reward: [(0, '4.413')] +[2023-02-24 12:18:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3432.8). Total num frames: 720896. Throughput: 0: 862.0. Samples: 179496. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 12:18:42,874][00205] Avg episode reward: [(0, '4.574')] +[2023-02-24 12:18:47,477][11215] Updated weights for policy 0, policy_version 180 (0.0025) +[2023-02-24 12:18:47,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.8, 300 sec: 3429.2). Total num frames: 737280. Throughput: 0: 854.2. Samples: 183378. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:18:47,876][00205] Avg episode reward: [(0, '4.684')] +[2023-02-24 12:18:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000180_737280.pth... +[2023-02-24 12:18:48,003][11201] Saving new best policy, reward=4.684! +[2023-02-24 12:18:52,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3444.4). Total num frames: 757760. Throughput: 0: 887.9. Samples: 189418. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:18:52,876][00205] Avg episode reward: [(0, '4.764')] +[2023-02-24 12:18:52,881][11201] Saving new best policy, reward=4.764! +[2023-02-24 12:18:57,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3440.6). Total num frames: 774144. Throughput: 0: 890.4. Samples: 192650. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:18:57,872][00205] Avg episode reward: [(0, '4.736')] +[2023-02-24 12:18:58,010][11215] Updated weights for policy 0, policy_version 190 (0.0013) +[2023-02-24 12:19:02,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3437.1). Total num frames: 790528. Throughput: 0: 854.0. Samples: 197286. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:19:02,875][00205] Avg episode reward: [(0, '4.664')] +[2023-02-24 12:19:07,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3481.6, 300 sec: 3433.7). Total num frames: 806912. Throughput: 0: 866.8. Samples: 201966. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:19:07,873][00205] Avg episode reward: [(0, '4.628')] +[2023-02-24 12:19:10,101][11215] Updated weights for policy 0, policy_version 200 (0.0025) +[2023-02-24 12:19:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3447.5). Total num frames: 827392. Throughput: 0: 894.9. Samples: 205346. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:19:12,878][00205] Avg episode reward: [(0, '4.605')] +[2023-02-24 12:19:17,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3460.7). Total num frames: 847872. Throughput: 0: 899.6. Samples: 211980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:19:17,874][00205] Avg episode reward: [(0, '4.689')] +[2023-02-24 12:19:21,222][11215] Updated weights for policy 0, policy_version 210 (0.0014) +[2023-02-24 12:19:22,872][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3457.0). Total num frames: 864256. Throughput: 0: 862.4. Samples: 216242. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:19:22,877][00205] Avg episode reward: [(0, '4.659')] +[2023-02-24 12:19:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3453.5). Total num frames: 880640. Throughput: 0: 865.1. Samples: 218426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:19:27,872][00205] Avg episode reward: [(0, '4.987')] +[2023-02-24 12:19:27,885][11201] Saving new best policy, reward=4.987! +[2023-02-24 12:19:32,149][11215] Updated weights for policy 0, policy_version 220 (0.0026) +[2023-02-24 12:19:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3465.8). Total num frames: 901120. Throughput: 0: 920.1. Samples: 224780. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:19:32,873][00205] Avg episode reward: [(0, '4.751')] +[2023-02-24 12:19:37,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3477.7). Total num frames: 921600. Throughput: 0: 923.4. Samples: 230972. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:19:37,872][00205] Avg episode reward: [(0, '4.536')] +[2023-02-24 12:19:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3474.0). Total num frames: 937984. Throughput: 0: 898.8. Samples: 233098. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:19:42,874][00205] Avg episode reward: [(0, '4.607')] +[2023-02-24 12:19:44,170][11215] Updated weights for policy 0, policy_version 230 (0.0030) +[2023-02-24 12:19:47,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3618.2, 300 sec: 3470.4). Total num frames: 954368. Throughput: 0: 895.2. Samples: 237572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:19:47,875][00205] Avg episode reward: [(0, '4.886')] +[2023-02-24 12:19:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3481.6). Total num frames: 974848. Throughput: 0: 937.3. Samples: 244142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:19:52,873][00205] Avg episode reward: [(0, '4.803')] +[2023-02-24 12:19:54,244][11215] Updated weights for policy 0, policy_version 240 (0.0026) +[2023-02-24 12:19:57,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3686.2, 300 sec: 3492.3). Total num frames: 995328. Throughput: 0: 932.5. Samples: 247310. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:19:57,875][00205] Avg episode reward: [(0, '4.768')] +[2023-02-24 12:20:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3474.5). Total num frames: 1007616. Throughput: 0: 881.9. Samples: 251664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:20:02,877][00205] Avg episode reward: [(0, '4.945')] +[2023-02-24 12:20:07,425][11215] Updated weights for policy 0, policy_version 250 (0.0012) +[2023-02-24 12:20:07,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3618.2, 300 sec: 3471.2). Total num frames: 1024000. Throughput: 0: 887.0. Samples: 256156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:20:07,877][00205] Avg episode reward: [(0, '5.081')] +[2023-02-24 12:20:07,888][11201] Saving new best policy, reward=5.081! +[2023-02-24 12:20:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 1044480. Throughput: 0: 908.4. Samples: 259306. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:20:12,874][00205] Avg episode reward: [(0, '5.107')] +[2023-02-24 12:20:12,880][11201] Saving new best policy, reward=5.107! +[2023-02-24 12:20:17,756][11215] Updated weights for policy 0, policy_version 260 (0.0012) +[2023-02-24 12:20:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3610.1). Total num frames: 1064960. Throughput: 0: 907.4. Samples: 265614. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:20:17,875][00205] Avg episode reward: [(0, '5.047')] +[2023-02-24 12:20:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 1077248. Throughput: 0: 858.2. Samples: 269590. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:20:22,873][00205] Avg episode reward: [(0, '5.014')] +[2023-02-24 12:20:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 1093632. Throughput: 0: 853.6. Samples: 271510. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:20:27,877][00205] Avg episode reward: [(0, '5.123')] +[2023-02-24 12:20:27,887][11201] Saving new best policy, reward=5.123! +[2023-02-24 12:20:30,782][11215] Updated weights for policy 0, policy_version 270 (0.0030) +[2023-02-24 12:20:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 1114112. Throughput: 0: 887.3. Samples: 277500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:20:32,872][00205] Avg episode reward: [(0, '5.457')] +[2023-02-24 12:20:32,877][11201] Saving new best policy, reward=5.457! +[2023-02-24 12:20:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 1130496. Throughput: 0: 872.3. Samples: 283396. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:20:37,876][00205] Avg episode reward: [(0, '5.337')] +[2023-02-24 12:20:42,877][00205] Fps is (10 sec: 2865.1, 60 sec: 3412.9, 300 sec: 3637.7). Total num frames: 1142784. Throughput: 0: 846.3. Samples: 285396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:20:42,881][00205] Avg episode reward: [(0, '5.401')] +[2023-02-24 12:20:42,915][11215] Updated weights for policy 0, policy_version 280 (0.0013) +[2023-02-24 12:20:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3623.9). Total num frames: 1159168. Throughput: 0: 840.1. Samples: 289468. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:20:47,873][00205] Avg episode reward: [(0, '5.472')] +[2023-02-24 12:20:47,886][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000283_1159168.pth... +[2023-02-24 12:20:47,995][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000074_303104.pth +[2023-02-24 12:20:48,011][11201] Saving new best policy, reward=5.472! +[2023-02-24 12:20:52,870][00205] Fps is (10 sec: 4099.0, 60 sec: 3481.6, 300 sec: 3623.9). Total num frames: 1183744. Throughput: 0: 878.3. Samples: 295680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:20:52,876][00205] Avg episode reward: [(0, '5.501')] +[2023-02-24 12:20:52,880][11201] Saving new best policy, reward=5.501! +[2023-02-24 12:20:53,934][11215] Updated weights for policy 0, policy_version 290 (0.0015) +[2023-02-24 12:20:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.5, 300 sec: 3637.8). Total num frames: 1200128. Throughput: 0: 875.2. Samples: 298692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:20:57,879][00205] Avg episode reward: [(0, '5.351')] +[2023-02-24 12:21:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 1212416. Throughput: 0: 833.4. Samples: 303116. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:21:02,874][00205] Avg episode reward: [(0, '5.385')] +[2023-02-24 12:21:07,198][11215] Updated weights for policy 0, policy_version 300 (0.0012) +[2023-02-24 12:21:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3596.2). Total num frames: 1228800. Throughput: 0: 845.3. Samples: 307628. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:21:07,872][00205] Avg episode reward: [(0, '5.232')] +[2023-02-24 12:21:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3596.1). Total num frames: 1249280. Throughput: 0: 872.5. Samples: 310772. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:21:12,875][00205] Avg episode reward: [(0, '5.181')] +[2023-02-24 12:21:17,493][11215] Updated weights for policy 0, policy_version 310 (0.0016) +[2023-02-24 12:21:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 1269760. Throughput: 0: 880.0. Samples: 317100. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:21:17,873][00205] Avg episode reward: [(0, '5.654')] +[2023-02-24 12:21:17,890][11201] Saving new best policy, reward=5.654! +[2023-02-24 12:21:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3582.3). Total num frames: 1282048. Throughput: 0: 836.2. Samples: 321026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:21:22,872][00205] Avg episode reward: [(0, '5.684')] +[2023-02-24 12:21:22,875][11201] Saving new best policy, reward=5.684! +[2023-02-24 12:21:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 1298432. Throughput: 0: 836.8. Samples: 323048. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:21:27,876][00205] Avg episode reward: [(0, '5.906')] +[2023-02-24 12:21:27,887][11201] Saving new best policy, reward=5.906! +[2023-02-24 12:21:30,480][11215] Updated weights for policy 0, policy_version 320 (0.0018) +[2023-02-24 12:21:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 1318912. Throughput: 0: 879.6. Samples: 329050. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:21:32,873][00205] Avg episode reward: [(0, '5.781')] +[2023-02-24 12:21:37,874][00205] Fps is (10 sec: 3685.1, 60 sec: 3413.1, 300 sec: 3582.2). Total num frames: 1335296. Throughput: 0: 872.0. Samples: 334922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:21:37,884][00205] Avg episode reward: [(0, '5.859')] +[2023-02-24 12:21:42,523][11215] Updated weights for policy 0, policy_version 330 (0.0022) +[2023-02-24 12:21:42,871][00205] Fps is (10 sec: 3276.4, 60 sec: 3481.9, 300 sec: 3568.4). Total num frames: 1351680. Throughput: 0: 849.4. Samples: 336916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:21:42,875][00205] Avg episode reward: [(0, '6.205')] +[2023-02-24 12:21:42,881][11201] Saving new best policy, reward=6.205! +[2023-02-24 12:21:47,870][00205] Fps is (10 sec: 3278.0, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 1368064. Throughput: 0: 842.8. Samples: 341040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:21:47,877][00205] Avg episode reward: [(0, '6.115')] +[2023-02-24 12:21:52,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3413.3, 300 sec: 3554.5). Total num frames: 1388544. Throughput: 0: 886.0. Samples: 347496. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:21:52,876][00205] Avg episode reward: [(0, '5.819')] +[2023-02-24 12:21:53,336][11215] Updated weights for policy 0, policy_version 340 (0.0013) +[2023-02-24 12:21:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 1409024. Throughput: 0: 888.5. Samples: 350756. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:21:57,875][00205] Avg episode reward: [(0, '5.845')] +[2023-02-24 12:22:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 1421312. Throughput: 0: 839.8. Samples: 354890. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:22:02,878][00205] Avg episode reward: [(0, '5.821')] +[2023-02-24 12:22:06,387][11215] Updated weights for policy 0, policy_version 350 (0.0019) +[2023-02-24 12:22:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1437696. Throughput: 0: 860.0. Samples: 359724. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:22:07,878][00205] Avg episode reward: [(0, '6.200')] +[2023-02-24 12:22:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 1458176. Throughput: 0: 887.9. Samples: 363002. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:22:12,878][00205] Avg episode reward: [(0, '6.405')] +[2023-02-24 12:22:12,882][11201] Saving new best policy, reward=6.405! +[2023-02-24 12:22:16,660][11215] Updated weights for policy 0, policy_version 360 (0.0020) +[2023-02-24 12:22:17,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3413.3, 300 sec: 3526.7). Total num frames: 1474560. Throughput: 0: 887.8. Samples: 369000. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:22:17,875][00205] Avg episode reward: [(0, '5.882')] +[2023-02-24 12:22:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1490944. Throughput: 0: 849.6. Samples: 373150. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:22:22,877][00205] Avg episode reward: [(0, '6.185')] +[2023-02-24 12:22:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1507328. Throughput: 0: 850.6. Samples: 375192. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:22:27,873][00205] Avg episode reward: [(0, '6.282')] +[2023-02-24 12:22:29,299][11215] Updated weights for policy 0, policy_version 370 (0.0015) +[2023-02-24 12:22:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1527808. Throughput: 0: 898.9. Samples: 381492. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:22:32,872][00205] Avg episode reward: [(0, '6.499')] +[2023-02-24 12:22:32,875][11201] Saving new best policy, reward=6.499! +[2023-02-24 12:22:37,873][00205] Fps is (10 sec: 3685.2, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1544192. Throughput: 0: 877.9. Samples: 387004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:22:37,887][00205] Avg episode reward: [(0, '6.315')] +[2023-02-24 12:22:41,166][11215] Updated weights for policy 0, policy_version 380 (0.0012) +[2023-02-24 12:22:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.7, 300 sec: 3512.8). Total num frames: 1560576. Throughput: 0: 850.4. Samples: 389024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:22:42,872][00205] Avg episode reward: [(0, '6.222')] +[2023-02-24 12:22:47,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1576960. Throughput: 0: 857.3. Samples: 393470. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:22:47,878][00205] Avg episode reward: [(0, '6.490')] +[2023-02-24 12:22:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000385_1576960.pth... +[2023-02-24 12:22:48,008][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000180_737280.pth +[2023-02-24 12:22:52,505][11215] Updated weights for policy 0, policy_version 390 (0.0016) +[2023-02-24 12:22:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1597440. Throughput: 0: 888.3. Samples: 399698. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:22:52,873][00205] Avg episode reward: [(0, '6.882')] +[2023-02-24 12:22:52,876][11201] Saving new best policy, reward=6.882! +[2023-02-24 12:22:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3499.0). Total num frames: 1613824. Throughput: 0: 884.8. Samples: 402816. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:22:57,874][00205] Avg episode reward: [(0, '6.792')] +[2023-02-24 12:23:02,874][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1626112. Throughput: 0: 838.5. Samples: 406734. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:23:02,876][00205] Avg episode reward: [(0, '7.109')] +[2023-02-24 12:23:02,878][11201] Saving new best policy, reward=7.109! +[2023-02-24 12:23:05,821][11215] Updated weights for policy 0, policy_version 400 (0.0013) +[2023-02-24 12:23:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1646592. Throughput: 0: 854.3. Samples: 411594. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:23:07,873][00205] Avg episode reward: [(0, '7.453')] +[2023-02-24 12:23:07,881][11201] Saving new best policy, reward=7.453! +[2023-02-24 12:23:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1667072. Throughput: 0: 878.7. Samples: 414732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:23:12,873][00205] Avg episode reward: [(0, '7.514')] +[2023-02-24 12:23:12,877][11201] Saving new best policy, reward=7.514! +[2023-02-24 12:23:15,700][11215] Updated weights for policy 0, policy_version 410 (0.0012) +[2023-02-24 12:23:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1683456. Throughput: 0: 869.8. Samples: 420632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:23:17,872][00205] Avg episode reward: [(0, '7.496')] +[2023-02-24 12:23:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1695744. Throughput: 0: 836.9. Samples: 424662. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:23:22,877][00205] Avg episode reward: [(0, '7.483')] +[2023-02-24 12:23:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1716224. Throughput: 0: 837.9. Samples: 426728. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:23:27,878][00205] Avg episode reward: [(0, '7.827')] +[2023-02-24 12:23:27,886][11201] Saving new best policy, reward=7.827! +[2023-02-24 12:23:28,928][11215] Updated weights for policy 0, policy_version 420 (0.0020) +[2023-02-24 12:23:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1736704. Throughput: 0: 878.6. Samples: 433008. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:23:32,879][00205] Avg episode reward: [(0, '8.086')] +[2023-02-24 12:23:32,883][11201] Saving new best policy, reward=8.086! +[2023-02-24 12:23:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.5, 300 sec: 3485.1). Total num frames: 1748992. Throughput: 0: 859.6. Samples: 438378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:23:37,874][00205] Avg episode reward: [(0, '7.910')] +[2023-02-24 12:23:40,867][11215] Updated weights for policy 0, policy_version 430 (0.0011) +[2023-02-24 12:23:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1765376. Throughput: 0: 833.9. Samples: 440342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:23:42,878][00205] Avg episode reward: [(0, '7.784')] +[2023-02-24 12:23:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1781760. Throughput: 0: 848.8. Samples: 444928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:23:47,872][00205] Avg episode reward: [(0, '7.886')] +[2023-02-24 12:23:52,061][11215] Updated weights for policy 0, policy_version 440 (0.0018) +[2023-02-24 12:23:52,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1802240. Throughput: 0: 882.7. Samples: 451314. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:23:52,883][00205] Avg episode reward: [(0, '8.041')] +[2023-02-24 12:23:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1818624. Throughput: 0: 878.9. Samples: 454284. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:23:57,872][00205] Avg episode reward: [(0, '8.421')] +[2023-02-24 12:23:57,892][11201] Saving new best policy, reward=8.421! +[2023-02-24 12:24:02,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1830912. Throughput: 0: 835.1. Samples: 458214. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:24:02,875][00205] Avg episode reward: [(0, '8.603')] +[2023-02-24 12:24:02,878][11201] Saving new best policy, reward=8.603! +[2023-02-24 12:24:05,500][11215] Updated weights for policy 0, policy_version 450 (0.0032) +[2023-02-24 12:24:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1851392. Throughput: 0: 853.7. Samples: 463078. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:24:07,880][00205] Avg episode reward: [(0, '9.296')] +[2023-02-24 12:24:07,890][11201] Saving new best policy, reward=9.296! +[2023-02-24 12:24:12,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1871872. Throughput: 0: 876.9. Samples: 466190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:24:12,872][00205] Avg episode reward: [(0, '9.761')] +[2023-02-24 12:24:12,880][11201] Saving new best policy, reward=9.761! +[2023-02-24 12:24:16,003][11215] Updated weights for policy 0, policy_version 460 (0.0027) +[2023-02-24 12:24:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1888256. Throughput: 0: 861.4. Samples: 471770. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:24:17,874][00205] Avg episode reward: [(0, '9.532')] +[2023-02-24 12:24:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1900544. Throughput: 0: 829.6. Samples: 475712. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:24:22,876][00205] Avg episode reward: [(0, '10.124')] +[2023-02-24 12:24:22,881][11201] Saving new best policy, reward=10.124! +[2023-02-24 12:24:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3443.4). Total num frames: 1916928. Throughput: 0: 836.3. Samples: 477974. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:24:27,872][00205] Avg episode reward: [(0, '9.217')] +[2023-02-24 12:24:28,884][11215] Updated weights for policy 0, policy_version 470 (0.0033) +[2023-02-24 12:24:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3443.4). Total num frames: 1937408. Throughput: 0: 870.9. Samples: 484120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:24:32,873][00205] Avg episode reward: [(0, '9.530')] +[2023-02-24 12:24:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 1953792. Throughput: 0: 843.8. Samples: 489286. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:24:37,876][00205] Avg episode reward: [(0, '9.708')] +[2023-02-24 12:24:41,426][11215] Updated weights for policy 0, policy_version 480 (0.0041) +[2023-02-24 12:24:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3429.5). Total num frames: 1966080. Throughput: 0: 821.3. Samples: 491242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:24:42,873][00205] Avg episode reward: [(0, '9.560')] +[2023-02-24 12:24:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 1986560. Throughput: 0: 839.7. Samples: 496002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:24:47,877][00205] Avg episode reward: [(0, '10.393')] +[2023-02-24 12:24:47,891][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000485_1986560.pth... +[2023-02-24 12:24:48,015][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000283_1159168.pth +[2023-02-24 12:24:48,026][11201] Saving new best policy, reward=10.393! +[2023-02-24 12:24:52,339][11215] Updated weights for policy 0, policy_version 490 (0.0012) +[2023-02-24 12:24:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3429.6). Total num frames: 2007040. Throughput: 0: 867.1. Samples: 502096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:24:52,873][00205] Avg episode reward: [(0, '10.734')] +[2023-02-24 12:24:52,876][11201] Saving new best policy, reward=10.734! +[2023-02-24 12:24:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 2023424. Throughput: 0: 857.2. Samples: 504764. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 12:24:57,880][00205] Avg episode reward: [(0, '11.146')] +[2023-02-24 12:24:57,896][11201] Saving new best policy, reward=11.146! +[2023-02-24 12:25:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.4, 300 sec: 3429.5). Total num frames: 2035712. Throughput: 0: 815.2. Samples: 508456. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:25:02,877][00205] Avg episode reward: [(0, '10.893')] +[2023-02-24 12:25:06,006][11215] Updated weights for policy 0, policy_version 500 (0.0025) +[2023-02-24 12:25:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 2052096. Throughput: 0: 844.3. Samples: 513704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:25:07,877][00205] Avg episode reward: [(0, '10.468')] +[2023-02-24 12:25:12,872][00205] Fps is (10 sec: 3685.5, 60 sec: 3344.9, 300 sec: 3415.6). Total num frames: 2072576. Throughput: 0: 864.1. Samples: 516860. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:25:12,878][00205] Avg episode reward: [(0, '10.186')] +[2023-02-24 12:25:16,819][11215] Updated weights for policy 0, policy_version 510 (0.0012) +[2023-02-24 12:25:17,874][00205] Fps is (10 sec: 3684.7, 60 sec: 3344.8, 300 sec: 3429.5). Total num frames: 2088960. Throughput: 0: 853.7. Samples: 522542. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:25:17,877][00205] Avg episode reward: [(0, '10.895')] +[2023-02-24 12:25:22,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2105344. Throughput: 0: 830.5. Samples: 526658. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:25:22,876][00205] Avg episode reward: [(0, '11.838')] +[2023-02-24 12:25:22,883][11201] Saving new best policy, reward=11.838! +[2023-02-24 12:25:27,870][00205] Fps is (10 sec: 3688.0, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2125824. Throughput: 0: 842.8. Samples: 529168. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:25:27,876][00205] Avg episode reward: [(0, '11.191')] +[2023-02-24 12:25:28,778][11215] Updated weights for policy 0, policy_version 520 (0.0015) +[2023-02-24 12:25:32,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2146304. Throughput: 0: 879.1. Samples: 535562. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:25:32,872][00205] Avg episode reward: [(0, '12.906')] +[2023-02-24 12:25:32,881][11201] Saving new best policy, reward=12.906! +[2023-02-24 12:25:37,873][00205] Fps is (10 sec: 3275.7, 60 sec: 3413.1, 300 sec: 3443.5). Total num frames: 2158592. Throughput: 0: 854.1. Samples: 540534. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:25:37,876][00205] Avg episode reward: [(0, '12.234')] +[2023-02-24 12:25:41,184][11215] Updated weights for policy 0, policy_version 530 (0.0022) +[2023-02-24 12:25:42,872][00205] Fps is (10 sec: 2866.5, 60 sec: 3481.5, 300 sec: 3443.4). Total num frames: 2174976. Throughput: 0: 839.1. Samples: 542524. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:25:42,874][00205] Avg episode reward: [(0, '11.783')] +[2023-02-24 12:25:47,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 2191360. Throughput: 0: 867.4. Samples: 547488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:25:47,875][00205] Avg episode reward: [(0, '11.929')] +[2023-02-24 12:25:51,704][11215] Updated weights for policy 0, policy_version 540 (0.0012) +[2023-02-24 12:25:52,870][00205] Fps is (10 sec: 4097.0, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2215936. Throughput: 0: 894.9. Samples: 553976. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:25:52,877][00205] Avg episode reward: [(0, '11.139')] +[2023-02-24 12:25:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 2228224. Throughput: 0: 882.2. Samples: 556558. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:25:57,873][00205] Avg episode reward: [(0, '12.047')] +[2023-02-24 12:26:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2244608. Throughput: 0: 846.2. Samples: 560618. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:26:02,873][00205] Avg episode reward: [(0, '12.846')] +[2023-02-24 12:26:05,046][11215] Updated weights for policy 0, policy_version 550 (0.0019) +[2023-02-24 12:26:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2265088. Throughput: 0: 875.7. Samples: 566064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:26:07,872][00205] Avg episode reward: [(0, '13.604')] +[2023-02-24 12:26:07,888][11201] Saving new best policy, reward=13.604! +[2023-02-24 12:26:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3443.4). Total num frames: 2285568. Throughput: 0: 888.0. Samples: 569126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:26:12,878][00205] Avg episode reward: [(0, '13.737')] +[2023-02-24 12:26:12,885][11201] Saving new best policy, reward=13.737! +[2023-02-24 12:26:15,515][11215] Updated weights for policy 0, policy_version 560 (0.0015) +[2023-02-24 12:26:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.9, 300 sec: 3443.4). Total num frames: 2297856. Throughput: 0: 863.5. Samples: 574418. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:26:17,873][00205] Avg episode reward: [(0, '14.525')] +[2023-02-24 12:26:17,890][11201] Saving new best policy, reward=14.525! +[2023-02-24 12:26:22,870][00205] Fps is (10 sec: 2457.6, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2310144. Throughput: 0: 839.6. Samples: 578312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:26:22,874][00205] Avg episode reward: [(0, '14.451')] +[2023-02-24 12:26:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2330624. Throughput: 0: 856.8. Samples: 581076. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:26:27,873][00205] Avg episode reward: [(0, '14.278')] +[2023-02-24 12:26:28,206][11215] Updated weights for policy 0, policy_version 570 (0.0023) +[2023-02-24 12:26:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3443.5). Total num frames: 2351104. Throughput: 0: 885.1. Samples: 587318. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:26:32,872][00205] Avg episode reward: [(0, '14.832')] +[2023-02-24 12:26:32,876][11201] Saving new best policy, reward=14.832! +[2023-02-24 12:26:37,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3481.8, 300 sec: 3443.4). Total num frames: 2367488. Throughput: 0: 847.5. Samples: 592114. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:26:37,878][00205] Avg episode reward: [(0, '14.690')] +[2023-02-24 12:26:40,690][11215] Updated weights for policy 0, policy_version 580 (0.0018) +[2023-02-24 12:26:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.5, 300 sec: 3429.5). Total num frames: 2379776. Throughput: 0: 835.2. Samples: 594142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:26:42,875][00205] Avg episode reward: [(0, '13.701')] +[2023-02-24 12:26:47,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2400256. Throughput: 0: 862.8. Samples: 599444. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:26:47,872][00205] Avg episode reward: [(0, '15.186')] +[2023-02-24 12:26:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000586_2400256.pth... +[2023-02-24 12:26:48,031][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000385_1576960.pth +[2023-02-24 12:26:48,046][11201] Saving new best policy, reward=15.186! +[2023-02-24 12:26:51,208][11215] Updated weights for policy 0, policy_version 590 (0.0014) +[2023-02-24 12:26:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2420736. Throughput: 0: 881.0. Samples: 605710. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:26:52,878][00205] Avg episode reward: [(0, '14.601')] +[2023-02-24 12:26:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2437120. Throughput: 0: 866.8. Samples: 608134. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:26:57,873][00205] Avg episode reward: [(0, '14.396')] +[2023-02-24 12:27:02,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2449408. Throughput: 0: 839.4. Samples: 612190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:27:02,880][00205] Avg episode reward: [(0, '14.811')] +[2023-02-24 12:27:04,374][11215] Updated weights for policy 0, policy_version 600 (0.0020) +[2023-02-24 12:27:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2469888. Throughput: 0: 884.1. Samples: 618096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:27:07,878][00205] Avg episode reward: [(0, '15.226')] +[2023-02-24 12:27:07,887][11201] Saving new best policy, reward=15.226! +[2023-02-24 12:27:12,870][00205] Fps is (10 sec: 4506.0, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2494464. Throughput: 0: 892.8. Samples: 621252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:27:12,878][00205] Avg episode reward: [(0, '15.343')] +[2023-02-24 12:27:12,883][11201] Saving new best policy, reward=15.343! +[2023-02-24 12:27:14,283][11215] Updated weights for policy 0, policy_version 610 (0.0012) +[2023-02-24 12:27:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2506752. Throughput: 0: 870.8. Samples: 626504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:27:17,880][00205] Avg episode reward: [(0, '15.456')] +[2023-02-24 12:27:17,895][11201] Saving new best policy, reward=15.456! +[2023-02-24 12:27:22,870][00205] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2519040. Throughput: 0: 856.6. Samples: 630662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:27:22,875][00205] Avg episode reward: [(0, '16.697')] +[2023-02-24 12:27:22,879][11201] Saving new best policy, reward=16.697! +[2023-02-24 12:27:27,025][11215] Updated weights for policy 0, policy_version 620 (0.0025) +[2023-02-24 12:27:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2539520. Throughput: 0: 875.3. Samples: 633530. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:27:27,875][00205] Avg episode reward: [(0, '16.487')] +[2023-02-24 12:27:32,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2564096. Throughput: 0: 902.6. Samples: 640060. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:27:32,873][00205] Avg episode reward: [(0, '18.070')] +[2023-02-24 12:27:32,880][11201] Saving new best policy, reward=18.070! +[2023-02-24 12:27:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2576384. Throughput: 0: 865.5. Samples: 644656. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:27:37,876][00205] Avg episode reward: [(0, '19.315')] +[2023-02-24 12:27:37,889][11201] Saving new best policy, reward=19.315! +[2023-02-24 12:27:38,589][11215] Updated weights for policy 0, policy_version 630 (0.0016) +[2023-02-24 12:27:42,870][00205] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2588672. Throughput: 0: 854.9. Samples: 646606. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:27:42,872][00205] Avg episode reward: [(0, '18.752')] +[2023-02-24 12:27:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2613248. Throughput: 0: 886.7. Samples: 652090. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:27:47,873][00205] Avg episode reward: [(0, '19.421')] +[2023-02-24 12:27:47,885][11201] Saving new best policy, reward=19.421! +[2023-02-24 12:27:49,817][11215] Updated weights for policy 0, policy_version 640 (0.0027) +[2023-02-24 12:27:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2633728. Throughput: 0: 899.0. Samples: 658552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:27:52,874][00205] Avg episode reward: [(0, '18.641')] +[2023-02-24 12:27:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2646016. Throughput: 0: 879.2. Samples: 660816. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:27:57,874][00205] Avg episode reward: [(0, '18.167')] +[2023-02-24 12:28:02,866][11215] Updated weights for policy 0, policy_version 650 (0.0028) +[2023-02-24 12:28:02,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2662400. Throughput: 0: 853.8. Samples: 664926. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:28:02,879][00205] Avg episode reward: [(0, '18.468')] +[2023-02-24 12:28:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2682880. Throughput: 0: 893.1. Samples: 670852. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:28:07,879][00205] Avg episode reward: [(0, '19.198')] +[2023-02-24 12:28:12,188][11215] Updated weights for policy 0, policy_version 660 (0.0021) +[2023-02-24 12:28:12,872][00205] Fps is (10 sec: 4095.2, 60 sec: 3481.5, 300 sec: 3457.3). Total num frames: 2703360. Throughput: 0: 901.2. Samples: 674086. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:28:12,883][00205] Avg episode reward: [(0, '19.262')] +[2023-02-24 12:28:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2715648. Throughput: 0: 869.5. Samples: 679188. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:28:17,873][00205] Avg episode reward: [(0, '19.826')] +[2023-02-24 12:28:17,896][11201] Saving new best policy, reward=19.826! +[2023-02-24 12:28:22,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2732032. Throughput: 0: 857.9. Samples: 683262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:28:22,873][00205] Avg episode reward: [(0, '20.419')] +[2023-02-24 12:28:22,875][11201] Saving new best policy, reward=20.419! +[2023-02-24 12:28:25,364][11215] Updated weights for policy 0, policy_version 670 (0.0015) +[2023-02-24 12:28:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2752512. Throughput: 0: 883.5. Samples: 686364. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:28:27,872][00205] Avg episode reward: [(0, '19.222')] +[2023-02-24 12:28:32,872][00205] Fps is (10 sec: 4095.2, 60 sec: 3481.5, 300 sec: 3471.2). Total num frames: 2772992. Throughput: 0: 906.4. Samples: 692878. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:28:32,879][00205] Avg episode reward: [(0, '20.182')] +[2023-02-24 12:28:36,282][11215] Updated weights for policy 0, policy_version 680 (0.0027) +[2023-02-24 12:28:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2789376. Throughput: 0: 864.6. Samples: 697458. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:28:37,873][00205] Avg episode reward: [(0, '19.719')] +[2023-02-24 12:28:42,870][00205] Fps is (10 sec: 2867.8, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2801664. Throughput: 0: 858.8. Samples: 699460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:28:42,878][00205] Avg episode reward: [(0, '18.783')] +[2023-02-24 12:28:47,846][11215] Updated weights for policy 0, policy_version 690 (0.0014) +[2023-02-24 12:28:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2826240. Throughput: 0: 895.8. Samples: 705238. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:28:47,872][00205] Avg episode reward: [(0, '17.900')] +[2023-02-24 12:28:47,882][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000690_2826240.pth... +[2023-02-24 12:28:47,997][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000485_1986560.pth +[2023-02-24 12:28:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2842624. Throughput: 0: 907.5. Samples: 711688. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:28:52,874][00205] Avg episode reward: [(0, '16.623')] +[2023-02-24 12:28:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 2859008. Throughput: 0: 878.2. Samples: 713602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:28:57,873][00205] Avg episode reward: [(0, '16.645')] +[2023-02-24 12:29:00,608][11215] Updated weights for policy 0, policy_version 700 (0.0028) +[2023-02-24 12:29:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2871296. Throughput: 0: 855.0. Samples: 717664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:29:02,876][00205] Avg episode reward: [(0, '17.588')] +[2023-02-24 12:29:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2891776. Throughput: 0: 897.9. Samples: 723666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:29:07,873][00205] Avg episode reward: [(0, '17.711')] +[2023-02-24 12:29:10,964][11215] Updated weights for policy 0, policy_version 710 (0.0037) +[2023-02-24 12:29:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.7, 300 sec: 3471.2). Total num frames: 2912256. Throughput: 0: 898.6. Samples: 726800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:29:12,876][00205] Avg episode reward: [(0, '18.661')] +[2023-02-24 12:29:17,875][00205] Fps is (10 sec: 3275.0, 60 sec: 3481.3, 300 sec: 3471.1). Total num frames: 2924544. Throughput: 0: 857.9. Samples: 731486. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:29:17,878][00205] Avg episode reward: [(0, '19.159')] +[2023-02-24 12:29:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2940928. Throughput: 0: 854.0. Samples: 735890. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:29:22,876][00205] Avg episode reward: [(0, '19.729')] +[2023-02-24 12:29:23,915][11215] Updated weights for policy 0, policy_version 720 (0.0026) +[2023-02-24 12:29:27,870][00205] Fps is (10 sec: 4098.2, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 2965504. Throughput: 0: 880.8. Samples: 739098. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:29:27,873][00205] Avg episode reward: [(0, '19.640')] +[2023-02-24 12:29:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.7, 300 sec: 3485.1). Total num frames: 2981888. Throughput: 0: 898.1. Samples: 745654. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 12:29:32,876][00205] Avg episode reward: [(0, '20.169')] +[2023-02-24 12:29:34,526][11215] Updated weights for policy 0, policy_version 730 (0.0015) +[2023-02-24 12:29:37,872][00205] Fps is (10 sec: 3276.1, 60 sec: 3481.5, 300 sec: 3498.9). Total num frames: 2998272. Throughput: 0: 850.3. Samples: 749954. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:29:37,875][00205] Avg episode reward: [(0, '20.339')] +[2023-02-24 12:29:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3014656. Throughput: 0: 852.5. Samples: 751966. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:29:42,873][00205] Avg episode reward: [(0, '21.573')] +[2023-02-24 12:29:42,881][11201] Saving new best policy, reward=21.573! +[2023-02-24 12:29:46,581][11215] Updated weights for policy 0, policy_version 740 (0.0023) +[2023-02-24 12:29:47,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3035136. Throughput: 0: 892.8. Samples: 757842. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:29:47,872][00205] Avg episode reward: [(0, '22.027')] +[2023-02-24 12:29:47,886][11201] Saving new best policy, reward=22.027! +[2023-02-24 12:29:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3051520. Throughput: 0: 895.1. Samples: 763946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:29:52,875][00205] Avg episode reward: [(0, '21.699')] +[2023-02-24 12:29:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3067904. Throughput: 0: 870.3. Samples: 765962. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:29:57,874][00205] Avg episode reward: [(0, '22.312')] +[2023-02-24 12:29:57,886][11201] Saving new best policy, reward=22.312! +[2023-02-24 12:29:59,122][11215] Updated weights for policy 0, policy_version 750 (0.0020) +[2023-02-24 12:30:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3084288. Throughput: 0: 853.4. Samples: 769884. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:30:02,873][00205] Avg episode reward: [(0, '21.738')] +[2023-02-24 12:30:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3104768. Throughput: 0: 899.1. Samples: 776350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:30:07,877][00205] Avg episode reward: [(0, '22.720')] +[2023-02-24 12:30:07,891][11201] Saving new best policy, reward=22.720! +[2023-02-24 12:30:09,415][11215] Updated weights for policy 0, policy_version 760 (0.0014) +[2023-02-24 12:30:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3121152. Throughput: 0: 898.2. Samples: 779516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:30:12,873][00205] Avg episode reward: [(0, '22.461')] +[2023-02-24 12:30:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3550.2, 300 sec: 3499.0). Total num frames: 3137536. Throughput: 0: 852.9. Samples: 784036. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:30:17,876][00205] Avg episode reward: [(0, '22.451')] +[2023-02-24 12:30:22,406][11215] Updated weights for policy 0, policy_version 770 (0.0017) +[2023-02-24 12:30:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3153920. Throughput: 0: 862.7. Samples: 788772. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:30:22,872][00205] Avg episode reward: [(0, '21.952')] +[2023-02-24 12:30:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3178496. Throughput: 0: 891.9. Samples: 792100. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:30:27,879][00205] Avg episode reward: [(0, '21.292')] +[2023-02-24 12:30:31,449][11215] Updated weights for policy 0, policy_version 780 (0.0014) +[2023-02-24 12:30:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3512.9). Total num frames: 3194880. Throughput: 0: 917.6. Samples: 799132. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:30:32,876][00205] Avg episode reward: [(0, '20.844')] +[2023-02-24 12:30:37,871][00205] Fps is (10 sec: 3276.3, 60 sec: 3549.9, 300 sec: 3512.9). Total num frames: 3211264. Throughput: 0: 869.6. Samples: 803078. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:30:37,878][00205] Avg episode reward: [(0, '20.241')] +[2023-02-24 12:30:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3227648. Throughput: 0: 871.1. Samples: 805160. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:30:42,872][00205] Avg episode reward: [(0, '21.477')] +[2023-02-24 12:30:44,598][11215] Updated weights for policy 0, policy_version 790 (0.0023) +[2023-02-24 12:30:47,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3248128. Throughput: 0: 921.2. Samples: 811336. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:30:47,873][00205] Avg episode reward: [(0, '22.575')] +[2023-02-24 12:30:47,885][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000793_3248128.pth... +[2023-02-24 12:30:48,009][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000586_2400256.pth +[2023-02-24 12:30:52,875][00205] Fps is (10 sec: 4093.7, 60 sec: 3617.8, 300 sec: 3526.7). Total num frames: 3268608. Throughput: 0: 910.7. Samples: 817338. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:30:52,885][00205] Avg episode reward: [(0, '22.106')] +[2023-02-24 12:30:55,704][11215] Updated weights for policy 0, policy_version 800 (0.0029) +[2023-02-24 12:30:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3280896. Throughput: 0: 886.8. Samples: 819424. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:30:57,874][00205] Avg episode reward: [(0, '22.843')] +[2023-02-24 12:30:57,885][11201] Saving new best policy, reward=22.843! +[2023-02-24 12:31:02,870][00205] Fps is (10 sec: 2868.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3297280. Throughput: 0: 879.2. Samples: 823602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:31:02,872][00205] Avg episode reward: [(0, '24.269')] +[2023-02-24 12:31:02,882][11201] Saving new best policy, reward=24.269! +[2023-02-24 12:31:07,107][11215] Updated weights for policy 0, policy_version 810 (0.0023) +[2023-02-24 12:31:07,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3549.8, 300 sec: 3499.0). Total num frames: 3317760. Throughput: 0: 918.7. Samples: 830114. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:31:07,873][00205] Avg episode reward: [(0, '25.845')] +[2023-02-24 12:31:07,886][11201] Saving new best policy, reward=25.845! +[2023-02-24 12:31:12,874][00205] Fps is (10 sec: 4094.2, 60 sec: 3617.9, 300 sec: 3526.7). Total num frames: 3338240. Throughput: 0: 915.9. Samples: 833320. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:31:12,877][00205] Avg episode reward: [(0, '24.641')] +[2023-02-24 12:31:17,872][00205] Fps is (10 sec: 3276.3, 60 sec: 3549.8, 300 sec: 3526.7). Total num frames: 3350528. Throughput: 0: 857.0. Samples: 837698. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:31:17,877][00205] Avg episode reward: [(0, '24.794')] +[2023-02-24 12:31:19,799][11215] Updated weights for policy 0, policy_version 820 (0.0014) +[2023-02-24 12:31:22,870][00205] Fps is (10 sec: 3278.2, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 3371008. Throughput: 0: 879.1. Samples: 842638. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:31:22,881][00205] Avg episode reward: [(0, '25.251')] +[2023-02-24 12:31:27,870][00205] Fps is (10 sec: 4096.6, 60 sec: 3549.8, 300 sec: 3526.7). Total num frames: 3391488. Throughput: 0: 905.2. Samples: 845896. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:31:27,873][00205] Avg episode reward: [(0, '25.312')] +[2023-02-24 12:31:29,403][11215] Updated weights for policy 0, policy_version 830 (0.0035) +[2023-02-24 12:31:32,870][00205] Fps is (10 sec: 3686.2, 60 sec: 3549.8, 300 sec: 3526.7). Total num frames: 3407872. Throughput: 0: 905.1. Samples: 852064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:31:32,872][00205] Avg episode reward: [(0, '25.504')] +[2023-02-24 12:31:37,872][00205] Fps is (10 sec: 3276.2, 60 sec: 3549.8, 300 sec: 3540.6). Total num frames: 3424256. Throughput: 0: 864.5. Samples: 856236. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:31:37,875][00205] Avg episode reward: [(0, '24.761')] +[2023-02-24 12:31:42,707][11215] Updated weights for policy 0, policy_version 840 (0.0026) +[2023-02-24 12:31:42,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3440640. Throughput: 0: 862.1. Samples: 858220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:31:42,875][00205] Avg episode reward: [(0, '25.354')] +[2023-02-24 12:31:47,870][00205] Fps is (10 sec: 3687.3, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3461120. Throughput: 0: 913.2. Samples: 864694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:31:47,876][00205] Avg episode reward: [(0, '24.708')] +[2023-02-24 12:31:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.9, 300 sec: 3526.7). Total num frames: 3477504. Throughput: 0: 897.5. Samples: 870502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:31:52,873][00205] Avg episode reward: [(0, '24.505')] +[2023-02-24 12:31:52,901][11215] Updated weights for policy 0, policy_version 850 (0.0019) +[2023-02-24 12:31:57,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3540.6). Total num frames: 3493888. Throughput: 0: 871.6. Samples: 872538. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:31:57,877][00205] Avg episode reward: [(0, '24.874')] +[2023-02-24 12:32:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3510272. Throughput: 0: 877.1. Samples: 877168. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:32:02,872][00205] Avg episode reward: [(0, '23.638')] +[2023-02-24 12:32:04,697][11215] Updated weights for policy 0, policy_version 860 (0.0038) +[2023-02-24 12:32:07,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3618.0, 300 sec: 3526.7). Total num frames: 3534848. Throughput: 0: 915.9. Samples: 883856. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:32:07,875][00205] Avg episode reward: [(0, '23.967')] +[2023-02-24 12:32:12,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3550.0, 300 sec: 3540.6). Total num frames: 3551232. Throughput: 0: 912.1. Samples: 886942. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:32:12,874][00205] Avg episode reward: [(0, '21.795')] +[2023-02-24 12:32:16,744][11215] Updated weights for policy 0, policy_version 870 (0.0022) +[2023-02-24 12:32:17,872][00205] Fps is (10 sec: 2867.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3563520. Throughput: 0: 866.7. Samples: 891068. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:32:17,876][00205] Avg episode reward: [(0, '21.779')] +[2023-02-24 12:32:22,870][00205] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3584000. Throughput: 0: 887.8. Samples: 896186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:32:22,875][00205] Avg episode reward: [(0, '22.045')] +[2023-02-24 12:32:27,234][11215] Updated weights for policy 0, policy_version 880 (0.0022) +[2023-02-24 12:32:27,870][00205] Fps is (10 sec: 4096.7, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3604480. Throughput: 0: 915.7. Samples: 899428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:32:27,875][00205] Avg episode reward: [(0, '21.926')] +[2023-02-24 12:32:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3620864. Throughput: 0: 907.9. Samples: 905550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:32:32,875][00205] Avg episode reward: [(0, '21.742')] +[2023-02-24 12:32:37,872][00205] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 3637248. Throughput: 0: 870.2. Samples: 909660. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:32:37,874][00205] Avg episode reward: [(0, '21.922')] +[2023-02-24 12:32:40,249][11215] Updated weights for policy 0, policy_version 890 (0.0026) +[2023-02-24 12:32:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3653632. Throughput: 0: 873.9. Samples: 911862. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:32:42,878][00205] Avg episode reward: [(0, '22.072')] +[2023-02-24 12:32:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 3678208. Throughput: 0: 918.6. Samples: 918504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:32:47,873][00205] Avg episode reward: [(0, '21.527')] +[2023-02-24 12:32:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000898_3678208.pth... +[2023-02-24 12:32:48,000][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000690_2826240.pth +[2023-02-24 12:32:49,504][11215] Updated weights for policy 0, policy_version 900 (0.0014) +[2023-02-24 12:32:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 3694592. Throughput: 0: 894.0. Samples: 924084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:32:52,876][00205] Avg episode reward: [(0, '20.238')] +[2023-02-24 12:32:57,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3706880. Throughput: 0: 871.4. Samples: 926154. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:32:57,874][00205] Avg episode reward: [(0, '20.008')] +[2023-02-24 12:33:02,222][11215] Updated weights for policy 0, policy_version 910 (0.0020) +[2023-02-24 12:33:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 3727360. Throughput: 0: 889.2. Samples: 931082. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:33:02,878][00205] Avg episode reward: [(0, '20.590')] +[2023-02-24 12:33:07,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3618.3, 300 sec: 3554.5). Total num frames: 3751936. Throughput: 0: 922.6. Samples: 937702. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:33:07,872][00205] Avg episode reward: [(0, '20.570')] +[2023-02-24 12:33:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 3764224. Throughput: 0: 914.8. Samples: 940592. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:33:12,880][00205] Avg episode reward: [(0, '21.241')] +[2023-02-24 12:33:13,123][11215] Updated weights for policy 0, policy_version 920 (0.0015) +[2023-02-24 12:33:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.2, 300 sec: 3554.5). Total num frames: 3780608. Throughput: 0: 871.6. Samples: 944770. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:33:17,875][00205] Avg episode reward: [(0, '20.790')] +[2023-02-24 12:33:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 3801088. Throughput: 0: 901.6. Samples: 950230. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:33:22,877][00205] Avg episode reward: [(0, '23.611')] +[2023-02-24 12:33:24,717][11215] Updated weights for policy 0, policy_version 930 (0.0038) +[2023-02-24 12:33:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 3821568. Throughput: 0: 924.8. Samples: 953480. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:33:27,872][00205] Avg episode reward: [(0, '23.023')] +[2023-02-24 12:33:32,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 3837952. Throughput: 0: 907.0. Samples: 959318. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:33:32,873][00205] Avg episode reward: [(0, '22.371')] +[2023-02-24 12:33:36,764][11215] Updated weights for policy 0, policy_version 940 (0.0026) +[2023-02-24 12:33:37,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 3850240. Throughput: 0: 874.8. Samples: 963452. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:33:37,873][00205] Avg episode reward: [(0, '23.057')] +[2023-02-24 12:33:42,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 3870720. Throughput: 0: 883.0. Samples: 965890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:33:42,872][00205] Avg episode reward: [(0, '23.216')] +[2023-02-24 12:33:46,999][11215] Updated weights for policy 0, policy_version 950 (0.0012) +[2023-02-24 12:33:47,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 3891200. Throughput: 0: 919.3. Samples: 972450. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:33:47,872][00205] Avg episode reward: [(0, '21.615')] +[2023-02-24 12:33:52,874][00205] Fps is (10 sec: 3684.7, 60 sec: 3549.6, 300 sec: 3554.4). Total num frames: 3907584. Throughput: 0: 889.4. Samples: 977728. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:33:52,876][00205] Avg episode reward: [(0, '22.134')] +[2023-02-24 12:33:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 3923968. Throughput: 0: 870.5. Samples: 979764. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:33:57,875][00205] Avg episode reward: [(0, '22.780')] +[2023-02-24 12:34:00,073][11215] Updated weights for policy 0, policy_version 960 (0.0019) +[2023-02-24 12:34:02,870][00205] Fps is (10 sec: 3688.1, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 3944448. Throughput: 0: 892.1. Samples: 984916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:34:02,876][00205] Avg episode reward: [(0, '22.696')] +[2023-02-24 12:34:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 3964928. Throughput: 0: 920.1. Samples: 991634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:34:07,872][00205] Avg episode reward: [(0, '22.213')] +[2023-02-24 12:34:09,599][11215] Updated weights for policy 0, policy_version 970 (0.0014) +[2023-02-24 12:34:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 3981312. Throughput: 0: 907.9. Samples: 994334. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:34:12,875][00205] Avg episode reward: [(0, '22.002')] +[2023-02-24 12:34:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 3993600. Throughput: 0: 869.8. Samples: 998460. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:34:17,876][00205] Avg episode reward: [(0, '21.600')] +[2023-02-24 12:34:22,322][11215] Updated weights for policy 0, policy_version 980 (0.0019) +[2023-02-24 12:34:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4014080. Throughput: 0: 903.1. Samples: 1004092. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:34:22,876][00205] Avg episode reward: [(0, '24.195')] +[2023-02-24 12:34:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 4038656. Throughput: 0: 920.5. Samples: 1007312. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:34:27,878][00205] Avg episode reward: [(0, '23.973')] +[2023-02-24 12:34:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4050944. Throughput: 0: 899.2. Samples: 1012912. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:34:32,876][00205] Avg episode reward: [(0, '23.066')] +[2023-02-24 12:34:33,331][11215] Updated weights for policy 0, policy_version 990 (0.0013) +[2023-02-24 12:34:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 4067328. Throughput: 0: 874.4. Samples: 1017072. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:34:37,876][00205] Avg episode reward: [(0, '25.073')] +[2023-02-24 12:34:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4087808. Throughput: 0: 890.0. Samples: 1019816. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:34:42,873][00205] Avg episode reward: [(0, '26.109')] +[2023-02-24 12:34:42,875][11201] Saving new best policy, reward=26.109! +[2023-02-24 12:34:44,417][11215] Updated weights for policy 0, policy_version 1000 (0.0017) +[2023-02-24 12:34:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 4108288. Throughput: 0: 918.4. Samples: 1026242. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:34:47,873][00205] Avg episode reward: [(0, '26.954')] +[2023-02-24 12:34:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001003_4108288.pth... +[2023-02-24 12:34:48,050][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000793_3248128.pth +[2023-02-24 12:34:48,061][11201] Saving new best policy, reward=26.954! +[2023-02-24 12:34:52,870][00205] Fps is (10 sec: 3276.6, 60 sec: 3550.1, 300 sec: 3568.4). Total num frames: 4120576. Throughput: 0: 873.1. Samples: 1030926. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:34:52,873][00205] Avg episode reward: [(0, '26.611')] +[2023-02-24 12:34:57,874][00205] Fps is (10 sec: 2456.6, 60 sec: 3481.3, 300 sec: 3554.4). Total num frames: 4132864. Throughput: 0: 857.2. Samples: 1032910. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:34:57,877][00205] Avg episode reward: [(0, '26.001')] +[2023-02-24 12:34:57,910][11215] Updated weights for policy 0, policy_version 1010 (0.0021) +[2023-02-24 12:35:02,870][00205] Fps is (10 sec: 3277.0, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 4153344. Throughput: 0: 877.5. Samples: 1037946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:35:02,873][00205] Avg episode reward: [(0, '26.890')] +[2023-02-24 12:35:07,732][11215] Updated weights for policy 0, policy_version 1020 (0.0026) +[2023-02-24 12:35:07,870][00205] Fps is (10 sec: 4507.6, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 4177920. Throughput: 0: 898.0. Samples: 1044500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:35:07,872][00205] Avg episode reward: [(0, '25.811')] +[2023-02-24 12:35:12,873][00205] Fps is (10 sec: 3685.1, 60 sec: 3481.4, 300 sec: 3568.3). Total num frames: 4190208. Throughput: 0: 880.8. Samples: 1046952. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:35:12,883][00205] Avg episode reward: [(0, '26.692')] +[2023-02-24 12:35:17,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 4206592. Throughput: 0: 847.7. Samples: 1051058. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:35:17,874][00205] Avg episode reward: [(0, '25.030')] +[2023-02-24 12:35:20,544][11215] Updated weights for policy 0, policy_version 1030 (0.0021) +[2023-02-24 12:35:22,870][00205] Fps is (10 sec: 3687.7, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4227072. Throughput: 0: 888.2. Samples: 1057042. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:35:22,872][00205] Avg episode reward: [(0, '24.995')] +[2023-02-24 12:35:27,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 4247552. Throughput: 0: 898.8. Samples: 1060262. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:35:27,876][00205] Avg episode reward: [(0, '23.880')] +[2023-02-24 12:35:30,855][11215] Updated weights for policy 0, policy_version 1040 (0.0025) +[2023-02-24 12:35:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4263936. Throughput: 0: 874.1. Samples: 1065578. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:35:32,876][00205] Avg episode reward: [(0, '24.515')] +[2023-02-24 12:35:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 4276224. Throughput: 0: 863.2. Samples: 1069770. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:35:37,873][00205] Avg episode reward: [(0, '24.452')] +[2023-02-24 12:35:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 4296704. Throughput: 0: 883.5. Samples: 1072664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:35:42,873][00205] Avg episode reward: [(0, '23.207')] +[2023-02-24 12:35:43,053][11215] Updated weights for policy 0, policy_version 1050 (0.0015) +[2023-02-24 12:35:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4321280. Throughput: 0: 917.2. Samples: 1079218. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:35:47,873][00205] Avg episode reward: [(0, '23.072')] +[2023-02-24 12:35:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4333568. Throughput: 0: 878.6. Samples: 1084036. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:35:52,873][00205] Avg episode reward: [(0, '23.314')] +[2023-02-24 12:35:54,833][11215] Updated weights for policy 0, policy_version 1060 (0.0015) +[2023-02-24 12:35:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.4, 300 sec: 3568.4). Total num frames: 4349952. Throughput: 0: 870.1. Samples: 1086102. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:35:57,879][00205] Avg episode reward: [(0, '22.996')] +[2023-02-24 12:36:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4370432. Throughput: 0: 904.5. Samples: 1091762. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:36:02,877][00205] Avg episode reward: [(0, '23.478')] +[2023-02-24 12:36:05,185][11215] Updated weights for policy 0, policy_version 1070 (0.0016) +[2023-02-24 12:36:07,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3549.7, 300 sec: 3568.4). Total num frames: 4390912. Throughput: 0: 917.2. Samples: 1098320. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:36:07,879][00205] Avg episode reward: [(0, '23.825')] +[2023-02-24 12:36:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3582.3). Total num frames: 4407296. Throughput: 0: 896.0. Samples: 1100582. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:36:12,877][00205] Avg episode reward: [(0, '24.060')] +[2023-02-24 12:36:17,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4419584. Throughput: 0: 871.6. Samples: 1104800. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:36:17,873][00205] Avg episode reward: [(0, '25.036')] +[2023-02-24 12:36:18,075][11215] Updated weights for policy 0, policy_version 1080 (0.0020) +[2023-02-24 12:36:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4440064. Throughput: 0: 911.4. Samples: 1110784. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:36:22,875][00205] Avg episode reward: [(0, '24.692')] +[2023-02-24 12:36:27,528][11215] Updated weights for policy 0, policy_version 1090 (0.0016) +[2023-02-24 12:36:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 4464640. Throughput: 0: 919.2. Samples: 1114026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:36:27,873][00205] Avg episode reward: [(0, '25.739')] +[2023-02-24 12:36:32,871][00205] Fps is (10 sec: 3686.1, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 4476928. Throughput: 0: 888.7. Samples: 1119210. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:36:32,875][00205] Avg episode reward: [(0, '25.352')] +[2023-02-24 12:36:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4493312. Throughput: 0: 872.7. Samples: 1123308. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:36:37,879][00205] Avg episode reward: [(0, '25.245')] +[2023-02-24 12:36:40,517][11215] Updated weights for policy 0, policy_version 1100 (0.0023) +[2023-02-24 12:36:42,870][00205] Fps is (10 sec: 3686.7, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4513792. Throughput: 0: 895.7. Samples: 1126410. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:36:42,872][00205] Avg episode reward: [(0, '24.665')] +[2023-02-24 12:36:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 4534272. Throughput: 0: 913.9. Samples: 1132888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:36:47,879][00205] Avg episode reward: [(0, '23.801')] +[2023-02-24 12:36:47,902][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001107_4534272.pth... +[2023-02-24 12:36:48,048][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000898_3678208.pth +[2023-02-24 12:36:51,888][11215] Updated weights for policy 0, policy_version 1110 (0.0016) +[2023-02-24 12:36:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4546560. Throughput: 0: 868.5. Samples: 1137400. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:36:52,874][00205] Avg episode reward: [(0, '23.689')] +[2023-02-24 12:36:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4562944. Throughput: 0: 862.9. Samples: 1139412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:36:57,874][00205] Avg episode reward: [(0, '24.247')] +[2023-02-24 12:37:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4583424. Throughput: 0: 898.9. Samples: 1145252. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:37:02,873][00205] Avg episode reward: [(0, '22.963')] +[2023-02-24 12:37:03,079][11215] Updated weights for policy 0, policy_version 1120 (0.0016) +[2023-02-24 12:37:07,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 4603904. Throughput: 0: 912.9. Samples: 1151864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:37:07,876][00205] Avg episode reward: [(0, '22.026')] +[2023-02-24 12:37:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 4620288. Throughput: 0: 886.8. Samples: 1153932. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:37:12,876][00205] Avg episode reward: [(0, '22.896')] +[2023-02-24 12:37:15,580][11215] Updated weights for policy 0, policy_version 1130 (0.0020) +[2023-02-24 12:37:17,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4632576. Throughput: 0: 862.9. Samples: 1158038. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:37:17,873][00205] Avg episode reward: [(0, '24.056')] +[2023-02-24 12:37:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4657152. Throughput: 0: 908.4. Samples: 1164188. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:37:22,873][00205] Avg episode reward: [(0, '25.627')] +[2023-02-24 12:37:25,359][11215] Updated weights for policy 0, policy_version 1140 (0.0020) +[2023-02-24 12:37:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 4677632. Throughput: 0: 912.5. Samples: 1167472. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 12:37:27,872][00205] Avg episode reward: [(0, '24.836')] +[2023-02-24 12:37:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4689920. Throughput: 0: 877.6. Samples: 1172380. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:37:32,873][00205] Avg episode reward: [(0, '25.505')] +[2023-02-24 12:37:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4706304. Throughput: 0: 871.8. Samples: 1176632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:37:37,872][00205] Avg episode reward: [(0, '27.100')] +[2023-02-24 12:37:37,897][11201] Saving new best policy, reward=27.100! +[2023-02-24 12:37:38,645][11215] Updated weights for policy 0, policy_version 1150 (0.0031) +[2023-02-24 12:37:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4726784. Throughput: 0: 897.6. Samples: 1179804. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:37:42,873][00205] Avg episode reward: [(0, '26.473')] +[2023-02-24 12:37:47,873][00205] Fps is (10 sec: 4094.6, 60 sec: 3549.7, 300 sec: 3568.3). Total num frames: 4747264. Throughput: 0: 911.5. Samples: 1186272. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:37:47,876][00205] Avg episode reward: [(0, '27.358')] +[2023-02-24 12:37:47,884][11201] Saving new best policy, reward=27.358! +[2023-02-24 12:37:48,971][11215] Updated weights for policy 0, policy_version 1160 (0.0013) +[2023-02-24 12:37:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4759552. Throughput: 0: 859.2. Samples: 1190528. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:37:52,872][00205] Avg episode reward: [(0, '26.911')] +[2023-02-24 12:37:57,870][00205] Fps is (10 sec: 2868.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4775936. Throughput: 0: 858.6. Samples: 1192568. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:37:57,872][00205] Avg episode reward: [(0, '26.721')] +[2023-02-24 12:38:01,317][11215] Updated weights for policy 0, policy_version 1170 (0.0011) +[2023-02-24 12:38:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 4796416. Throughput: 0: 901.9. Samples: 1198624. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:38:02,872][00205] Avg episode reward: [(0, '25.721')] +[2023-02-24 12:38:07,870][00205] Fps is (10 sec: 4095.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4816896. Throughput: 0: 904.7. Samples: 1204900. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:38:07,880][00205] Avg episode reward: [(0, '25.433')] +[2023-02-24 12:38:12,859][11215] Updated weights for policy 0, policy_version 1180 (0.0023) +[2023-02-24 12:38:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4833280. Throughput: 0: 878.1. Samples: 1206988. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:38:12,879][00205] Avg episode reward: [(0, '25.316')] +[2023-02-24 12:38:17,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 4845568. Throughput: 0: 862.8. Samples: 1211208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:38:17,872][00205] Avg episode reward: [(0, '26.604')] +[2023-02-24 12:38:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4870144. Throughput: 0: 908.6. Samples: 1217520. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:38:22,873][00205] Avg episode reward: [(0, '24.923')] +[2023-02-24 12:38:23,700][11215] Updated weights for policy 0, policy_version 1190 (0.0028) +[2023-02-24 12:38:27,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4890624. Throughput: 0: 911.3. Samples: 1220812. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:38:27,879][00205] Avg episode reward: [(0, '26.016')] +[2023-02-24 12:38:32,877][00205] Fps is (10 sec: 3274.4, 60 sec: 3549.4, 300 sec: 3568.3). Total num frames: 4902912. Throughput: 0: 873.1. Samples: 1225564. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:38:32,884][00205] Avg episode reward: [(0, '25.041')] +[2023-02-24 12:38:36,551][11215] Updated weights for policy 0, policy_version 1200 (0.0014) +[2023-02-24 12:38:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4919296. Throughput: 0: 879.7. Samples: 1230116. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:38:37,872][00205] Avg episode reward: [(0, '25.457')] +[2023-02-24 12:38:42,870][00205] Fps is (10 sec: 3689.1, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4939776. Throughput: 0: 908.0. Samples: 1233430. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:38:42,878][00205] Avg episode reward: [(0, '26.832')] +[2023-02-24 12:38:45,830][11215] Updated weights for policy 0, policy_version 1210 (0.0022) +[2023-02-24 12:38:47,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 4960256. Throughput: 0: 919.5. Samples: 1240002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:38:47,874][00205] Avg episode reward: [(0, '26.514')] +[2023-02-24 12:38:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001211_4960256.pth... +[2023-02-24 12:38:48,096][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001003_4108288.pth +[2023-02-24 12:38:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4972544. Throughput: 0: 868.1. Samples: 1243964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:38:52,882][00205] Avg episode reward: [(0, '27.554')] +[2023-02-24 12:38:52,891][11201] Saving new best policy, reward=27.554! +[2023-02-24 12:38:57,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 4988928. Throughput: 0: 866.5. Samples: 1245980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:38:57,872][00205] Avg episode reward: [(0, '27.820')] +[2023-02-24 12:38:57,889][11201] Saving new best policy, reward=27.820! +[2023-02-24 12:38:59,147][11215] Updated weights for policy 0, policy_version 1220 (0.0026) +[2023-02-24 12:39:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 5013504. Throughput: 0: 906.9. Samples: 1252020. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:39:02,875][00205] Avg episode reward: [(0, '27.834')] +[2023-02-24 12:39:02,880][11201] Saving new best policy, reward=27.834! +[2023-02-24 12:39:07,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5029888. Throughput: 0: 901.1. Samples: 1258070. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:39:07,880][00205] Avg episode reward: [(0, '27.955')] +[2023-02-24 12:39:07,897][11201] Saving new best policy, reward=27.955! +[2023-02-24 12:39:10,087][11215] Updated weights for policy 0, policy_version 1230 (0.0019) +[2023-02-24 12:39:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5042176. Throughput: 0: 871.9. Samples: 1260046. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:39:12,875][00205] Avg episode reward: [(0, '26.738')] +[2023-02-24 12:39:17,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5058560. Throughput: 0: 860.2. Samples: 1264266. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:39:17,873][00205] Avg episode reward: [(0, '26.010')] +[2023-02-24 12:39:21,693][11215] Updated weights for policy 0, policy_version 1240 (0.0025) +[2023-02-24 12:39:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5083136. Throughput: 0: 900.9. Samples: 1270656. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:39:22,873][00205] Avg episode reward: [(0, '25.109')] +[2023-02-24 12:39:27,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5099520. Throughput: 0: 899.9. Samples: 1273924. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:39:27,873][00205] Avg episode reward: [(0, '23.278')] +[2023-02-24 12:39:32,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3550.3, 300 sec: 3554.5). Total num frames: 5115904. Throughput: 0: 853.5. Samples: 1278410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:39:32,875][00205] Avg episode reward: [(0, '23.698')] +[2023-02-24 12:39:34,002][11215] Updated weights for policy 0, policy_version 1250 (0.0014) +[2023-02-24 12:39:37,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5132288. Throughput: 0: 864.5. Samples: 1282866. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:39:37,872][00205] Avg episode reward: [(0, '23.343')] +[2023-02-24 12:39:42,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5152768. Throughput: 0: 891.0. Samples: 1286074. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:39:42,872][00205] Avg episode reward: [(0, '23.898')] +[2023-02-24 12:39:44,431][11215] Updated weights for policy 0, policy_version 1260 (0.0019) +[2023-02-24 12:39:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5169152. Throughput: 0: 905.9. Samples: 1292786. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:39:47,877][00205] Avg episode reward: [(0, '26.005')] +[2023-02-24 12:39:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 5185536. Throughput: 0: 860.8. Samples: 1296806. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:39:52,873][00205] Avg episode reward: [(0, '25.523')] +[2023-02-24 12:39:57,597][11215] Updated weights for policy 0, policy_version 1270 (0.0011) +[2023-02-24 12:39:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5201920. Throughput: 0: 864.5. Samples: 1298948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:39:57,875][00205] Avg episode reward: [(0, '26.334')] +[2023-02-24 12:40:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 5222400. Throughput: 0: 900.7. Samples: 1304796. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:40:02,873][00205] Avg episode reward: [(0, '27.262')] +[2023-02-24 12:40:07,703][11215] Updated weights for policy 0, policy_version 1280 (0.0016) +[2023-02-24 12:40:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 5242880. Throughput: 0: 895.7. Samples: 1310962. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:40:07,879][00205] Avg episode reward: [(0, '27.918')] +[2023-02-24 12:40:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5255168. Throughput: 0: 867.7. Samples: 1312972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:40:12,873][00205] Avg episode reward: [(0, '27.645')] +[2023-02-24 12:40:17,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3549.8, 300 sec: 3540.6). Total num frames: 5271552. Throughput: 0: 861.7. Samples: 1317186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:40:17,875][00205] Avg episode reward: [(0, '28.400')] +[2023-02-24 12:40:17,889][11201] Saving new best policy, reward=28.400! +[2023-02-24 12:40:20,215][11215] Updated weights for policy 0, policy_version 1290 (0.0030) +[2023-02-24 12:40:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 5292032. Throughput: 0: 904.2. Samples: 1323556. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:40:22,883][00205] Avg episode reward: [(0, '27.920')] +[2023-02-24 12:40:27,871][00205] Fps is (10 sec: 4095.6, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5312512. Throughput: 0: 906.3. Samples: 1326860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:40:27,874][00205] Avg episode reward: [(0, '27.805')] +[2023-02-24 12:40:31,668][11215] Updated weights for policy 0, policy_version 1300 (0.0027) +[2023-02-24 12:40:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5324800. Throughput: 0: 858.2. Samples: 1331404. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:40:32,876][00205] Avg episode reward: [(0, '27.152')] +[2023-02-24 12:40:37,870][00205] Fps is (10 sec: 3277.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5345280. Throughput: 0: 874.9. Samples: 1336176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:40:37,876][00205] Avg episode reward: [(0, '25.343')] +[2023-02-24 12:40:42,540][11215] Updated weights for policy 0, policy_version 1310 (0.0020) +[2023-02-24 12:40:42,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3549.7, 300 sec: 3540.6). Total num frames: 5365760. Throughput: 0: 899.0. Samples: 1339404. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:40:42,874][00205] Avg episode reward: [(0, '27.124')] +[2023-02-24 12:40:47,872][00205] Fps is (10 sec: 3685.7, 60 sec: 3549.7, 300 sec: 3554.5). Total num frames: 5382144. Throughput: 0: 911.1. Samples: 1345798. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:40:47,875][00205] Avg episode reward: [(0, '25.789')] +[2023-02-24 12:40:47,889][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001314_5382144.pth... +[2023-02-24 12:40:48,035][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001107_4534272.pth +[2023-02-24 12:40:52,872][00205] Fps is (10 sec: 3277.0, 60 sec: 3549.7, 300 sec: 3554.5). Total num frames: 5398528. Throughput: 0: 862.0. Samples: 1349754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:40:52,879][00205] Avg episode reward: [(0, '24.927')] +[2023-02-24 12:40:55,663][11215] Updated weights for policy 0, policy_version 1320 (0.0021) +[2023-02-24 12:40:57,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5414912. Throughput: 0: 863.6. Samples: 1351836. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:40:57,872][00205] Avg episode reward: [(0, '26.058')] +[2023-02-24 12:41:02,870][00205] Fps is (10 sec: 3687.1, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5435392. Throughput: 0: 916.1. Samples: 1358412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:41:02,872][00205] Avg episode reward: [(0, '27.403')] +[2023-02-24 12:41:05,026][11215] Updated weights for policy 0, policy_version 1330 (0.0015) +[2023-02-24 12:41:07,871][00205] Fps is (10 sec: 4095.5, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5455872. Throughput: 0: 904.6. Samples: 1364266. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:41:07,879][00205] Avg episode reward: [(0, '27.649')] +[2023-02-24 12:41:12,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5468160. Throughput: 0: 876.3. Samples: 1366294. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:41:12,873][00205] Avg episode reward: [(0, '26.789')] +[2023-02-24 12:41:17,870][00205] Fps is (10 sec: 2867.5, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5484544. Throughput: 0: 877.1. Samples: 1370872. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:41:17,873][00205] Avg episode reward: [(0, '27.554')] +[2023-02-24 12:41:17,904][11215] Updated weights for policy 0, policy_version 1340 (0.0013) +[2023-02-24 12:41:22,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3618.0, 300 sec: 3540.6). Total num frames: 5509120. Throughput: 0: 915.6. Samples: 1377378. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:41:22,875][00205] Avg episode reward: [(0, '28.234')] +[2023-02-24 12:41:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 5525504. Throughput: 0: 917.0. Samples: 1380666. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:41:27,875][00205] Avg episode reward: [(0, '28.472')] +[2023-02-24 12:41:27,890][11201] Saving new best policy, reward=28.472! +[2023-02-24 12:41:28,502][11215] Updated weights for policy 0, policy_version 1350 (0.0018) +[2023-02-24 12:41:32,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5537792. Throughput: 0: 863.7. Samples: 1384662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:41:32,874][00205] Avg episode reward: [(0, '27.413')] +[2023-02-24 12:41:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5558272. Throughput: 0: 887.5. Samples: 1389688. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:41:37,873][00205] Avg episode reward: [(0, '27.655')] +[2023-02-24 12:41:40,251][11215] Updated weights for policy 0, policy_version 1360 (0.0032) +[2023-02-24 12:41:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3540.6). Total num frames: 5578752. Throughput: 0: 913.2. Samples: 1392928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:41:42,876][00205] Avg episode reward: [(0, '28.925')] +[2023-02-24 12:41:42,879][11201] Saving new best policy, reward=28.925! +[2023-02-24 12:41:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 5595136. Throughput: 0: 897.9. Samples: 1398818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:41:47,876][00205] Avg episode reward: [(0, '29.357')] +[2023-02-24 12:41:47,888][11201] Saving new best policy, reward=29.357! +[2023-02-24 12:41:52,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.7, 300 sec: 3540.6). Total num frames: 5607424. Throughput: 0: 856.0. Samples: 1402786. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:41:52,874][00205] Avg episode reward: [(0, '29.483')] +[2023-02-24 12:41:52,906][11201] Saving new best policy, reward=29.483! +[2023-02-24 12:41:52,911][11215] Updated weights for policy 0, policy_version 1370 (0.0017) +[2023-02-24 12:41:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5627904. Throughput: 0: 857.4. Samples: 1404878. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:41:57,872][00205] Avg episode reward: [(0, '30.130')] +[2023-02-24 12:41:57,887][11201] Saving new best policy, reward=30.130! +[2023-02-24 12:42:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5648384. Throughput: 0: 898.9. Samples: 1411324. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:42:02,880][00205] Avg episode reward: [(0, '29.209')] +[2023-02-24 12:42:03,348][11215] Updated weights for policy 0, policy_version 1380 (0.0022) +[2023-02-24 12:42:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3540.6). Total num frames: 5664768. Throughput: 0: 881.8. Samples: 1417056. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:42:07,873][00205] Avg episode reward: [(0, '28.847')] +[2023-02-24 12:42:12,872][00205] Fps is (10 sec: 3276.1, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5681152. Throughput: 0: 854.9. Samples: 1419140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:42:12,874][00205] Avg episode reward: [(0, '28.500')] +[2023-02-24 12:42:16,061][11215] Updated weights for policy 0, policy_version 1390 (0.0028) +[2023-02-24 12:42:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 5701632. Throughput: 0: 878.1. Samples: 1424178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:42:17,878][00205] Avg episode reward: [(0, '28.080')] +[2023-02-24 12:42:22,870][00205] Fps is (10 sec: 4506.6, 60 sec: 3618.3, 300 sec: 3554.5). Total num frames: 5726208. Throughput: 0: 925.9. Samples: 1431352. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:42:22,873][00205] Avg episode reward: [(0, '27.170')] +[2023-02-24 12:42:24,609][11215] Updated weights for policy 0, policy_version 1400 (0.0023) +[2023-02-24 12:42:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 5742592. Throughput: 0: 927.2. Samples: 1434654. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:42:27,876][00205] Avg episode reward: [(0, '26.678')] +[2023-02-24 12:42:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5758976. Throughput: 0: 893.1. Samples: 1439008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:42:32,872][00205] Avg episode reward: [(0, '27.204')] +[2023-02-24 12:42:36,876][11215] Updated weights for policy 0, policy_version 1410 (0.0020) +[2023-02-24 12:42:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5779456. Throughput: 0: 932.0. Samples: 1444728. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:42:37,872][00205] Avg episode reward: [(0, '26.727')] +[2023-02-24 12:42:42,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3754.7, 300 sec: 3582.3). Total num frames: 5804032. Throughput: 0: 965.6. Samples: 1448332. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:42:42,872][00205] Avg episode reward: [(0, '25.694')] +[2023-02-24 12:42:45,928][11215] Updated weights for policy 0, policy_version 1420 (0.0024) +[2023-02-24 12:42:47,875][00205] Fps is (10 sec: 4093.8, 60 sec: 3754.3, 300 sec: 3596.1). Total num frames: 5820416. Throughput: 0: 965.8. Samples: 1454792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:42:47,878][00205] Avg episode reward: [(0, '25.460')] +[2023-02-24 12:42:47,893][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001421_5820416.pth... +[2023-02-24 12:42:48,041][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001211_4960256.pth +[2023-02-24 12:42:52,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3754.7, 300 sec: 3582.3). Total num frames: 5832704. Throughput: 0: 926.7. Samples: 1458756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:42:52,875][00205] Avg episode reward: [(0, '25.142')] +[2023-02-24 12:42:57,870][00205] Fps is (10 sec: 2868.6, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5849088. Throughput: 0: 930.5. Samples: 1461010. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:42:57,875][00205] Avg episode reward: [(0, '24.673')] +[2023-02-24 12:42:58,934][11215] Updated weights for policy 0, policy_version 1430 (0.0017) +[2023-02-24 12:43:02,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.6, 300 sec: 3582.3). Total num frames: 5873664. Throughput: 0: 962.7. Samples: 1467498. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:43:02,879][00205] Avg episode reward: [(0, '25.489')] +[2023-02-24 12:43:07,871][00205] Fps is (10 sec: 4095.6, 60 sec: 3754.6, 300 sec: 3582.2). Total num frames: 5890048. Throughput: 0: 927.3. Samples: 1473084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:43:07,878][00205] Avg episode reward: [(0, '24.572')] +[2023-02-24 12:43:10,110][11215] Updated weights for policy 0, policy_version 1440 (0.0022) +[2023-02-24 12:43:12,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3686.5, 300 sec: 3582.3). Total num frames: 5902336. Throughput: 0: 899.6. Samples: 1475136. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:43:12,875][00205] Avg episode reward: [(0, '24.618')] +[2023-02-24 12:43:17,870][00205] Fps is (10 sec: 3277.3, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5922816. Throughput: 0: 910.5. Samples: 1479980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:43:17,877][00205] Avg episode reward: [(0, '23.787')] +[2023-02-24 12:43:21,151][11215] Updated weights for policy 0, policy_version 1450 (0.0021) +[2023-02-24 12:43:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 5943296. Throughput: 0: 928.1. Samples: 1486492. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:43:22,873][00205] Avg episode reward: [(0, '25.619')] +[2023-02-24 12:43:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.4). Total num frames: 5959680. Throughput: 0: 914.4. Samples: 1489480. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:43:27,876][00205] Avg episode reward: [(0, '25.045')] +[2023-02-24 12:43:32,871][00205] Fps is (10 sec: 3276.5, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 5976064. Throughput: 0: 861.5. Samples: 1493554. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:43:32,874][00205] Avg episode reward: [(0, '25.332')] +[2023-02-24 12:43:34,228][11215] Updated weights for policy 0, policy_version 1460 (0.0027) +[2023-02-24 12:43:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 5996544. Throughput: 0: 893.2. Samples: 1498950. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:43:37,872][00205] Avg episode reward: [(0, '25.772')] +[2023-02-24 12:43:42,870][00205] Fps is (10 sec: 4096.3, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6017024. Throughput: 0: 914.1. Samples: 1502144. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:43:42,878][00205] Avg episode reward: [(0, '25.907')] +[2023-02-24 12:43:43,586][11215] Updated weights for policy 0, policy_version 1470 (0.0016) +[2023-02-24 12:43:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.2, 300 sec: 3596.1). Total num frames: 6033408. Throughput: 0: 896.5. Samples: 1507838. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:43:47,877][00205] Avg episode reward: [(0, '26.580')] +[2023-02-24 12:43:52,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6045696. Throughput: 0: 862.3. Samples: 1511886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:43:52,879][00205] Avg episode reward: [(0, '25.424')] +[2023-02-24 12:43:56,726][11215] Updated weights for policy 0, policy_version 1480 (0.0012) +[2023-02-24 12:43:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 6066176. Throughput: 0: 871.0. Samples: 1514332. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:43:57,878][00205] Avg episode reward: [(0, '24.721')] +[2023-02-24 12:44:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6086656. Throughput: 0: 907.5. Samples: 1520818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:44:02,877][00205] Avg episode reward: [(0, '24.597')] +[2023-02-24 12:44:07,384][11215] Updated weights for policy 0, policy_version 1490 (0.0016) +[2023-02-24 12:44:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 6103040. Throughput: 0: 881.4. Samples: 1526156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:44:07,873][00205] Avg episode reward: [(0, '25.599')] +[2023-02-24 12:44:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6115328. Throughput: 0: 860.8. Samples: 1528214. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:44:12,875][00205] Avg episode reward: [(0, '26.135')] +[2023-02-24 12:44:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6135808. Throughput: 0: 886.1. Samples: 1533430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:44:17,875][00205] Avg episode reward: [(0, '26.670')] +[2023-02-24 12:44:18,915][11215] Updated weights for policy 0, policy_version 1500 (0.0016) +[2023-02-24 12:44:22,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3596.2). Total num frames: 6160384. Throughput: 0: 911.6. Samples: 1539972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:44:22,873][00205] Avg episode reward: [(0, '27.034')] +[2023-02-24 12:44:27,872][00205] Fps is (10 sec: 3685.6, 60 sec: 3549.7, 300 sec: 3582.2). Total num frames: 6172672. Throughput: 0: 896.8. Samples: 1542502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:44:27,875][00205] Avg episode reward: [(0, '28.232')] +[2023-02-24 12:44:31,281][11215] Updated weights for policy 0, policy_version 1510 (0.0015) +[2023-02-24 12:44:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6189056. Throughput: 0: 862.0. Samples: 1546626. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:44:32,877][00205] Avg episode reward: [(0, '28.717')] +[2023-02-24 12:44:37,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6209536. Throughput: 0: 899.2. Samples: 1552352. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:44:37,875][00205] Avg episode reward: [(0, '29.136')] +[2023-02-24 12:44:41,437][11215] Updated weights for policy 0, policy_version 1520 (0.0015) +[2023-02-24 12:44:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 6230016. Throughput: 0: 917.2. Samples: 1555608. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:44:42,876][00205] Avg episode reward: [(0, '27.976')] +[2023-02-24 12:44:47,873][00205] Fps is (10 sec: 3685.2, 60 sec: 3549.7, 300 sec: 3596.1). Total num frames: 6246400. Throughput: 0: 893.7. Samples: 1561038. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:44:47,877][00205] Avg episode reward: [(0, '27.555')] +[2023-02-24 12:44:47,885][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001525_6246400.pth... +[2023-02-24 12:44:48,033][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001314_5382144.pth +[2023-02-24 12:44:52,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6258688. Throughput: 0: 862.0. Samples: 1564946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:44:52,876][00205] Avg episode reward: [(0, '26.975')] +[2023-02-24 12:44:54,636][11215] Updated weights for policy 0, policy_version 1530 (0.0017) +[2023-02-24 12:44:57,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6279168. Throughput: 0: 877.6. Samples: 1567708. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:44:57,873][00205] Avg episode reward: [(0, '26.468')] +[2023-02-24 12:45:02,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6299648. Throughput: 0: 902.7. Samples: 1574052. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:45:02,878][00205] Avg episode reward: [(0, '25.282')] +[2023-02-24 12:45:05,088][11215] Updated weights for policy 0, policy_version 1540 (0.0013) +[2023-02-24 12:45:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6311936. Throughput: 0: 866.2. Samples: 1578952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:45:07,874][00205] Avg episode reward: [(0, '25.806')] +[2023-02-24 12:45:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6328320. Throughput: 0: 855.0. Samples: 1580976. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:45:12,875][00205] Avg episode reward: [(0, '26.113')] +[2023-02-24 12:45:17,190][11215] Updated weights for policy 0, policy_version 1550 (0.0016) +[2023-02-24 12:45:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6348800. Throughput: 0: 888.3. Samples: 1586598. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:45:17,872][00205] Avg episode reward: [(0, '27.188')] +[2023-02-24 12:45:22,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6369280. Throughput: 0: 903.1. Samples: 1592990. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:45:22,875][00205] Avg episode reward: [(0, '27.562')] +[2023-02-24 12:45:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3596.1). Total num frames: 6385664. Throughput: 0: 881.2. Samples: 1595262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:45:27,875][00205] Avg episode reward: [(0, '27.304')] +[2023-02-24 12:45:29,203][11215] Updated weights for policy 0, policy_version 1560 (0.0017) +[2023-02-24 12:45:32,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 6397952. Throughput: 0: 851.9. Samples: 1599370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:45:32,879][00205] Avg episode reward: [(0, '27.522')] +[2023-02-24 12:45:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6422528. Throughput: 0: 898.2. Samples: 1605366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:45:37,872][00205] Avg episode reward: [(0, '27.148')] +[2023-02-24 12:45:39,717][11215] Updated weights for policy 0, policy_version 1570 (0.0020) +[2023-02-24 12:45:42,870][00205] Fps is (10 sec: 4505.4, 60 sec: 3549.8, 300 sec: 3596.2). Total num frames: 6443008. Throughput: 0: 909.9. Samples: 1608652. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:45:42,873][00205] Avg episode reward: [(0, '26.756')] +[2023-02-24 12:45:47,871][00205] Fps is (10 sec: 3276.4, 60 sec: 3481.7, 300 sec: 3582.3). Total num frames: 6455296. Throughput: 0: 879.4. Samples: 1613624. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:45:47,874][00205] Avg episode reward: [(0, '24.953')] +[2023-02-24 12:45:52,870][00205] Fps is (10 sec: 2457.7, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 6467584. Throughput: 0: 860.0. Samples: 1617654. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:45:52,872][00205] Avg episode reward: [(0, '24.925')] +[2023-02-24 12:45:53,081][11215] Updated weights for policy 0, policy_version 1580 (0.0029) +[2023-02-24 12:45:57,870][00205] Fps is (10 sec: 3686.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6492160. Throughput: 0: 887.6. Samples: 1620918. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:45:57,875][00205] Avg episode reward: [(0, '25.196')] +[2023-02-24 12:46:02,276][11215] Updated weights for policy 0, policy_version 1590 (0.0024) +[2023-02-24 12:46:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6512640. Throughput: 0: 907.6. Samples: 1627438. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:46:02,873][00205] Avg episode reward: [(0, '24.389')] +[2023-02-24 12:46:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6524928. Throughput: 0: 865.2. Samples: 1631924. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:46:07,873][00205] Avg episode reward: [(0, '25.683')] +[2023-02-24 12:46:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6541312. Throughput: 0: 859.7. Samples: 1633948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:46:12,874][00205] Avg episode reward: [(0, '24.758')] +[2023-02-24 12:46:15,361][11215] Updated weights for policy 0, policy_version 1600 (0.0012) +[2023-02-24 12:46:17,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6561792. Throughput: 0: 900.7. Samples: 1639900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:46:17,873][00205] Avg episode reward: [(0, '26.283')] +[2023-02-24 12:46:22,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3549.8, 300 sec: 3582.2). Total num frames: 6582272. Throughput: 0: 908.8. Samples: 1646262. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:46:22,874][00205] Avg episode reward: [(0, '25.931')] +[2023-02-24 12:46:26,337][11215] Updated weights for policy 0, policy_version 1610 (0.0015) +[2023-02-24 12:46:27,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6594560. Throughput: 0: 880.8. Samples: 1648286. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:46:27,876][00205] Avg episode reward: [(0, '25.591')] +[2023-02-24 12:46:32,870][00205] Fps is (10 sec: 2867.6, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6610944. Throughput: 0: 860.2. Samples: 1652332. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:46:32,880][00205] Avg episode reward: [(0, '25.339')] +[2023-02-24 12:46:37,665][11215] Updated weights for policy 0, policy_version 1620 (0.0016) +[2023-02-24 12:46:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6635520. Throughput: 0: 914.7. Samples: 1658814. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:46:37,873][00205] Avg episode reward: [(0, '25.575')] +[2023-02-24 12:46:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6651904. Throughput: 0: 915.1. Samples: 1662098. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:46:42,873][00205] Avg episode reward: [(0, '25.991')] +[2023-02-24 12:46:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 6668288. Throughput: 0: 871.0. Samples: 1666634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:46:47,877][00205] Avg episode reward: [(0, '26.007')] +[2023-02-24 12:46:47,889][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001628_6668288.pth... +[2023-02-24 12:46:48,098][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001421_5820416.pth +[2023-02-24 12:46:50,894][11215] Updated weights for policy 0, policy_version 1630 (0.0025) +[2023-02-24 12:46:52,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6680576. Throughput: 0: 870.9. Samples: 1671114. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 12:46:52,872][00205] Avg episode reward: [(0, '25.376')] +[2023-02-24 12:46:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6705152. Throughput: 0: 898.4. Samples: 1674378. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:46:57,875][00205] Avg episode reward: [(0, '24.305')] +[2023-02-24 12:47:00,430][11215] Updated weights for policy 0, policy_version 1640 (0.0031) +[2023-02-24 12:47:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6721536. Throughput: 0: 909.5. Samples: 1680826. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:47:02,873][00205] Avg episode reward: [(0, '24.712')] +[2023-02-24 12:47:07,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3582.3). Total num frames: 6737920. Throughput: 0: 861.4. Samples: 1685024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:47:07,881][00205] Avg episode reward: [(0, '23.768')] +[2023-02-24 12:47:12,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 6754304. Throughput: 0: 861.9. Samples: 1687070. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:47:12,873][00205] Avg episode reward: [(0, '25.100')] +[2023-02-24 12:47:13,240][11215] Updated weights for policy 0, policy_version 1650 (0.0024) +[2023-02-24 12:47:17,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6774784. Throughput: 0: 910.3. Samples: 1693294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:47:17,881][00205] Avg episode reward: [(0, '24.110')] +[2023-02-24 12:47:22,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 6795264. Throughput: 0: 903.2. Samples: 1699456. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:47:22,872][00205] Avg episode reward: [(0, '24.172')] +[2023-02-24 12:47:23,684][11215] Updated weights for policy 0, policy_version 1660 (0.0018) +[2023-02-24 12:47:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6807552. Throughput: 0: 876.6. Samples: 1701544. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:47:27,874][00205] Avg episode reward: [(0, '24.170')] +[2023-02-24 12:47:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6828032. Throughput: 0: 869.7. Samples: 1705772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:47:32,872][00205] Avg episode reward: [(0, '24.176')] +[2023-02-24 12:47:35,573][11215] Updated weights for policy 0, policy_version 1670 (0.0019) +[2023-02-24 12:47:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6848512. Throughput: 0: 919.2. Samples: 1712476. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:47:37,873][00205] Avg episode reward: [(0, '23.699')] +[2023-02-24 12:47:42,883][00205] Fps is (10 sec: 4090.5, 60 sec: 3617.3, 300 sec: 3554.4). Total num frames: 6868992. Throughput: 0: 918.7. Samples: 1715732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:47:42,897][00205] Avg episode reward: [(0, '24.180')] +[2023-02-24 12:47:47,330][11215] Updated weights for policy 0, policy_version 1680 (0.0012) +[2023-02-24 12:47:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6881280. Throughput: 0: 870.4. Samples: 1719994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:47:47,874][00205] Avg episode reward: [(0, '25.763')] +[2023-02-24 12:47:52,870][00205] Fps is (10 sec: 2871.1, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6897664. Throughput: 0: 886.6. Samples: 1724922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:47:52,878][00205] Avg episode reward: [(0, '25.965')] +[2023-02-24 12:47:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6918144. Throughput: 0: 913.5. Samples: 1728176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:47:57,873][00205] Avg episode reward: [(0, '25.937')] +[2023-02-24 12:47:57,956][11215] Updated weights for policy 0, policy_version 1690 (0.0014) +[2023-02-24 12:48:02,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3618.0, 300 sec: 3554.5). Total num frames: 6938624. Throughput: 0: 909.1. Samples: 1734204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:48:02,880][00205] Avg episode reward: [(0, '24.937')] +[2023-02-24 12:48:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6950912. Throughput: 0: 865.3. Samples: 1738396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:48:07,875][00205] Avg episode reward: [(0, '25.501')] +[2023-02-24 12:48:10,916][11215] Updated weights for policy 0, policy_version 1700 (0.0011) +[2023-02-24 12:48:12,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6971392. Throughput: 0: 868.0. Samples: 1740602. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:48:12,876][00205] Avg episode reward: [(0, '25.679')] +[2023-02-24 12:48:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6991872. Throughput: 0: 921.3. Samples: 1747232. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:48:17,873][00205] Avg episode reward: [(0, '24.206')] +[2023-02-24 12:48:20,241][11215] Updated weights for policy 0, policy_version 1710 (0.0011) +[2023-02-24 12:48:22,872][00205] Fps is (10 sec: 3685.8, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7008256. Throughput: 0: 896.1. Samples: 1752804. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:48:22,881][00205] Avg episode reward: [(0, '23.939')] +[2023-02-24 12:48:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7024640. Throughput: 0: 870.0. Samples: 1754872. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:48:27,878][00205] Avg episode reward: [(0, '25.550')] +[2023-02-24 12:48:32,870][00205] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7041024. Throughput: 0: 881.2. Samples: 1759646. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:48:32,873][00205] Avg episode reward: [(0, '26.768')] +[2023-02-24 12:48:33,291][11215] Updated weights for policy 0, policy_version 1720 (0.0019) +[2023-02-24 12:48:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7061504. Throughput: 0: 917.9. Samples: 1766228. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:48:37,876][00205] Avg episode reward: [(0, '26.012')] +[2023-02-24 12:48:42,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3482.4, 300 sec: 3540.6). Total num frames: 7077888. Throughput: 0: 911.4. Samples: 1769188. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 12:48:42,873][00205] Avg episode reward: [(0, '25.593')] +[2023-02-24 12:48:44,671][11215] Updated weights for policy 0, policy_version 1730 (0.0015) +[2023-02-24 12:48:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7094272. Throughput: 0: 868.4. Samples: 1773282. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 12:48:47,880][00205] Avg episode reward: [(0, '25.264')] +[2023-02-24 12:48:47,891][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001732_7094272.pth... +[2023-02-24 12:48:48,073][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001525_6246400.pth +[2023-02-24 12:48:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7110656. Throughput: 0: 892.0. Samples: 1778538. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:48:52,878][00205] Avg episode reward: [(0, '24.936')] +[2023-02-24 12:48:55,723][11215] Updated weights for policy 0, policy_version 1740 (0.0015) +[2023-02-24 12:48:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7135232. Throughput: 0: 915.3. Samples: 1781790. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:48:57,873][00205] Avg episode reward: [(0, '24.912')] +[2023-02-24 12:49:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 7151616. Throughput: 0: 892.8. Samples: 1787408. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:02,877][00205] Avg episode reward: [(0, '25.786')] +[2023-02-24 12:49:07,871][00205] Fps is (10 sec: 2866.8, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7163904. Throughput: 0: 859.7. Samples: 1791492. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 12:49:07,879][00205] Avg episode reward: [(0, '25.678')] +[2023-02-24 12:49:08,364][11215] Updated weights for policy 0, policy_version 1750 (0.0021) +[2023-02-24 12:49:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7184384. Throughput: 0: 873.7. Samples: 1794190. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 12:49:12,873][00205] Avg episode reward: [(0, '26.490')] +[2023-02-24 12:49:17,872][00205] Fps is (10 sec: 4095.6, 60 sec: 3549.7, 300 sec: 3540.6). Total num frames: 7204864. Throughput: 0: 912.6. Samples: 1800716. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:17,875][00205] Avg episode reward: [(0, '26.776')] +[2023-02-24 12:49:18,189][11215] Updated weights for policy 0, policy_version 1760 (0.0019) +[2023-02-24 12:49:22,871][00205] Fps is (10 sec: 3685.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7221248. Throughput: 0: 881.0. Samples: 1805876. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:22,874][00205] Avg episode reward: [(0, '25.964')] +[2023-02-24 12:49:27,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 7233536. Throughput: 0: 860.6. Samples: 1807916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:27,874][00205] Avg episode reward: [(0, '26.699')] +[2023-02-24 12:49:31,253][11215] Updated weights for policy 0, policy_version 1770 (0.0017) +[2023-02-24 12:49:32,870][00205] Fps is (10 sec: 3277.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7254016. Throughput: 0: 882.6. Samples: 1812998. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:32,872][00205] Avg episode reward: [(0, '24.613')] +[2023-02-24 12:49:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7278592. Throughput: 0: 910.6. Samples: 1819514. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:49:37,872][00205] Avg episode reward: [(0, '23.298')] +[2023-02-24 12:49:41,760][11215] Updated weights for policy 0, policy_version 1780 (0.0011) +[2023-02-24 12:49:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7290880. Throughput: 0: 897.6. Samples: 1822180. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:42,877][00205] Avg episode reward: [(0, '24.331')] +[2023-02-24 12:49:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7307264. Throughput: 0: 863.6. Samples: 1826270. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:49:47,880][00205] Avg episode reward: [(0, '23.984')] +[2023-02-24 12:49:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7327744. Throughput: 0: 899.7. Samples: 1831978. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:52,878][00205] Avg episode reward: [(0, '24.064')] +[2023-02-24 12:49:53,624][11215] Updated weights for policy 0, policy_version 1790 (0.0031) +[2023-02-24 12:49:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7348224. Throughput: 0: 913.0. Samples: 1835276. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:49:57,877][00205] Avg episode reward: [(0, '24.798')] +[2023-02-24 12:50:02,870][00205] Fps is (10 sec: 3686.2, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 7364608. Throughput: 0: 883.9. Samples: 1840488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:50:02,874][00205] Avg episode reward: [(0, '25.309')] +[2023-02-24 12:50:05,692][11215] Updated weights for policy 0, policy_version 1800 (0.0016) +[2023-02-24 12:50:07,872][00205] Fps is (10 sec: 2866.5, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7376896. Throughput: 0: 862.8. Samples: 1844704. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:50:07,879][00205] Avg episode reward: [(0, '26.188')] +[2023-02-24 12:50:12,870][00205] Fps is (10 sec: 3277.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7397376. Throughput: 0: 881.2. Samples: 1847570. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:50:12,874][00205] Avg episode reward: [(0, '25.033')] +[2023-02-24 12:50:15,993][11215] Updated weights for policy 0, policy_version 1810 (0.0020) +[2023-02-24 12:50:17,870][00205] Fps is (10 sec: 4097.0, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 7417856. Throughput: 0: 914.2. Samples: 1854138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:50:17,872][00205] Avg episode reward: [(0, '25.616')] +[2023-02-24 12:50:22,872][00205] Fps is (10 sec: 3685.5, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7434240. Throughput: 0: 882.8. Samples: 1859244. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 12:50:22,875][00205] Avg episode reward: [(0, '26.669')] +[2023-02-24 12:50:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7446528. Throughput: 0: 869.6. Samples: 1861312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:50:27,873][00205] Avg episode reward: [(0, '25.640')] +[2023-02-24 12:50:29,015][11215] Updated weights for policy 0, policy_version 1820 (0.0029) +[2023-02-24 12:50:32,870][00205] Fps is (10 sec: 3687.3, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7471104. Throughput: 0: 896.4. Samples: 1866610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:50:32,873][00205] Avg episode reward: [(0, '26.284')] +[2023-02-24 12:50:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7491584. Throughput: 0: 918.1. Samples: 1873294. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:50:37,872][00205] Avg episode reward: [(0, '25.689')] +[2023-02-24 12:50:38,382][11215] Updated weights for policy 0, policy_version 1830 (0.0020) +[2023-02-24 12:50:42,872][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 7507968. Throughput: 0: 899.0. Samples: 1875732. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:50:42,877][00205] Avg episode reward: [(0, '26.624')] +[2023-02-24 12:50:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 7520256. Throughput: 0: 875.2. Samples: 1879870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:50:47,879][00205] Avg episode reward: [(0, '26.732')] +[2023-02-24 12:50:47,895][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001836_7520256.pth... +[2023-02-24 12:50:48,031][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001628_6668288.pth +[2023-02-24 12:50:51,327][11215] Updated weights for policy 0, policy_version 1840 (0.0023) +[2023-02-24 12:50:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7540736. Throughput: 0: 908.0. Samples: 1885564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:50:52,873][00205] Avg episode reward: [(0, '26.612')] +[2023-02-24 12:50:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7561216. Throughput: 0: 915.1. Samples: 1888750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:50:57,877][00205] Avg episode reward: [(0, '27.062')] +[2023-02-24 12:51:02,874][11215] Updated weights for policy 0, policy_version 1850 (0.0012) +[2023-02-24 12:51:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 7577600. Throughput: 0: 880.1. Samples: 1893744. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:51:02,880][00205] Avg episode reward: [(0, '26.215')] +[2023-02-24 12:51:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 7589888. Throughput: 0: 856.6. Samples: 1897788. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:51:07,872][00205] Avg episode reward: [(0, '26.323')] +[2023-02-24 12:51:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7610368. Throughput: 0: 876.8. Samples: 1900768. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:51:12,873][00205] Avg episode reward: [(0, '27.406')] +[2023-02-24 12:51:14,417][11215] Updated weights for policy 0, policy_version 1860 (0.0014) +[2023-02-24 12:51:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7630848. Throughput: 0: 899.7. Samples: 1907096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:51:17,872][00205] Avg episode reward: [(0, '28.808')] +[2023-02-24 12:51:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.7, 300 sec: 3554.5). Total num frames: 7643136. Throughput: 0: 849.4. Samples: 1911516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:51:22,877][00205] Avg episode reward: [(0, '29.578')] +[2023-02-24 12:51:27,793][11215] Updated weights for policy 0, policy_version 1870 (0.0016) +[2023-02-24 12:51:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7659520. Throughput: 0: 838.7. Samples: 1913472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:51:27,880][00205] Avg episode reward: [(0, '28.223')] +[2023-02-24 12:51:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 7680000. Throughput: 0: 871.3. Samples: 1919078. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:51:32,873][00205] Avg episode reward: [(0, '27.765')] +[2023-02-24 12:51:37,190][11215] Updated weights for policy 0, policy_version 1880 (0.0018) +[2023-02-24 12:51:37,873][00205] Fps is (10 sec: 4094.6, 60 sec: 3481.4, 300 sec: 3554.5). Total num frames: 7700480. Throughput: 0: 891.2. Samples: 1925672. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:51:37,876][00205] Avg episode reward: [(0, '27.676')] +[2023-02-24 12:51:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3540.6). Total num frames: 7712768. Throughput: 0: 866.8. Samples: 1927756. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:51:42,877][00205] Avg episode reward: [(0, '26.898')] +[2023-02-24 12:51:47,870][00205] Fps is (10 sec: 2868.2, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 7729152. Throughput: 0: 847.1. Samples: 1931862. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:51:47,881][00205] Avg episode reward: [(0, '26.736')] +[2023-02-24 12:51:50,187][11215] Updated weights for policy 0, policy_version 1890 (0.0014) +[2023-02-24 12:51:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 7749632. Throughput: 0: 897.0. Samples: 1938152. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:51:52,875][00205] Avg episode reward: [(0, '25.938')] +[2023-02-24 12:51:57,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3481.5, 300 sec: 3554.5). Total num frames: 7770112. Throughput: 0: 902.8. Samples: 1941396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:51:57,874][00205] Avg episode reward: [(0, '26.483')] +[2023-02-24 12:52:00,893][11215] Updated weights for policy 0, policy_version 1900 (0.0020) +[2023-02-24 12:52:02,872][00205] Fps is (10 sec: 3685.5, 60 sec: 3481.5, 300 sec: 3554.5). Total num frames: 7786496. Throughput: 0: 869.7. Samples: 1946236. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:52:02,880][00205] Avg episode reward: [(0, '26.473')] +[2023-02-24 12:52:07,870][00205] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7802880. Throughput: 0: 868.6. Samples: 1950604. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:52:07,879][00205] Avg episode reward: [(0, '26.178')] +[2023-02-24 12:52:12,327][11215] Updated weights for policy 0, policy_version 1910 (0.0013) +[2023-02-24 12:52:12,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7823360. Throughput: 0: 898.8. Samples: 1953916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:52:12,880][00205] Avg episode reward: [(0, '25.571')] +[2023-02-24 12:52:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7843840. Throughput: 0: 920.7. Samples: 1960510. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:52:17,879][00205] Avg episode reward: [(0, '25.162')] +[2023-02-24 12:52:22,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7856128. Throughput: 0: 871.2. Samples: 1964872. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:52:22,872][00205] Avg episode reward: [(0, '24.254')] +[2023-02-24 12:52:24,711][11215] Updated weights for policy 0, policy_version 1920 (0.0018) +[2023-02-24 12:52:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7872512. Throughput: 0: 869.5. Samples: 1966884. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:52:27,875][00205] Avg episode reward: [(0, '26.902')] +[2023-02-24 12:52:32,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3618.0, 300 sec: 3554.5). Total num frames: 7897088. Throughput: 0: 913.4. Samples: 1972966. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:52:32,878][00205] Avg episode reward: [(0, '27.304')] +[2023-02-24 12:52:34,676][11215] Updated weights for policy 0, policy_version 1930 (0.0012) +[2023-02-24 12:52:37,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3550.1, 300 sec: 3540.8). Total num frames: 7913472. Throughput: 0: 914.4. Samples: 1979300. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:52:37,876][00205] Avg episode reward: [(0, '27.444')] +[2023-02-24 12:52:42,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7929856. Throughput: 0: 888.5. Samples: 1981376. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:52:42,872][00205] Avg episode reward: [(0, '27.197')] +[2023-02-24 12:52:47,644][11215] Updated weights for policy 0, policy_version 1940 (0.0018) +[2023-02-24 12:52:47,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7946240. Throughput: 0: 874.3. Samples: 1985578. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:52:47,876][00205] Avg episode reward: [(0, '28.879')] +[2023-02-24 12:52:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001940_7946240.pth... +[2023-02-24 12:52:48,009][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001732_7094272.pth +[2023-02-24 12:52:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7966720. Throughput: 0: 919.1. Samples: 1991962. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:52:52,873][00205] Avg episode reward: [(0, '29.815')] +[2023-02-24 12:52:57,558][11215] Updated weights for policy 0, policy_version 1950 (0.0023) +[2023-02-24 12:52:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3554.5). Total num frames: 7987200. Throughput: 0: 918.8. Samples: 1995260. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:52:57,872][00205] Avg episode reward: [(0, '31.595')] +[2023-02-24 12:52:57,882][11201] Saving new best policy, reward=31.595! +[2023-02-24 12:53:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 7999488. Throughput: 0: 871.2. Samples: 1999716. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:53:02,878][00205] Avg episode reward: [(0, '30.868')] +[2023-02-24 12:53:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8015872. Throughput: 0: 877.0. Samples: 2004338. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:53:07,876][00205] Avg episode reward: [(0, '31.491')] +[2023-02-24 12:53:10,102][11215] Updated weights for policy 0, policy_version 1960 (0.0012) +[2023-02-24 12:53:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8036352. Throughput: 0: 904.5. Samples: 2007588. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:53:12,873][00205] Avg episode reward: [(0, '31.804')] +[2023-02-24 12:53:12,898][11201] Saving new best policy, reward=31.804! +[2023-02-24 12:53:17,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3549.7, 300 sec: 3554.5). Total num frames: 8056832. Throughput: 0: 915.1. Samples: 2014144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:53:17,875][00205] Avg episode reward: [(0, '31.345')] +[2023-02-24 12:53:21,407][11215] Updated weights for policy 0, policy_version 1970 (0.0035) +[2023-02-24 12:53:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8073216. Throughput: 0: 867.2. Samples: 2018322. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:53:22,874][00205] Avg episode reward: [(0, '31.241')] +[2023-02-24 12:53:27,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8089600. Throughput: 0: 868.6. Samples: 2020464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:53:27,874][00205] Avg episode reward: [(0, '29.691')] +[2023-02-24 12:53:32,595][11215] Updated weights for policy 0, policy_version 1980 (0.0012) +[2023-02-24 12:53:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 8110080. Throughput: 0: 912.1. Samples: 2026624. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:53:32,872][00205] Avg episode reward: [(0, '27.546')] +[2023-02-24 12:53:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8130560. Throughput: 0: 903.2. Samples: 2032604. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:53:37,876][00205] Avg episode reward: [(0, '26.472')] +[2023-02-24 12:53:42,871][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8142848. Throughput: 0: 875.6. Samples: 2034662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:53:42,874][00205] Avg episode reward: [(0, '25.526')] +[2023-02-24 12:53:45,377][11215] Updated weights for policy 0, policy_version 1990 (0.0025) +[2023-02-24 12:53:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8159232. Throughput: 0: 871.5. Samples: 2038934. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:53:47,877][00205] Avg episode reward: [(0, '25.844')] +[2023-02-24 12:53:52,873][00205] Fps is (10 sec: 3685.1, 60 sec: 3549.7, 300 sec: 3540.6). Total num frames: 8179712. Throughput: 0: 911.0. Samples: 2045334. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:53:52,875][00205] Avg episode reward: [(0, '25.736')] +[2023-02-24 12:53:55,101][11215] Updated weights for policy 0, policy_version 2000 (0.0011) +[2023-02-24 12:53:57,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 8200192. Throughput: 0: 910.7. Samples: 2048570. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:53:57,876][00205] Avg episode reward: [(0, '25.999')] +[2023-02-24 12:54:02,870][00205] Fps is (10 sec: 3278.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8212480. Throughput: 0: 863.1. Samples: 2052980. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:54:02,875][00205] Avg episode reward: [(0, '26.480')] +[2023-02-24 12:54:07,870][00205] Fps is (10 sec: 2867.6, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8228864. Throughput: 0: 879.3. Samples: 2057890. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:54:07,876][00205] Avg episode reward: [(0, '27.098')] +[2023-02-24 12:54:08,056][11215] Updated weights for policy 0, policy_version 2010 (0.0030) +[2023-02-24 12:54:12,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8253440. Throughput: 0: 903.7. Samples: 2061132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:54:12,873][00205] Avg episode reward: [(0, '28.201')] +[2023-02-24 12:54:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 8269824. Throughput: 0: 908.6. Samples: 2067510. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:54:17,872][00205] Avg episode reward: [(0, '26.869')] +[2023-02-24 12:54:18,176][11215] Updated weights for policy 0, policy_version 2020 (0.0015) +[2023-02-24 12:54:22,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8286208. Throughput: 0: 868.8. Samples: 2071702. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:54:22,873][00205] Avg episode reward: [(0, '26.320')] +[2023-02-24 12:54:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8302592. Throughput: 0: 870.4. Samples: 2073830. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:54:27,872][00205] Avg episode reward: [(0, '26.621')] +[2023-02-24 12:54:30,189][11215] Updated weights for policy 0, policy_version 2030 (0.0017) +[2023-02-24 12:54:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8323072. Throughput: 0: 920.5. Samples: 2080358. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:54:32,875][00205] Avg episode reward: [(0, '26.115')] +[2023-02-24 12:54:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8343552. Throughput: 0: 907.8. Samples: 2086182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:54:37,872][00205] Avg episode reward: [(0, '27.023')] +[2023-02-24 12:54:41,947][11215] Updated weights for policy 0, policy_version 2040 (0.0019) +[2023-02-24 12:54:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8355840. Throughput: 0: 883.0. Samples: 2088302. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:54:42,879][00205] Avg episode reward: [(0, '27.933')] +[2023-02-24 12:54:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8376320. Throughput: 0: 888.0. Samples: 2092938. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:54:47,873][00205] Avg episode reward: [(0, '28.366')] +[2023-02-24 12:54:47,883][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002045_8376320.pth... +[2023-02-24 12:54:48,006][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001836_7520256.pth +[2023-02-24 12:54:52,537][11215] Updated weights for policy 0, policy_version 2050 (0.0013) +[2023-02-24 12:54:52,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3618.3, 300 sec: 3554.5). Total num frames: 8396800. Throughput: 0: 921.9. Samples: 2099376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:54:52,879][00205] Avg episode reward: [(0, '28.745')] +[2023-02-24 12:54:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 8413184. Throughput: 0: 919.3. Samples: 2102500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:54:57,875][00205] Avg episode reward: [(0, '29.440')] +[2023-02-24 12:55:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8425472. Throughput: 0: 865.4. Samples: 2106454. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:55:02,878][00205] Avg episode reward: [(0, '28.903')] +[2023-02-24 12:55:05,727][11215] Updated weights for policy 0, policy_version 2060 (0.0016) +[2023-02-24 12:55:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8445952. Throughput: 0: 887.2. Samples: 2111624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:55:07,873][00205] Avg episode reward: [(0, '28.550')] +[2023-02-24 12:55:12,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8466432. Throughput: 0: 911.7. Samples: 2114858. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:55:12,879][00205] Avg episode reward: [(0, '28.947')] +[2023-02-24 12:55:15,273][11215] Updated weights for policy 0, policy_version 2070 (0.0011) +[2023-02-24 12:55:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8482816. Throughput: 0: 897.2. Samples: 2120732. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:55:17,878][00205] Avg episode reward: [(0, '27.978')] +[2023-02-24 12:55:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8499200. Throughput: 0: 859.9. Samples: 2124878. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:55:22,873][00205] Avg episode reward: [(0, '27.826')] +[2023-02-24 12:55:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8515584. Throughput: 0: 868.5. Samples: 2127384. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:55:27,875][00205] Avg episode reward: [(0, '27.059')] +[2023-02-24 12:55:28,038][11215] Updated weights for policy 0, policy_version 2080 (0.0035) +[2023-02-24 12:55:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8540160. Throughput: 0: 912.2. Samples: 2133986. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:55:32,872][00205] Avg episode reward: [(0, '29.171')] +[2023-02-24 12:55:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8556544. Throughput: 0: 889.4. Samples: 2139398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:55:37,877][00205] Avg episode reward: [(0, '29.604')] +[2023-02-24 12:55:38,962][11215] Updated weights for policy 0, policy_version 2090 (0.0012) +[2023-02-24 12:55:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8568832. Throughput: 0: 865.1. Samples: 2141430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:55:42,875][00205] Avg episode reward: [(0, '30.581')] +[2023-02-24 12:55:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8589312. Throughput: 0: 890.4. Samples: 2146524. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:55:47,875][00205] Avg episode reward: [(0, '29.880')] +[2023-02-24 12:55:50,256][11215] Updated weights for policy 0, policy_version 2100 (0.0031) +[2023-02-24 12:55:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8609792. Throughput: 0: 920.1. Samples: 2153030. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:55:52,872][00205] Avg episode reward: [(0, '29.177')] +[2023-02-24 12:55:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8626176. Throughput: 0: 911.2. Samples: 2155860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:55:57,877][00205] Avg episode reward: [(0, '30.053')] +[2023-02-24 12:56:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8638464. Throughput: 0: 871.9. Samples: 2159966. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:56:02,878][00205] Avg episode reward: [(0, '28.558')] +[2023-02-24 12:56:03,106][11215] Updated weights for policy 0, policy_version 2110 (0.0042) +[2023-02-24 12:56:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8658944. Throughput: 0: 901.4. Samples: 2165442. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:56:07,872][00205] Avg episode reward: [(0, '27.659')] +[2023-02-24 12:56:12,814][11215] Updated weights for policy 0, policy_version 2120 (0.0012) +[2023-02-24 12:56:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8683520. Throughput: 0: 915.2. Samples: 2168566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:56:12,872][00205] Avg episode reward: [(0, '28.102')] +[2023-02-24 12:56:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8695808. Throughput: 0: 895.8. Samples: 2174298. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:56:17,875][00205] Avg episode reward: [(0, '27.283')] +[2023-02-24 12:56:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8712192. Throughput: 0: 867.5. Samples: 2178436. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:56:22,877][00205] Avg episode reward: [(0, '27.355')] +[2023-02-24 12:56:25,659][11215] Updated weights for policy 0, policy_version 2130 (0.0022) +[2023-02-24 12:56:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8732672. Throughput: 0: 882.9. Samples: 2181162. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:56:27,873][00205] Avg episode reward: [(0, '29.071')] +[2023-02-24 12:56:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8753152. Throughput: 0: 916.0. Samples: 2187742. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:56:32,878][00205] Avg episode reward: [(0, '28.206')] +[2023-02-24 12:56:35,983][11215] Updated weights for policy 0, policy_version 2140 (0.0020) +[2023-02-24 12:56:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 8769536. Throughput: 0: 885.6. Samples: 2192880. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:56:37,879][00205] Avg episode reward: [(0, '29.021')] +[2023-02-24 12:56:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8781824. Throughput: 0: 867.6. Samples: 2194900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:56:42,872][00205] Avg episode reward: [(0, '29.582')] +[2023-02-24 12:56:47,810][11215] Updated weights for policy 0, policy_version 2150 (0.0023) +[2023-02-24 12:56:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8806400. Throughput: 0: 896.2. Samples: 2200294. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:56:47,877][00205] Avg episode reward: [(0, '28.292')] +[2023-02-24 12:56:47,894][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002150_8806400.pth... +[2023-02-24 12:56:48,011][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001940_7946240.pth +[2023-02-24 12:56:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8826880. Throughput: 0: 920.0. Samples: 2206842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:56:52,874][00205] Avg episode reward: [(0, '27.672')] +[2023-02-24 12:56:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8839168. Throughput: 0: 906.6. Samples: 2209364. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:56:57,876][00205] Avg episode reward: [(0, '27.579')] +[2023-02-24 12:56:59,635][11215] Updated weights for policy 0, policy_version 2160 (0.0015) +[2023-02-24 12:57:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8855552. Throughput: 0: 869.4. Samples: 2213422. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:57:02,873][00205] Avg episode reward: [(0, '26.788')] +[2023-02-24 12:57:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8876032. Throughput: 0: 909.3. Samples: 2219356. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:57:07,877][00205] Avg episode reward: [(0, '28.423')] +[2023-02-24 12:57:10,273][11215] Updated weights for policy 0, policy_version 2170 (0.0012) +[2023-02-24 12:57:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8896512. Throughput: 0: 919.4. Samples: 2222536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:57:12,879][00205] Avg episode reward: [(0, '28.220')] +[2023-02-24 12:57:17,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8912896. Throughput: 0: 890.7. Samples: 2227826. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:57:17,875][00205] Avg episode reward: [(0, '28.280')] +[2023-02-24 12:57:22,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8925184. Throughput: 0: 871.3. Samples: 2232088. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:57:22,877][00205] Avg episode reward: [(0, '27.497')] +[2023-02-24 12:57:23,192][11215] Updated weights for policy 0, policy_version 2180 (0.0028) +[2023-02-24 12:57:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8949760. Throughput: 0: 891.6. Samples: 2235024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:57:27,873][00205] Avg episode reward: [(0, '29.520')] +[2023-02-24 12:57:32,529][11215] Updated weights for policy 0, policy_version 2190 (0.0018) +[2023-02-24 12:57:32,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8970240. Throughput: 0: 919.3. Samples: 2241662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:57:32,872][00205] Avg episode reward: [(0, '29.056')] +[2023-02-24 12:57:37,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8982528. Throughput: 0: 882.6. Samples: 2246558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:57:37,872][00205] Avg episode reward: [(0, '28.206')] +[2023-02-24 12:57:42,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8998912. Throughput: 0: 872.4. Samples: 2248622. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:57:42,884][00205] Avg episode reward: [(0, '27.766')] +[2023-02-24 12:57:45,525][11215] Updated weights for policy 0, policy_version 2200 (0.0026) +[2023-02-24 12:57:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9019392. Throughput: 0: 904.6. Samples: 2254130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:57:47,872][00205] Avg episode reward: [(0, '27.727')] +[2023-02-24 12:57:52,870][00205] Fps is (10 sec: 4096.3, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9039872. Throughput: 0: 918.0. Samples: 2260664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:57:52,873][00205] Avg episode reward: [(0, '28.475')] +[2023-02-24 12:57:56,016][11215] Updated weights for policy 0, policy_version 2210 (0.0018) +[2023-02-24 12:57:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9056256. Throughput: 0: 899.6. Samples: 2263016. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:57:57,877][00205] Avg episode reward: [(0, '26.638')] +[2023-02-24 12:58:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9068544. Throughput: 0: 873.3. Samples: 2267124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:02,879][00205] Avg episode reward: [(0, '26.333')] +[2023-02-24 12:58:07,770][11215] Updated weights for policy 0, policy_version 2220 (0.0034) +[2023-02-24 12:58:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9093120. Throughput: 0: 913.2. Samples: 2273180. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:58:07,873][00205] Avg episode reward: [(0, '25.888')] +[2023-02-24 12:58:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9113600. Throughput: 0: 920.0. Samples: 2276424. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:58:12,875][00205] Avg episode reward: [(0, '24.725')] +[2023-02-24 12:58:17,874][00205] Fps is (10 sec: 3275.4, 60 sec: 3549.6, 300 sec: 3568.3). Total num frames: 9125888. Throughput: 0: 887.7. Samples: 2281612. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:17,879][00205] Avg episode reward: [(0, '24.592')] +[2023-02-24 12:58:19,754][11215] Updated weights for policy 0, policy_version 2230 (0.0016) +[2023-02-24 12:58:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 9142272. Throughput: 0: 872.0. Samples: 2285796. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:22,878][00205] Avg episode reward: [(0, '24.101')] +[2023-02-24 12:58:27,870][00205] Fps is (10 sec: 3688.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9162752. Throughput: 0: 895.9. Samples: 2288936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:27,872][00205] Avg episode reward: [(0, '25.623')] +[2023-02-24 12:58:30,071][11215] Updated weights for policy 0, policy_version 2240 (0.0014) +[2023-02-24 12:58:32,872][00205] Fps is (10 sec: 4095.2, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 9183232. Throughput: 0: 919.9. Samples: 2295528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:32,874][00205] Avg episode reward: [(0, '26.641')] +[2023-02-24 12:58:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9199616. Throughput: 0: 881.3. Samples: 2300322. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:37,878][00205] Avg episode reward: [(0, '26.937')] +[2023-02-24 12:58:42,870][00205] Fps is (10 sec: 2867.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9211904. Throughput: 0: 872.7. Samples: 2302288. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:42,878][00205] Avg episode reward: [(0, '27.227')] +[2023-02-24 12:58:42,996][11215] Updated weights for policy 0, policy_version 2250 (0.0017) +[2023-02-24 12:58:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9236480. Throughput: 0: 912.1. Samples: 2308170. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:58:47,872][00205] Avg episode reward: [(0, '26.217')] +[2023-02-24 12:58:47,884][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002255_9236480.pth... +[2023-02-24 12:58:48,000][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002045_8376320.pth +[2023-02-24 12:58:52,474][11215] Updated weights for policy 0, policy_version 2260 (0.0020) +[2023-02-24 12:58:52,872][00205] Fps is (10 sec: 4504.5, 60 sec: 3618.0, 300 sec: 3582.3). Total num frames: 9256960. Throughput: 0: 922.3. Samples: 2314686. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:58:52,879][00205] Avg episode reward: [(0, '27.372')] +[2023-02-24 12:58:57,871][00205] Fps is (10 sec: 3276.3, 60 sec: 3549.8, 300 sec: 3582.2). Total num frames: 9269248. Throughput: 0: 896.1. Samples: 2316748. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:58:57,876][00205] Avg episode reward: [(0, '26.596')] +[2023-02-24 12:59:02,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9285632. Throughput: 0: 872.1. Samples: 2320854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:59:02,873][00205] Avg episode reward: [(0, '26.704')] +[2023-02-24 12:59:05,405][11215] Updated weights for policy 0, policy_version 2270 (0.0020) +[2023-02-24 12:59:07,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9306112. Throughput: 0: 917.7. Samples: 2327092. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:59:07,872][00205] Avg episode reward: [(0, '26.088')] +[2023-02-24 12:59:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9326592. Throughput: 0: 919.9. Samples: 2330330. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:59:12,874][00205] Avg episode reward: [(0, '26.037')] +[2023-02-24 12:59:16,197][11215] Updated weights for policy 0, policy_version 2280 (0.0037) +[2023-02-24 12:59:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.4, 300 sec: 3582.3). Total num frames: 9342976. Throughput: 0: 882.9. Samples: 2335258. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:59:17,876][00205] Avg episode reward: [(0, '26.873')] +[2023-02-24 12:59:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9359360. Throughput: 0: 871.3. Samples: 2339530. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:59:22,872][00205] Avg episode reward: [(0, '27.022')] +[2023-02-24 12:59:27,569][11215] Updated weights for policy 0, policy_version 2290 (0.0015) +[2023-02-24 12:59:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9379840. Throughput: 0: 900.6. Samples: 2342814. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:59:27,872][00205] Avg episode reward: [(0, '27.863')] +[2023-02-24 12:59:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3582.3). Total num frames: 9400320. Throughput: 0: 917.4. Samples: 2349454. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:59:32,875][00205] Avg episode reward: [(0, '27.149')] +[2023-02-24 12:59:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9412608. Throughput: 0: 869.7. Samples: 2353822. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:59:37,873][00205] Avg episode reward: [(0, '27.391')] +[2023-02-24 12:59:40,176][11215] Updated weights for policy 0, policy_version 2300 (0.0019) +[2023-02-24 12:59:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 9428992. Throughput: 0: 869.1. Samples: 2355858. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:59:42,872][00205] Avg episode reward: [(0, '28.339')] +[2023-02-24 12:59:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9449472. Throughput: 0: 911.9. Samples: 2361888. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:59:47,873][00205] Avg episode reward: [(0, '28.968')] +[2023-02-24 12:59:49,807][11215] Updated weights for policy 0, policy_version 2310 (0.0011) +[2023-02-24 12:59:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3582.3). Total num frames: 9469952. Throughput: 0: 913.8. Samples: 2368214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:59:52,880][00205] Avg episode reward: [(0, '28.679')] +[2023-02-24 12:59:57,871][00205] Fps is (10 sec: 3685.8, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9486336. Throughput: 0: 888.1. Samples: 2370298. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:59:57,875][00205] Avg episode reward: [(0, '28.517')] +[2023-02-24 13:00:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9498624. Throughput: 0: 867.0. Samples: 2374274. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:02,872][00205] Avg episode reward: [(0, '29.111')] +[2023-02-24 13:00:03,025][11215] Updated weights for policy 0, policy_version 2320 (0.0020) +[2023-02-24 13:00:07,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9523200. Throughput: 0: 918.4. Samples: 2380856. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:07,873][00205] Avg episode reward: [(0, '29.438')] +[2023-02-24 13:00:12,873][00205] Fps is (10 sec: 4094.7, 60 sec: 3549.7, 300 sec: 3582.2). Total num frames: 9539584. Throughput: 0: 919.3. Samples: 2384184. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:12,876][00205] Avg episode reward: [(0, '29.535')] +[2023-02-24 13:00:12,976][11215] Updated weights for policy 0, policy_version 2330 (0.0015) +[2023-02-24 13:00:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3582.3). Total num frames: 9555968. Throughput: 0: 871.9. Samples: 2388692. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:17,876][00205] Avg episode reward: [(0, '29.166')] +[2023-02-24 13:00:22,870][00205] Fps is (10 sec: 3277.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9572352. Throughput: 0: 878.2. Samples: 2393340. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:22,879][00205] Avg episode reward: [(0, '28.664')] +[2023-02-24 13:00:25,293][11215] Updated weights for policy 0, policy_version 2340 (0.0011) +[2023-02-24 13:00:27,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9592832. Throughput: 0: 907.0. Samples: 2396672. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:00:27,877][00205] Avg episode reward: [(0, '29.703')] +[2023-02-24 13:00:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9613312. Throughput: 0: 916.7. Samples: 2403140. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:32,873][00205] Avg episode reward: [(0, '28.917')] +[2023-02-24 13:00:36,509][11215] Updated weights for policy 0, policy_version 2350 (0.0013) +[2023-02-24 13:00:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9625600. Throughput: 0: 869.5. Samples: 2407340. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:00:37,872][00205] Avg episode reward: [(0, '30.191')] +[2023-02-24 13:00:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9641984. Throughput: 0: 868.9. Samples: 2409398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:00:42,881][00205] Avg episode reward: [(0, '28.555')] +[2023-02-24 13:00:47,770][11215] Updated weights for policy 0, policy_version 2360 (0.0012) +[2023-02-24 13:00:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9666560. Throughput: 0: 915.4. Samples: 2415466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:47,872][00205] Avg episode reward: [(0, '28.287')] +[2023-02-24 13:00:47,893][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002360_9666560.pth... +[2023-02-24 13:00:48,005][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002150_8806400.pth +[2023-02-24 13:00:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9682944. Throughput: 0: 906.2. Samples: 2421636. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:00:52,875][00205] Avg episode reward: [(0, '28.414')] +[2023-02-24 13:00:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3596.1). Total num frames: 9699328. Throughput: 0: 876.8. Samples: 2423636. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:00:57,872][00205] Avg episode reward: [(0, '28.747')] +[2023-02-24 13:01:00,664][11215] Updated weights for policy 0, policy_version 2370 (0.0020) +[2023-02-24 13:01:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9715712. Throughput: 0: 870.8. Samples: 2427878. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:01:02,879][00205] Avg episode reward: [(0, '28.769')] +[2023-02-24 13:01:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9736192. Throughput: 0: 916.4. Samples: 2434578. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:01:07,873][00205] Avg episode reward: [(0, '29.246')] +[2023-02-24 13:01:10,109][11215] Updated weights for policy 0, policy_version 2380 (0.0012) +[2023-02-24 13:01:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.3, 300 sec: 3596.1). Total num frames: 9756672. Throughput: 0: 914.9. Samples: 2437842. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:01:12,877][00205] Avg episode reward: [(0, '28.743')] +[2023-02-24 13:01:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9768960. Throughput: 0: 871.5. Samples: 2442356. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:01:17,873][00205] Avg episode reward: [(0, '27.426')] +[2023-02-24 13:01:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9785344. Throughput: 0: 881.3. Samples: 2447000. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:01:22,872][00205] Avg episode reward: [(0, '29.245')] +[2023-02-24 13:01:22,886][11215] Updated weights for policy 0, policy_version 2390 (0.0014) +[2023-02-24 13:01:27,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9809920. Throughput: 0: 909.5. Samples: 2450324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:01:27,872][00205] Avg episode reward: [(0, '28.138')] +[2023-02-24 13:01:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9826304. Throughput: 0: 919.0. Samples: 2456820. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:01:32,880][00205] Avg episode reward: [(0, '29.533')] +[2023-02-24 13:01:33,120][11215] Updated weights for policy 0, policy_version 2400 (0.0017) +[2023-02-24 13:01:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9842688. Throughput: 0: 873.8. Samples: 2460956. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:01:37,875][00205] Avg episode reward: [(0, '26.979')] +[2023-02-24 13:01:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 9859072. Throughput: 0: 876.7. Samples: 2463086. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:01:42,873][00205] Avg episode reward: [(0, '26.891')] +[2023-02-24 13:01:45,223][11215] Updated weights for policy 0, policy_version 2410 (0.0014) +[2023-02-24 13:01:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9879552. Throughput: 0: 921.8. Samples: 2469360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:01:47,873][00205] Avg episode reward: [(0, '29.293')] +[2023-02-24 13:01:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9900032. Throughput: 0: 906.2. Samples: 2475356. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:01:52,877][00205] Avg episode reward: [(0, '31.339')] +[2023-02-24 13:01:56,888][11215] Updated weights for policy 0, policy_version 2420 (0.0014) +[2023-02-24 13:01:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9912320. Throughput: 0: 879.5. Samples: 2477420. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:01:57,880][00205] Avg episode reward: [(0, '30.273')] +[2023-02-24 13:02:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9928704. Throughput: 0: 876.2. Samples: 2481786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:02:02,879][00205] Avg episode reward: [(0, '30.971')] +[2023-02-24 13:02:07,534][11215] Updated weights for policy 0, policy_version 2430 (0.0027) +[2023-02-24 13:02:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9953280. Throughput: 0: 920.0. Samples: 2488400. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:02:07,872][00205] Avg episode reward: [(0, '30.673')] +[2023-02-24 13:02:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3596.2). Total num frames: 9973760. Throughput: 0: 926.2. Samples: 2492002. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:02:12,879][00205] Avg episode reward: [(0, '32.861')] +[2023-02-24 13:02:12,881][11201] Saving new best policy, reward=32.861! +[2023-02-24 13:02:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9986048. Throughput: 0: 885.0. Samples: 2496644. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:02:17,873][00205] Avg episode reward: [(0, '32.329')] +[2023-02-24 13:02:19,426][11215] Updated weights for policy 0, policy_version 2440 (0.0027) +[2023-02-24 13:02:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 10006528. Throughput: 0: 913.5. Samples: 2502064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:02:22,878][00205] Avg episode reward: [(0, '33.120')] +[2023-02-24 13:02:22,881][11201] Saving new best policy, reward=33.120! +[2023-02-24 13:02:27,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3686.4, 300 sec: 3596.1). Total num frames: 10031104. Throughput: 0: 944.7. Samples: 2505598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:02:27,872][00205] Avg episode reward: [(0, '32.114')] +[2023-02-24 13:02:28,321][11215] Updated weights for policy 0, policy_version 2450 (0.0011) +[2023-02-24 13:02:32,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 10051584. Throughput: 0: 958.1. Samples: 2512476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:02:32,872][00205] Avg episode reward: [(0, '32.733')] +[2023-02-24 13:02:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 10067968. Throughput: 0: 928.1. Samples: 2517120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:02:37,876][00205] Avg episode reward: [(0, '31.443')] +[2023-02-24 13:02:40,375][11215] Updated weights for policy 0, policy_version 2460 (0.0024) +[2023-02-24 13:02:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3623.9). Total num frames: 10088448. Throughput: 0: 934.0. Samples: 2519452. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:02:42,873][00205] Avg episode reward: [(0, '31.606')] +[2023-02-24 13:02:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3623.9). Total num frames: 10108928. Throughput: 0: 995.9. Samples: 2526602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:02:47,876][00205] Avg episode reward: [(0, '31.046')] +[2023-02-24 13:02:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002468_10108928.pth... +[2023-02-24 13:02:48,003][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002255_9236480.pth +[2023-02-24 13:02:48,873][11215] Updated weights for policy 0, policy_version 2470 (0.0014) +[2023-02-24 13:02:52,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3822.8, 300 sec: 3637.8). Total num frames: 10129408. Throughput: 0: 986.9. Samples: 2532812. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:02:52,876][00205] Avg episode reward: [(0, '29.429')] +[2023-02-24 13:02:57,872][00205] Fps is (10 sec: 3685.5, 60 sec: 3891.0, 300 sec: 3651.7). Total num frames: 10145792. Throughput: 0: 956.8. Samples: 2535062. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:02:57,880][00205] Avg episode reward: [(0, '28.950')] +[2023-02-24 13:03:01,104][11215] Updated weights for policy 0, policy_version 2480 (0.0018) +[2023-02-24 13:03:02,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3959.5, 300 sec: 3637.8). Total num frames: 10166272. Throughput: 0: 965.6. Samples: 2540094. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:03:02,877][00205] Avg episode reward: [(0, '28.169')] +[2023-02-24 13:03:07,870][00205] Fps is (10 sec: 4097.1, 60 sec: 3891.2, 300 sec: 3637.8). Total num frames: 10186752. Throughput: 0: 1006.3. Samples: 2547348. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:03:07,874][00205] Avg episode reward: [(0, '28.697')] +[2023-02-24 13:03:09,547][11215] Updated weights for policy 0, policy_version 2490 (0.0022) +[2023-02-24 13:03:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3665.6). Total num frames: 10207232. Throughput: 0: 1006.0. Samples: 2550870. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:03:12,874][00205] Avg episode reward: [(0, '28.847')] +[2023-02-24 13:03:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3665.6). Total num frames: 10223616. Throughput: 0: 952.1. Samples: 2555322. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:03:17,876][00205] Avg episode reward: [(0, '29.297')] +[2023-02-24 13:03:21,717][11215] Updated weights for policy 0, policy_version 2500 (0.0025) +[2023-02-24 13:03:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3665.6). Total num frames: 10244096. Throughput: 0: 975.1. Samples: 2560998. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:03:22,877][00205] Avg episode reward: [(0, '30.799')] +[2023-02-24 13:03:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3679.5). Total num frames: 10268672. Throughput: 0: 1003.7. Samples: 2564618. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:03:27,873][00205] Avg episode reward: [(0, '30.425')] +[2023-02-24 13:03:30,474][11215] Updated weights for policy 0, policy_version 2510 (0.0011) +[2023-02-24 13:03:32,870][00205] Fps is (10 sec: 4095.8, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 10285056. Throughput: 0: 992.0. Samples: 2571242. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:03:32,874][00205] Avg episode reward: [(0, '30.344')] +[2023-02-24 13:03:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3693.3). Total num frames: 10301440. Throughput: 0: 956.1. Samples: 2575834. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:03:37,878][00205] Avg episode reward: [(0, '30.769')] +[2023-02-24 13:03:42,226][11215] Updated weights for policy 0, policy_version 2520 (0.0019) +[2023-02-24 13:03:42,871][00205] Fps is (10 sec: 3686.3, 60 sec: 3891.1, 300 sec: 3679.4). Total num frames: 10321920. Throughput: 0: 967.2. Samples: 2578586. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:03:42,878][00205] Avg episode reward: [(0, '31.018')] +[2023-02-24 13:03:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3693.4). Total num frames: 10346496. Throughput: 0: 1017.7. Samples: 2585892. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:03:47,873][00205] Avg episode reward: [(0, '28.245')] +[2023-02-24 13:03:51,216][11215] Updated weights for policy 0, policy_version 2530 (0.0011) +[2023-02-24 13:03:52,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3959.5, 300 sec: 3721.1). Total num frames: 10366976. Throughput: 0: 990.8. Samples: 2591934. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) +[2023-02-24 13:03:52,878][00205] Avg episode reward: [(0, '28.230')] +[2023-02-24 13:03:57,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3891.3, 300 sec: 3707.2). Total num frames: 10379264. Throughput: 0: 960.7. Samples: 2594100. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 13:03:57,878][00205] Avg episode reward: [(0, '29.088')] +[2023-02-24 13:04:02,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 10399744. Throughput: 0: 981.2. Samples: 2599476. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:04:02,875][00205] Avg episode reward: [(0, '27.915')] +[2023-02-24 13:04:02,958][11215] Updated weights for policy 0, policy_version 2540 (0.0022) +[2023-02-24 13:04:07,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3959.5, 300 sec: 3721.1). Total num frames: 10424320. Throughput: 0: 1016.7. Samples: 2606750. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:04:07,875][00205] Avg episode reward: [(0, '27.682')] +[2023-02-24 13:04:12,555][11215] Updated weights for policy 0, policy_version 2550 (0.0021) +[2023-02-24 13:04:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 10444800. Throughput: 0: 1008.3. Samples: 2609992. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:04:12,878][00205] Avg episode reward: [(0, '26.928')] +[2023-02-24 13:04:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3721.1). Total num frames: 10457088. Throughput: 0: 962.1. Samples: 2614536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:04:17,880][00205] Avg episode reward: [(0, '27.941')] +[2023-02-24 13:04:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 10481664. Throughput: 0: 993.4. Samples: 2620536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:04:22,872][00205] Avg episode reward: [(0, '26.358')] +[2023-02-24 13:04:23,456][11215] Updated weights for policy 0, policy_version 2560 (0.0016) +[2023-02-24 13:04:27,870][00205] Fps is (10 sec: 4915.3, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 10506240. Throughput: 0: 1012.4. Samples: 2624142. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:04:27,872][00205] Avg episode reward: [(0, '25.797')] +[2023-02-24 13:04:32,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3959.3, 300 sec: 3762.7). Total num frames: 10522624. Throughput: 0: 987.9. Samples: 2630348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:04:32,875][00205] Avg episode reward: [(0, '27.648')] +[2023-02-24 13:04:33,598][11215] Updated weights for policy 0, policy_version 2570 (0.0011) +[2023-02-24 13:04:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3762.8). Total num frames: 10539008. Throughput: 0: 955.4. Samples: 2634928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:04:37,877][00205] Avg episode reward: [(0, '28.105')] +[2023-02-24 13:04:42,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3959.5, 300 sec: 3762.8). Total num frames: 10559488. Throughput: 0: 972.3. Samples: 2637854. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:04:42,873][00205] Avg episode reward: [(0, '29.040')] +[2023-02-24 13:04:43,957][11215] Updated weights for policy 0, policy_version 2580 (0.0014) +[2023-02-24 13:04:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3776.6). Total num frames: 10584064. Throughput: 0: 1014.0. Samples: 2645106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:04:47,873][00205] Avg episode reward: [(0, '28.718')] +[2023-02-24 13:04:47,891][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002584_10584064.pth... +[2023-02-24 13:04:48,023][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002360_9666560.pth +[2023-02-24 13:04:52,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 10600448. Throughput: 0: 974.1. Samples: 2650584. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:04:52,873][00205] Avg episode reward: [(0, '29.225')] +[2023-02-24 13:04:54,891][11215] Updated weights for policy 0, policy_version 2590 (0.0017) +[2023-02-24 13:04:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3790.5). Total num frames: 10616832. Throughput: 0: 951.6. Samples: 2652814. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:04:57,879][00205] Avg episode reward: [(0, '30.032')] +[2023-02-24 13:05:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 10637312. Throughput: 0: 972.8. Samples: 2658312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:05:02,873][00205] Avg episode reward: [(0, '30.642')] +[2023-02-24 13:05:05,055][11215] Updated weights for policy 0, policy_version 2600 (0.0022) +[2023-02-24 13:05:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3804.5). Total num frames: 10661888. Throughput: 0: 1001.2. Samples: 2665590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:05:07,880][00205] Avg episode reward: [(0, '29.211')] +[2023-02-24 13:05:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 10678272. Throughput: 0: 983.6. Samples: 2668406. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:05:12,877][00205] Avg episode reward: [(0, '30.018')] +[2023-02-24 13:05:16,702][11215] Updated weights for policy 0, policy_version 2610 (0.0038) +[2023-02-24 13:05:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 10690560. Throughput: 0: 946.1. Samples: 2672920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:05:17,876][00205] Avg episode reward: [(0, '30.640')] +[2023-02-24 13:05:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 10715136. Throughput: 0: 981.7. Samples: 2679104. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:05:22,873][00205] Avg episode reward: [(0, '30.615')] +[2023-02-24 13:05:25,826][11215] Updated weights for policy 0, policy_version 2620 (0.0031) +[2023-02-24 13:05:27,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 10739712. Throughput: 0: 996.9. Samples: 2682712. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:05:27,873][00205] Avg episode reward: [(0, '31.885')] +[2023-02-24 13:05:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.4, 300 sec: 3832.2). Total num frames: 10756096. Throughput: 0: 969.5. Samples: 2688734. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:05:32,877][00205] Avg episode reward: [(0, '31.684')] +[2023-02-24 13:05:37,665][11215] Updated weights for policy 0, policy_version 2630 (0.0021) +[2023-02-24 13:05:37,873][00205] Fps is (10 sec: 3275.7, 60 sec: 3891.0, 300 sec: 3832.1). Total num frames: 10772480. Throughput: 0: 949.5. Samples: 2693316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:05:37,882][00205] Avg episode reward: [(0, '32.848')] +[2023-02-24 13:05:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 10792960. Throughput: 0: 972.3. Samples: 2696566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:05:42,872][00205] Avg episode reward: [(0, '32.834')] +[2023-02-24 13:05:46,239][11215] Updated weights for policy 0, policy_version 2640 (0.0014) +[2023-02-24 13:05:47,870][00205] Fps is (10 sec: 4916.9, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 10821632. Throughput: 0: 1012.2. Samples: 2703862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:05:47,878][00205] Avg episode reward: [(0, '33.923')] +[2023-02-24 13:05:47,893][11201] Saving new best policy, reward=33.923! +[2023-02-24 13:05:52,872][00205] Fps is (10 sec: 4095.2, 60 sec: 3891.1, 300 sec: 3846.0). Total num frames: 10833920. Throughput: 0: 966.0. Samples: 2709064. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:05:52,879][00205] Avg episode reward: [(0, '32.833')] +[2023-02-24 13:05:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 10850304. Throughput: 0: 952.7. Samples: 2711278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:05:57,873][00205] Avg episode reward: [(0, '30.821')] +[2023-02-24 13:05:58,763][11215] Updated weights for policy 0, policy_version 2650 (0.0031) +[2023-02-24 13:06:02,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 10870784. Throughput: 0: 982.6. Samples: 2717138. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:06:02,872][00205] Avg episode reward: [(0, '28.643')] +[2023-02-24 13:06:07,363][11215] Updated weights for policy 0, policy_version 2660 (0.0016) +[2023-02-24 13:06:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 10895360. Throughput: 0: 1004.5. Samples: 2724308. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:06:07,877][00205] Avg episode reward: [(0, '28.408')] +[2023-02-24 13:06:12,874][00205] Fps is (10 sec: 4094.4, 60 sec: 3890.9, 300 sec: 3873.8). Total num frames: 10911744. Throughput: 0: 982.5. Samples: 2726928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:06:12,879][00205] Avg episode reward: [(0, '28.654')] +[2023-02-24 13:06:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 10928128. Throughput: 0: 951.6. Samples: 2731556. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:06:17,875][00205] Avg episode reward: [(0, '27.336')] +[2023-02-24 13:06:19,284][11215] Updated weights for policy 0, policy_version 2670 (0.0011) +[2023-02-24 13:06:22,877][00205] Fps is (10 sec: 4094.8, 60 sec: 3959.0, 300 sec: 3873.8). Total num frames: 10952704. Throughput: 0: 995.8. Samples: 2738132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:06:22,883][00205] Avg episode reward: [(0, '27.078')] +[2023-02-24 13:06:27,724][11215] Updated weights for policy 0, policy_version 2680 (0.0025) +[2023-02-24 13:06:27,870][00205] Fps is (10 sec: 4915.3, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 10977280. Throughput: 0: 1004.9. Samples: 2741788. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:06:27,877][00205] Avg episode reward: [(0, '29.119')] +[2023-02-24 13:06:32,870][00205] Fps is (10 sec: 3688.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 10989568. Throughput: 0: 968.8. Samples: 2747460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:06:32,874][00205] Avg episode reward: [(0, '29.675')] +[2023-02-24 13:06:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.4, 300 sec: 3887.7). Total num frames: 11005952. Throughput: 0: 952.8. Samples: 2751940. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:06:37,872][00205] Avg episode reward: [(0, '29.712')] +[2023-02-24 13:06:39,952][11215] Updated weights for policy 0, policy_version 2690 (0.0012) +[2023-02-24 13:06:42,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11030528. Throughput: 0: 981.0. Samples: 2755424. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:06:42,877][00205] Avg episode reward: [(0, '30.627')] +[2023-02-24 13:06:47,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11055104. Throughput: 0: 1011.0. Samples: 2762634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:06:47,877][00205] Avg episode reward: [(0, '31.811')] +[2023-02-24 13:06:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002699_11055104.pth... +[2023-02-24 13:06:48,038][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002468_10108928.pth +[2023-02-24 13:06:49,145][11215] Updated weights for policy 0, policy_version 2700 (0.0011) +[2023-02-24 13:06:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.3, 300 sec: 3915.5). Total num frames: 11067392. Throughput: 0: 960.7. Samples: 2767540. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:06:52,875][00205] Avg episode reward: [(0, '32.203')] +[2023-02-24 13:06:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11083776. Throughput: 0: 952.8. Samples: 2769800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:06:57,872][00205] Avg episode reward: [(0, '31.032')] +[2023-02-24 13:07:00,880][11215] Updated weights for policy 0, policy_version 2710 (0.0011) +[2023-02-24 13:07:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 11108352. Throughput: 0: 986.4. Samples: 2775946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:07:02,873][00205] Avg episode reward: [(0, '29.995')] +[2023-02-24 13:07:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11128832. Throughput: 0: 999.9. Samples: 2783122. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:07:07,873][00205] Avg episode reward: [(0, '31.850')] +[2023-02-24 13:07:10,795][11215] Updated weights for policy 0, policy_version 2720 (0.0023) +[2023-02-24 13:07:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.5, 300 sec: 3929.4). Total num frames: 11145216. Throughput: 0: 969.5. Samples: 2785414. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:07:12,872][00205] Avg episode reward: [(0, '30.840')] +[2023-02-24 13:07:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11161600. Throughput: 0: 943.5. Samples: 2789918. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:07:17,873][00205] Avg episode reward: [(0, '29.945')] +[2023-02-24 13:07:21,861][11215] Updated weights for policy 0, policy_version 2730 (0.0015) +[2023-02-24 13:07:22,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3891.6, 300 sec: 3915.5). Total num frames: 11186176. Throughput: 0: 988.6. Samples: 2796426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:07:22,872][00205] Avg episode reward: [(0, '28.391')] +[2023-02-24 13:07:27,872][00205] Fps is (10 sec: 4504.7, 60 sec: 3822.8, 300 sec: 3915.5). Total num frames: 11206656. Throughput: 0: 988.4. Samples: 2799904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:07:27,880][00205] Avg episode reward: [(0, '28.858')] +[2023-02-24 13:07:32,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3823.0, 300 sec: 3901.6). Total num frames: 11218944. Throughput: 0: 941.9. Samples: 2805020. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:07:32,872][00205] Avg episode reward: [(0, '31.238')] +[2023-02-24 13:07:33,132][11215] Updated weights for policy 0, policy_version 2740 (0.0011) +[2023-02-24 13:07:37,870][00205] Fps is (10 sec: 2867.8, 60 sec: 3822.9, 300 sec: 3887.7). Total num frames: 11235328. Throughput: 0: 931.2. Samples: 2809444. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:07:37,872][00205] Avg episode reward: [(0, '30.567')] +[2023-02-24 13:07:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 11259904. Throughput: 0: 961.5. Samples: 2813068. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:07:42,878][00205] Avg episode reward: [(0, '30.264')] +[2023-02-24 13:07:42,998][11215] Updated weights for policy 0, policy_version 2750 (0.0012) +[2023-02-24 13:07:47,875][00205] Fps is (10 sec: 4912.7, 60 sec: 3822.6, 300 sec: 3915.4). Total num frames: 11284480. Throughput: 0: 986.8. Samples: 2820358. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:07:47,880][00205] Avg episode reward: [(0, '31.168')] +[2023-02-24 13:07:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 11296768. Throughput: 0: 935.2. Samples: 2825206. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:07:52,874][00205] Avg episode reward: [(0, '32.204')] +[2023-02-24 13:07:54,363][11215] Updated weights for policy 0, policy_version 2760 (0.0034) +[2023-02-24 13:07:57,870][00205] Fps is (10 sec: 3278.5, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11317248. Throughput: 0: 934.0. Samples: 2827446. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:07:57,872][00205] Avg episode reward: [(0, '33.362')] +[2023-02-24 13:08:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 11337728. Throughput: 0: 978.8. Samples: 2833964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:08:02,878][00205] Avg episode reward: [(0, '33.448')] +[2023-02-24 13:08:03,838][11215] Updated weights for policy 0, policy_version 2770 (0.0017) +[2023-02-24 13:08:07,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11362304. Throughput: 0: 987.8. Samples: 2840876. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:08:07,880][00205] Avg episode reward: [(0, '32.975')] +[2023-02-24 13:08:12,872][00205] Fps is (10 sec: 3685.7, 60 sec: 3822.8, 300 sec: 3901.6). Total num frames: 11374592. Throughput: 0: 960.1. Samples: 2843108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:08:12,880][00205] Avg episode reward: [(0, '32.114')] +[2023-02-24 13:08:15,661][11215] Updated weights for policy 0, policy_version 2780 (0.0012) +[2023-02-24 13:08:17,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11395072. Throughput: 0: 945.3. Samples: 2847558. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:08:17,873][00205] Avg episode reward: [(0, '30.239')] +[2023-02-24 13:08:22,870][00205] Fps is (10 sec: 4506.4, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11419648. Throughput: 0: 1004.7. Samples: 2854656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:08:22,873][00205] Avg episode reward: [(0, '29.244')] +[2023-02-24 13:08:24,526][11215] Updated weights for policy 0, policy_version 2790 (0.0011) +[2023-02-24 13:08:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.3, 300 sec: 3915.5). Total num frames: 11440128. Throughput: 0: 1002.7. Samples: 2858190. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:08:27,874][00205] Avg episode reward: [(0, '28.940')] +[2023-02-24 13:08:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11452416. Throughput: 0: 953.8. Samples: 2863276. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:08:32,872][00205] Avg episode reward: [(0, '27.881')] +[2023-02-24 13:08:36,720][11215] Updated weights for policy 0, policy_version 2800 (0.0016) +[2023-02-24 13:08:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11472896. Throughput: 0: 957.2. Samples: 2868282. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:08:37,880][00205] Avg episode reward: [(0, '27.963')] +[2023-02-24 13:08:42,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11497472. Throughput: 0: 986.5. Samples: 2871838. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:08:42,872][00205] Avg episode reward: [(0, '29.039')] +[2023-02-24 13:08:45,237][11215] Updated weights for policy 0, policy_version 2810 (0.0018) +[2023-02-24 13:08:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.5, 300 sec: 3901.6). Total num frames: 11517952. Throughput: 0: 1002.6. Samples: 2879082. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:08:47,876][00205] Avg episode reward: [(0, '29.905')] +[2023-02-24 13:08:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002812_11517952.pth... +[2023-02-24 13:08:48,023][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002584_10584064.pth +[2023-02-24 13:08:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11530240. Throughput: 0: 947.2. Samples: 2883500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:08:52,874][00205] Avg episode reward: [(0, '30.130')] +[2023-02-24 13:08:57,602][11215] Updated weights for policy 0, policy_version 2820 (0.0034) +[2023-02-24 13:08:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11550720. Throughput: 0: 947.8. Samples: 2885758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:08:57,881][00205] Avg episode reward: [(0, '31.498')] +[2023-02-24 13:09:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11575296. Throughput: 0: 999.3. Samples: 2892526. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:09:02,873][00205] Avg episode reward: [(0, '30.983')] +[2023-02-24 13:09:06,160][11215] Updated weights for policy 0, policy_version 2830 (0.0021) +[2023-02-24 13:09:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11595776. Throughput: 0: 991.9. Samples: 2899292. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:09:07,873][00205] Avg episode reward: [(0, '30.991')] +[2023-02-24 13:09:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.6, 300 sec: 3915.5). Total num frames: 11612160. Throughput: 0: 964.2. Samples: 2901580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:09:12,877][00205] Avg episode reward: [(0, '30.112')] +[2023-02-24 13:09:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11628544. Throughput: 0: 954.1. Samples: 2906212. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:09:17,872][00205] Avg episode reward: [(0, '28.870')] +[2023-02-24 13:09:18,305][11215] Updated weights for policy 0, policy_version 2840 (0.0025) +[2023-02-24 13:09:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11653120. Throughput: 0: 1004.3. Samples: 2913476. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:09:22,872][00205] Avg episode reward: [(0, '27.218')] +[2023-02-24 13:09:27,339][11215] Updated weights for policy 0, policy_version 2850 (0.0025) +[2023-02-24 13:09:27,875][00205] Fps is (10 sec: 4503.2, 60 sec: 3890.9, 300 sec: 3901.6). Total num frames: 11673600. Throughput: 0: 1003.3. Samples: 2916992. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:09:27,881][00205] Avg episode reward: [(0, '26.111')] +[2023-02-24 13:09:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11689984. Throughput: 0: 950.6. Samples: 2921860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:09:32,876][00205] Avg episode reward: [(0, '26.217')] +[2023-02-24 13:09:37,870][00205] Fps is (10 sec: 3278.6, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11706368. Throughput: 0: 971.2. Samples: 2927204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:09:37,881][00205] Avg episode reward: [(0, '27.113')] +[2023-02-24 13:09:38,824][11215] Updated weights for policy 0, policy_version 2860 (0.0024) +[2023-02-24 13:09:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11730944. Throughput: 0: 1000.1. Samples: 2930764. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:09:42,878][00205] Avg episode reward: [(0, '27.195')] +[2023-02-24 13:09:47,874][00205] Fps is (10 sec: 4503.5, 60 sec: 3890.9, 300 sec: 3901.6). Total num frames: 11751424. Throughput: 0: 999.8. Samples: 2937522. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:09:47,883][00205] Avg episode reward: [(0, '26.724')] +[2023-02-24 13:09:48,667][11215] Updated weights for policy 0, policy_version 2870 (0.0023) +[2023-02-24 13:09:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11763712. Throughput: 0: 948.6. Samples: 2941980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:09:52,878][00205] Avg episode reward: [(0, '28.349')] +[2023-02-24 13:09:57,870][00205] Fps is (10 sec: 3278.3, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11784192. Throughput: 0: 948.8. Samples: 2944278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:09:57,876][00205] Avg episode reward: [(0, '28.480')] +[2023-02-24 13:10:00,095][11215] Updated weights for policy 0, policy_version 2880 (0.0020) +[2023-02-24 13:10:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11808768. Throughput: 0: 995.2. Samples: 2950994. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:10:02,879][00205] Avg episode reward: [(0, '29.259')] +[2023-02-24 13:10:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11829248. Throughput: 0: 972.9. Samples: 2957256. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:10:07,882][00205] Avg episode reward: [(0, '28.575')] +[2023-02-24 13:10:10,845][11215] Updated weights for policy 0, policy_version 2890 (0.0022) +[2023-02-24 13:10:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 11841536. Throughput: 0: 944.2. Samples: 2959476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:10:12,874][00205] Avg episode reward: [(0, '28.388')] +[2023-02-24 13:10:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11862016. Throughput: 0: 943.7. Samples: 2964328. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:17,873][00205] Avg episode reward: [(0, '27.157')] +[2023-02-24 13:10:21,046][11215] Updated weights for policy 0, policy_version 2900 (0.0014) +[2023-02-24 13:10:22,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11886592. Throughput: 0: 984.7. Samples: 2971514. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:22,874][00205] Avg episode reward: [(0, '27.272')] +[2023-02-24 13:10:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.3, 300 sec: 3887.7). Total num frames: 11902976. Throughput: 0: 980.3. Samples: 2974876. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:10:27,875][00205] Avg episode reward: [(0, '27.192')] +[2023-02-24 13:10:32,746][11215] Updated weights for policy 0, policy_version 2910 (0.0014) +[2023-02-24 13:10:32,871][00205] Fps is (10 sec: 3276.5, 60 sec: 3822.9, 300 sec: 3887.8). Total num frames: 11919360. Throughput: 0: 928.2. Samples: 2979288. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:32,874][00205] Avg episode reward: [(0, '28.060')] +[2023-02-24 13:10:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 11935744. Throughput: 0: 948.3. Samples: 2984652. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:37,875][00205] Avg episode reward: [(0, '28.655')] +[2023-02-24 13:10:42,349][11215] Updated weights for policy 0, policy_version 2920 (0.0031) +[2023-02-24 13:10:42,870][00205] Fps is (10 sec: 4096.4, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 11960320. Throughput: 0: 974.4. Samples: 2988128. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:42,872][00205] Avg episode reward: [(0, '29.528')] +[2023-02-24 13:10:47,875][00205] Fps is (10 sec: 4093.7, 60 sec: 3754.6, 300 sec: 3873.8). Total num frames: 11976704. Throughput: 0: 964.8. Samples: 2994414. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:10:47,883][00205] Avg episode reward: [(0, '30.080')] +[2023-02-24 13:10:47,923][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002925_11980800.pth... +[2023-02-24 13:10:48,074][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002699_11055104.pth +[2023-02-24 13:10:52,874][00205] Fps is (10 sec: 3275.4, 60 sec: 3822.7, 300 sec: 3873.8). Total num frames: 11993088. Throughput: 0: 923.6. Samples: 2998820. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:52,885][00205] Avg episode reward: [(0, '29.245')] +[2023-02-24 13:10:54,979][11215] Updated weights for policy 0, policy_version 2930 (0.0023) +[2023-02-24 13:10:57,870][00205] Fps is (10 sec: 3688.4, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12013568. Throughput: 0: 926.2. Samples: 3001154. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:57,875][00205] Avg episode reward: [(0, '29.862')] +[2023-02-24 13:11:02,870][00205] Fps is (10 sec: 4507.4, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12038144. Throughput: 0: 979.2. Samples: 3008392. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:11:02,877][00205] Avg episode reward: [(0, '29.976')] +[2023-02-24 13:11:03,367][11215] Updated weights for policy 0, policy_version 2940 (0.0012) +[2023-02-24 13:11:07,875][00205] Fps is (10 sec: 4093.8, 60 sec: 3754.3, 300 sec: 3873.8). Total num frames: 12054528. Throughput: 0: 956.8. Samples: 3014576. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:11:07,881][00205] Avg episode reward: [(0, '29.996')] +[2023-02-24 13:11:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12070912. Throughput: 0: 931.0. Samples: 3016772. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:11:12,874][00205] Avg episode reward: [(0, '29.561')] +[2023-02-24 13:11:15,626][11215] Updated weights for policy 0, policy_version 2950 (0.0040) +[2023-02-24 13:11:17,870][00205] Fps is (10 sec: 3688.4, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 12091392. Throughput: 0: 949.2. Samples: 3022000. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:11:17,872][00205] Avg episode reward: [(0, '29.569')] +[2023-02-24 13:11:22,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 12115968. Throughput: 0: 988.6. Samples: 3029138. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:11:22,880][00205] Avg episode reward: [(0, '29.785')] +[2023-02-24 13:11:24,419][11215] Updated weights for policy 0, policy_version 2960 (0.0035) +[2023-02-24 13:11:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12132352. Throughput: 0: 983.2. Samples: 3032370. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:11:27,876][00205] Avg episode reward: [(0, '30.146')] +[2023-02-24 13:11:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3873.8). Total num frames: 12148736. Throughput: 0: 944.2. Samples: 3036898. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:11:32,873][00205] Avg episode reward: [(0, '29.465')] +[2023-02-24 13:11:36,684][11215] Updated weights for policy 0, policy_version 2970 (0.0027) +[2023-02-24 13:11:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 12169216. Throughput: 0: 965.7. Samples: 3042274. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:11:37,880][00205] Avg episode reward: [(0, '29.023')] +[2023-02-24 13:11:42,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12189696. Throughput: 0: 992.4. Samples: 3045810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:11:42,872][00205] Avg episode reward: [(0, '28.890')] +[2023-02-24 13:11:46,019][11215] Updated weights for policy 0, policy_version 2980 (0.0028) +[2023-02-24 13:11:47,871][00205] Fps is (10 sec: 4095.7, 60 sec: 3891.5, 300 sec: 3873.8). Total num frames: 12210176. Throughput: 0: 972.4. Samples: 3052150. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:11:47,876][00205] Avg episode reward: [(0, '29.086')] +[2023-02-24 13:11:52,871][00205] Fps is (10 sec: 3276.6, 60 sec: 3823.2, 300 sec: 3859.9). Total num frames: 12222464. Throughput: 0: 931.8. Samples: 3056502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:11:52,873][00205] Avg episode reward: [(0, '28.779')] +[2023-02-24 13:11:57,870][00205] Fps is (10 sec: 3277.0, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12242944. Throughput: 0: 936.8. Samples: 3058930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:11:57,877][00205] Avg episode reward: [(0, '28.479')] +[2023-02-24 13:11:58,062][11215] Updated weights for policy 0, policy_version 2990 (0.0019) +[2023-02-24 13:12:02,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 12267520. Throughput: 0: 975.8. Samples: 3065910. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:12:02,879][00205] Avg episode reward: [(0, '28.994')] +[2023-02-24 13:12:07,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3823.1, 300 sec: 3859.9). Total num frames: 12283904. Throughput: 0: 948.9. Samples: 3071840. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:12:07,877][00205] Avg episode reward: [(0, '27.318')] +[2023-02-24 13:12:08,107][11215] Updated weights for policy 0, policy_version 3000 (0.0012) +[2023-02-24 13:12:12,871][00205] Fps is (10 sec: 3276.6, 60 sec: 3822.9, 300 sec: 3859.9). Total num frames: 12300288. Throughput: 0: 926.0. Samples: 3074042. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:12:12,880][00205] Avg episode reward: [(0, '27.280')] +[2023-02-24 13:12:17,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12320768. Throughput: 0: 939.7. Samples: 3079182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:12:17,874][00205] Avg episode reward: [(0, '27.394')] +[2023-02-24 13:12:19,207][11215] Updated weights for policy 0, policy_version 3010 (0.0028) +[2023-02-24 13:12:22,870][00205] Fps is (10 sec: 4506.0, 60 sec: 3823.0, 300 sec: 3860.0). Total num frames: 12345344. Throughput: 0: 976.9. Samples: 3086234. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:12:22,873][00205] Avg episode reward: [(0, '27.006')] +[2023-02-24 13:12:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12361728. Throughput: 0: 968.4. Samples: 3089388. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:12:27,874][00205] Avg episode reward: [(0, '27.804')] +[2023-02-24 13:12:30,189][11215] Updated weights for policy 0, policy_version 3020 (0.0017) +[2023-02-24 13:12:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 12374016. Throughput: 0: 923.7. Samples: 3093716. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:12:32,872][00205] Avg episode reward: [(0, '26.419')] +[2023-02-24 13:12:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 12394496. Throughput: 0: 950.6. Samples: 3099278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:12:37,877][00205] Avg episode reward: [(0, '26.483')] +[2023-02-24 13:12:40,594][11215] Updated weights for policy 0, policy_version 3030 (0.0017) +[2023-02-24 13:12:42,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12419072. Throughput: 0: 974.0. Samples: 3102758. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:12:42,875][00205] Avg episode reward: [(0, '28.327')] +[2023-02-24 13:12:47,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 12435456. Throughput: 0: 959.6. Samples: 3109090. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:12:47,874][00205] Avg episode reward: [(0, '28.554')] +[2023-02-24 13:12:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003036_12435456.pth... +[2023-02-24 13:12:48,048][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002812_11517952.pth +[2023-02-24 13:12:52,202][11215] Updated weights for policy 0, policy_version 3040 (0.0017) +[2023-02-24 13:12:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3846.1). Total num frames: 12451840. Throughput: 0: 923.6. Samples: 3113400. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:12:52,875][00205] Avg episode reward: [(0, '28.876')] +[2023-02-24 13:12:57,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12472320. Throughput: 0: 930.4. Samples: 3115908. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:12:57,880][00205] Avg episode reward: [(0, '29.707')] +[2023-02-24 13:13:02,140][11215] Updated weights for policy 0, policy_version 3050 (0.0024) +[2023-02-24 13:13:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 12492800. Throughput: 0: 966.6. Samples: 3122680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:13:02,872][00205] Avg episode reward: [(0, '29.861')] +[2023-02-24 13:13:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.8, 300 sec: 3846.1). Total num frames: 12509184. Throughput: 0: 929.6. Samples: 3128068. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:13:07,878][00205] Avg episode reward: [(0, '30.553')] +[2023-02-24 13:13:12,871][00205] Fps is (10 sec: 3276.5, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 12525568. Throughput: 0: 906.7. Samples: 3130188. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:13:12,878][00205] Avg episode reward: [(0, '30.168')] +[2023-02-24 13:13:14,790][11215] Updated weights for policy 0, policy_version 3060 (0.0019) +[2023-02-24 13:13:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 12546048. Throughput: 0: 931.7. Samples: 3135642. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:13:17,879][00205] Avg episode reward: [(0, '29.916')] +[2023-02-24 13:13:22,870][00205] Fps is (10 sec: 4506.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 12570624. Throughput: 0: 963.1. Samples: 3142616. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:13:22,873][00205] Avg episode reward: [(0, '28.476')] +[2023-02-24 13:13:23,608][11215] Updated weights for policy 0, policy_version 3070 (0.0025) +[2023-02-24 13:13:27,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.6, 300 sec: 3846.1). Total num frames: 12587008. Throughput: 0: 950.5. Samples: 3145532. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:13:27,877][00205] Avg episode reward: [(0, '27.511')] +[2023-02-24 13:13:32,871][00205] Fps is (10 sec: 2866.8, 60 sec: 3754.6, 300 sec: 3818.3). Total num frames: 12599296. Throughput: 0: 904.5. Samples: 3149794. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:13:32,876][00205] Avg episode reward: [(0, '27.058')] +[2023-02-24 13:13:36,092][11215] Updated weights for policy 0, policy_version 3080 (0.0022) +[2023-02-24 13:13:37,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 12623872. Throughput: 0: 941.1. Samples: 3155750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:13:37,872][00205] Avg episode reward: [(0, '27.133')] +[2023-02-24 13:13:42,870][00205] Fps is (10 sec: 4506.2, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 12644352. Throughput: 0: 964.2. Samples: 3159296. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:13:42,881][00205] Avg episode reward: [(0, '27.275')] +[2023-02-24 13:13:45,463][11215] Updated weights for policy 0, policy_version 3090 (0.0025) +[2023-02-24 13:13:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 12660736. Throughput: 0: 945.8. Samples: 3165240. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:13:47,874][00205] Avg episode reward: [(0, '26.354')] +[2023-02-24 13:13:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 12677120. Throughput: 0: 925.4. Samples: 3169712. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:13:52,877][00205] Avg episode reward: [(0, '25.739')] +[2023-02-24 13:13:57,098][11215] Updated weights for policy 0, policy_version 3100 (0.0019) +[2023-02-24 13:13:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 12697600. Throughput: 0: 943.2. Samples: 3172630. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:13:57,879][00205] Avg episode reward: [(0, '25.633')] +[2023-02-24 13:14:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 12722176. Throughput: 0: 977.0. Samples: 3179608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:14:02,877][00205] Avg episode reward: [(0, '27.165')] +[2023-02-24 13:14:07,380][11215] Updated weights for policy 0, policy_version 3110 (0.0033) +[2023-02-24 13:14:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 12738560. Throughput: 0: 941.1. Samples: 3184964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:14:07,875][00205] Avg episode reward: [(0, '25.456')] +[2023-02-24 13:14:12,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 12750848. Throughput: 0: 925.6. Samples: 3187186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:14:12,881][00205] Avg episode reward: [(0, '25.933')] +[2023-02-24 13:14:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 12775424. Throughput: 0: 960.3. Samples: 3193004. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:14:17,872][00205] Avg episode reward: [(0, '26.471')] +[2023-02-24 13:14:18,152][11215] Updated weights for policy 0, policy_version 3120 (0.0019) +[2023-02-24 13:14:22,870][00205] Fps is (10 sec: 4915.5, 60 sec: 3822.9, 300 sec: 3818.4). Total num frames: 12800000. Throughput: 0: 986.0. Samples: 3200122. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:14:22,874][00205] Avg episode reward: [(0, '27.418')] +[2023-02-24 13:14:27,875][00205] Fps is (10 sec: 4093.8, 60 sec: 3822.6, 300 sec: 3818.2). Total num frames: 12816384. Throughput: 0: 965.5. Samples: 3202750. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:14:27,883][00205] Avg episode reward: [(0, '27.568')] +[2023-02-24 13:14:28,893][11215] Updated weights for policy 0, policy_version 3130 (0.0023) +[2023-02-24 13:14:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 12828672. Throughput: 0: 932.2. Samples: 3207188. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:14:32,875][00205] Avg episode reward: [(0, '26.551')] +[2023-02-24 13:14:37,870][00205] Fps is (10 sec: 3688.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 12853248. Throughput: 0: 975.5. Samples: 3213610. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:14:37,872][00205] Avg episode reward: [(0, '26.103')] +[2023-02-24 13:14:39,206][11215] Updated weights for policy 0, policy_version 3140 (0.0014) +[2023-02-24 13:14:42,870][00205] Fps is (10 sec: 4915.3, 60 sec: 3891.2, 300 sec: 3818.4). Total num frames: 12877824. Throughput: 0: 990.2. Samples: 3217188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:14:42,872][00205] Avg episode reward: [(0, '25.734')] +[2023-02-24 13:14:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 12894208. Throughput: 0: 965.4. Samples: 3223052. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:14:47,872][00205] Avg episode reward: [(0, '25.618')] +[2023-02-24 13:14:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003148_12894208.pth... +[2023-02-24 13:14:48,031][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002925_11980800.pth +[2023-02-24 13:14:50,425][11215] Updated weights for policy 0, policy_version 3150 (0.0013) +[2023-02-24 13:14:52,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 12906496. Throughput: 0: 947.2. Samples: 3227590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:14:52,873][00205] Avg episode reward: [(0, '25.574')] +[2023-02-24 13:14:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 12931072. Throughput: 0: 968.8. Samples: 3230782. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:14:57,872][00205] Avg episode reward: [(0, '25.325')] +[2023-02-24 13:15:00,192][11215] Updated weights for policy 0, policy_version 3160 (0.0023) +[2023-02-24 13:15:02,870][00205] Fps is (10 sec: 4915.1, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 12955648. Throughput: 0: 988.6. Samples: 3237490. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:15:02,872][00205] Avg episode reward: [(0, '27.306')] +[2023-02-24 13:15:07,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 12967936. Throughput: 0: 951.6. Samples: 3242942. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:15:07,876][00205] Avg episode reward: [(0, '28.133')] +[2023-02-24 13:15:12,154][11215] Updated weights for policy 0, policy_version 3170 (0.0011) +[2023-02-24 13:15:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.3, 300 sec: 3804.4). Total num frames: 12984320. Throughput: 0: 943.1. Samples: 3245184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:15:12,879][00205] Avg episode reward: [(0, '28.380')] +[2023-02-24 13:15:17,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 13008896. Throughput: 0: 981.4. Samples: 3251352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:15:17,879][00205] Avg episode reward: [(0, '29.088')] +[2023-02-24 13:15:20,814][11215] Updated weights for policy 0, policy_version 3180 (0.0021) +[2023-02-24 13:15:22,870][00205] Fps is (10 sec: 4915.1, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 13033472. Throughput: 0: 998.4. Samples: 3258536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:15:22,873][00205] Avg episode reward: [(0, '31.138')] +[2023-02-24 13:15:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.6, 300 sec: 3832.2). Total num frames: 13049856. Throughput: 0: 974.3. Samples: 3261032. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:15:27,873][00205] Avg episode reward: [(0, '30.174')] +[2023-02-24 13:15:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13062144. Throughput: 0: 944.3. Samples: 3265546. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:15:32,873][00205] Avg episode reward: [(0, '28.223')] +[2023-02-24 13:15:33,036][11215] Updated weights for policy 0, policy_version 3190 (0.0014) +[2023-02-24 13:15:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13086720. Throughput: 0: 992.4. Samples: 3272246. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:15:37,873][00205] Avg episode reward: [(0, '29.218')] +[2023-02-24 13:15:41,379][11215] Updated weights for policy 0, policy_version 3200 (0.0024) +[2023-02-24 13:15:42,870][00205] Fps is (10 sec: 4915.1, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13111296. Throughput: 0: 1001.8. Samples: 3275864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:15:42,878][00205] Avg episode reward: [(0, '28.585')] +[2023-02-24 13:15:47,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13127680. Throughput: 0: 978.6. Samples: 3281528. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:15:47,876][00205] Avg episode reward: [(0, '28.434')] +[2023-02-24 13:15:52,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 13144064. Throughput: 0: 958.6. Samples: 3286080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:15:52,875][00205] Avg episode reward: [(0, '27.921')] +[2023-02-24 13:15:53,572][11215] Updated weights for policy 0, policy_version 3210 (0.0019) +[2023-02-24 13:15:57,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13164544. Throughput: 0: 983.6. Samples: 3289448. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:15:57,880][00205] Avg episode reward: [(0, '28.219')] +[2023-02-24 13:16:02,336][11215] Updated weights for policy 0, policy_version 3220 (0.0019) +[2023-02-24 13:16:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13189120. Throughput: 0: 1001.8. Samples: 3296434. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:02,876][00205] Avg episode reward: [(0, '26.576')] +[2023-02-24 13:16:07,876][00205] Fps is (10 sec: 4093.3, 60 sec: 3959.1, 300 sec: 3846.0). Total num frames: 13205504. Throughput: 0: 955.4. Samples: 3301536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:16:07,879][00205] Avg episode reward: [(0, '26.522')] +[2023-02-24 13:16:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 13221888. Throughput: 0: 949.9. Samples: 3303776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:12,873][00205] Avg episode reward: [(0, '27.093')] +[2023-02-24 13:16:14,616][11215] Updated weights for policy 0, policy_version 3230 (0.0030) +[2023-02-24 13:16:17,870][00205] Fps is (10 sec: 3688.8, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13242368. Throughput: 0: 988.4. Samples: 3310022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:16:17,872][00205] Avg episode reward: [(0, '26.161')] +[2023-02-24 13:16:22,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13266944. Throughput: 0: 999.3. Samples: 3317214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:22,877][00205] Avg episode reward: [(0, '27.452')] +[2023-02-24 13:16:23,608][11215] Updated weights for policy 0, policy_version 3240 (0.0011) +[2023-02-24 13:16:27,878][00205] Fps is (10 sec: 4092.8, 60 sec: 3890.7, 300 sec: 3846.0). Total num frames: 13283328. Throughput: 0: 968.9. Samples: 3319472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:27,881][00205] Avg episode reward: [(0, '27.243')] +[2023-02-24 13:16:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13295616. Throughput: 0: 943.2. Samples: 3323972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:32,873][00205] Avg episode reward: [(0, '27.434')] +[2023-02-24 13:16:35,436][11215] Updated weights for policy 0, policy_version 3250 (0.0025) +[2023-02-24 13:16:37,870][00205] Fps is (10 sec: 3689.2, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 13320192. Throughput: 0: 990.8. Samples: 3330666. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:16:37,872][00205] Avg episode reward: [(0, '28.732')] +[2023-02-24 13:16:42,873][00205] Fps is (10 sec: 4913.5, 60 sec: 3891.0, 300 sec: 3846.0). Total num frames: 13344768. Throughput: 0: 995.5. Samples: 3334250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:42,877][00205] Avg episode reward: [(0, '29.226')] +[2023-02-24 13:16:44,956][11215] Updated weights for policy 0, policy_version 3260 (0.0011) +[2023-02-24 13:16:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 13361152. Throughput: 0: 960.8. Samples: 3339670. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:16:47,872][00205] Avg episode reward: [(0, '30.104')] +[2023-02-24 13:16:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003262_13361152.pth... +[2023-02-24 13:16:48,050][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003036_12435456.pth +[2023-02-24 13:16:52,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13377536. Throughput: 0: 948.0. Samples: 3344192. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:16:52,879][00205] Avg episode reward: [(0, '29.504')] +[2023-02-24 13:16:56,264][11215] Updated weights for policy 0, policy_version 3270 (0.0012) +[2023-02-24 13:16:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 13398016. Throughput: 0: 975.4. Samples: 3347668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:57,872][00205] Avg episode reward: [(0, '28.290')] +[2023-02-24 13:17:02,871][00205] Fps is (10 sec: 4505.0, 60 sec: 3891.1, 300 sec: 3860.0). Total num frames: 13422592. Throughput: 0: 992.6. Samples: 3354692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:17:02,875][00205] Avg episode reward: [(0, '29.657')] +[2023-02-24 13:17:06,777][11215] Updated weights for policy 0, policy_version 3280 (0.0015) +[2023-02-24 13:17:07,873][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.3, 300 sec: 3846.1). Total num frames: 13434880. Throughput: 0: 941.9. Samples: 3359600. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:17:07,882][00205] Avg episode reward: [(0, '29.137')] +[2023-02-24 13:17:12,870][00205] Fps is (10 sec: 2867.6, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 13451264. Throughput: 0: 943.9. Samples: 3361942. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:17:12,872][00205] Avg episode reward: [(0, '28.440')] +[2023-02-24 13:17:17,438][11215] Updated weights for policy 0, policy_version 3290 (0.0018) +[2023-02-24 13:17:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 13475840. Throughput: 0: 982.1. Samples: 3368166. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:17:17,872][00205] Avg episode reward: [(0, '27.276')] +[2023-02-24 13:17:22,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 13500416. Throughput: 0: 993.2. Samples: 3375360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:17:22,872][00205] Avg episode reward: [(0, '26.586')] +[2023-02-24 13:17:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.4, 300 sec: 3860.0). Total num frames: 13512704. Throughput: 0: 962.5. Samples: 3377558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:17:27,873][00205] Avg episode reward: [(0, '26.672')] +[2023-02-24 13:17:28,242][11215] Updated weights for policy 0, policy_version 3300 (0.0015) +[2023-02-24 13:17:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13529088. Throughput: 0: 934.9. Samples: 3381740. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:17:32,874][00205] Avg episode reward: [(0, '25.957')] +[2023-02-24 13:17:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 13549568. Throughput: 0: 976.0. Samples: 3388112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:17:37,875][00205] Avg episode reward: [(0, '26.734')] +[2023-02-24 13:17:38,777][11215] Updated weights for policy 0, policy_version 3310 (0.0016) +[2023-02-24 13:17:42,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3823.1, 300 sec: 3860.0). Total num frames: 13574144. Throughput: 0: 975.6. Samples: 3391572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:17:42,875][00205] Avg episode reward: [(0, '27.235')] +[2023-02-24 13:17:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 13586432. Throughput: 0: 935.1. Samples: 3396768. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:17:47,872][00205] Avg episode reward: [(0, '26.936')] +[2023-02-24 13:17:50,851][11215] Updated weights for policy 0, policy_version 3320 (0.0019) +[2023-02-24 13:17:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13606912. Throughput: 0: 928.7. Samples: 3401392. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:17:52,871][00205] Avg episode reward: [(0, '27.285')] +[2023-02-24 13:17:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13627392. Throughput: 0: 955.6. Samples: 3404946. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:17:57,879][00205] Avg episode reward: [(0, '28.580')] +[2023-02-24 13:17:59,878][11215] Updated weights for policy 0, policy_version 3330 (0.0019) +[2023-02-24 13:18:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3860.0). Total num frames: 13647872. Throughput: 0: 976.7. Samples: 3412118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:18:02,873][00205] Avg episode reward: [(0, '28.180')] +[2023-02-24 13:18:07,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 13664256. Throughput: 0: 918.7. Samples: 3416702. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:18:07,877][00205] Avg episode reward: [(0, '27.651')] +[2023-02-24 13:18:12,307][11215] Updated weights for policy 0, policy_version 3340 (0.0041) +[2023-02-24 13:18:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13680640. Throughput: 0: 918.0. Samples: 3418870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:18:12,873][00205] Avg episode reward: [(0, '28.435')] +[2023-02-24 13:18:17,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13705216. Throughput: 0: 967.1. Samples: 3425260. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:18:17,873][00205] Avg episode reward: [(0, '29.291')] +[2023-02-24 13:18:20,960][11215] Updated weights for policy 0, policy_version 3350 (0.0017) +[2023-02-24 13:18:22,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 13725696. Throughput: 0: 975.4. Samples: 3432004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:18:22,876][00205] Avg episode reward: [(0, '29.159')] +[2023-02-24 13:18:27,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3873.9). Total num frames: 13742080. Throughput: 0: 946.8. Samples: 3434176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:18:27,878][00205] Avg episode reward: [(0, '28.704')] +[2023-02-24 13:18:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13758464. Throughput: 0: 929.7. Samples: 3438606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:18:32,879][00205] Avg episode reward: [(0, '29.137')] +[2023-02-24 13:18:33,539][11215] Updated weights for policy 0, policy_version 3360 (0.0020) +[2023-02-24 13:18:37,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13778944. Throughput: 0: 975.9. Samples: 3445306. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:18:37,873][00205] Avg episode reward: [(0, '28.880')] +[2023-02-24 13:18:42,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 13799424. Throughput: 0: 975.5. Samples: 3448844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:18:42,873][00205] Avg episode reward: [(0, '29.770')] +[2023-02-24 13:18:43,020][11215] Updated weights for policy 0, policy_version 3370 (0.0013) +[2023-02-24 13:18:47,872][00205] Fps is (10 sec: 3685.8, 60 sec: 3822.8, 300 sec: 3859.9). Total num frames: 13815808. Throughput: 0: 923.1. Samples: 3453658. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:18:47,877][00205] Avg episode reward: [(0, '30.464')] +[2023-02-24 13:18:47,893][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003373_13815808.pth... +[2023-02-24 13:18:48,036][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003148_12894208.pth +[2023-02-24 13:18:52,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 13832192. Throughput: 0: 928.3. Samples: 3458474. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:18:52,873][00205] Avg episode reward: [(0, '30.365')] +[2023-02-24 13:18:54,924][11215] Updated weights for policy 0, policy_version 3380 (0.0023) +[2023-02-24 13:18:57,870][00205] Fps is (10 sec: 4096.7, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13856768. Throughput: 0: 957.3. Samples: 3461948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:18:57,879][00205] Avg episode reward: [(0, '31.903')] +[2023-02-24 13:19:02,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 13877248. Throughput: 0: 968.5. Samples: 3468844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:19:02,880][00205] Avg episode reward: [(0, '30.509')] +[2023-02-24 13:19:05,667][11215] Updated weights for policy 0, policy_version 3390 (0.0014) +[2023-02-24 13:19:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 13889536. Throughput: 0: 913.1. Samples: 3473092. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:19:07,876][00205] Avg episode reward: [(0, '30.280')] +[2023-02-24 13:19:12,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13910016. Throughput: 0: 915.0. Samples: 3475350. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:19:12,878][00205] Avg episode reward: [(0, '28.337')] +[2023-02-24 13:19:16,419][11215] Updated weights for policy 0, policy_version 3400 (0.0013) +[2023-02-24 13:19:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 13930496. Throughput: 0: 961.9. Samples: 3481890. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:19:17,879][00205] Avg episode reward: [(0, '27.576')] +[2023-02-24 13:19:22,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 13950976. Throughput: 0: 952.9. Samples: 3488186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:19:22,873][00205] Avg episode reward: [(0, '26.552')] +[2023-02-24 13:19:27,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3846.1). Total num frames: 13963264. Throughput: 0: 922.4. Samples: 3490352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:19:27,876][00205] Avg episode reward: [(0, '25.387')] +[2023-02-24 13:19:28,281][11215] Updated weights for policy 0, policy_version 3410 (0.0023) +[2023-02-24 13:19:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 13983744. Throughput: 0: 915.1. Samples: 3494838. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:19:32,875][00205] Avg episode reward: [(0, '25.694')] +[2023-02-24 13:19:37,787][11215] Updated weights for policy 0, policy_version 3420 (0.0014) +[2023-02-24 13:19:37,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 14008320. Throughput: 0: 959.7. Samples: 3501662. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) +[2023-02-24 13:19:37,879][00205] Avg episode reward: [(0, '27.661')] +[2023-02-24 13:19:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 14024704. Throughput: 0: 960.4. Samples: 3505164. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:19:42,874][00205] Avg episode reward: [(0, '27.639')] +[2023-02-24 13:19:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.8, 300 sec: 3846.1). Total num frames: 14041088. Throughput: 0: 909.3. Samples: 3509764. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 13:19:47,875][00205] Avg episode reward: [(0, '29.394')] +[2023-02-24 13:19:50,471][11215] Updated weights for policy 0, policy_version 3430 (0.0027) +[2023-02-24 13:19:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 14057472. Throughput: 0: 929.2. Samples: 3514906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:19:52,872][00205] Avg episode reward: [(0, '29.038')] +[2023-02-24 13:19:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 14082048. Throughput: 0: 958.0. Samples: 3518462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:19:57,873][00205] Avg episode reward: [(0, '28.238')] +[2023-02-24 13:19:59,275][11215] Updated weights for policy 0, policy_version 3440 (0.0022) +[2023-02-24 13:20:02,872][00205] Fps is (10 sec: 4504.6, 60 sec: 3754.5, 300 sec: 3846.0). Total num frames: 14102528. Throughput: 0: 955.6. Samples: 3524892. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:20:02,877][00205] Avg episode reward: [(0, '30.013')] +[2023-02-24 13:20:07,871][00205] Fps is (10 sec: 3276.4, 60 sec: 3754.6, 300 sec: 3832.2). Total num frames: 14114816. Throughput: 0: 910.3. Samples: 3529152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:20:07,874][00205] Avg episode reward: [(0, '26.897')] +[2023-02-24 13:20:11,975][11215] Updated weights for policy 0, policy_version 3450 (0.0021) +[2023-02-24 13:20:12,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 14135296. Throughput: 0: 911.4. Samples: 3531366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:20:12,872][00205] Avg episode reward: [(0, '26.146')] +[2023-02-24 13:20:17,870][00205] Fps is (10 sec: 4096.5, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 14155776. Throughput: 0: 966.9. Samples: 3538348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:20:17,872][00205] Avg episode reward: [(0, '25.074')] +[2023-02-24 13:20:20,670][11215] Updated weights for policy 0, policy_version 3460 (0.0011) +[2023-02-24 13:20:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 14176256. Throughput: 0: 956.0. Samples: 3544684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:20:22,874][00205] Avg episode reward: [(0, '25.799')] +[2023-02-24 13:20:27,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 14192640. Throughput: 0: 926.8. Samples: 3546870. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:20:27,875][00205] Avg episode reward: [(0, '27.608')] +[2023-02-24 13:20:32,831][11215] Updated weights for policy 0, policy_version 3470 (0.0030) +[2023-02-24 13:20:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 14213120. Throughput: 0: 933.2. Samples: 3551758. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:20:32,872][00205] Avg episode reward: [(0, '28.704')] +[2023-02-24 13:20:37,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 14233600. Throughput: 0: 975.5. Samples: 3558802. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:20:37,878][00205] Avg episode reward: [(0, '30.275')] +[2023-02-24 13:20:42,548][11215] Updated weights for policy 0, policy_version 3480 (0.0024) +[2023-02-24 13:20:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 14254080. Throughput: 0: 973.6. Samples: 3562272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:20:42,872][00205] Avg episode reward: [(0, '29.947')] +[2023-02-24 13:20:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 14266368. Throughput: 0: 927.9. Samples: 3566644. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:20:47,873][00205] Avg episode reward: [(0, '31.964')] +[2023-02-24 13:20:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003483_14266368.pth... +[2023-02-24 13:20:48,049][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003262_13361152.pth +[2023-02-24 13:20:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 14286848. Throughput: 0: 952.0. Samples: 3571990. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:20:52,872][00205] Avg episode reward: [(0, '32.424')] +[2023-02-24 13:20:54,231][11215] Updated weights for policy 0, policy_version 3490 (0.0026) +[2023-02-24 13:20:57,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 14311424. Throughput: 0: 979.7. Samples: 3575454. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:20:57,875][00205] Avg episode reward: [(0, '31.058')] +[2023-02-24 13:21:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3804.5). Total num frames: 14327808. Throughput: 0: 969.7. Samples: 3581984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:21:02,878][00205] Avg episode reward: [(0, '31.370')] +[2023-02-24 13:21:04,717][11215] Updated weights for policy 0, policy_version 3500 (0.0030) +[2023-02-24 13:21:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 14344192. Throughput: 0: 923.7. Samples: 3586250. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:07,875][00205] Avg episode reward: [(0, '30.257')] +[2023-02-24 13:21:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14360576. Throughput: 0: 925.1. Samples: 3588500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:12,879][00205] Avg episode reward: [(0, '30.011')] +[2023-02-24 13:21:15,474][11215] Updated weights for policy 0, policy_version 3510 (0.0020) +[2023-02-24 13:21:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14385152. Throughput: 0: 973.6. Samples: 3595572. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:17,876][00205] Avg episode reward: [(0, '29.541')] +[2023-02-24 13:21:22,872][00205] Fps is (10 sec: 4504.5, 60 sec: 3822.8, 300 sec: 3804.5). Total num frames: 14405632. Throughput: 0: 955.8. Samples: 3601814. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:21:22,877][00205] Avg episode reward: [(0, '28.882')] +[2023-02-24 13:21:26,956][11215] Updated weights for policy 0, policy_version 3520 (0.0026) +[2023-02-24 13:21:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 14417920. Throughput: 0: 926.4. Samples: 3603962. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:21:27,872][00205] Avg episode reward: [(0, '29.000')] +[2023-02-24 13:21:32,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14438400. Throughput: 0: 935.4. Samples: 3608738. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:32,873][00205] Avg episode reward: [(0, '28.622')] +[2023-02-24 13:21:36,772][11215] Updated weights for policy 0, policy_version 3530 (0.0026) +[2023-02-24 13:21:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 14462976. Throughput: 0: 972.3. Samples: 3615742. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:37,873][00205] Avg episode reward: [(0, '28.415')] +[2023-02-24 13:21:42,877][00205] Fps is (10 sec: 4093.0, 60 sec: 3754.2, 300 sec: 3790.4). Total num frames: 14479360. Throughput: 0: 972.9. Samples: 3619240. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:42,884][00205] Avg episode reward: [(0, '26.604')] +[2023-02-24 13:21:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14495744. Throughput: 0: 922.6. Samples: 3623500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:47,873][00205] Avg episode reward: [(0, '28.410')] +[2023-02-24 13:21:48,978][11215] Updated weights for policy 0, policy_version 3540 (0.0028) +[2023-02-24 13:21:52,870][00205] Fps is (10 sec: 3689.1, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14516224. Throughput: 0: 948.1. Samples: 3628914. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:21:52,872][00205] Avg episode reward: [(0, '28.067')] +[2023-02-24 13:21:57,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.6, 300 sec: 3776.7). Total num frames: 14536704. Throughput: 0: 976.2. Samples: 3632428. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:57,873][00205] Avg episode reward: [(0, '28.859')] +[2023-02-24 13:21:58,067][11215] Updated weights for policy 0, policy_version 3550 (0.0013) +[2023-02-24 13:22:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 14557184. Throughput: 0: 959.7. Samples: 3638758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:02,871][00205] Avg episode reward: [(0, '29.466')] +[2023-02-24 13:22:07,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14569472. Throughput: 0: 917.2. Samples: 3643084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:07,873][00205] Avg episode reward: [(0, '29.846')] +[2023-02-24 13:22:10,606][11215] Updated weights for policy 0, policy_version 3560 (0.0020) +[2023-02-24 13:22:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 14589952. Throughput: 0: 922.3. Samples: 3645464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:12,872][00205] Avg episode reward: [(0, '30.779')] +[2023-02-24 13:22:17,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 14614528. Throughput: 0: 971.7. Samples: 3652466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:22:17,872][00205] Avg episode reward: [(0, '30.570')] +[2023-02-24 13:22:19,483][11215] Updated weights for policy 0, policy_version 3570 (0.0020) +[2023-02-24 13:22:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3790.5). Total num frames: 14630912. Throughput: 0: 947.1. Samples: 3658362. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:22:22,875][00205] Avg episode reward: [(0, '30.270')] +[2023-02-24 13:22:27,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14647296. Throughput: 0: 917.7. Samples: 3660532. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:22:27,879][00205] Avg episode reward: [(0, '30.798')] +[2023-02-24 13:22:31,857][11215] Updated weights for policy 0, policy_version 3580 (0.0017) +[2023-02-24 13:22:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14667776. Throughput: 0: 940.0. Samples: 3665800. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:32,879][00205] Avg episode reward: [(0, '28.957')] +[2023-02-24 13:22:37,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14692352. Throughput: 0: 974.2. Samples: 3672752. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:22:37,877][00205] Avg episode reward: [(0, '28.236')] +[2023-02-24 13:22:41,102][11215] Updated weights for policy 0, policy_version 3590 (0.0026) +[2023-02-24 13:22:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.4, 300 sec: 3804.4). Total num frames: 14708736. Throughput: 0: 967.5. Samples: 3675964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:22:42,877][00205] Avg episode reward: [(0, '27.926')] +[2023-02-24 13:22:47,874][00205] Fps is (10 sec: 2866.0, 60 sec: 3754.4, 300 sec: 3776.6). Total num frames: 14721024. Throughput: 0: 925.2. Samples: 3680394. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:47,879][00205] Avg episode reward: [(0, '27.837')] +[2023-02-24 13:22:47,894][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003594_14721024.pth... +[2023-02-24 13:22:48,060][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003373_13815808.pth +[2023-02-24 13:22:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 14741504. Throughput: 0: 953.4. Samples: 3685988. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:52,872][00205] Avg episode reward: [(0, '27.800')] +[2023-02-24 13:22:53,026][11215] Updated weights for policy 0, policy_version 3600 (0.0021) +[2023-02-24 13:22:57,870][00205] Fps is (10 sec: 4507.4, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 14766080. Throughput: 0: 978.2. Samples: 3689482. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:57,879][00205] Avg episode reward: [(0, '28.941')] +[2023-02-24 13:23:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14782464. Throughput: 0: 961.3. Samples: 3695726. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:23:02,875][00205] Avg episode reward: [(0, '28.939')] +[2023-02-24 13:23:03,086][11215] Updated weights for policy 0, policy_version 3610 (0.0011) +[2023-02-24 13:23:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14798848. Throughput: 0: 925.8. Samples: 3700024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:23:07,873][00205] Avg episode reward: [(0, '29.466')] +[2023-02-24 13:23:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 14819328. Throughput: 0: 934.0. Samples: 3702560. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:23:12,872][00205] Avg episode reward: [(0, '30.402')] +[2023-02-24 13:23:14,294][11215] Updated weights for policy 0, policy_version 3620 (0.0024) +[2023-02-24 13:23:17,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14843904. Throughput: 0: 972.8. Samples: 3709576. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:23:17,872][00205] Avg episode reward: [(0, '31.584')] +[2023-02-24 13:23:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14860288. Throughput: 0: 947.2. Samples: 3715374. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:23:22,873][00205] Avg episode reward: [(0, '31.972')] +[2023-02-24 13:23:25,541][11215] Updated weights for policy 0, policy_version 3630 (0.0030) +[2023-02-24 13:23:27,870][00205] Fps is (10 sec: 2867.0, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 14872576. Throughput: 0: 922.3. Samples: 3717468. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:23:27,878][00205] Avg episode reward: [(0, '31.147')] +[2023-02-24 13:23:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 14893056. Throughput: 0: 940.1. Samples: 3722696. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:23:32,879][00205] Avg episode reward: [(0, '30.206')] +[2023-02-24 13:23:35,677][11215] Updated weights for policy 0, policy_version 3640 (0.0013) +[2023-02-24 13:23:37,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14917632. Throughput: 0: 972.9. Samples: 3729768. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:23:37,872][00205] Avg episode reward: [(0, '29.361')] +[2023-02-24 13:23:42,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3754.6, 300 sec: 3790.5). Total num frames: 14934016. Throughput: 0: 962.9. Samples: 3732814. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:23:42,874][00205] Avg episode reward: [(0, '29.718')] +[2023-02-24 13:23:47,828][11215] Updated weights for policy 0, policy_version 3650 (0.0019) +[2023-02-24 13:23:47,875][00205] Fps is (10 sec: 3275.2, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14950400. Throughput: 0: 916.6. Samples: 3736978. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:23:47,882][00205] Avg episode reward: [(0, '29.596')] +[2023-02-24 13:23:52,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 14970880. Throughput: 0: 950.8. Samples: 3742810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:23:52,878][00205] Avg episode reward: [(0, '27.900')] +[2023-02-24 13:23:56,875][11215] Updated weights for policy 0, policy_version 3660 (0.0011) +[2023-02-24 13:23:57,870][00205] Fps is (10 sec: 4507.9, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14995456. Throughput: 0: 972.5. Samples: 3746322. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:23:57,872][00205] Avg episode reward: [(0, '28.600')] +[2023-02-24 13:24:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15011840. Throughput: 0: 952.3. Samples: 3752430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:24:02,877][00205] Avg episode reward: [(0, '29.330')] +[2023-02-24 13:24:07,873][00205] Fps is (10 sec: 2866.3, 60 sec: 3754.5, 300 sec: 3776.6). Total num frames: 15024128. Throughput: 0: 917.5. Samples: 3756664. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:24:07,878][00205] Avg episode reward: [(0, '29.552')] +[2023-02-24 13:24:09,391][11215] Updated weights for policy 0, policy_version 3670 (0.0020) +[2023-02-24 13:24:12,874][00205] Fps is (10 sec: 3275.4, 60 sec: 3754.4, 300 sec: 3776.6). Total num frames: 15044608. Throughput: 0: 936.1. Samples: 3759594. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:24:12,877][00205] Avg episode reward: [(0, '30.132')] +[2023-02-24 13:24:17,870][00205] Fps is (10 sec: 4506.9, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15069184. Throughput: 0: 975.4. Samples: 3766590. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:24:17,874][00205] Avg episode reward: [(0, '30.715')] +[2023-02-24 13:24:18,083][11215] Updated weights for policy 0, policy_version 3680 (0.0022) +[2023-02-24 13:24:22,873][00205] Fps is (10 sec: 4096.4, 60 sec: 3754.5, 300 sec: 3804.4). Total num frames: 15085568. Throughput: 0: 940.3. Samples: 3772084. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:24:22,876][00205] Avg episode reward: [(0, '31.799')] +[2023-02-24 13:24:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 15101952. Throughput: 0: 921.7. Samples: 3774290. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:24:27,873][00205] Avg episode reward: [(0, '31.516')] +[2023-02-24 13:24:30,543][11215] Updated weights for policy 0, policy_version 3690 (0.0024) +[2023-02-24 13:24:32,870][00205] Fps is (10 sec: 3687.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 15122432. Throughput: 0: 954.7. Samples: 3779936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:24:32,873][00205] Avg episode reward: [(0, '31.794')] +[2023-02-24 13:24:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15147008. Throughput: 0: 982.4. Samples: 3787018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:24:37,876][00205] Avg episode reward: [(0, '30.642')] +[2023-02-24 13:24:39,576][11215] Updated weights for policy 0, policy_version 3700 (0.0017) +[2023-02-24 13:24:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 15163392. Throughput: 0: 965.6. Samples: 3789772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:24:42,877][00205] Avg episode reward: [(0, '28.464')] +[2023-02-24 13:24:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3755.0, 300 sec: 3790.5). Total num frames: 15175680. Throughput: 0: 923.3. Samples: 3793980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:24:47,878][00205] Avg episode reward: [(0, '28.684')] +[2023-02-24 13:24:47,897][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003705_15175680.pth... +[2023-02-24 13:24:48,079][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003483_14266368.pth +[2023-02-24 13:24:51,750][11215] Updated weights for policy 0, policy_version 3710 (0.0020) +[2023-02-24 13:24:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15200256. Throughput: 0: 963.0. Samples: 3799994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:24:52,873][00205] Avg episode reward: [(0, '27.226')] +[2023-02-24 13:24:57,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3790.6). Total num frames: 15220736. Throughput: 0: 975.4. Samples: 3803482. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:24:57,875][00205] Avg episode reward: [(0, '27.058')] +[2023-02-24 13:25:02,409][11215] Updated weights for policy 0, policy_version 3720 (0.0028) +[2023-02-24 13:25:02,873][00205] Fps is (10 sec: 3685.3, 60 sec: 3754.5, 300 sec: 3804.4). Total num frames: 15237120. Throughput: 0: 941.8. Samples: 3808976. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:25:02,879][00205] Avg episode reward: [(0, '27.016')] +[2023-02-24 13:25:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.9, 300 sec: 3776.7). Total num frames: 15249408. Throughput: 0: 916.3. Samples: 3813316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:25:07,879][00205] Avg episode reward: [(0, '26.833')] +[2023-02-24 13:25:12,870][00205] Fps is (10 sec: 3687.5, 60 sec: 3823.2, 300 sec: 3790.5). Total num frames: 15273984. Throughput: 0: 934.8. Samples: 3816354. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:25:12,873][00205] Avg episode reward: [(0, '28.488')] +[2023-02-24 13:25:13,510][11215] Updated weights for policy 0, policy_version 3730 (0.0017) +[2023-02-24 13:25:17,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15298560. Throughput: 0: 961.5. Samples: 3823204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:25:17,873][00205] Avg episode reward: [(0, '29.344')] +[2023-02-24 13:25:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.9, 300 sec: 3790.5). Total num frames: 15310848. Throughput: 0: 923.4. Samples: 3828570. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:25:22,874][00205] Avg episode reward: [(0, '30.927')] +[2023-02-24 13:25:24,623][11215] Updated weights for policy 0, policy_version 3740 (0.0016) +[2023-02-24 13:25:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15327232. Throughput: 0: 910.9. Samples: 3830762. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:25:27,873][00205] Avg episode reward: [(0, '30.591')] +[2023-02-24 13:25:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15351808. Throughput: 0: 946.9. Samples: 3836590. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:25:32,872][00205] Avg episode reward: [(0, '30.033')] +[2023-02-24 13:25:34,539][11215] Updated weights for policy 0, policy_version 3750 (0.0017) +[2023-02-24 13:25:37,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15372288. Throughput: 0: 968.7. Samples: 3843584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:25:37,879][00205] Avg episode reward: [(0, '29.382')] +[2023-02-24 13:25:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 15388672. Throughput: 0: 949.6. Samples: 3846214. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:25:42,872][00205] Avg episode reward: [(0, '29.231')] +[2023-02-24 13:25:46,555][11215] Updated weights for policy 0, policy_version 3760 (0.0030) +[2023-02-24 13:25:47,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15405056. Throughput: 0: 924.2. Samples: 3850564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:25:47,879][00205] Avg episode reward: [(0, '28.233')] +[2023-02-24 13:25:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15425536. Throughput: 0: 968.5. Samples: 3856900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:25:52,879][00205] Avg episode reward: [(0, '28.024')] +[2023-02-24 13:25:55,699][11215] Updated weights for policy 0, policy_version 3770 (0.0011) +[2023-02-24 13:25:57,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15450112. Throughput: 0: 979.4. Samples: 3860428. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:25:57,878][00205] Avg episode reward: [(0, '28.417')] +[2023-02-24 13:26:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.1, 300 sec: 3804.4). Total num frames: 15466496. Throughput: 0: 952.5. Samples: 3866066. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:26:02,873][00205] Avg episode reward: [(0, '29.659')] +[2023-02-24 13:26:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15478784. Throughput: 0: 930.6. Samples: 3870448. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:26:07,872][00205] Avg episode reward: [(0, '30.114')] +[2023-02-24 13:26:08,097][11215] Updated weights for policy 0, policy_version 3780 (0.0037) +[2023-02-24 13:26:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15503360. Throughput: 0: 952.3. Samples: 3873616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:26:12,880][00205] Avg episode reward: [(0, '30.489')] +[2023-02-24 13:26:16,975][11215] Updated weights for policy 0, policy_version 3790 (0.0017) +[2023-02-24 13:26:17,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 15527936. Throughput: 0: 975.8. Samples: 3880500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:26:17,879][00205] Avg episode reward: [(0, '31.722')] +[2023-02-24 13:26:22,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15540224. Throughput: 0: 936.0. Samples: 3885702. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:26:22,874][00205] Avg episode reward: [(0, '32.746')] +[2023-02-24 13:26:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15556608. Throughput: 0: 924.5. Samples: 3887818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:26:27,873][00205] Avg episode reward: [(0, '31.724')] +[2023-02-24 13:26:29,367][11215] Updated weights for policy 0, policy_version 3800 (0.0011) +[2023-02-24 13:26:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15577088. Throughput: 0: 960.3. Samples: 3893776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:26:32,876][00205] Avg episode reward: [(0, '29.443')] +[2023-02-24 13:26:37,872][00205] Fps is (10 sec: 4504.6, 60 sec: 3822.8, 300 sec: 3804.5). Total num frames: 15601664. Throughput: 0: 974.1. Samples: 3900736. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:26:37,878][00205] Avg episode reward: [(0, '28.960')] +[2023-02-24 13:26:38,499][11215] Updated weights for policy 0, policy_version 3810 (0.0017) +[2023-02-24 13:26:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15613952. Throughput: 0: 948.3. Samples: 3903102. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:26:42,874][00205] Avg episode reward: [(0, '28.649')] +[2023-02-24 13:26:47,870][00205] Fps is (10 sec: 2867.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15630336. Throughput: 0: 919.1. Samples: 3907426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:26:47,872][00205] Avg episode reward: [(0, '27.857')] +[2023-02-24 13:26:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003816_15630336.pth... +[2023-02-24 13:26:48,007][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003594_14721024.pth +[2023-02-24 13:26:50,814][11215] Updated weights for policy 0, policy_version 3820 (0.0020) +[2023-02-24 13:26:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15654912. Throughput: 0: 961.1. Samples: 3913696. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:26:52,872][00205] Avg episode reward: [(0, '27.325')] +[2023-02-24 13:26:57,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15675392. Throughput: 0: 968.2. Samples: 3917186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:26:57,879][00205] Avg episode reward: [(0, '27.446')] +[2023-02-24 13:27:00,812][11215] Updated weights for policy 0, policy_version 3830 (0.0016) +[2023-02-24 13:27:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 15691776. Throughput: 0: 937.4. Samples: 3922684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:27:02,877][00205] Avg episode reward: [(0, '29.302')] +[2023-02-24 13:27:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15708160. Throughput: 0: 919.9. Samples: 3927096. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:27:07,873][00205] Avg episode reward: [(0, '28.767')] +[2023-02-24 13:27:12,018][11215] Updated weights for policy 0, policy_version 3840 (0.0025) +[2023-02-24 13:27:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15728640. Throughput: 0: 947.1. Samples: 3930436. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:27:12,872][00205] Avg episode reward: [(0, '30.464')] +[2023-02-24 13:27:17,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 15753216. Throughput: 0: 967.7. Samples: 3937324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:27:17,872][00205] Avg episode reward: [(0, '31.113')] +[2023-02-24 13:27:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15765504. Throughput: 0: 925.3. Samples: 3942374. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:27:22,881][00205] Avg episode reward: [(0, '32.603')] +[2023-02-24 13:27:23,047][11215] Updated weights for policy 0, policy_version 3850 (0.0019) +[2023-02-24 13:27:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 15781888. Throughput: 0: 922.0. Samples: 3944590. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:27:27,875][00205] Avg episode reward: [(0, '33.059')] +[2023-02-24 13:27:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 15806464. Throughput: 0: 959.2. Samples: 3950590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:27:32,872][00205] Avg episode reward: [(0, '33.152')] +[2023-02-24 13:27:33,423][11215] Updated weights for policy 0, policy_version 3860 (0.0012) +[2023-02-24 13:27:37,878][00205] Fps is (10 sec: 4501.8, 60 sec: 3754.3, 300 sec: 3790.4). Total num frames: 15826944. Throughput: 0: 974.5. Samples: 3957556. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:27:37,881][00205] Avg episode reward: [(0, '34.449')] +[2023-02-24 13:27:37,899][11201] Saving new best policy, reward=34.449! +[2023-02-24 13:27:42,874][00205] Fps is (10 sec: 3684.8, 60 sec: 3822.7, 300 sec: 3804.4). Total num frames: 15843328. Throughput: 0: 945.7. Samples: 3959746. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:27:42,876][00205] Avg episode reward: [(0, '34.549')] +[2023-02-24 13:27:42,886][11201] Saving new best policy, reward=34.549! +[2023-02-24 13:27:45,283][11215] Updated weights for policy 0, policy_version 3870 (0.0011) +[2023-02-24 13:27:47,870][00205] Fps is (10 sec: 3279.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15859712. Throughput: 0: 921.7. Samples: 3964162. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:27:47,872][00205] Avg episode reward: [(0, '31.631')] +[2023-02-24 13:27:52,870][00205] Fps is (10 sec: 4097.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15884288. Throughput: 0: 970.6. Samples: 3970774. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:27:52,876][00205] Avg episode reward: [(0, '30.482')] +[2023-02-24 13:27:54,630][11215] Updated weights for policy 0, policy_version 3880 (0.0019) +[2023-02-24 13:27:57,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15904768. Throughput: 0: 975.8. Samples: 3974346. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:27:57,872][00205] Avg episode reward: [(0, '30.739')] +[2023-02-24 13:28:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15921152. Throughput: 0: 940.8. Samples: 3979662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:28:02,873][00205] Avg episode reward: [(0, '31.045')] +[2023-02-24 13:28:06,900][11215] Updated weights for policy 0, policy_version 3890 (0.0026) +[2023-02-24 13:28:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15937536. Throughput: 0: 928.7. Samples: 3984164. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:07,873][00205] Avg episode reward: [(0, '31.096')] +[2023-02-24 13:28:12,871][00205] Fps is (10 sec: 3686.1, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 15958016. Throughput: 0: 958.3. Samples: 3987712. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:28:12,878][00205] Avg episode reward: [(0, '28.555')] +[2023-02-24 13:28:15,825][11215] Updated weights for policy 0, policy_version 3900 (0.0014) +[2023-02-24 13:28:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15978496. Throughput: 0: 976.8. Samples: 3994548. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:17,877][00205] Avg episode reward: [(0, '29.449')] +[2023-02-24 13:28:22,870][00205] Fps is (10 sec: 3686.7, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15994880. Throughput: 0: 925.8. Samples: 3999208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:22,872][00205] Avg episode reward: [(0, '31.371')] +[2023-02-24 13:28:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16011264. Throughput: 0: 924.0. Samples: 4001324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:27,879][00205] Avg episode reward: [(0, '30.108')] +[2023-02-24 13:28:28,307][11215] Updated weights for policy 0, policy_version 3910 (0.0016) +[2023-02-24 13:28:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16035840. Throughput: 0: 968.8. Samples: 4007756. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:28:32,872][00205] Avg episode reward: [(0, '30.784')] +[2023-02-24 13:28:37,024][11215] Updated weights for policy 0, policy_version 3920 (0.0011) +[2023-02-24 13:28:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3823.5, 300 sec: 3804.4). Total num frames: 16056320. Throughput: 0: 973.9. Samples: 4014598. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:37,878][00205] Avg episode reward: [(0, '30.628')] +[2023-02-24 13:28:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.2, 300 sec: 3804.5). Total num frames: 16072704. Throughput: 0: 942.5. Samples: 4016758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:42,879][00205] Avg episode reward: [(0, '30.219')] +[2023-02-24 13:28:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16089088. Throughput: 0: 924.1. Samples: 4021248. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:28:47,872][00205] Avg episode reward: [(0, '30.982')] +[2023-02-24 13:28:47,881][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003928_16089088.pth... +[2023-02-24 13:28:47,999][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003705_15175680.pth +[2023-02-24 13:28:49,555][11215] Updated weights for policy 0, policy_version 3930 (0.0020) +[2023-02-24 13:28:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16109568. Throughput: 0: 971.3. Samples: 4027874. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:52,872][00205] Avg episode reward: [(0, '31.041')] +[2023-02-24 13:28:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16130048. Throughput: 0: 967.6. Samples: 4031252. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:28:57,877][00205] Avg episode reward: [(0, '32.061')] +[2023-02-24 13:28:59,693][11215] Updated weights for policy 0, policy_version 3940 (0.0017) +[2023-02-24 13:29:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.5). Total num frames: 16146432. Throughput: 0: 929.4. Samples: 4036370. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:29:02,879][00205] Avg episode reward: [(0, '30.960')] +[2023-02-24 13:29:07,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3754.6, 300 sec: 3790.6). Total num frames: 16162816. Throughput: 0: 931.7. Samples: 4041134. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:29:07,875][00205] Avg episode reward: [(0, '31.018')] +[2023-02-24 13:29:10,828][11215] Updated weights for policy 0, policy_version 3950 (0.0020) +[2023-02-24 13:29:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 16187392. Throughput: 0: 961.6. Samples: 4044598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:29:12,880][00205] Avg episode reward: [(0, '31.144')] +[2023-02-24 13:29:17,876][00205] Fps is (10 sec: 4502.9, 60 sec: 3822.5, 300 sec: 3804.4). Total num frames: 16207872. Throughput: 0: 972.3. Samples: 4051514. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:29:17,879][00205] Avg episode reward: [(0, '32.337')] +[2023-02-24 13:29:21,731][11215] Updated weights for policy 0, policy_version 3960 (0.0014) +[2023-02-24 13:29:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16220160. Throughput: 0: 919.1. Samples: 4055956. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:29:22,876][00205] Avg episode reward: [(0, '32.456')] +[2023-02-24 13:29:27,870][00205] Fps is (10 sec: 3278.9, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16240640. Throughput: 0: 918.0. Samples: 4058066. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:29:27,872][00205] Avg episode reward: [(0, '32.083')] +[2023-02-24 13:29:32,158][11215] Updated weights for policy 0, policy_version 3970 (0.0017) +[2023-02-24 13:29:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16261120. Throughput: 0: 965.4. Samples: 4064690. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:29:32,873][00205] Avg episode reward: [(0, '30.460')] +[2023-02-24 13:29:37,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3754.5, 300 sec: 3790.5). Total num frames: 16281600. Throughput: 0: 961.6. Samples: 4071148. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:29:37,883][00205] Avg episode reward: [(0, '30.129')] +[2023-02-24 13:29:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 16297984. Throughput: 0: 936.3. Samples: 4073386. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:29:42,875][00205] Avg episode reward: [(0, '29.237')] +[2023-02-24 13:29:43,843][11215] Updated weights for policy 0, policy_version 3980 (0.0012) +[2023-02-24 13:29:47,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16314368. Throughput: 0: 922.6. Samples: 4077888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:29:47,875][00205] Avg episode reward: [(0, '29.045')] +[2023-02-24 13:29:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16338944. Throughput: 0: 966.9. Samples: 4084644. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:29:52,878][00205] Avg episode reward: [(0, '31.121')] +[2023-02-24 13:29:53,682][11215] Updated weights for policy 0, policy_version 3990 (0.0019) +[2023-02-24 13:29:57,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 16359424. Throughput: 0: 965.5. Samples: 4088046. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:29:57,873][00205] Avg episode reward: [(0, '30.992')] +[2023-02-24 13:30:02,872][00205] Fps is (10 sec: 3276.0, 60 sec: 3754.5, 300 sec: 3804.4). Total num frames: 16371712. Throughput: 0: 912.0. Samples: 4092550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:30:02,875][00205] Avg episode reward: [(0, '30.782')] +[2023-02-24 13:30:06,278][11215] Updated weights for policy 0, policy_version 4000 (0.0021) +[2023-02-24 13:30:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16388096. Throughput: 0: 923.2. Samples: 4097502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:30:07,873][00205] Avg episode reward: [(0, '30.459')] +[2023-02-24 13:30:12,870][00205] Fps is (10 sec: 4097.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16412672. Throughput: 0: 953.2. Samples: 4100958. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:30:12,879][00205] Avg episode reward: [(0, '32.182')] +[2023-02-24 13:30:15,315][11215] Updated weights for policy 0, policy_version 4010 (0.0013) +[2023-02-24 13:30:17,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3755.1, 300 sec: 3804.4). Total num frames: 16433152. Throughput: 0: 956.1. Samples: 4107714. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:30:17,876][00205] Avg episode reward: [(0, '32.199')] +[2023-02-24 13:30:22,870][00205] Fps is (10 sec: 3276.6, 60 sec: 3754.6, 300 sec: 3790.5). Total num frames: 16445440. Throughput: 0: 908.6. Samples: 4112032. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:30:22,878][00205] Avg episode reward: [(0, '31.692')] +[2023-02-24 13:30:27,659][11215] Updated weights for policy 0, policy_version 4020 (0.0027) +[2023-02-24 13:30:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16465920. Throughput: 0: 907.7. Samples: 4114234. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:30:27,881][00205] Avg episode reward: [(0, '30.593')] +[2023-02-24 13:30:32,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16486400. Throughput: 0: 958.9. Samples: 4121038. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:30:32,872][00205] Avg episode reward: [(0, '29.981')] +[2023-02-24 13:30:37,153][11215] Updated weights for policy 0, policy_version 4030 (0.0027) +[2023-02-24 13:30:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3790.5). Total num frames: 16506880. Throughput: 0: 947.2. Samples: 4127268. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:30:37,872][00205] Avg episode reward: [(0, '30.057')] +[2023-02-24 13:30:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 16519168. Throughput: 0: 917.9. Samples: 4129350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:30:42,873][00205] Avg episode reward: [(0, '29.359')] +[2023-02-24 13:30:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16539648. Throughput: 0: 928.5. Samples: 4134332. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:30:47,873][00205] Avg episode reward: [(0, '29.214')] +[2023-02-24 13:30:47,891][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004038_16539648.pth... +[2023-02-24 13:30:48,013][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003816_15630336.pth +[2023-02-24 13:30:49,043][11215] Updated weights for policy 0, policy_version 4040 (0.0023) +[2023-02-24 13:30:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16564224. Throughput: 0: 971.4. Samples: 4141216. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:30:52,878][00205] Avg episode reward: [(0, '29.870')] +[2023-02-24 13:30:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 16580608. Throughput: 0: 968.6. Samples: 4144546. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:30:57,878][00205] Avg episode reward: [(0, '29.219')] +[2023-02-24 13:30:59,542][11215] Updated weights for policy 0, policy_version 4050 (0.0013) +[2023-02-24 13:31:02,871][00205] Fps is (10 sec: 3276.3, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16596992. Throughput: 0: 916.0. Samples: 4148936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:31:02,877][00205] Avg episode reward: [(0, '29.501')] +[2023-02-24 13:31:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 16617472. Throughput: 0: 941.9. Samples: 4154416. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:07,873][00205] Avg episode reward: [(0, '28.976')] +[2023-02-24 13:31:10,200][11215] Updated weights for policy 0, policy_version 4060 (0.0016) +[2023-02-24 13:31:12,870][00205] Fps is (10 sec: 4096.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 16637952. Throughput: 0: 970.3. Samples: 4157898. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:12,872][00205] Avg episode reward: [(0, '30.104')] +[2023-02-24 13:31:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16658432. Throughput: 0: 957.9. Samples: 4164142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:17,877][00205] Avg episode reward: [(0, '30.695')] +[2023-02-24 13:31:21,723][11215] Updated weights for policy 0, policy_version 4070 (0.0028) +[2023-02-24 13:31:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16670720. Throughput: 0: 916.6. Samples: 4168516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:22,877][00205] Avg episode reward: [(0, '29.653')] +[2023-02-24 13:31:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16691200. Throughput: 0: 928.3. Samples: 4171124. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:31:27,872][00205] Avg episode reward: [(0, '29.709')] +[2023-02-24 13:31:31,542][11215] Updated weights for policy 0, policy_version 4080 (0.0019) +[2023-02-24 13:31:32,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 16715776. Throughput: 0: 968.9. Samples: 4177932. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:32,873][00205] Avg episode reward: [(0, '29.639')] +[2023-02-24 13:31:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16732160. Throughput: 0: 942.6. Samples: 4183632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:37,875][00205] Avg episode reward: [(0, '29.612')] +[2023-02-24 13:31:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16748544. Throughput: 0: 917.0. Samples: 4185810. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:31:42,873][00205] Avg episode reward: [(0, '27.973')] +[2023-02-24 13:31:43,966][11215] Updated weights for policy 0, policy_version 4090 (0.0015) +[2023-02-24 13:31:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 16769024. Throughput: 0: 935.5. Samples: 4191032. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:47,877][00205] Avg episode reward: [(0, '27.151')] +[2023-02-24 13:31:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16789504. Throughput: 0: 963.7. Samples: 4197782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:31:52,876][00205] Avg episode reward: [(0, '27.364')] +[2023-02-24 13:31:53,266][11215] Updated weights for policy 0, policy_version 4100 (0.0016) +[2023-02-24 13:31:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16805888. Throughput: 0: 955.4. Samples: 4200890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:31:57,878][00205] Avg episode reward: [(0, '27.743')] +[2023-02-24 13:32:02,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16822272. Throughput: 0: 911.9. Samples: 4205180. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:32:02,873][00205] Avg episode reward: [(0, '26.935')] +[2023-02-24 13:32:05,764][11215] Updated weights for policy 0, policy_version 4110 (0.0015) +[2023-02-24 13:32:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16842752. Throughput: 0: 941.7. Samples: 4210894. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:32:07,884][00205] Avg episode reward: [(0, '27.696')] +[2023-02-24 13:32:12,873][00205] Fps is (10 sec: 4504.2, 60 sec: 3822.7, 300 sec: 3776.6). Total num frames: 16867328. Throughput: 0: 959.8. Samples: 4214318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:32:12,877][00205] Avg episode reward: [(0, '30.044')] +[2023-02-24 13:32:15,059][11215] Updated weights for policy 0, policy_version 4120 (0.0017) +[2023-02-24 13:32:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16883712. Throughput: 0: 943.8. Samples: 4220404. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:32:17,877][00205] Avg episode reward: [(0, '32.229')] +[2023-02-24 13:32:22,870][00205] Fps is (10 sec: 2868.1, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16896000. Throughput: 0: 913.1. Samples: 4224720. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:32:22,879][00205] Avg episode reward: [(0, '33.955')] +[2023-02-24 13:32:27,093][11215] Updated weights for policy 0, policy_version 4130 (0.0011) +[2023-02-24 13:32:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 16916480. Throughput: 0: 925.8. Samples: 4227470. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:32:27,872][00205] Avg episode reward: [(0, '34.055')] +[2023-02-24 13:32:32,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3776.8). Total num frames: 16941056. Throughput: 0: 966.7. Samples: 4234534. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:32:32,873][00205] Avg episode reward: [(0, '34.715')] +[2023-02-24 13:32:32,875][11201] Saving new best policy, reward=34.715! +[2023-02-24 13:32:36,765][11215] Updated weights for policy 0, policy_version 4140 (0.0025) +[2023-02-24 13:32:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16957440. Throughput: 0: 940.8. Samples: 4240118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:32:37,873][00205] Avg episode reward: [(0, '36.120')] +[2023-02-24 13:32:37,884][11201] Saving new best policy, reward=36.120! +[2023-02-24 13:32:42,871][00205] Fps is (10 sec: 3276.4, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 16973824. Throughput: 0: 919.4. Samples: 4242266. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:32:42,880][00205] Avg episode reward: [(0, '35.290')] +[2023-02-24 13:32:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 16994304. Throughput: 0: 944.2. Samples: 4247670. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:32:47,877][00205] Avg episode reward: [(0, '34.136')] +[2023-02-24 13:32:47,894][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004149_16994304.pth... +[2023-02-24 13:32:48,010][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003928_16089088.pth +[2023-02-24 13:32:48,301][11215] Updated weights for policy 0, policy_version 4150 (0.0027) +[2023-02-24 13:32:52,870][00205] Fps is (10 sec: 4506.1, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 17018880. Throughput: 0: 971.9. Samples: 4254628. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:32:52,879][00205] Avg episode reward: [(0, '32.439')] +[2023-02-24 13:32:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17031168. Throughput: 0: 957.7. Samples: 4257412. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:32:57,875][00205] Avg episode reward: [(0, '32.089')] +[2023-02-24 13:32:59,334][11215] Updated weights for policy 0, policy_version 4160 (0.0020) +[2023-02-24 13:33:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17047552. Throughput: 0: 919.1. Samples: 4261762. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:33:02,877][00205] Avg episode reward: [(0, '32.550')] +[2023-02-24 13:33:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17068032. Throughput: 0: 954.0. Samples: 4267650. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:33:07,876][00205] Avg episode reward: [(0, '32.966')] +[2023-02-24 13:33:09,698][11215] Updated weights for policy 0, policy_version 4170 (0.0016) +[2023-02-24 13:33:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.9, 300 sec: 3776.7). Total num frames: 17092608. Throughput: 0: 968.8. Samples: 4271064. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:33:12,872][00205] Avg episode reward: [(0, '32.938')] +[2023-02-24 13:33:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17108992. Throughput: 0: 939.4. Samples: 4276808. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:33:17,872][00205] Avg episode reward: [(0, '32.660')] +[2023-02-24 13:33:21,800][11215] Updated weights for policy 0, policy_version 4180 (0.0027) +[2023-02-24 13:33:22,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17121280. Throughput: 0: 912.0. Samples: 4281156. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:33:22,878][00205] Avg episode reward: [(0, '31.478')] +[2023-02-24 13:33:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17145856. Throughput: 0: 928.6. Samples: 4284050. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:33:27,879][00205] Avg episode reward: [(0, '31.509')] +[2023-02-24 13:33:31,446][11215] Updated weights for policy 0, policy_version 4190 (0.0018) +[2023-02-24 13:33:32,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17166336. Throughput: 0: 958.2. Samples: 4290790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:33:32,873][00205] Avg episode reward: [(0, '32.313')] +[2023-02-24 13:33:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17182720. Throughput: 0: 916.0. Samples: 4295850. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:33:37,874][00205] Avg episode reward: [(0, '33.047')] +[2023-02-24 13:33:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3686.5, 300 sec: 3748.9). Total num frames: 17195008. Throughput: 0: 901.0. Samples: 4297958. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:33:42,874][00205] Avg episode reward: [(0, '32.472')] +[2023-02-24 13:33:44,122][11215] Updated weights for policy 0, policy_version 4200 (0.0020) +[2023-02-24 13:33:47,870][00205] Fps is (10 sec: 3686.2, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 17219584. Throughput: 0: 935.4. Samples: 4303854. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:33:47,879][00205] Avg episode reward: [(0, '32.023')] +[2023-02-24 13:33:52,823][11215] Updated weights for policy 0, policy_version 4210 (0.0021) +[2023-02-24 13:33:52,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17244160. Throughput: 0: 960.3. Samples: 4310864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:33:52,877][00205] Avg episode reward: [(0, '32.352')] +[2023-02-24 13:33:57,870][00205] Fps is (10 sec: 3686.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17256448. Throughput: 0: 939.6. Samples: 4313348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:33:57,873][00205] Avg episode reward: [(0, '33.993')] +[2023-02-24 13:34:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17272832. Throughput: 0: 907.8. Samples: 4317660. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:02,872][00205] Avg episode reward: [(0, '34.625')] +[2023-02-24 13:34:05,487][11215] Updated weights for policy 0, policy_version 4220 (0.0018) +[2023-02-24 13:34:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17293312. Throughput: 0: 950.6. Samples: 4323932. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:34:07,874][00205] Avg episode reward: [(0, '33.173')] +[2023-02-24 13:34:12,879][00205] Fps is (10 sec: 4503.3, 60 sec: 3754.4, 300 sec: 3762.8). Total num frames: 17317888. Throughput: 0: 961.7. Samples: 4327330. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:12,882][00205] Avg episode reward: [(0, '31.212')] +[2023-02-24 13:34:15,093][11215] Updated weights for policy 0, policy_version 4230 (0.0012) +[2023-02-24 13:34:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3762.8). Total num frames: 17330176. Throughput: 0: 935.1. Samples: 4332870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:17,878][00205] Avg episode reward: [(0, '30.270')] +[2023-02-24 13:34:22,870][00205] Fps is (10 sec: 2868.6, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17346560. Throughput: 0: 920.5. Samples: 4337274. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:22,873][00205] Avg episode reward: [(0, '30.476')] +[2023-02-24 13:34:26,653][11215] Updated weights for policy 0, policy_version 4240 (0.0013) +[2023-02-24 13:34:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17371136. Throughput: 0: 945.7. Samples: 4340514. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:27,876][00205] Avg episode reward: [(0, '29.960')] +[2023-02-24 13:34:32,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17391616. Throughput: 0: 969.8. Samples: 4347496. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 13:34:32,874][00205] Avg episode reward: [(0, '29.325')] +[2023-02-24 13:34:37,210][11215] Updated weights for policy 0, policy_version 4250 (0.0015) +[2023-02-24 13:34:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17408000. Throughput: 0: 926.6. Samples: 4352562. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:37,873][00205] Avg episode reward: [(0, '29.577')] +[2023-02-24 13:34:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17424384. Throughput: 0: 918.4. Samples: 4354676. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:42,872][00205] Avg episode reward: [(0, '31.387')] +[2023-02-24 13:34:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17444864. Throughput: 0: 957.4. Samples: 4360744. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:47,872][00205] Avg episode reward: [(0, '31.086')] +[2023-02-24 13:34:47,922][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004260_17448960.pth... +[2023-02-24 13:34:47,925][11215] Updated weights for policy 0, policy_version 4260 (0.0018) +[2023-02-24 13:34:48,064][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004038_16539648.pth +[2023-02-24 13:34:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17469440. Throughput: 0: 972.8. Samples: 4367708. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:34:52,875][00205] Avg episode reward: [(0, '30.335')] +[2023-02-24 13:34:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17481728. Throughput: 0: 945.0. Samples: 4369850. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:57,877][00205] Avg episode reward: [(0, '29.849')] +[2023-02-24 13:34:59,967][11215] Updated weights for policy 0, policy_version 4270 (0.0018) +[2023-02-24 13:35:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17498112. Throughput: 0: 914.1. Samples: 4374004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:35:02,872][00205] Avg episode reward: [(0, '31.191')] +[2023-02-24 13:35:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17522688. Throughput: 0: 963.6. Samples: 4380634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:07,873][00205] Avg episode reward: [(0, '32.240')] +[2023-02-24 13:35:09,432][11215] Updated weights for policy 0, policy_version 4280 (0.0018) +[2023-02-24 13:35:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3755.0, 300 sec: 3762.8). Total num frames: 17543168. Throughput: 0: 969.5. Samples: 4384140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:12,873][00205] Avg episode reward: [(0, '31.934')] +[2023-02-24 13:35:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 17559552. Throughput: 0: 930.0. Samples: 4389344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:17,875][00205] Avg episode reward: [(0, '31.448')] +[2023-02-24 13:35:21,830][11215] Updated weights for policy 0, policy_version 4290 (0.0011) +[2023-02-24 13:35:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17575936. Throughput: 0: 919.2. Samples: 4393924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:35:22,873][00205] Avg episode reward: [(0, '31.255')] +[2023-02-24 13:35:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17596416. Throughput: 0: 946.6. Samples: 4397274. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:27,872][00205] Avg episode reward: [(0, '32.735')] +[2023-02-24 13:35:30,785][11215] Updated weights for policy 0, policy_version 4300 (0.0012) +[2023-02-24 13:35:32,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3754.5, 300 sec: 3762.7). Total num frames: 17616896. Throughput: 0: 965.7. Samples: 4404202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:35:32,875][00205] Avg episode reward: [(0, '32.606')] +[2023-02-24 13:35:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17633280. Throughput: 0: 912.5. Samples: 4408772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:35:37,872][00205] Avg episode reward: [(0, '30.765')] +[2023-02-24 13:35:42,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17649664. Throughput: 0: 912.8. Samples: 4410926. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:35:42,877][00205] Avg episode reward: [(0, '31.647')] +[2023-02-24 13:35:43,238][11215] Updated weights for policy 0, policy_version 4310 (0.0018) +[2023-02-24 13:35:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17674240. Throughput: 0: 963.7. Samples: 4417370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:47,872][00205] Avg episode reward: [(0, '32.506')] +[2023-02-24 13:35:52,482][11215] Updated weights for policy 0, policy_version 4320 (0.0016) +[2023-02-24 13:35:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17694720. Throughput: 0: 965.4. Samples: 4424076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:52,876][00205] Avg episode reward: [(0, '32.686')] +[2023-02-24 13:35:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17707008. Throughput: 0: 933.8. Samples: 4426162. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:57,878][00205] Avg episode reward: [(0, '31.490')] +[2023-02-24 13:36:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17727488. Throughput: 0: 916.5. Samples: 4430588. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:36:02,875][00205] Avg episode reward: [(0, '30.287')] +[2023-02-24 13:36:04,490][11215] Updated weights for policy 0, policy_version 4330 (0.0012) +[2023-02-24 13:36:07,871][00205] Fps is (10 sec: 4095.7, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 17747968. Throughput: 0: 969.9. Samples: 4437570. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:36:07,876][00205] Avg episode reward: [(0, '29.798')] +[2023-02-24 13:36:12,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3754.6, 300 sec: 3762.7). Total num frames: 17768448. Throughput: 0: 971.0. Samples: 4440970. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:36:12,874][00205] Avg episode reward: [(0, '30.527')] +[2023-02-24 13:36:14,723][11215] Updated weights for policy 0, policy_version 4340 (0.0026) +[2023-02-24 13:36:17,871][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 17784832. Throughput: 0: 925.7. Samples: 4445858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:36:17,878][00205] Avg episode reward: [(0, '30.486')] +[2023-02-24 13:36:22,870][00205] Fps is (10 sec: 3277.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17801216. Throughput: 0: 933.2. Samples: 4450768. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:36:22,872][00205] Avg episode reward: [(0, '29.989')] +[2023-02-24 13:36:26,034][11215] Updated weights for policy 0, policy_version 4350 (0.0013) +[2023-02-24 13:36:27,870][00205] Fps is (10 sec: 4096.3, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17825792. Throughput: 0: 958.6. Samples: 4454064. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:36:27,879][00205] Avg episode reward: [(0, '32.190')] +[2023-02-24 13:36:32,873][00205] Fps is (10 sec: 4504.0, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 17846272. Throughput: 0: 967.0. Samples: 4460888. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:36:32,875][00205] Avg episode reward: [(0, '31.899')] +[2023-02-24 13:36:37,228][11215] Updated weights for policy 0, policy_version 4360 (0.0014) +[2023-02-24 13:36:37,877][00205] Fps is (10 sec: 3274.4, 60 sec: 3754.2, 300 sec: 3762.7). Total num frames: 17858560. Throughput: 0: 914.3. Samples: 4465228. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:36:37,885][00205] Avg episode reward: [(0, '31.766')] +[2023-02-24 13:36:42,870][00205] Fps is (10 sec: 2868.2, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17874944. Throughput: 0: 915.7. Samples: 4467370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:36:42,873][00205] Avg episode reward: [(0, '31.646')] +[2023-02-24 13:36:47,420][11215] Updated weights for policy 0, policy_version 4370 (0.0017) +[2023-02-24 13:36:47,870][00205] Fps is (10 sec: 4099.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17899520. Throughput: 0: 964.9. Samples: 4474008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:36:47,881][00205] Avg episode reward: [(0, '30.047')] +[2023-02-24 13:36:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004370_17899520.pth... +[2023-02-24 13:36:48,033][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004149_16994304.pth +[2023-02-24 13:36:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17920000. Throughput: 0: 948.7. Samples: 4480262. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:36:52,876][00205] Avg episode reward: [(0, '30.306')] +[2023-02-24 13:36:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17932288. Throughput: 0: 919.7. Samples: 4482356. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:36:57,875][00205] Avg episode reward: [(0, '28.813')] +[2023-02-24 13:36:59,646][11215] Updated weights for policy 0, policy_version 4380 (0.0033) +[2023-02-24 13:37:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 17948672. Throughput: 0: 910.0. Samples: 4486808. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:37:02,873][00205] Avg episode reward: [(0, '27.859')] +[2023-02-24 13:37:07,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17973248. Throughput: 0: 955.5. Samples: 4493766. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:37:07,873][00205] Avg episode reward: [(0, '29.218')] +[2023-02-24 13:37:08,913][11215] Updated weights for policy 0, policy_version 4390 (0.0012) +[2023-02-24 13:37:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.8, 300 sec: 3762.8). Total num frames: 17993728. Throughput: 0: 959.2. Samples: 4497226. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:37:12,880][00205] Avg episode reward: [(0, '31.056')] +[2023-02-24 13:37:17,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18010112. Throughput: 0: 911.0. Samples: 4501882. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:37:17,878][00205] Avg episode reward: [(0, '30.981')] +[2023-02-24 13:37:21,376][11215] Updated weights for policy 0, policy_version 4400 (0.0015) +[2023-02-24 13:37:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18026496. Throughput: 0: 930.7. Samples: 4507104. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:37:22,872][00205] Avg episode reward: [(0, '30.979')] +[2023-02-24 13:37:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18051072. Throughput: 0: 960.0. Samples: 4510572. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:37:27,879][00205] Avg episode reward: [(0, '31.649')] +[2023-02-24 13:37:30,306][11215] Updated weights for policy 0, policy_version 4410 (0.0023) +[2023-02-24 13:37:32,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3686.6, 300 sec: 3762.8). Total num frames: 18067456. Throughput: 0: 960.0. Samples: 4517206. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:37:32,873][00205] Avg episode reward: [(0, '31.672')] +[2023-02-24 13:37:37,873][00205] Fps is (10 sec: 3275.8, 60 sec: 3754.9, 300 sec: 3762.7). Total num frames: 18083840. Throughput: 0: 918.5. Samples: 4521598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:37:37,875][00205] Avg episode reward: [(0, '31.408')] +[2023-02-24 13:37:42,551][11215] Updated weights for policy 0, policy_version 4420 (0.0025) +[2023-02-24 13:37:42,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 18104320. Throughput: 0: 920.1. Samples: 4523760. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:37:42,872][00205] Avg episode reward: [(0, '31.068')] +[2023-02-24 13:37:47,870][00205] Fps is (10 sec: 4507.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 18128896. Throughput: 0: 978.8. Samples: 4530854. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:37:47,872][00205] Avg episode reward: [(0, '29.337')] +[2023-02-24 13:37:51,721][11215] Updated weights for policy 0, policy_version 4430 (0.0013) +[2023-02-24 13:37:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18145280. Throughput: 0: 962.1. Samples: 4537058. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:37:52,877][00205] Avg episode reward: [(0, '28.751')] +[2023-02-24 13:37:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18161664. Throughput: 0: 932.4. Samples: 4539182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:37:57,878][00205] Avg episode reward: [(0, '28.305')] +[2023-02-24 13:38:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 18178048. Throughput: 0: 936.1. Samples: 4544008. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:38:02,872][00205] Avg episode reward: [(0, '29.431')] +[2023-02-24 13:38:03,901][11215] Updated weights for policy 0, policy_version 4440 (0.0023) +[2023-02-24 13:38:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3762.8). Total num frames: 18202624. Throughput: 0: 975.9. Samples: 4551018. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:38:07,872][00205] Avg episode reward: [(0, '29.386')] +[2023-02-24 13:38:12,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18223104. Throughput: 0: 972.9. Samples: 4554352. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:38:12,876][00205] Avg episode reward: [(0, '31.248')] +[2023-02-24 13:38:14,241][11215] Updated weights for policy 0, policy_version 4450 (0.0022) +[2023-02-24 13:38:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18235392. Throughput: 0: 921.5. Samples: 4558672. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:38:17,879][00205] Avg episode reward: [(0, '31.288')] +[2023-02-24 13:38:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 18255872. Throughput: 0: 943.7. Samples: 4564060. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:38:22,872][00205] Avg episode reward: [(0, '31.926')] +[2023-02-24 13:38:25,306][11215] Updated weights for policy 0, policy_version 4460 (0.0025) +[2023-02-24 13:38:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18276352. Throughput: 0: 971.2. Samples: 4567464. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:38:27,873][00205] Avg episode reward: [(0, '33.241')] +[2023-02-24 13:38:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18296832. Throughput: 0: 953.2. Samples: 4573748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:38:32,880][00205] Avg episode reward: [(0, '32.765')] +[2023-02-24 13:38:36,710][11215] Updated weights for policy 0, policy_version 4470 (0.0011) +[2023-02-24 13:38:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.9, 300 sec: 3776.6). Total num frames: 18309120. Throughput: 0: 912.5. Samples: 4578120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:38:37,877][00205] Avg episode reward: [(0, '32.218')] +[2023-02-24 13:38:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18329600. Throughput: 0: 920.8. Samples: 4580616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:38:42,873][00205] Avg episode reward: [(0, '32.247')] +[2023-02-24 13:38:46,660][11215] Updated weights for policy 0, policy_version 4480 (0.0034) +[2023-02-24 13:38:47,870][00205] Fps is (10 sec: 4505.4, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 18354176. Throughput: 0: 966.6. Samples: 4587504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:38:47,872][00205] Avg episode reward: [(0, '30.376')] +[2023-02-24 13:38:47,882][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004481_18354176.pth... +[2023-02-24 13:38:48,010][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004260_17448960.pth +[2023-02-24 13:38:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18370560. Throughput: 0: 938.8. Samples: 4593266. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:38:52,875][00205] Avg episode reward: [(0, '29.966')] +[2023-02-24 13:38:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 18386944. Throughput: 0: 912.1. Samples: 4595396. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:38:57,872][00205] Avg episode reward: [(0, '30.101')] +[2023-02-24 13:38:59,192][11215] Updated weights for policy 0, policy_version 4490 (0.0016) +[2023-02-24 13:39:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18407424. Throughput: 0: 929.7. Samples: 4600508. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:39:02,877][00205] Avg episode reward: [(0, '29.211')] +[2023-02-24 13:39:07,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18427904. Throughput: 0: 965.3. Samples: 4607500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:39:07,874][00205] Avg episode reward: [(0, '31.558')] +[2023-02-24 13:39:08,138][11215] Updated weights for policy 0, policy_version 4500 (0.0017) +[2023-02-24 13:39:12,875][00205] Fps is (10 sec: 3684.4, 60 sec: 3686.1, 300 sec: 3776.6). Total num frames: 18444288. Throughput: 0: 957.4. Samples: 4610554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:39:12,880][00205] Avg episode reward: [(0, '30.965')] +[2023-02-24 13:39:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 18460672. Throughput: 0: 914.6. Samples: 4614906. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:39:17,877][00205] Avg episode reward: [(0, '31.455')] +[2023-02-24 13:39:20,479][11215] Updated weights for policy 0, policy_version 4510 (0.0017) +[2023-02-24 13:39:22,870][00205] Fps is (10 sec: 3688.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18481152. Throughput: 0: 946.6. Samples: 4620718. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:39:22,882][00205] Avg episode reward: [(0, '32.129')] +[2023-02-24 13:39:27,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18505728. Throughput: 0: 968.6. Samples: 4624202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:39:27,881][00205] Avg episode reward: [(0, '32.640')] +[2023-02-24 13:39:29,696][11215] Updated weights for policy 0, policy_version 4520 (0.0014) +[2023-02-24 13:39:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18522112. Throughput: 0: 947.4. Samples: 4630138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:39:32,874][00205] Avg episode reward: [(0, '33.105')] +[2023-02-24 13:39:37,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 18534400. Throughput: 0: 916.4. Samples: 4634506. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:39:37,879][00205] Avg episode reward: [(0, '33.413')] +[2023-02-24 13:39:41,942][11215] Updated weights for policy 0, policy_version 4530 (0.0016) +[2023-02-24 13:39:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18558976. Throughput: 0: 930.5. Samples: 4637266. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:39:42,875][00205] Avg episode reward: [(0, '31.860')] +[2023-02-24 13:39:47,870][00205] Fps is (10 sec: 4505.9, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18579456. Throughput: 0: 973.5. Samples: 4644316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:39:47,871][00205] Avg episode reward: [(0, '33.947')] +[2023-02-24 13:39:51,730][11215] Updated weights for policy 0, policy_version 4540 (0.0011) +[2023-02-24 13:39:52,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 18595840. Throughput: 0: 941.1. Samples: 4649848. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:39:52,877][00205] Avg episode reward: [(0, '33.288')] +[2023-02-24 13:39:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 18612224. Throughput: 0: 921.8. Samples: 4652028. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:39:57,876][00205] Avg episode reward: [(0, '32.833')] +[2023-02-24 13:40:02,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18632704. Throughput: 0: 941.5. Samples: 4657272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:40:02,872][00205] Avg episode reward: [(0, '32.925')] +[2023-02-24 13:40:03,409][11215] Updated weights for policy 0, policy_version 4550 (0.0014) +[2023-02-24 13:40:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18657280. Throughput: 0: 968.4. Samples: 4664296. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:40:07,875][00205] Avg episode reward: [(0, '32.421')] +[2023-02-24 13:40:12,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3823.2, 300 sec: 3776.6). Total num frames: 18673664. Throughput: 0: 952.5. Samples: 4667064. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:40:12,874][00205] Avg episode reward: [(0, '32.367')] +[2023-02-24 13:40:14,224][11215] Updated weights for policy 0, policy_version 4560 (0.0014) +[2023-02-24 13:40:17,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 18685952. Throughput: 0: 916.9. Samples: 4671398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:40:17,874][00205] Avg episode reward: [(0, '33.540')] +[2023-02-24 13:40:22,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18710528. Throughput: 0: 956.8. Samples: 4677562. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:40:22,872][00205] Avg episode reward: [(0, '31.933')] +[2023-02-24 13:40:24,288][11215] Updated weights for policy 0, policy_version 4570 (0.0020) +[2023-02-24 13:40:27,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3754.6, 300 sec: 3776.7). Total num frames: 18731008. Throughput: 0: 974.3. Samples: 4681112. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:40:27,872][00205] Avg episode reward: [(0, '30.706')] +[2023-02-24 13:40:32,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 18747392. Throughput: 0: 941.7. Samples: 4686694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:40:32,877][00205] Avg episode reward: [(0, '31.070')] +[2023-02-24 13:40:36,269][11215] Updated weights for policy 0, policy_version 4580 (0.0018) +[2023-02-24 13:40:37,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3823.0, 300 sec: 3776.6). Total num frames: 18763776. Throughput: 0: 917.0. Samples: 4691112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:40:37,872][00205] Avg episode reward: [(0, '30.738')] +[2023-02-24 13:40:42,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18784256. Throughput: 0: 934.1. Samples: 4694062. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:40:42,873][00205] Avg episode reward: [(0, '31.084')] +[2023-02-24 13:40:45,892][11215] Updated weights for policy 0, policy_version 4590 (0.0022) +[2023-02-24 13:40:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18808832. Throughput: 0: 974.7. Samples: 4701134. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:40:47,872][00205] Avg episode reward: [(0, '30.620')] +[2023-02-24 13:40:47,880][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004592_18808832.pth... +[2023-02-24 13:40:48,029][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004370_17899520.pth +[2023-02-24 13:40:52,871][00205] Fps is (10 sec: 3685.9, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 18821120. Throughput: 0: 937.5. Samples: 4706484. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:40:52,875][00205] Avg episode reward: [(0, '30.381')] +[2023-02-24 13:40:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18837504. Throughput: 0: 923.6. Samples: 4708624. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:40:57,874][00205] Avg episode reward: [(0, '31.576')] +[2023-02-24 13:40:58,500][11215] Updated weights for policy 0, policy_version 4600 (0.0011) +[2023-02-24 13:41:02,870][00205] Fps is (10 sec: 4096.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18862080. Throughput: 0: 950.1. Samples: 4714150. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:41:02,880][00205] Avg episode reward: [(0, '31.787')] +[2023-02-24 13:41:07,349][11215] Updated weights for policy 0, policy_version 4610 (0.0026) +[2023-02-24 13:41:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18882560. Throughput: 0: 969.6. Samples: 4721196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:41:07,873][00205] Avg episode reward: [(0, '31.181')] +[2023-02-24 13:41:12,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3754.8, 300 sec: 3776.7). Total num frames: 18898944. Throughput: 0: 949.5. Samples: 4723840. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:41:12,873][00205] Avg episode reward: [(0, '32.176')] +[2023-02-24 13:41:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18911232. Throughput: 0: 924.0. Samples: 4728272. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:41:17,874][00205] Avg episode reward: [(0, '31.709')] +[2023-02-24 13:41:19,673][11215] Updated weights for policy 0, policy_version 4620 (0.0025) +[2023-02-24 13:41:22,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 18935808. Throughput: 0: 963.2. Samples: 4734458. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:41:22,873][00205] Avg episode reward: [(0, '31.460')] +[2023-02-24 13:41:27,870][00205] Fps is (10 sec: 4915.1, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18960384. Throughput: 0: 975.9. Samples: 4737980. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:41:27,881][00205] Avg episode reward: [(0, '32.309')] +[2023-02-24 13:41:28,487][11215] Updated weights for policy 0, policy_version 4630 (0.0022) +[2023-02-24 13:41:32,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 18976768. Throughput: 0: 945.1. Samples: 4743662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:41:32,872][00205] Avg episode reward: [(0, '31.874')] +[2023-02-24 13:41:37,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18989056. Throughput: 0: 922.5. Samples: 4747996. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:41:37,875][00205] Avg episode reward: [(0, '33.078')] +[2023-02-24 13:41:40,808][11215] Updated weights for policy 0, policy_version 4640 (0.0026) +[2023-02-24 13:41:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19013632. Throughput: 0: 944.5. Samples: 4751128. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:41:42,876][00205] Avg episode reward: [(0, '32.005')] +[2023-02-24 13:41:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19034112. Throughput: 0: 977.3. Samples: 4758130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:41:47,875][00205] Avg episode reward: [(0, '30.901')] +[2023-02-24 13:41:50,662][11215] Updated weights for policy 0, policy_version 4650 (0.0016) +[2023-02-24 13:41:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 19050496. Throughput: 0: 937.7. Samples: 4763392. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:41:52,881][00205] Avg episode reward: [(0, '31.801')] +[2023-02-24 13:41:57,872][00205] Fps is (10 sec: 3276.0, 60 sec: 3822.8, 300 sec: 3790.5). Total num frames: 19066880. Throughput: 0: 926.2. Samples: 4765520. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:41:57,879][00205] Avg episode reward: [(0, '32.167')] +[2023-02-24 13:42:02,148][11215] Updated weights for policy 0, policy_version 4660 (0.0017) +[2023-02-24 13:42:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19087360. Throughput: 0: 955.6. Samples: 4771276. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:42:02,873][00205] Avg episode reward: [(0, '29.330')] +[2023-02-24 13:42:07,870][00205] Fps is (10 sec: 4506.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19111936. Throughput: 0: 972.3. Samples: 4778212. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:42:07,873][00205] Avg episode reward: [(0, '29.882')] +[2023-02-24 13:42:12,785][11215] Updated weights for policy 0, policy_version 4670 (0.0015) +[2023-02-24 13:42:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19128320. Throughput: 0: 949.2. Samples: 4780692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:42:12,872][00205] Avg episode reward: [(0, '30.252')] +[2023-02-24 13:42:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19140608. Throughput: 0: 917.6. Samples: 4784952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:42:17,872][00205] Avg episode reward: [(0, '30.203')] +[2023-02-24 13:42:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3776.7). Total num frames: 19165184. Throughput: 0: 960.3. Samples: 4791208. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:42:22,873][00205] Avg episode reward: [(0, '29.424')] +[2023-02-24 13:42:23,628][11215] Updated weights for policy 0, policy_version 4680 (0.0020) +[2023-02-24 13:42:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19185664. Throughput: 0: 969.2. Samples: 4794740. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:42:27,872][00205] Avg episode reward: [(0, '28.006')] +[2023-02-24 13:42:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.6). Total num frames: 19202048. Throughput: 0: 935.9. Samples: 4800244. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:42:32,872][00205] Avg episode reward: [(0, '28.539')] +[2023-02-24 13:42:35,105][11215] Updated weights for policy 0, policy_version 4690 (0.0016) +[2023-02-24 13:42:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19218432. Throughput: 0: 918.4. Samples: 4804718. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:42:37,881][00205] Avg episode reward: [(0, '29.047')] +[2023-02-24 13:42:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 19238912. Throughput: 0: 945.2. Samples: 4808052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:42:42,872][00205] Avg episode reward: [(0, '27.868')] +[2023-02-24 13:42:44,645][11215] Updated weights for policy 0, policy_version 4700 (0.0011) +[2023-02-24 13:42:47,874][00205] Fps is (10 sec: 4503.7, 60 sec: 3822.7, 300 sec: 3790.5). Total num frames: 19263488. Throughput: 0: 973.2. Samples: 4815076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:42:47,877][00205] Avg episode reward: [(0, '28.748')] +[2023-02-24 13:42:47,885][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004703_19263488.pth... +[2023-02-24 13:42:48,040][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004481_18354176.pth +[2023-02-24 13:42:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19275776. Throughput: 0: 929.0. Samples: 4820018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:42:52,878][00205] Avg episode reward: [(0, '27.869')] +[2023-02-24 13:42:57,042][11215] Updated weights for policy 0, policy_version 4710 (0.0023) +[2023-02-24 13:42:57,870][00205] Fps is (10 sec: 2868.4, 60 sec: 3754.8, 300 sec: 3776.6). Total num frames: 19292160. Throughput: 0: 922.1. Samples: 4822186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:42:57,873][00205] Avg episode reward: [(0, '28.164')] +[2023-02-24 13:43:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19316736. Throughput: 0: 957.1. Samples: 4828020. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:43:02,873][00205] Avg episode reward: [(0, '28.988')] +[2023-02-24 13:43:06,185][11215] Updated weights for policy 0, policy_version 4720 (0.0012) +[2023-02-24 13:43:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19337216. Throughput: 0: 974.3. Samples: 4835052. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:43:07,873][00205] Avg episode reward: [(0, '27.640')] +[2023-02-24 13:43:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19353600. Throughput: 0: 947.6. Samples: 4837380. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:43:12,873][00205] Avg episode reward: [(0, '28.411')] +[2023-02-24 13:43:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19369984. Throughput: 0: 924.0. Samples: 4841824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:43:17,873][00205] Avg episode reward: [(0, '28.179')] +[2023-02-24 13:43:18,659][11215] Updated weights for policy 0, policy_version 4730 (0.0034) +[2023-02-24 13:43:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19390464. Throughput: 0: 969.9. Samples: 4848362. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:43:22,873][00205] Avg episode reward: [(0, '28.688')] +[2023-02-24 13:43:27,183][11215] Updated weights for policy 0, policy_version 4740 (0.0017) +[2023-02-24 13:43:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19415040. Throughput: 0: 973.9. Samples: 4851878. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:43:27,872][00205] Avg episode reward: [(0, '28.992')] +[2023-02-24 13:43:32,870][00205] Fps is (10 sec: 3686.2, 60 sec: 3754.6, 300 sec: 3790.5). Total num frames: 19427328. Throughput: 0: 935.0. Samples: 4857146. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:43:32,875][00205] Avg episode reward: [(0, '29.533')] +[2023-02-24 13:43:37,872][00205] Fps is (10 sec: 2866.5, 60 sec: 3754.5, 300 sec: 3776.6). Total num frames: 19443712. Throughput: 0: 921.0. Samples: 4861466. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:43:37,875][00205] Avg episode reward: [(0, '29.846')] +[2023-02-24 13:43:39,807][11215] Updated weights for policy 0, policy_version 4750 (0.0029) +[2023-02-24 13:43:42,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19468288. Throughput: 0: 949.6. Samples: 4864920. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:43:42,872][00205] Avg episode reward: [(0, '29.126')] +[2023-02-24 13:43:47,873][00205] Fps is (10 sec: 4505.2, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19488768. Throughput: 0: 975.4. Samples: 4871916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:43:47,876][00205] Avg episode reward: [(0, '30.106')] +[2023-02-24 13:43:49,625][11215] Updated weights for policy 0, policy_version 4760 (0.0016) +[2023-02-24 13:43:52,874][00205] Fps is (10 sec: 3684.9, 60 sec: 3822.7, 300 sec: 3790.5). Total num frames: 19505152. Throughput: 0: 924.6. Samples: 4876662. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:43:52,876][00205] Avg episode reward: [(0, '30.569')] +[2023-02-24 13:43:57,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 19521536. Throughput: 0: 920.2. Samples: 4878788. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:43:57,872][00205] Avg episode reward: [(0, '30.493')] +[2023-02-24 13:44:00,981][11215] Updated weights for policy 0, policy_version 4770 (0.0026) +[2023-02-24 13:44:02,870][00205] Fps is (10 sec: 4097.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19546112. Throughput: 0: 961.9. Samples: 4885108. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:44:02,872][00205] Avg episode reward: [(0, '29.534')] +[2023-02-24 13:44:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 19566592. Throughput: 0: 967.4. Samples: 4891896. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:44:07,872][00205] Avg episode reward: [(0, '30.184')] +[2023-02-24 13:44:11,976][11215] Updated weights for policy 0, policy_version 4780 (0.0011) +[2023-02-24 13:44:12,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19578880. Throughput: 0: 936.6. Samples: 4894026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:44:12,874][00205] Avg episode reward: [(0, '31.095')] +[2023-02-24 13:44:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19595264. Throughput: 0: 916.0. Samples: 4898364. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:44:17,872][00205] Avg episode reward: [(0, '30.332')] +[2023-02-24 13:44:22,239][11215] Updated weights for policy 0, policy_version 4790 (0.0017) +[2023-02-24 13:44:22,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19619840. Throughput: 0: 977.0. Samples: 4905430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:44:22,873][00205] Avg episode reward: [(0, '30.039')] +[2023-02-24 13:44:27,874][00205] Fps is (10 sec: 4503.6, 60 sec: 3754.4, 300 sec: 3790.5). Total num frames: 19640320. Throughput: 0: 978.0. Samples: 4908936. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:44:27,877][00205] Avg episode reward: [(0, '29.960')] +[2023-02-24 13:44:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 19656704. Throughput: 0: 931.4. Samples: 4913828. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:44:32,878][00205] Avg episode reward: [(0, '29.704')] +[2023-02-24 13:44:33,987][11215] Updated weights for policy 0, policy_version 4800 (0.0026) +[2023-02-24 13:44:37,870][00205] Fps is (10 sec: 3278.2, 60 sec: 3823.1, 300 sec: 3776.7). Total num frames: 19673088. Throughput: 0: 934.4. Samples: 4918708. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:44:37,873][00205] Avg episode reward: [(0, '30.977')] +[2023-02-24 13:44:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19697664. Throughput: 0: 962.5. Samples: 4922100. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:44:42,872][00205] Avg episode reward: [(0, '32.188')] +[2023-02-24 13:44:43,543][11215] Updated weights for policy 0, policy_version 4810 (0.0024) +[2023-02-24 13:44:47,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3823.1, 300 sec: 3804.4). Total num frames: 19718144. Throughput: 0: 978.1. Samples: 4929124. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:44:47,874][00205] Avg episode reward: [(0, '30.658')] +[2023-02-24 13:44:47,886][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004814_19718144.pth... +[2023-02-24 13:44:48,035][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004592_18808832.pth +[2023-02-24 13:44:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.9, 300 sec: 3790.5). Total num frames: 19730432. Throughput: 0: 923.7. Samples: 4933462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:44:52,880][00205] Avg episode reward: [(0, '32.048')] +[2023-02-24 13:44:56,130][11215] Updated weights for policy 0, policy_version 4820 (0.0019) +[2023-02-24 13:44:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19746816. Throughput: 0: 923.3. Samples: 4935574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:44:57,872][00205] Avg episode reward: [(0, '31.180')] +[2023-02-24 13:45:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19771392. Throughput: 0: 963.9. Samples: 4941740. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:45:02,872][00205] Avg episode reward: [(0, '30.934')] +[2023-02-24 13:45:05,453][11215] Updated weights for policy 0, policy_version 4830 (0.0019) +[2023-02-24 13:45:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 19787776. Throughput: 0: 946.6. Samples: 4948026. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:45:07,874][00205] Avg episode reward: [(0, '30.586')] +[2023-02-24 13:45:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19804160. Throughput: 0: 916.7. Samples: 4950184. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:45:12,874][00205] Avg episode reward: [(0, '31.153')] +[2023-02-24 13:45:17,852][11215] Updated weights for policy 0, policy_version 4840 (0.0012) +[2023-02-24 13:45:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19824640. Throughput: 0: 908.3. Samples: 4954702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:45:17,878][00205] Avg episode reward: [(0, '29.579')] +[2023-02-24 13:45:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19845120. Throughput: 0: 957.3. Samples: 4961788. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:45:22,879][00205] Avg episode reward: [(0, '30.000')] +[2023-02-24 13:45:27,014][11215] Updated weights for policy 0, policy_version 4850 (0.0021) +[2023-02-24 13:45:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.9, 300 sec: 3790.5). Total num frames: 19865600. Throughput: 0: 959.2. Samples: 4965266. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:45:27,878][00205] Avg episode reward: [(0, '29.956')] +[2023-02-24 13:45:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19881984. Throughput: 0: 906.3. Samples: 4969906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:45:32,874][00205] Avg episode reward: [(0, '31.605')] +[2023-02-24 13:45:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19898368. Throughput: 0: 920.4. Samples: 4974880. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:45:37,878][00205] Avg episode reward: [(0, '30.971')] +[2023-02-24 13:45:39,131][11215] Updated weights for policy 0, policy_version 4860 (0.0014) +[2023-02-24 13:45:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19922944. Throughput: 0: 951.0. Samples: 4978368. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:45:42,879][00205] Avg episode reward: [(0, '30.475')] +[2023-02-24 13:45:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3790.6). Total num frames: 19939328. Throughput: 0: 963.4. Samples: 4985094. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:45:47,880][00205] Avg episode reward: [(0, '30.778')] +[2023-02-24 13:45:49,491][11215] Updated weights for policy 0, policy_version 4870 (0.0015) +[2023-02-24 13:45:52,875][00205] Fps is (10 sec: 3275.0, 60 sec: 3754.3, 300 sec: 3790.5). Total num frames: 19955712. Throughput: 0: 921.1. Samples: 4989480. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:45:52,879][00205] Avg episode reward: [(0, '31.618')] +[2023-02-24 13:45:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19976192. Throughput: 0: 922.4. Samples: 4991690. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:45:57,880][00205] Avg episode reward: [(0, '31.637')] +[2023-02-24 13:46:00,362][11215] Updated weights for policy 0, policy_version 4880 (0.0016) +[2023-02-24 13:46:02,870][00205] Fps is (10 sec: 4098.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19996672. Throughput: 0: 975.7. Samples: 4998610. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:46:02,882][00205] Avg episode reward: [(0, '31.607')] +[2023-02-24 13:46:03,954][11201] Stopping Batcher_0... +[2023-02-24 13:46:03,954][11201] Loop batcher_evt_loop terminating... +[2023-02-24 13:46:03,954][00205] Component Batcher_0 stopped! +[2023-02-24 13:46:03,960][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... +[2023-02-24 13:46:04,002][11215] Weights refcount: 2 0 +[2023-02-24 13:46:04,006][00205] Component InferenceWorker_p0-w0 stopped! +[2023-02-24 13:46:04,017][11215] Stopping InferenceWorker_p0-w0... +[2023-02-24 13:46:04,017][11215] Loop inference_proc0-0_evt_loop terminating... +[2023-02-24 13:46:04,036][11221] Stopping RolloutWorker_w1... +[2023-02-24 13:46:04,031][11226] Stopping RolloutWorker_w7... +[2023-02-24 13:46:04,032][00205] Component RolloutWorker_w7 stopped! +[2023-02-24 13:46:04,038][00205] Component RolloutWorker_w1 stopped! +[2023-02-24 13:46:04,041][00205] Component RolloutWorker_w5 stopped! +[2023-02-24 13:46:04,041][11224] Stopping RolloutWorker_w5... +[2023-02-24 13:46:04,044][11224] Loop rollout_proc5_evt_loop terminating... +[2023-02-24 13:46:04,038][11221] Loop rollout_proc1_evt_loop terminating... +[2023-02-24 13:46:04,045][11226] Loop rollout_proc7_evt_loop terminating... +[2023-02-24 13:46:04,047][11223] Stopping RolloutWorker_w3... +[2023-02-24 13:46:04,048][11223] Loop rollout_proc3_evt_loop terminating... +[2023-02-24 13:46:04,049][00205] Component RolloutWorker_w3 stopped! +[2023-02-24 13:46:04,064][11225] Stopping RolloutWorker_w6... +[2023-02-24 13:46:04,064][00205] Component RolloutWorker_w6 stopped! +[2023-02-24 13:46:04,072][11225] Loop rollout_proc6_evt_loop terminating... +[2023-02-24 13:46:04,078][11222] Stopping RolloutWorker_w2... +[2023-02-24 13:46:04,079][11222] Loop rollout_proc2_evt_loop terminating... +[2023-02-24 13:46:04,078][00205] Component RolloutWorker_w2 stopped! +[2023-02-24 13:46:04,088][11216] Stopping RolloutWorker_w0... +[2023-02-24 13:46:04,088][00205] Component RolloutWorker_w0 stopped! +[2023-02-24 13:46:04,093][11227] Stopping RolloutWorker_w4... +[2023-02-24 13:46:04,093][00205] Component RolloutWorker_w4 stopped! +[2023-02-24 13:46:04,095][11227] Loop rollout_proc4_evt_loop terminating... +[2023-02-24 13:46:04,099][11216] Loop rollout_proc0_evt_loop terminating... +[2023-02-24 13:46:04,139][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004703_19263488.pth +[2023-02-24 13:46:04,151][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... +[2023-02-24 13:46:04,323][00205] Component LearnerWorker_p0 stopped! +[2023-02-24 13:46:04,328][00205] Waiting for process learner_proc0 to stop... +[2023-02-24 13:46:04,343][11201] Stopping LearnerWorker_p0... +[2023-02-24 13:46:04,344][11201] Loop learner_proc0_evt_loop terminating... +[2023-02-24 13:46:06,567][00205] Waiting for process inference_proc0-0 to join... +[2023-02-24 13:46:07,271][00205] Waiting for process rollout_proc0 to join... +[2023-02-24 13:46:08,011][00205] Waiting for process rollout_proc1 to join... +[2023-02-24 13:46:08,013][00205] Waiting for process rollout_proc2 to join... +[2023-02-24 13:46:08,022][00205] Waiting for process rollout_proc3 to join... +[2023-02-24 13:46:08,024][00205] Waiting for process rollout_proc4 to join... +[2023-02-24 13:46:08,025][00205] Waiting for process rollout_proc5 to join... +[2023-02-24 13:46:08,026][00205] Waiting for process rollout_proc6 to join... +[2023-02-24 13:46:08,028][00205] Waiting for process rollout_proc7 to join... +[2023-02-24 13:46:08,030][00205] Batcher 0 profile tree view: +batching: 125.6692, releasing_batches: 0.1240 +[2023-02-24 13:46:08,032][00205] InferenceWorker_p0-w0 profile tree view: +wait_policy: 0.0001 + wait_policy_total: 2737.0422 +update_model: 37.9221 + weight_update: 0.0016 +one_step: 0.0118 + handle_policy_step: 2483.2568 + deserialize: 75.3816, stack: 14.8616, obs_to_device_normalize: 559.5526, forward: 1184.2792, send_messages: 130.3140 + prepare_outputs: 395.5048 + to_cpu: 242.4212 +[2023-02-24 13:46:08,034][00205] Learner 0 profile tree view: +misc: 0.0290, prepare_batch: 60.7874 +train: 366.6477 + epoch_init: 0.0621, minibatch_init: 0.0458, losses_postprocess: 2.9527, kl_divergence: 2.8728, after_optimizer: 162.1073 + calculate_losses: 130.0243 + losses_init: 0.0470, forward_head: 7.9970, bptt_initial: 86.1198, tail: 5.1093, advantages_returns: 1.4157, losses: 16.7013 + bptt: 11.0390 + bptt_forward_core: 10.5633 + update: 65.4562 + clip: 7.0785 +[2023-02-24 13:46:08,036][00205] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 1.6596, enqueue_policy_requests: 779.0498, env_step: 4091.0265, overhead: 107.0679, complete_rollouts: 33.4033 +save_policy_outputs: 98.6476 + split_output_tensors: 47.6182 +[2023-02-24 13:46:08,038][00205] RolloutWorker_w7 profile tree view: +wait_for_trajectories: 1.8929, enqueue_policy_requests: 766.4891, env_step: 4101.5740, overhead: 109.4752, complete_rollouts: 35.1840 +save_policy_outputs: 99.1619 + split_output_tensors: 48.3378 +[2023-02-24 13:46:08,040][00205] Loop Runner_EvtLoop terminating... +[2023-02-24 13:46:08,043][00205] Runner profile tree view: +main_loop: 5474.8034 +[2023-02-24 13:46:08,055][00205] Collected {0: 20004864}, FPS: 3654.0 +[2023-02-24 14:12:39,442][00205] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json +[2023-02-24 14:12:39,445][00205] Overriding arg 'num_workers' with value 1 passed from command line +[2023-02-24 14:12:39,447][00205] Adding new argument 'no_render'=True that is not in the saved config file! +[2023-02-24 14:12:39,449][00205] Adding new argument 'save_video'=True that is not in the saved config file! +[2023-02-24 14:12:39,451][00205] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! +[2023-02-24 14:12:39,452][00205] Adding new argument 'video_name'=None that is not in the saved config file! +[2023-02-24 14:12:39,456][00205] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! +[2023-02-24 14:12:39,457][00205] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! +[2023-02-24 14:12:39,458][00205] Adding new argument 'push_to_hub'=False that is not in the saved config file! +[2023-02-24 14:12:39,459][00205] Adding new argument 'hf_repository'=None that is not in the saved config file! +[2023-02-24 14:12:39,462][00205] Adding new argument 'policy_index'=0 that is not in the saved config file! +[2023-02-24 14:12:39,463][00205] Adding new argument 'eval_deterministic'=False that is not in the saved config file! +[2023-02-24 14:12:39,465][00205] Adding new argument 'train_script'=None that is not in the saved config file! +[2023-02-24 14:12:39,466][00205] Adding new argument 'enjoy_script'=None that is not in the saved config file! +[2023-02-24 14:12:39,467][00205] Using frameskip 1 and render_action_repeat=4 for evaluation +[2023-02-24 14:12:39,510][00205] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 14:12:39,514][00205] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 14:12:39,519][00205] RunningMeanStd input shape: (1,) +[2023-02-24 14:12:39,546][00205] ConvEncoder: input_channels=3 +[2023-02-24 14:12:40,349][00205] Conv encoder output size: 512 +[2023-02-24 14:12:40,352][00205] Policy head output size: 512 +[2023-02-24 14:12:43,178][00205] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... +[2023-02-24 14:12:44,426][00205] Num frames 100... +[2023-02-24 14:12:44,538][00205] Num frames 200... +[2023-02-24 14:12:44,654][00205] Num frames 300... +[2023-02-24 14:12:44,765][00205] Num frames 400... +[2023-02-24 14:12:44,880][00205] Num frames 500... +[2023-02-24 14:12:44,989][00205] Num frames 600... +[2023-02-24 14:12:45,102][00205] Num frames 700... +[2023-02-24 14:12:45,213][00205] Num frames 800... +[2023-02-24 14:12:45,331][00205] Num frames 900... +[2023-02-24 14:12:45,443][00205] Num frames 1000... +[2023-02-24 14:12:45,561][00205] Num frames 1100... +[2023-02-24 14:12:45,677][00205] Num frames 1200... +[2023-02-24 14:12:45,797][00205] Num frames 1300... +[2023-02-24 14:12:45,909][00205] Num frames 1400... +[2023-02-24 14:12:46,022][00205] Num frames 1500... +[2023-02-24 14:12:46,083][00205] Avg episode rewards: #0: 36.040, true rewards: #0: 15.040 +[2023-02-24 14:12:46,085][00205] Avg episode reward: 36.040, avg true_objective: 15.040 +[2023-02-24 14:12:46,195][00205] Num frames 1600... +[2023-02-24 14:12:46,307][00205] Num frames 1700... +[2023-02-24 14:12:46,416][00205] Num frames 1800... +[2023-02-24 14:12:46,529][00205] Num frames 1900... +[2023-02-24 14:12:46,651][00205] Num frames 2000... +[2023-02-24 14:12:46,762][00205] Num frames 2100... +[2023-02-24 14:12:46,873][00205] Num frames 2200... +[2023-02-24 14:12:46,990][00205] Num frames 2300... +[2023-02-24 14:12:47,109][00205] Num frames 2400... +[2023-02-24 14:12:47,229][00205] Num frames 2500... +[2023-02-24 14:12:47,338][00205] Num frames 2600... +[2023-02-24 14:12:47,450][00205] Num frames 2700... +[2023-02-24 14:12:47,531][00205] Avg episode rewards: #0: 32.600, true rewards: #0: 13.600 +[2023-02-24 14:12:47,533][00205] Avg episode reward: 32.600, avg true_objective: 13.600 +[2023-02-24 14:12:47,623][00205] Num frames 2800... +[2023-02-24 14:12:47,739][00205] Num frames 2900... +[2023-02-24 14:12:47,850][00205] Num frames 3000... +[2023-02-24 14:12:47,973][00205] Num frames 3100... +[2023-02-24 14:12:48,083][00205] Num frames 3200... +[2023-02-24 14:12:48,194][00205] Num frames 3300... +[2023-02-24 14:12:48,304][00205] Num frames 3400... +[2023-02-24 14:12:48,419][00205] Num frames 3500... +[2023-02-24 14:12:48,564][00205] Avg episode rewards: #0: 27.613, true rewards: #0: 11.947 +[2023-02-24 14:12:48,565][00205] Avg episode reward: 27.613, avg true_objective: 11.947 +[2023-02-24 14:12:48,589][00205] Num frames 3600... +[2023-02-24 14:12:48,704][00205] Num frames 3700... +[2023-02-24 14:12:48,821][00205] Num frames 3800... +[2023-02-24 14:12:48,929][00205] Num frames 3900... +[2023-02-24 14:12:49,048][00205] Num frames 4000... +[2023-02-24 14:12:49,166][00205] Num frames 4100... +[2023-02-24 14:12:49,278][00205] Num frames 4200... +[2023-02-24 14:12:49,390][00205] Num frames 4300... +[2023-02-24 14:12:49,502][00205] Num frames 4400... +[2023-02-24 14:12:49,616][00205] Num frames 4500... +[2023-02-24 14:12:49,733][00205] Num frames 4600... +[2023-02-24 14:12:49,842][00205] Num frames 4700... +[2023-02-24 14:12:49,955][00205] Num frames 4800... +[2023-02-24 14:12:50,066][00205] Num frames 4900... +[2023-02-24 14:12:50,178][00205] Num frames 5000... +[2023-02-24 14:12:50,290][00205] Num frames 5100... +[2023-02-24 14:12:50,405][00205] Num frames 5200... +[2023-02-24 14:12:50,523][00205] Num frames 5300... +[2023-02-24 14:12:50,633][00205] Num frames 5400... +[2023-02-24 14:12:50,778][00205] Avg episode rewards: #0: 32.680, true rewards: #0: 13.680 +[2023-02-24 14:12:50,779][00205] Avg episode reward: 32.680, avg true_objective: 13.680 +[2023-02-24 14:12:50,815][00205] Num frames 5500... +[2023-02-24 14:12:50,926][00205] Num frames 5600... +[2023-02-24 14:12:51,041][00205] Num frames 5700... +[2023-02-24 14:12:51,148][00205] Num frames 5800... +[2023-02-24 14:12:51,258][00205] Num frames 5900... +[2023-02-24 14:12:51,336][00205] Avg episode rewards: #0: 27.240, true rewards: #0: 11.840 +[2023-02-24 14:12:51,339][00205] Avg episode reward: 27.240, avg true_objective: 11.840 +[2023-02-24 14:12:51,437][00205] Num frames 6000... +[2023-02-24 14:12:51,547][00205] Num frames 6100... +[2023-02-24 14:12:51,658][00205] Num frames 6200... +[2023-02-24 14:12:51,774][00205] Num frames 6300... +[2023-02-24 14:12:51,889][00205] Num frames 6400... +[2023-02-24 14:12:52,001][00205] Num frames 6500... +[2023-02-24 14:12:52,112][00205] Num frames 6600... +[2023-02-24 14:12:52,223][00205] Num frames 6700... +[2023-02-24 14:12:52,391][00205] Num frames 6800... +[2023-02-24 14:12:52,551][00205] Num frames 6900... +[2023-02-24 14:12:52,704][00205] Num frames 7000... +[2023-02-24 14:12:52,875][00205] Num frames 7100... +[2023-02-24 14:12:53,035][00205] Num frames 7200... +[2023-02-24 14:12:53,190][00205] Num frames 7300... +[2023-02-24 14:12:53,354][00205] Num frames 7400... +[2023-02-24 14:12:53,432][00205] Avg episode rewards: #0: 29.685, true rewards: #0: 12.352 +[2023-02-24 14:12:53,437][00205] Avg episode reward: 29.685, avg true_objective: 12.352 +[2023-02-24 14:12:53,581][00205] Num frames 7500... +[2023-02-24 14:12:53,741][00205] Num frames 7600... +[2023-02-24 14:12:53,896][00205] Num frames 7700... +[2023-02-24 14:12:54,052][00205] Num frames 7800... +[2023-02-24 14:12:54,215][00205] Num frames 7900... +[2023-02-24 14:12:54,378][00205] Num frames 8000... +[2023-02-24 14:12:54,538][00205] Num frames 8100... +[2023-02-24 14:12:54,696][00205] Num frames 8200... +[2023-02-24 14:12:54,861][00205] Num frames 8300... +[2023-02-24 14:12:55,032][00205] Avg episode rewards: #0: 28.530, true rewards: #0: 11.959 +[2023-02-24 14:12:55,034][00205] Avg episode reward: 28.530, avg true_objective: 11.959 +[2023-02-24 14:12:55,091][00205] Num frames 8400... +[2023-02-24 14:12:55,262][00205] Num frames 8500... +[2023-02-24 14:12:55,425][00205] Num frames 8600... +[2023-02-24 14:12:55,584][00205] Num frames 8700... +[2023-02-24 14:12:55,744][00205] Num frames 8800... +[2023-02-24 14:12:55,863][00205] Avg episode rewards: #0: 26.321, true rewards: #0: 11.071 +[2023-02-24 14:12:55,866][00205] Avg episode reward: 26.321, avg true_objective: 11.071 +[2023-02-24 14:12:55,918][00205] Num frames 8900... +[2023-02-24 14:12:56,027][00205] Num frames 9000... +[2023-02-24 14:12:56,143][00205] Num frames 9100... +[2023-02-24 14:12:56,255][00205] Num frames 9200... +[2023-02-24 14:12:56,369][00205] Num frames 9300... +[2023-02-24 14:12:56,482][00205] Num frames 9400... +[2023-02-24 14:12:56,600][00205] Num frames 9500... +[2023-02-24 14:12:56,711][00205] Num frames 9600... +[2023-02-24 14:12:56,835][00205] Num frames 9700... +[2023-02-24 14:12:56,966][00205] Num frames 9800... +[2023-02-24 14:12:57,076][00205] Num frames 9900... +[2023-02-24 14:12:57,148][00205] Avg episode rewards: #0: 25.903, true rewards: #0: 11.014 +[2023-02-24 14:12:57,150][00205] Avg episode reward: 25.903, avg true_objective: 11.014 +[2023-02-24 14:12:57,248][00205] Num frames 10000... +[2023-02-24 14:12:57,368][00205] Num frames 10100... +[2023-02-24 14:12:57,479][00205] Num frames 10200... +[2023-02-24 14:12:57,591][00205] Num frames 10300... +[2023-02-24 14:12:57,701][00205] Num frames 10400... +[2023-02-24 14:12:57,814][00205] Num frames 10500... +[2023-02-24 14:12:57,930][00205] Num frames 10600... +[2023-02-24 14:12:58,040][00205] Num frames 10700... +[2023-02-24 14:12:58,151][00205] Num frames 10800... +[2023-02-24 14:12:58,227][00205] Avg episode rewards: #0: 25.217, true rewards: #0: 10.817 +[2023-02-24 14:12:58,229][00205] Avg episode reward: 25.217, avg true_objective: 10.817 +[2023-02-24 14:14:02,653][00205] Replay video saved to /content/train_dir/default_experiment/replay.mp4! +[2023-02-24 14:22:15,412][00205] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json +[2023-02-24 14:22:15,415][00205] Overriding arg 'num_workers' with value 1 passed from command line +[2023-02-24 14:22:15,417][00205] Adding new argument 'no_render'=True that is not in the saved config file! +[2023-02-24 14:22:15,419][00205] Adding new argument 'save_video'=True that is not in the saved config file! +[2023-02-24 14:22:15,421][00205] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! +[2023-02-24 14:22:15,423][00205] Adding new argument 'video_name'=None that is not in the saved config file! +[2023-02-24 14:22:15,425][00205] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! +[2023-02-24 14:22:15,427][00205] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! +[2023-02-24 14:22:15,433][00205] Adding new argument 'push_to_hub'=True that is not in the saved config file! +[2023-02-24 14:22:15,434][00205] Adding new argument 'hf_repository'='dbaibak/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! +[2023-02-24 14:22:15,436][00205] Adding new argument 'policy_index'=0 that is not in the saved config file! +[2023-02-24 14:22:15,439][00205] Adding new argument 'eval_deterministic'=False that is not in the saved config file! +[2023-02-24 14:22:15,441][00205] Adding new argument 'train_script'=None that is not in the saved config file! +[2023-02-24 14:22:15,443][00205] Adding new argument 'enjoy_script'=None that is not in the saved config file! +[2023-02-24 14:22:15,445][00205] Using frameskip 1 and render_action_repeat=4 for evaluation +[2023-02-24 14:22:15,463][00205] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 14:22:15,466][00205] RunningMeanStd input shape: (1,) +[2023-02-24 14:22:15,480][00205] ConvEncoder: input_channels=3 +[2023-02-24 14:22:15,518][00205] Conv encoder output size: 512 +[2023-02-24 14:22:15,520][00205] Policy head output size: 512 +[2023-02-24 14:22:15,540][00205] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... +[2023-02-24 14:22:15,996][00205] Num frames 100... +[2023-02-24 14:22:16,126][00205] Num frames 200... +[2023-02-24 14:22:16,252][00205] Num frames 300... +[2023-02-24 14:22:16,374][00205] Num frames 400... +[2023-02-24 14:22:16,493][00205] Num frames 500... +[2023-02-24 14:22:16,613][00205] Num frames 600... +[2023-02-24 14:22:16,738][00205] Num frames 700... +[2023-02-24 14:22:16,854][00205] Num frames 800... +[2023-02-24 14:22:16,974][00205] Num frames 900... +[2023-02-24 14:22:17,099][00205] Num frames 1000... +[2023-02-24 14:22:17,235][00205] Num frames 1100... +[2023-02-24 14:22:17,354][00205] Num frames 1200... +[2023-02-24 14:22:17,484][00205] Num frames 1300... +[2023-02-24 14:22:17,606][00205] Num frames 1400... +[2023-02-24 14:22:17,733][00205] Num frames 1500... +[2023-02-24 14:22:17,857][00205] Num frames 1600... +[2023-02-24 14:22:18,032][00205] Avg episode rewards: #0: 41.980, true rewards: #0: 16.980 +[2023-02-24 14:22:18,034][00205] Avg episode reward: 41.980, avg true_objective: 16.980 +[2023-02-24 14:22:18,040][00205] Num frames 1700... +[2023-02-24 14:22:18,166][00205] Num frames 1800... +[2023-02-24 14:22:18,286][00205] Num frames 1900... +[2023-02-24 14:22:18,408][00205] Num frames 2000... +[2023-02-24 14:22:18,519][00205] Num frames 2100... +[2023-02-24 14:22:18,628][00205] Num frames 2200... +[2023-02-24 14:22:18,738][00205] Num frames 2300... +[2023-02-24 14:22:18,856][00205] Num frames 2400... +[2023-02-24 14:22:18,968][00205] Num frames 2500... +[2023-02-24 14:22:19,082][00205] Num frames 2600... +[2023-02-24 14:22:19,202][00205] Num frames 2700... +[2023-02-24 14:22:19,322][00205] Num frames 2800... +[2023-02-24 14:22:19,435][00205] Num frames 2900... +[2023-02-24 14:22:19,549][00205] Num frames 3000... +[2023-02-24 14:22:19,622][00205] Avg episode rewards: #0: 36.555, true rewards: #0: 15.055 +[2023-02-24 14:22:19,624][00205] Avg episode reward: 36.555, avg true_objective: 15.055 +[2023-02-24 14:22:19,732][00205] Num frames 3100... +[2023-02-24 14:22:19,843][00205] Num frames 3200... +[2023-02-24 14:22:19,967][00205] Num frames 3300... +[2023-02-24 14:22:20,082][00205] Num frames 3400... +[2023-02-24 14:22:20,201][00205] Num frames 3500... +[2023-02-24 14:22:20,315][00205] Num frames 3600... +[2023-02-24 14:22:20,430][00205] Num frames 3700... +[2023-02-24 14:22:20,544][00205] Num frames 3800... +[2023-02-24 14:22:20,660][00205] Num frames 3900... +[2023-02-24 14:22:20,775][00205] Num frames 4000... +[2023-02-24 14:22:20,890][00205] Num frames 4100... +[2023-02-24 14:22:21,004][00205] Num frames 4200... +[2023-02-24 14:22:21,134][00205] Avg episode rewards: #0: 34.890, true rewards: #0: 14.223 +[2023-02-24 14:22:21,135][00205] Avg episode reward: 34.890, avg true_objective: 14.223 +[2023-02-24 14:22:21,192][00205] Num frames 4300... +[2023-02-24 14:22:21,310][00205] Num frames 4400... +[2023-02-24 14:22:21,426][00205] Num frames 4500... +[2023-02-24 14:22:21,542][00205] Num frames 4600... +[2023-02-24 14:22:21,666][00205] Num frames 4700... +[2023-02-24 14:22:21,782][00205] Num frames 4800... +[2023-02-24 14:22:21,908][00205] Num frames 4900... +[2023-02-24 14:22:22,022][00205] Num frames 5000... +[2023-02-24 14:22:22,135][00205] Num frames 5100... +[2023-02-24 14:22:22,260][00205] Num frames 5200... +[2023-02-24 14:22:22,373][00205] Num frames 5300... +[2023-02-24 14:22:22,489][00205] Num frames 5400... +[2023-02-24 14:22:22,603][00205] Num frames 5500... +[2023-02-24 14:22:22,720][00205] Num frames 5600... +[2023-02-24 14:22:22,835][00205] Num frames 5700... +[2023-02-24 14:22:22,942][00205] Avg episode rewards: #0: 36.850, true rewards: #0: 14.350 +[2023-02-24 14:22:22,944][00205] Avg episode reward: 36.850, avg true_objective: 14.350 +[2023-02-24 14:22:23,016][00205] Num frames 5800... +[2023-02-24 14:22:23,132][00205] Num frames 5900... +[2023-02-24 14:22:23,254][00205] Num frames 6000... +[2023-02-24 14:22:23,366][00205] Num frames 6100... +[2023-02-24 14:22:23,511][00205] Avg episode rewards: #0: 30.960, true rewards: #0: 12.360 +[2023-02-24 14:22:23,513][00205] Avg episode reward: 30.960, avg true_objective: 12.360 +[2023-02-24 14:22:23,542][00205] Num frames 6200... +[2023-02-24 14:22:23,655][00205] Num frames 6300... +[2023-02-24 14:22:23,776][00205] Num frames 6400... +[2023-02-24 14:22:23,901][00205] Num frames 6500... +[2023-02-24 14:22:24,018][00205] Num frames 6600... +[2023-02-24 14:22:24,132][00205] Num frames 6700... +[2023-02-24 14:22:24,255][00205] Num frames 6800... +[2023-02-24 14:22:24,376][00205] Num frames 6900... +[2023-02-24 14:22:24,490][00205] Num frames 7000... +[2023-02-24 14:22:24,610][00205] Num frames 7100... +[2023-02-24 14:22:24,737][00205] Num frames 7200... +[2023-02-24 14:22:24,852][00205] Num frames 7300... +[2023-02-24 14:22:24,970][00205] Num frames 7400... +[2023-02-24 14:22:25,086][00205] Num frames 7500... +[2023-02-24 14:22:25,244][00205] Num frames 7600... +[2023-02-24 14:22:25,425][00205] Num frames 7700... +[2023-02-24 14:22:25,592][00205] Num frames 7800... +[2023-02-24 14:22:25,756][00205] Num frames 7900... +[2023-02-24 14:22:25,919][00205] Num frames 8000... +[2023-02-24 14:22:26,008][00205] Avg episode rewards: #0: 33.361, true rewards: #0: 13.362 +[2023-02-24 14:22:26,015][00205] Avg episode reward: 33.361, avg true_objective: 13.362 +[2023-02-24 14:22:26,151][00205] Num frames 8100... +[2023-02-24 14:22:26,324][00205] Num frames 8200... +[2023-02-24 14:22:26,489][00205] Num frames 8300... +[2023-02-24 14:22:26,641][00205] Num frames 8400... +[2023-02-24 14:22:26,803][00205] Num frames 8500... +[2023-02-24 14:22:26,968][00205] Num frames 8600... +[2023-02-24 14:22:27,120][00205] Avg episode rewards: #0: 30.653, true rewards: #0: 12.367 +[2023-02-24 14:22:27,123][00205] Avg episode reward: 30.653, avg true_objective: 12.367 +[2023-02-24 14:22:27,202][00205] Num frames 8700... +[2023-02-24 14:22:27,379][00205] Num frames 8800... +[2023-02-24 14:22:27,549][00205] Num frames 8900... +[2023-02-24 14:22:27,718][00205] Num frames 9000... +[2023-02-24 14:22:27,893][00205] Num frames 9100... +[2023-02-24 14:22:28,060][00205] Num frames 9200... +[2023-02-24 14:22:28,227][00205] Num frames 9300... +[2023-02-24 14:22:28,398][00205] Num frames 9400... +[2023-02-24 14:22:28,576][00205] Num frames 9500... +[2023-02-24 14:22:28,741][00205] Num frames 9600... +[2023-02-24 14:22:28,860][00205] Num frames 9700... +[2023-02-24 14:22:28,987][00205] Num frames 9800... +[2023-02-24 14:22:29,095][00205] Avg episode rewards: #0: 30.429, true rewards: #0: 12.304 +[2023-02-24 14:22:29,096][00205] Avg episode reward: 30.429, avg true_objective: 12.304 +[2023-02-24 14:22:29,167][00205] Num frames 9900... +[2023-02-24 14:22:29,289][00205] Num frames 10000... +[2023-02-24 14:22:29,402][00205] Num frames 10100... +[2023-02-24 14:22:29,522][00205] Num frames 10200... +[2023-02-24 14:22:29,639][00205] Num frames 10300... +[2023-02-24 14:22:29,752][00205] Num frames 10400... +[2023-02-24 14:22:29,869][00205] Num frames 10500... +[2023-02-24 14:22:29,982][00205] Num frames 10600... +[2023-02-24 14:22:30,093][00205] Avg episode rewards: #0: 28.937, true rewards: #0: 11.826 +[2023-02-24 14:22:30,096][00205] Avg episode reward: 28.937, avg true_objective: 11.826 +[2023-02-24 14:22:30,163][00205] Num frames 10700... +[2023-02-24 14:22:30,280][00205] Num frames 10800... +[2023-02-24 14:22:30,399][00205] Num frames 10900... +[2023-02-24 14:22:30,527][00205] Num frames 11000... +[2023-02-24 14:22:30,640][00205] Num frames 11100... +[2023-02-24 14:22:30,756][00205] Num frames 11200... +[2023-02-24 14:22:30,874][00205] Num frames 11300... +[2023-02-24 14:22:30,988][00205] Num frames 11400... +[2023-02-24 14:22:31,105][00205] Num frames 11500... +[2023-02-24 14:22:31,218][00205] Num frames 11600... +[2023-02-24 14:22:31,341][00205] Num frames 11700... +[2023-02-24 14:22:31,461][00205] Num frames 11800... +[2023-02-24 14:22:31,580][00205] Num frames 11900... +[2023-02-24 14:22:31,701][00205] Num frames 12000... +[2023-02-24 14:22:31,781][00205] Avg episode rewards: #0: 29.819, true rewards: #0: 12.019 +[2023-02-24 14:22:31,782][00205] Avg episode reward: 29.819, avg true_objective: 12.019 +[2023-02-24 14:23:45,181][00205] Replay video saved to /content/train_dir/default_experiment/replay.mp4! +[2023-02-24 14:28:54,610][00205] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json +[2023-02-24 14:28:54,613][00205] Overriding arg 'num_workers' with value 1 passed from command line +[2023-02-24 14:28:54,616][00205] Adding new argument 'no_render'=True that is not in the saved config file! +[2023-02-24 14:28:54,619][00205] Adding new argument 'save_video'=True that is not in the saved config file! +[2023-02-24 14:28:54,621][00205] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! +[2023-02-24 14:28:54,624][00205] Adding new argument 'video_name'=None that is not in the saved config file! +[2023-02-24 14:28:54,625][00205] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! +[2023-02-24 14:28:54,626][00205] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! +[2023-02-24 14:28:54,628][00205] Adding new argument 'push_to_hub'=True that is not in the saved config file! +[2023-02-24 14:28:54,629][00205] Adding new argument 'hf_repository'='dbaibak/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! +[2023-02-24 14:28:54,630][00205] Adding new argument 'policy_index'=0 that is not in the saved config file! +[2023-02-24 14:28:54,632][00205] Adding new argument 'eval_deterministic'=False that is not in the saved config file! +[2023-02-24 14:28:54,633][00205] Adding new argument 'train_script'=None that is not in the saved config file! +[2023-02-24 14:28:54,635][00205] Adding new argument 'enjoy_script'=None that is not in the saved config file! +[2023-02-24 14:28:54,636][00205] Using frameskip 1 and render_action_repeat=4 for evaluation +[2023-02-24 14:28:54,667][00205] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 14:28:54,669][00205] RunningMeanStd input shape: (1,) +[2023-02-24 14:28:54,686][00205] ConvEncoder: input_channels=3 +[2023-02-24 14:28:54,746][00205] Conv encoder output size: 512 +[2023-02-24 14:28:54,747][00205] Policy head output size: 512 +[2023-02-24 14:28:54,769][00205] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... +[2023-02-24 14:28:55,225][00205] Num frames 100... +[2023-02-24 14:28:55,343][00205] Num frames 200... +[2023-02-24 14:28:55,454][00205] Num frames 300... +[2023-02-24 14:28:55,583][00205] Num frames 400... +[2023-02-24 14:28:55,694][00205] Num frames 500... +[2023-02-24 14:28:55,818][00205] Num frames 600... +[2023-02-24 14:28:55,945][00205] Num frames 700... +[2023-02-24 14:28:56,066][00205] Num frames 800... +[2023-02-24 14:28:56,179][00205] Num frames 900... +[2023-02-24 14:28:56,290][00205] Num frames 1000... +[2023-02-24 14:28:56,401][00205] Num frames 1100... +[2023-02-24 14:28:56,516][00205] Num frames 1200... +[2023-02-24 14:28:56,638][00205] Num frames 1300... +[2023-02-24 14:28:56,749][00205] Num frames 1400... +[2023-02-24 14:28:56,861][00205] Num frames 1500... +[2023-02-24 14:28:56,973][00205] Num frames 1600... +[2023-02-24 14:28:57,087][00205] Num frames 1700... +[2023-02-24 14:28:57,207][00205] Num frames 1800... +[2023-02-24 14:28:57,332][00205] Num frames 1900... +[2023-02-24 14:28:57,450][00205] Num frames 2000... +[2023-02-24 14:28:57,575][00205] Num frames 2100... +[2023-02-24 14:28:57,628][00205] Avg episode rewards: #0: 64.999, true rewards: #0: 21.000 +[2023-02-24 14:28:57,629][00205] Avg episode reward: 64.999, avg true_objective: 21.000 +[2023-02-24 14:28:57,748][00205] Num frames 2200... +[2023-02-24 14:28:57,873][00205] Num frames 2300... +[2023-02-24 14:28:57,988][00205] Num frames 2400... +[2023-02-24 14:28:58,102][00205] Num frames 2500... +[2023-02-24 14:28:58,215][00205] Num frames 2600... +[2023-02-24 14:28:58,339][00205] Num frames 2700... +[2023-02-24 14:28:58,493][00205] Num frames 2800... +[2023-02-24 14:28:58,617][00205] Num frames 2900... +[2023-02-24 14:28:58,732][00205] Num frames 3000... +[2023-02-24 14:28:58,851][00205] Num frames 3100... +[2023-02-24 14:28:58,964][00205] Num frames 3200... +[2023-02-24 14:28:59,134][00205] Num frames 3300... +[2023-02-24 14:28:59,299][00205] Num frames 3400... +[2023-02-24 14:28:59,421][00205] Avg episode rewards: #0: 49.200, true rewards: #0: 17.200 +[2023-02-24 14:28:59,427][00205] Avg episode reward: 49.200, avg true_objective: 17.200 +[2023-02-24 14:28:59,527][00205] Num frames 3500... +[2023-02-24 14:28:59,691][00205] Num frames 3600... +[2023-02-24 14:28:59,846][00205] Num frames 3700... +[2023-02-24 14:29:00,000][00205] Num frames 3800... +[2023-02-24 14:29:00,169][00205] Num frames 3900... +[2023-02-24 14:29:00,327][00205] Num frames 4000... +[2023-02-24 14:29:00,485][00205] Num frames 4100... +[2023-02-24 14:29:00,654][00205] Num frames 4200... +[2023-02-24 14:29:00,749][00205] Avg episode rewards: #0: 39.740, true rewards: #0: 14.073 +[2023-02-24 14:29:00,751][00205] Avg episode reward: 39.740, avg true_objective: 14.073 +[2023-02-24 14:29:00,877][00205] Num frames 4300... +[2023-02-24 14:29:01,032][00205] Num frames 4400... +[2023-02-24 14:29:01,190][00205] Num frames 4500... +[2023-02-24 14:29:01,356][00205] Num frames 4600... +[2023-02-24 14:29:01,519][00205] Num frames 4700... +[2023-02-24 14:29:01,687][00205] Num frames 4800... +[2023-02-24 14:29:01,850][00205] Num frames 4900... +[2023-02-24 14:29:02,013][00205] Num frames 5000... +[2023-02-24 14:29:02,209][00205] Avg episode rewards: #0: 33.965, true rewards: #0: 12.715 +[2023-02-24 14:29:02,212][00205] Avg episode reward: 33.965, avg true_objective: 12.715 +[2023-02-24 14:29:02,243][00205] Num frames 5100... +[2023-02-24 14:29:02,413][00205] Num frames 5200... +[2023-02-24 14:29:02,533][00205] Num frames 5300... +[2023-02-24 14:29:02,647][00205] Num frames 5400... +[2023-02-24 14:29:02,766][00205] Num frames 5500... +[2023-02-24 14:29:02,880][00205] Num frames 5600... +[2023-02-24 14:29:02,997][00205] Num frames 5700... +[2023-02-24 14:29:03,113][00205] Num frames 5800... +[2023-02-24 14:29:03,230][00205] Num frames 5900... +[2023-02-24 14:29:03,340][00205] Num frames 6000... +[2023-02-24 14:29:03,451][00205] Num frames 6100... +[2023-02-24 14:29:03,565][00205] Num frames 6200... +[2023-02-24 14:29:03,684][00205] Num frames 6300... +[2023-02-24 14:29:03,805][00205] Num frames 6400... +[2023-02-24 14:29:03,920][00205] Num frames 6500... +[2023-02-24 14:29:04,053][00205] Avg episode rewards: #0: 35.338, true rewards: #0: 13.138 +[2023-02-24 14:29:04,055][00205] Avg episode reward: 35.338, avg true_objective: 13.138 +[2023-02-24 14:29:04,093][00205] Num frames 6600... +[2023-02-24 14:29:04,214][00205] Num frames 6700... +[2023-02-24 14:29:04,326][00205] Num frames 6800... +[2023-02-24 14:29:04,449][00205] Num frames 6900... +[2023-02-24 14:29:04,562][00205] Num frames 7000... +[2023-02-24 14:29:04,673][00205] Num frames 7100... +[2023-02-24 14:29:04,798][00205] Num frames 7200... +[2023-02-24 14:29:04,918][00205] Num frames 7300... +[2023-02-24 14:29:04,982][00205] Avg episode rewards: #0: 31.675, true rewards: #0: 12.175 +[2023-02-24 14:29:04,986][00205] Avg episode reward: 31.675, avg true_objective: 12.175 +[2023-02-24 14:29:05,094][00205] Num frames 7400... +[2023-02-24 14:29:05,206][00205] Num frames 7500... +[2023-02-24 14:29:05,326][00205] Num frames 7600... +[2023-02-24 14:29:05,438][00205] Num frames 7700... +[2023-02-24 14:29:05,553][00205] Num frames 7800... +[2023-02-24 14:29:05,669][00205] Num frames 7900... +[2023-02-24 14:29:05,795][00205] Num frames 8000... +[2023-02-24 14:29:05,909][00205] Num frames 8100... +[2023-02-24 14:29:06,023][00205] Num frames 8200... +[2023-02-24 14:29:06,134][00205] Num frames 8300... +[2023-02-24 14:29:06,250][00205] Num frames 8400... +[2023-02-24 14:29:06,361][00205] Num frames 8500... +[2023-02-24 14:29:06,480][00205] Num frames 8600... +[2023-02-24 14:29:06,559][00205] Avg episode rewards: #0: 31.314, true rewards: #0: 12.314 +[2023-02-24 14:29:06,560][00205] Avg episode reward: 31.314, avg true_objective: 12.314 +[2023-02-24 14:29:06,657][00205] Num frames 8700... +[2023-02-24 14:29:06,785][00205] Num frames 8800... +[2023-02-24 14:29:06,903][00205] Num frames 8900... +[2023-02-24 14:29:07,018][00205] Num frames 9000... +[2023-02-24 14:29:07,138][00205] Num frames 9100... +[2023-02-24 14:29:07,248][00205] Num frames 9200... +[2023-02-24 14:29:07,363][00205] Num frames 9300... +[2023-02-24 14:29:07,479][00205] Num frames 9400... +[2023-02-24 14:29:07,595][00205] Num frames 9500... +[2023-02-24 14:29:07,707][00205] Num frames 9600... +[2023-02-24 14:29:07,833][00205] Num frames 9700... +[2023-02-24 14:29:07,946][00205] Num frames 9800... +[2023-02-24 14:29:08,060][00205] Num frames 9900... +[2023-02-24 14:29:08,175][00205] Avg episode rewards: #0: 31.062, true rewards: #0: 12.437 +[2023-02-24 14:29:08,178][00205] Avg episode reward: 31.062, avg true_objective: 12.437 +[2023-02-24 14:29:08,238][00205] Num frames 10000... +[2023-02-24 14:29:08,357][00205] Num frames 10100... +[2023-02-24 14:29:08,469][00205] Num frames 10200... +[2023-02-24 14:29:08,584][00205] Num frames 10300... +[2023-02-24 14:29:08,715][00205] Avg episode rewards: #0: 28.296, true rewards: #0: 11.518 +[2023-02-24 14:29:08,717][00205] Avg episode reward: 28.296, avg true_objective: 11.518 +[2023-02-24 14:29:08,760][00205] Num frames 10400... +[2023-02-24 14:29:08,882][00205] Num frames 10500... +[2023-02-24 14:29:09,002][00205] Num frames 10600... +[2023-02-24 14:29:09,118][00205] Num frames 10700... +[2023-02-24 14:29:09,230][00205] Num frames 10800... +[2023-02-24 14:29:09,342][00205] Num frames 10900... +[2023-02-24 14:29:09,458][00205] Num frames 11000... +[2023-02-24 14:29:09,579][00205] Num frames 11100... +[2023-02-24 14:29:09,691][00205] Num frames 11200... +[2023-02-24 14:29:09,802][00205] Num frames 11300... +[2023-02-24 14:29:09,923][00205] Num frames 11400... +[2023-02-24 14:29:10,035][00205] Num frames 11500... +[2023-02-24 14:29:10,152][00205] Num frames 11600... +[2023-02-24 14:29:10,244][00205] Avg episode rewards: #0: 28.832, true rewards: #0: 11.632 +[2023-02-24 14:29:10,246][00205] Avg episode reward: 28.832, avg true_objective: 11.632 +[2023-02-24 14:30:20,364][00205] Replay video saved to /content/train_dir/default_experiment/replay.mp4!