diff --git "a/sf_log.txt" "b/sf_log.txt" new file mode 100644--- /dev/null +++ "b/sf_log.txt" @@ -0,0 +1,1418 @@ +[2023-02-24 09:55:08,547][01623] Saving configuration to /content/train_dir/default_experiment/config.json... +[2023-02-24 09:55:08,550][01623] Rollout worker 0 uses device cpu +[2023-02-24 09:55:08,552][01623] Rollout worker 1 uses device cpu +[2023-02-24 09:55:08,554][01623] Rollout worker 2 uses device cpu +[2023-02-24 09:55:08,555][01623] Rollout worker 3 uses device cpu +[2023-02-24 09:55:08,557][01623] Rollout worker 4 uses device cpu +[2023-02-24 09:55:08,558][01623] Rollout worker 5 uses device cpu +[2023-02-24 09:55:08,560][01623] Rollout worker 6 uses device cpu +[2023-02-24 09:55:08,562][01623] Rollout worker 7 uses device cpu +[2023-02-24 09:55:08,740][01623] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 09:55:08,743][01623] InferenceWorker_p0-w0: min num requests: 2 +[2023-02-24 09:55:08,773][01623] Starting all processes... +[2023-02-24 09:55:08,774][01623] Starting process learner_proc0 +[2023-02-24 09:55:08,830][01623] Starting all processes... +[2023-02-24 09:55:08,846][01623] Starting process inference_proc0-0 +[2023-02-24 09:55:08,852][01623] Starting process rollout_proc0 +[2023-02-24 09:55:08,869][01623] Starting process rollout_proc1 +[2023-02-24 09:55:08,870][01623] Starting process rollout_proc2 +[2023-02-24 09:55:08,870][01623] Starting process rollout_proc3 +[2023-02-24 09:55:08,870][01623] Starting process rollout_proc4 +[2023-02-24 09:55:08,870][01623] Starting process rollout_proc5 +[2023-02-24 09:55:08,870][01623] Starting process rollout_proc6 +[2023-02-24 09:55:08,871][01623] Starting process rollout_proc7 +[2023-02-24 09:55:20,565][15460] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 09:55:20,569][15460] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2023-02-24 09:55:20,661][15485] Worker 6 uses CPU cores [0] +[2023-02-24 09:55:20,682][15484] Worker 4 uses CPU cores [0] +[2023-02-24 09:55:20,728][15476] Worker 1 uses CPU cores [1] +[2023-02-24 09:55:20,738][15482] Worker 3 uses CPU cores [1] +[2023-02-24 09:55:20,772][15481] Worker 2 uses CPU cores [0] +[2023-02-24 09:55:20,780][15474] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 09:55:20,782][15474] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2023-02-24 09:55:20,803][15486] Worker 7 uses CPU cores [1] +[2023-02-24 09:55:20,911][15475] Worker 0 uses CPU cores [0] +[2023-02-24 09:55:20,926][15483] Worker 5 uses CPU cores [1] +[2023-02-24 09:55:21,394][15474] Num visible devices: 1 +[2023-02-24 09:55:21,394][15460] Num visible devices: 1 +[2023-02-24 09:55:21,413][15460] Starting seed is not provided +[2023-02-24 09:55:21,414][15460] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 09:55:21,414][15460] Initializing actor-critic model on device cuda:0 +[2023-02-24 09:55:21,415][15460] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 09:55:21,417][15460] RunningMeanStd input shape: (1,) +[2023-02-24 09:55:21,430][15460] ConvEncoder: input_channels=3 +[2023-02-24 09:55:21,706][15460] Conv encoder output size: 512 +[2023-02-24 09:55:21,706][15460] Policy head output size: 512 +[2023-02-24 09:55:21,753][15460] Created Actor Critic model with architecture: +[2023-02-24 09:55:21,754][15460] ActorCriticSharedWeights( + (obs_normalizer): ObservationNormalizer( + (running_mean_std): RunningMeanStdDictInPlace( + (running_mean_std): ModuleDict( + (obs): RunningMeanStdInPlace() + ) + ) + ) + (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) + (encoder): VizdoomEncoder( + (basic_encoder): ConvEncoder( + (enc): RecursiveScriptModule( + original_name=ConvEncoderImpl + (conv_head): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Conv2d) + (1): RecursiveScriptModule(original_name=ELU) + (2): RecursiveScriptModule(original_name=Conv2d) + (3): RecursiveScriptModule(original_name=ELU) + (4): RecursiveScriptModule(original_name=Conv2d) + (5): RecursiveScriptModule(original_name=ELU) + ) + (mlp_layers): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Linear) + (1): RecursiveScriptModule(original_name=ELU) + ) + ) + ) + ) + (core): ModelCoreRNN( + (core): GRU(512, 512) + ) + (decoder): MlpDecoder( + (mlp): Identity() + ) + (critic_linear): Linear(in_features=512, out_features=1, bias=True) + (action_parameterization): ActionParameterizationDefault( + (distribution_linear): Linear(in_features=512, out_features=5, bias=True) + ) +) +[2023-02-24 09:55:28,539][15460] Using optimizer +[2023-02-24 09:55:28,541][15460] No checkpoints found +[2023-02-24 09:55:28,541][15460] Did not load from checkpoint, starting from scratch! +[2023-02-24 09:55:28,542][15460] Initialized policy 0 weights for model version 0 +[2023-02-24 09:55:28,545][15460] LearnerWorker_p0 finished initialization! +[2023-02-24 09:55:28,547][15460] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 09:55:28,734][01623] Heartbeat connected on Batcher_0 +[2023-02-24 09:55:28,740][01623] Heartbeat connected on LearnerWorker_p0 +[2023-02-24 09:55:28,753][01623] Heartbeat connected on RolloutWorker_w0 +[2023-02-24 09:55:28,758][01623] Heartbeat connected on RolloutWorker_w1 +[2023-02-24 09:55:28,760][01623] Heartbeat connected on RolloutWorker_w2 +[2023-02-24 09:55:28,762][01623] Heartbeat connected on RolloutWorker_w3 +[2023-02-24 09:55:28,764][01623] Heartbeat connected on RolloutWorker_w4 +[2023-02-24 09:55:28,771][01623] Heartbeat connected on RolloutWorker_w5 +[2023-02-24 09:55:28,773][01623] Heartbeat connected on RolloutWorker_w6 +[2023-02-24 09:55:28,774][01623] Heartbeat connected on RolloutWorker_w7 +[2023-02-24 09:55:28,792][15474] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 09:55:28,793][15474] RunningMeanStd input shape: (1,) +[2023-02-24 09:55:28,816][15474] ConvEncoder: input_channels=3 +[2023-02-24 09:55:28,972][15474] Conv encoder output size: 512 +[2023-02-24 09:55:28,974][15474] Policy head output size: 512 +[2023-02-24 09:55:29,601][01623] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-02-24 09:55:31,694][01623] Inference worker 0-0 is ready! +[2023-02-24 09:55:31,696][01623] All inference workers are ready! Signal rollout workers to start! +[2023-02-24 09:55:31,707][01623] Heartbeat connected on InferenceWorker_p0-w0 +[2023-02-24 09:55:31,795][15476] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 09:55:31,810][15482] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 09:55:31,821][15483] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 09:55:31,853][15475] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 09:55:31,853][15486] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 09:55:31,866][15485] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 09:55:31,874][15484] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 09:55:31,877][15481] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 09:55:32,369][15485] Decorrelating experience for 0 frames... +[2023-02-24 09:55:32,715][15485] Decorrelating experience for 32 frames... +[2023-02-24 09:55:33,089][15483] Decorrelating experience for 0 frames... +[2023-02-24 09:55:33,096][15476] Decorrelating experience for 0 frames... +[2023-02-24 09:55:33,098][15482] Decorrelating experience for 0 frames... +[2023-02-24 09:55:33,102][15486] Decorrelating experience for 0 frames... +[2023-02-24 09:55:33,437][15486] Decorrelating experience for 32 frames... +[2023-02-24 09:55:33,839][15486] Decorrelating experience for 64 frames... +[2023-02-24 09:55:34,236][15486] Decorrelating experience for 96 frames... +[2023-02-24 09:55:34,445][15484] Decorrelating experience for 0 frames... +[2023-02-24 09:55:34,455][15481] Decorrelating experience for 0 frames... +[2023-02-24 09:55:34,495][15485] Decorrelating experience for 64 frames... +[2023-02-24 09:55:34,497][15475] Decorrelating experience for 0 frames... +[2023-02-24 09:55:34,601][01623] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-02-24 09:55:35,204][15476] Decorrelating experience for 32 frames... +[2023-02-24 09:55:35,334][15483] Decorrelating experience for 32 frames... +[2023-02-24 09:55:35,786][15482] Decorrelating experience for 32 frames... +[2023-02-24 09:55:36,085][15484] Decorrelating experience for 32 frames... +[2023-02-24 09:55:36,167][15481] Decorrelating experience for 32 frames... +[2023-02-24 09:55:36,175][15475] Decorrelating experience for 32 frames... +[2023-02-24 09:55:36,591][15483] Decorrelating experience for 64 frames... +[2023-02-24 09:55:37,021][15476] Decorrelating experience for 64 frames... +[2023-02-24 09:55:37,779][15482] Decorrelating experience for 64 frames... +[2023-02-24 09:55:37,835][15476] Decorrelating experience for 96 frames... +[2023-02-24 09:55:38,414][15485] Decorrelating experience for 96 frames... +[2023-02-24 09:55:38,679][15484] Decorrelating experience for 64 frames... +[2023-02-24 09:55:38,914][15481] Decorrelating experience for 64 frames... +[2023-02-24 09:55:39,017][15475] Decorrelating experience for 64 frames... +[2023-02-24 09:55:39,601][01623] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 2.0. Samples: 20. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-02-24 09:55:39,612][01623] Avg episode reward: [(0, '1.280')] +[2023-02-24 09:55:39,953][15482] Decorrelating experience for 96 frames... +[2023-02-24 09:55:40,566][15484] Decorrelating experience for 96 frames... +[2023-02-24 09:55:40,679][15481] Decorrelating experience for 96 frames... +[2023-02-24 09:55:44,058][15460] Signal inference workers to stop experience collection... +[2023-02-24 09:55:44,081][15474] InferenceWorker_p0-w0: stopping experience collection +[2023-02-24 09:55:44,282][15483] Decorrelating experience for 96 frames... +[2023-02-24 09:55:44,602][01623] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 140.7. Samples: 2110. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-02-24 09:55:44,608][01623] Avg episode reward: [(0, '3.098')] +[2023-02-24 09:55:45,110][15475] Decorrelating experience for 96 frames... +[2023-02-24 09:55:46,666][15460] Signal inference workers to resume experience collection... +[2023-02-24 09:55:46,667][15474] InferenceWorker_p0-w0: resuming experience collection +[2023-02-24 09:55:49,602][01623] Fps is (10 sec: 819.1, 60 sec: 409.6, 300 sec: 409.6). Total num frames: 8192. Throughput: 0: 113.0. Samples: 2260. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) +[2023-02-24 09:55:49,605][01623] Avg episode reward: [(0, '3.056')] +[2023-02-24 09:55:54,601][01623] Fps is (10 sec: 2867.3, 60 sec: 1146.9, 300 sec: 1146.9). Total num frames: 28672. Throughput: 0: 257.8. Samples: 6444. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 09:55:54,606][01623] Avg episode reward: [(0, '3.857')] +[2023-02-24 09:55:56,720][15474] Updated weights for policy 0, policy_version 10 (0.0594) +[2023-02-24 09:55:59,602][01623] Fps is (10 sec: 4096.1, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 49152. Throughput: 0: 422.3. Samples: 12668. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) +[2023-02-24 09:55:59,606][01623] Avg episode reward: [(0, '4.322')] +[2023-02-24 09:56:04,601][01623] Fps is (10 sec: 3686.4, 60 sec: 1872.5, 300 sec: 1872.5). Total num frames: 65536. Throughput: 0: 432.9. Samples: 15150. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 09:56:04,605][01623] Avg episode reward: [(0, '4.341')] +[2023-02-24 09:56:09,601][01623] Fps is (10 sec: 2867.3, 60 sec: 1945.6, 300 sec: 1945.6). Total num frames: 77824. Throughput: 0: 473.4. Samples: 18936. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) +[2023-02-24 09:56:09,604][01623] Avg episode reward: [(0, '4.517')] +[2023-02-24 09:56:10,856][15474] Updated weights for policy 0, policy_version 20 (0.0017) +[2023-02-24 09:56:14,601][01623] Fps is (10 sec: 2867.2, 60 sec: 2093.5, 300 sec: 2093.5). Total num frames: 94208. Throughput: 0: 535.2. Samples: 24086. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 09:56:14,604][01623] Avg episode reward: [(0, '4.636')] +[2023-02-24 09:56:19,601][01623] Fps is (10 sec: 3686.4, 60 sec: 2293.8, 300 sec: 2293.8). Total num frames: 114688. Throughput: 0: 604.9. Samples: 27220. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 09:56:19,604][01623] Avg episode reward: [(0, '4.592')] +[2023-02-24 09:56:19,647][15460] Saving new best policy, reward=4.592! +[2023-02-24 09:56:20,610][15474] Updated weights for policy 0, policy_version 30 (0.0019) +[2023-02-24 09:56:24,602][01623] Fps is (10 sec: 3686.2, 60 sec: 2383.1, 300 sec: 2383.1). Total num frames: 131072. Throughput: 0: 727.5. Samples: 32760. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 09:56:24,607][01623] Avg episode reward: [(0, '4.555')] +[2023-02-24 09:56:29,601][01623] Fps is (10 sec: 2867.2, 60 sec: 2389.3, 300 sec: 2389.3). Total num frames: 143360. Throughput: 0: 770.1. Samples: 36766. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 09:56:29,604][01623] Avg episode reward: [(0, '4.497')] +[2023-02-24 09:56:34,608][01623] Fps is (10 sec: 2455.9, 60 sec: 2593.8, 300 sec: 2394.3). Total num frames: 155648. Throughput: 0: 802.0. Samples: 38354. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 09:56:34,611][01623] Avg episode reward: [(0, '4.659')] +[2023-02-24 09:56:34,622][15460] Saving new best policy, reward=4.659! +[2023-02-24 09:56:36,662][15474] Updated weights for policy 0, policy_version 40 (0.0018) +[2023-02-24 09:56:39,604][01623] Fps is (10 sec: 2866.5, 60 sec: 2867.1, 300 sec: 2457.5). Total num frames: 172032. Throughput: 0: 796.4. Samples: 42284. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 09:56:39,607][01623] Avg episode reward: [(0, '4.679')] +[2023-02-24 09:56:39,611][15460] Saving new best policy, reward=4.679! +[2023-02-24 09:56:44,601][01623] Fps is (10 sec: 3279.1, 60 sec: 3140.3, 300 sec: 2512.2). Total num frames: 188416. Throughput: 0: 770.5. Samples: 47342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:56:44,604][01623] Avg episode reward: [(0, '4.802')] +[2023-02-24 09:56:44,610][15460] Saving new best policy, reward=4.802! +[2023-02-24 09:56:49,601][01623] Fps is (10 sec: 2867.9, 60 sec: 3208.6, 300 sec: 2508.8). Total num frames: 200704. Throughput: 0: 760.8. Samples: 49384. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:56:49,607][01623] Avg episode reward: [(0, '4.591')] +[2023-02-24 09:56:50,197][15474] Updated weights for policy 0, policy_version 50 (0.0044) +[2023-02-24 09:56:54,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 2554.0). Total num frames: 217088. Throughput: 0: 767.2. Samples: 53460. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 09:56:54,605][01623] Avg episode reward: [(0, '4.416')] +[2023-02-24 09:56:59,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 2639.6). Total num frames: 237568. Throughput: 0: 789.7. Samples: 59624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:56:59,611][01623] Avg episode reward: [(0, '4.464')] +[2023-02-24 09:57:01,010][15474] Updated weights for policy 0, policy_version 60 (0.0036) +[2023-02-24 09:57:04,607][01623] Fps is (10 sec: 4093.4, 60 sec: 3208.2, 300 sec: 2716.1). Total num frames: 258048. Throughput: 0: 788.8. Samples: 62722. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:57:04,610][01623] Avg episode reward: [(0, '4.713')] +[2023-02-24 09:57:04,622][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000063_258048.pth... +[2023-02-24 09:57:09,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 2703.4). Total num frames: 270336. Throughput: 0: 778.8. Samples: 67804. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:57:09,613][01623] Avg episode reward: [(0, '4.802')] +[2023-02-24 09:57:13,988][15474] Updated weights for policy 0, policy_version 70 (0.0028) +[2023-02-24 09:57:14,601][01623] Fps is (10 sec: 2869.0, 60 sec: 3208.5, 300 sec: 2730.7). Total num frames: 286720. Throughput: 0: 784.4. Samples: 72066. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 09:57:14,604][01623] Avg episode reward: [(0, '4.838')] +[2023-02-24 09:57:14,621][15460] Saving new best policy, reward=4.838! +[2023-02-24 09:57:19,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 2792.7). Total num frames: 307200. Throughput: 0: 811.3. Samples: 74858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:57:19,605][01623] Avg episode reward: [(0, '4.846')] +[2023-02-24 09:57:19,608][15460] Saving new best policy, reward=4.846! +[2023-02-24 09:57:23,711][15474] Updated weights for policy 0, policy_version 80 (0.0027) +[2023-02-24 09:57:24,602][01623] Fps is (10 sec: 4095.6, 60 sec: 3276.8, 300 sec: 2849.4). Total num frames: 327680. Throughput: 0: 868.3. Samples: 81356. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 09:57:24,610][01623] Avg episode reward: [(0, '4.863')] +[2023-02-24 09:57:24,653][15460] Saving new best policy, reward=4.863! +[2023-02-24 09:57:29,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2867.2). Total num frames: 344064. Throughput: 0: 861.8. Samples: 86124. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 09:57:29,604][01623] Avg episode reward: [(0, '4.891')] +[2023-02-24 09:57:29,607][15460] Saving new best policy, reward=4.891! +[2023-02-24 09:57:34,601][01623] Fps is (10 sec: 2867.5, 60 sec: 3345.5, 300 sec: 2850.8). Total num frames: 356352. Throughput: 0: 860.9. Samples: 88126. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:57:34,604][01623] Avg episode reward: [(0, '4.707')] +[2023-02-24 09:57:37,174][15474] Updated weights for policy 0, policy_version 90 (0.0032) +[2023-02-24 09:57:39,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3413.5, 300 sec: 2898.7). Total num frames: 376832. Throughput: 0: 883.5. Samples: 93218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 09:57:39,608][01623] Avg episode reward: [(0, '4.664')] +[2023-02-24 09:57:44,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 2943.1). Total num frames: 397312. Throughput: 0: 893.6. Samples: 99836. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 09:57:44,609][01623] Avg episode reward: [(0, '4.792')] +[2023-02-24 09:57:47,287][15474] Updated weights for policy 0, policy_version 100 (0.0020) +[2023-02-24 09:57:49,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 2955.0). Total num frames: 413696. Throughput: 0: 886.0. Samples: 102588. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 09:57:49,604][01623] Avg episode reward: [(0, '5.149')] +[2023-02-24 09:57:49,607][15460] Saving new best policy, reward=5.149! +[2023-02-24 09:57:54,603][01623] Fps is (10 sec: 2866.6, 60 sec: 3481.5, 300 sec: 2937.8). Total num frames: 425984. Throughput: 0: 864.1. Samples: 106690. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 09:57:54,607][01623] Avg episode reward: [(0, '5.300')] +[2023-02-24 09:57:54,622][15460] Saving new best policy, reward=5.300! +[2023-02-24 09:57:59,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 2976.4). Total num frames: 446464. Throughput: 0: 884.6. Samples: 111874. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:57:59,604][01623] Avg episode reward: [(0, '5.256')] +[2023-02-24 09:57:59,842][15474] Updated weights for policy 0, policy_version 110 (0.0031) +[2023-02-24 09:58:04,602][01623] Fps is (10 sec: 4506.1, 60 sec: 3550.2, 300 sec: 3038.9). Total num frames: 471040. Throughput: 0: 896.2. Samples: 115190. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 09:58:04,606][01623] Avg episode reward: [(0, '5.379')] +[2023-02-24 09:58:04,622][15460] Saving new best policy, reward=5.379! +[2023-02-24 09:58:09,601][01623] Fps is (10 sec: 4095.9, 60 sec: 3618.1, 300 sec: 3046.4). Total num frames: 487424. Throughput: 0: 886.9. Samples: 121264. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 09:58:09,606][01623] Avg episode reward: [(0, '5.480')] +[2023-02-24 09:58:09,613][15460] Saving new best policy, reward=5.480! +[2023-02-24 09:58:11,058][15474] Updated weights for policy 0, policy_version 120 (0.0025) +[2023-02-24 09:58:14,602][01623] Fps is (10 sec: 2867.4, 60 sec: 3549.8, 300 sec: 3028.6). Total num frames: 499712. Throughput: 0: 871.6. Samples: 125348. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 09:58:14,611][01623] Avg episode reward: [(0, '5.273')] +[2023-02-24 09:58:19,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3035.9). Total num frames: 516096. Throughput: 0: 873.6. Samples: 127440. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 09:58:19,603][01623] Avg episode reward: [(0, '4.856')] +[2023-02-24 09:58:22,694][15474] Updated weights for policy 0, policy_version 130 (0.0025) +[2023-02-24 09:58:24,601][01623] Fps is (10 sec: 3686.6, 60 sec: 3481.7, 300 sec: 3066.2). Total num frames: 536576. Throughput: 0: 897.0. Samples: 133584. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 09:58:24,608][01623] Avg episode reward: [(0, '4.843')] +[2023-02-24 09:58:29,601][01623] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3094.8). Total num frames: 557056. Throughput: 0: 878.3. Samples: 139360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:58:29,608][01623] Avg episode reward: [(0, '5.210')] +[2023-02-24 09:58:34,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3077.5). Total num frames: 569344. Throughput: 0: 864.3. Samples: 141480. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 09:58:34,608][01623] Avg episode reward: [(0, '5.300')] +[2023-02-24 09:58:35,374][15474] Updated weights for policy 0, policy_version 140 (0.0029) +[2023-02-24 09:58:39,601][01623] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3082.8). Total num frames: 585728. Throughput: 0: 866.0. Samples: 145656. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 09:58:39,608][01623] Avg episode reward: [(0, '5.623')] +[2023-02-24 09:58:39,612][15460] Saving new best policy, reward=5.623! +[2023-02-24 09:58:44,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3129.8). Total num frames: 610304. Throughput: 0: 897.0. Samples: 152238. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 09:58:44,604][01623] Avg episode reward: [(0, '5.774')] +[2023-02-24 09:58:44,614][15460] Saving new best policy, reward=5.774! +[2023-02-24 09:58:45,451][15474] Updated weights for policy 0, policy_version 150 (0.0019) +[2023-02-24 09:58:49,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3133.4). Total num frames: 626688. Throughput: 0: 897.1. Samples: 155558. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 09:58:49,604][01623] Avg episode reward: [(0, '5.783')] +[2023-02-24 09:58:49,606][15460] Saving new best policy, reward=5.783! +[2023-02-24 09:58:54,606][01623] Fps is (10 sec: 3275.1, 60 sec: 3618.0, 300 sec: 3136.9). Total num frames: 643072. Throughput: 0: 862.2. Samples: 160068. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:58:54,620][01623] Avg episode reward: [(0, '5.628')] +[2023-02-24 09:58:58,616][15474] Updated weights for policy 0, policy_version 160 (0.0025) +[2023-02-24 09:58:59,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3140.3). Total num frames: 659456. Throughput: 0: 870.0. Samples: 164496. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:58:59,604][01623] Avg episode reward: [(0, '5.407')] +[2023-02-24 09:59:04,601][01623] Fps is (10 sec: 3688.2, 60 sec: 3481.7, 300 sec: 3162.5). Total num frames: 679936. Throughput: 0: 897.8. Samples: 167842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 09:59:04,607][01623] Avg episode reward: [(0, '5.295')] +[2023-02-24 09:59:04,618][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000166_679936.pth... +[2023-02-24 09:59:07,752][15474] Updated weights for policy 0, policy_version 170 (0.0021) +[2023-02-24 09:59:09,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3183.7). Total num frames: 700416. Throughput: 0: 908.1. Samples: 174448. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 09:59:09,603][01623] Avg episode reward: [(0, '5.823')] +[2023-02-24 09:59:09,609][15460] Saving new best policy, reward=5.823! +[2023-02-24 09:59:14,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3167.6). Total num frames: 712704. Throughput: 0: 876.0. Samples: 178778. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 09:59:14,611][01623] Avg episode reward: [(0, '6.096')] +[2023-02-24 09:59:14,627][15460] Saving new best policy, reward=6.096! +[2023-02-24 09:59:19,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3170.0). Total num frames: 729088. Throughput: 0: 874.6. Samples: 180836. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 09:59:19,608][01623] Avg episode reward: [(0, '5.990')] +[2023-02-24 09:59:21,228][15474] Updated weights for policy 0, policy_version 180 (0.0013) +[2023-02-24 09:59:24,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3189.7). Total num frames: 749568. Throughput: 0: 908.8. Samples: 186552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 09:59:24,607][01623] Avg episode reward: [(0, '5.658')] +[2023-02-24 09:59:29,601][01623] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3208.5). Total num frames: 770048. Throughput: 0: 909.1. Samples: 193148. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 09:59:29,605][01623] Avg episode reward: [(0, '5.918')] +[2023-02-24 09:59:31,346][15474] Updated weights for policy 0, policy_version 190 (0.0018) +[2023-02-24 09:59:34,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3209.9). Total num frames: 786432. Throughput: 0: 882.8. Samples: 195284. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 09:59:34,608][01623] Avg episode reward: [(0, '5.978')] +[2023-02-24 09:59:39,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3194.9). Total num frames: 798720. Throughput: 0: 876.1. Samples: 199488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:59:39,603][01623] Avg episode reward: [(0, '6.391')] +[2023-02-24 09:59:39,607][15460] Saving new best policy, reward=6.391! +[2023-02-24 09:59:43,847][15474] Updated weights for policy 0, policy_version 200 (0.0014) +[2023-02-24 09:59:44,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3212.5). Total num frames: 819200. Throughput: 0: 904.7. Samples: 205206. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:59:44,604][01623] Avg episode reward: [(0, '6.701')] +[2023-02-24 09:59:44,613][15460] Saving new best policy, reward=6.701! +[2023-02-24 09:59:49,604][01623] Fps is (10 sec: 4094.7, 60 sec: 3549.7, 300 sec: 3229.5). Total num frames: 839680. Throughput: 0: 898.5. Samples: 208278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 09:59:49,607][01623] Avg episode reward: [(0, '6.829')] +[2023-02-24 09:59:49,609][15460] Saving new best policy, reward=6.829! +[2023-02-24 09:59:54,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3550.2, 300 sec: 3230.4). Total num frames: 856064. Throughput: 0: 867.7. Samples: 213496. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 09:59:54,607][01623] Avg episode reward: [(0, '6.911')] +[2023-02-24 09:59:54,621][15460] Saving new best policy, reward=6.911! +[2023-02-24 09:59:55,876][15474] Updated weights for policy 0, policy_version 210 (0.0014) +[2023-02-24 09:59:59,601][01623] Fps is (10 sec: 2868.1, 60 sec: 3481.6, 300 sec: 3216.1). Total num frames: 868352. Throughput: 0: 862.9. Samples: 217608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 09:59:59,607][01623] Avg episode reward: [(0, '7.136')] +[2023-02-24 09:59:59,612][15460] Saving new best policy, reward=7.136! +[2023-02-24 10:00:04,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3232.1). Total num frames: 888832. Throughput: 0: 872.7. Samples: 220106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:00:04,604][01623] Avg episode reward: [(0, '6.957')] +[2023-02-24 10:00:07,073][15474] Updated weights for policy 0, policy_version 220 (0.0015) +[2023-02-24 10:00:09,602][01623] Fps is (10 sec: 3686.3, 60 sec: 3413.3, 300 sec: 3232.9). Total num frames: 905216. Throughput: 0: 887.9. Samples: 226508. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:00:09,605][01623] Avg episode reward: [(0, '6.833')] +[2023-02-24 10:00:14,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3219.3). Total num frames: 917504. Throughput: 0: 822.0. Samples: 230138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:00:14,609][01623] Avg episode reward: [(0, '6.858')] +[2023-02-24 10:00:19,601][01623] Fps is (10 sec: 2457.7, 60 sec: 3345.1, 300 sec: 3206.2). Total num frames: 929792. Throughput: 0: 812.1. Samples: 231830. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:00:19,606][01623] Avg episode reward: [(0, '7.099')] +[2023-02-24 10:00:23,971][15474] Updated weights for policy 0, policy_version 230 (0.0019) +[2023-02-24 10:00:24,604][01623] Fps is (10 sec: 2456.8, 60 sec: 3208.4, 300 sec: 3193.5). Total num frames: 942080. Throughput: 0: 792.8. Samples: 235168. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:00:24,608][01623] Avg episode reward: [(0, '6.945')] +[2023-02-24 10:00:29,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 962560. Throughput: 0: 791.2. Samples: 240810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:00:29,604][01623] Avg episode reward: [(0, '6.872')] +[2023-02-24 10:00:33,671][15474] Updated weights for policy 0, policy_version 240 (0.0014) +[2023-02-24 10:00:34,601][01623] Fps is (10 sec: 4507.0, 60 sec: 3345.1, 300 sec: 3346.2). Total num frames: 987136. Throughput: 0: 796.9. Samples: 244138. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:00:34,604][01623] Avg episode reward: [(0, '7.081')] +[2023-02-24 10:00:39,606][01623] Fps is (10 sec: 3684.5, 60 sec: 3344.8, 300 sec: 3387.8). Total num frames: 999424. Throughput: 0: 806.1. Samples: 249774. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:00:39,612][01623] Avg episode reward: [(0, '7.376')] +[2023-02-24 10:00:39,616][15460] Saving new best policy, reward=7.376! +[2023-02-24 10:00:44,601][01623] Fps is (10 sec: 2867.1, 60 sec: 3276.8, 300 sec: 3415.7). Total num frames: 1015808. Throughput: 0: 805.4. Samples: 253852. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:00:44,604][01623] Avg episode reward: [(0, '7.371')] +[2023-02-24 10:00:46,830][15474] Updated weights for policy 0, policy_version 250 (0.0016) +[2023-02-24 10:00:49,601][01623] Fps is (10 sec: 3688.3, 60 sec: 3277.0, 300 sec: 3415.6). Total num frames: 1036288. Throughput: 0: 807.6. Samples: 256450. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:00:49,607][01623] Avg episode reward: [(0, '7.348')] +[2023-02-24 10:00:54,601][01623] Fps is (10 sec: 4096.1, 60 sec: 3345.1, 300 sec: 3415.7). Total num frames: 1056768. Throughput: 0: 815.5. Samples: 263206. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 10:00:54,609][01623] Avg episode reward: [(0, '7.778')] +[2023-02-24 10:00:54,618][15460] Saving new best policy, reward=7.778! +[2023-02-24 10:00:56,311][15474] Updated weights for policy 0, policy_version 260 (0.0014) +[2023-02-24 10:00:59,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 1073152. Throughput: 0: 852.5. Samples: 268500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:00:59,606][01623] Avg episode reward: [(0, '8.148')] +[2023-02-24 10:00:59,618][15460] Saving new best policy, reward=8.148! +[2023-02-24 10:01:04,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 1085440. Throughput: 0: 860.3. Samples: 270544. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:01:04,606][01623] Avg episode reward: [(0, '8.419')] +[2023-02-24 10:01:04,623][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000265_1085440.pth... +[2023-02-24 10:01:04,757][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000063_258048.pth +[2023-02-24 10:01:04,776][15460] Saving new best policy, reward=8.419! +[2023-02-24 10:01:09,456][15474] Updated weights for policy 0, policy_version 270 (0.0033) +[2023-02-24 10:01:09,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3429.5). Total num frames: 1105920. Throughput: 0: 888.3. Samples: 275138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:01:09,609][01623] Avg episode reward: [(0, '8.458')] +[2023-02-24 10:01:09,614][15460] Saving new best policy, reward=8.458! +[2023-02-24 10:01:14,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1126400. Throughput: 0: 907.5. Samples: 281646. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:01:14,607][01623] Avg episode reward: [(0, '8.902')] +[2023-02-24 10:01:14,620][15460] Saving new best policy, reward=8.902! +[2023-02-24 10:01:19,601][01623] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1142784. Throughput: 0: 899.6. Samples: 284620. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:01:19,607][01623] Avg episode reward: [(0, '9.433')] +[2023-02-24 10:01:19,613][15460] Saving new best policy, reward=9.433! +[2023-02-24 10:01:20,813][15474] Updated weights for policy 0, policy_version 280 (0.0016) +[2023-02-24 10:01:24,604][01623] Fps is (10 sec: 2866.3, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1155072. Throughput: 0: 863.7. Samples: 288638. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 10:01:24,611][01623] Avg episode reward: [(0, '9.001')] +[2023-02-24 10:01:29,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3457.4). Total num frames: 1175552. Throughput: 0: 881.1. Samples: 293500. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:01:29,608][01623] Avg episode reward: [(0, '9.622')] +[2023-02-24 10:01:29,612][15460] Saving new best policy, reward=9.622! +[2023-02-24 10:01:32,365][15474] Updated weights for policy 0, policy_version 290 (0.0013) +[2023-02-24 10:01:34,601][01623] Fps is (10 sec: 4097.3, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1196032. Throughput: 0: 895.4. Samples: 296742. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:01:34,604][01623] Avg episode reward: [(0, '9.686')] +[2023-02-24 10:01:34,614][15460] Saving new best policy, reward=9.686! +[2023-02-24 10:01:39,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3550.2, 300 sec: 3471.2). Total num frames: 1212416. Throughput: 0: 882.5. Samples: 302918. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:01:39,605][01623] Avg episode reward: [(0, '9.937')] +[2023-02-24 10:01:39,610][15460] Saving new best policy, reward=9.937! +[2023-02-24 10:01:44,601][01623] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1224704. Throughput: 0: 856.0. Samples: 307020. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 10:01:44,606][01623] Avg episode reward: [(0, '11.370')] +[2023-02-24 10:01:44,620][15460] Saving new best policy, reward=11.370! +[2023-02-24 10:01:45,110][15474] Updated weights for policy 0, policy_version 300 (0.0013) +[2023-02-24 10:01:49,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1241088. Throughput: 0: 855.7. Samples: 309052. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:01:49,604][01623] Avg episode reward: [(0, '11.794')] +[2023-02-24 10:01:49,678][15460] Saving new best policy, reward=11.794! +[2023-02-24 10:01:54,601][01623] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1265664. Throughput: 0: 887.7. Samples: 315086. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 10:01:54,604][01623] Avg episode reward: [(0, '12.334')] +[2023-02-24 10:01:54,615][15460] Saving new best policy, reward=12.334! +[2023-02-24 10:01:55,546][15474] Updated weights for policy 0, policy_version 310 (0.0016) +[2023-02-24 10:01:59,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.3). Total num frames: 1282048. Throughput: 0: 882.4. Samples: 321356. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 10:01:59,604][01623] Avg episode reward: [(0, '13.059')] +[2023-02-24 10:01:59,607][15460] Saving new best policy, reward=13.059! +[2023-02-24 10:02:04,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 1298432. Throughput: 0: 862.0. Samples: 323412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:02:04,607][01623] Avg episode reward: [(0, '12.536')] +[2023-02-24 10:02:08,502][15474] Updated weights for policy 0, policy_version 320 (0.0038) +[2023-02-24 10:02:09,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1314816. Throughput: 0: 868.5. Samples: 327718. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:02:09,608][01623] Avg episode reward: [(0, '12.612')] +[2023-02-24 10:02:14,601][01623] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1335296. Throughput: 0: 904.1. Samples: 334184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:02:14,603][01623] Avg episode reward: [(0, '13.126')] +[2023-02-24 10:02:14,616][15460] Saving new best policy, reward=13.126! +[2023-02-24 10:02:17,750][15474] Updated weights for policy 0, policy_version 330 (0.0015) +[2023-02-24 10:02:19,601][01623] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 1355776. Throughput: 0: 904.7. Samples: 337454. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:02:19,608][01623] Avg episode reward: [(0, '12.559')] +[2023-02-24 10:02:24,601][01623] Fps is (10 sec: 3276.7, 60 sec: 3550.0, 300 sec: 3471.2). Total num frames: 1368064. Throughput: 0: 873.2. Samples: 342212. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:02:24,609][01623] Avg episode reward: [(0, '13.064')] +[2023-02-24 10:02:29,601][01623] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1384448. Throughput: 0: 875.4. Samples: 346412. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:02:29,609][01623] Avg episode reward: [(0, '13.035')] +[2023-02-24 10:02:31,059][15474] Updated weights for policy 0, policy_version 340 (0.0012) +[2023-02-24 10:02:34,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1404928. Throughput: 0: 898.5. Samples: 349484. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:02:34,609][01623] Avg episode reward: [(0, '11.485')] +[2023-02-24 10:02:39,602][01623] Fps is (10 sec: 4095.8, 60 sec: 3549.8, 300 sec: 3485.1). Total num frames: 1425408. Throughput: 0: 909.2. Samples: 356002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:02:39,605][01623] Avg episode reward: [(0, '11.966')] +[2023-02-24 10:02:41,293][15474] Updated weights for policy 0, policy_version 350 (0.0018) +[2023-02-24 10:02:44,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 1441792. Throughput: 0: 872.1. Samples: 360600. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:02:44,603][01623] Avg episode reward: [(0, '11.484')] +[2023-02-24 10:02:49,601][01623] Fps is (10 sec: 2867.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 1454080. Throughput: 0: 871.2. Samples: 362618. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:02:49,608][01623] Avg episode reward: [(0, '11.155')] +[2023-02-24 10:02:53,807][15474] Updated weights for policy 0, policy_version 360 (0.0025) +[2023-02-24 10:02:54,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1474560. Throughput: 0: 894.1. Samples: 367954. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:02:54,609][01623] Avg episode reward: [(0, '12.310')] +[2023-02-24 10:02:59,601][01623] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 1499136. Throughput: 0: 893.6. Samples: 374398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:02:59,603][01623] Avg episode reward: [(0, '12.829')] +[2023-02-24 10:03:04,602][01623] Fps is (10 sec: 3686.0, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 1511424. Throughput: 0: 877.3. Samples: 376932. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:03:04,606][01623] Avg episode reward: [(0, '12.561')] +[2023-02-24 10:03:04,620][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000369_1511424.pth... +[2023-02-24 10:03:04,748][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000166_679936.pth +[2023-02-24 10:03:05,545][15474] Updated weights for policy 0, policy_version 370 (0.0014) +[2023-02-24 10:03:09,601][01623] Fps is (10 sec: 2457.5, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1523712. Throughput: 0: 861.6. Samples: 380982. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 10:03:09,608][01623] Avg episode reward: [(0, '13.109')] +[2023-02-24 10:03:14,601][01623] Fps is (10 sec: 3277.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1544192. Throughput: 0: 885.0. Samples: 386238. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 10:03:14,608][01623] Avg episode reward: [(0, '14.178')] +[2023-02-24 10:03:14,617][15460] Saving new best policy, reward=14.178! +[2023-02-24 10:03:17,307][15474] Updated weights for policy 0, policy_version 380 (0.0025) +[2023-02-24 10:03:19,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1564672. Throughput: 0: 884.6. Samples: 389292. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 10:03:19,609][01623] Avg episode reward: [(0, '14.786')] +[2023-02-24 10:03:19,613][15460] Saving new best policy, reward=14.786! +[2023-02-24 10:03:24,603][01623] Fps is (10 sec: 3685.9, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 1581056. Throughput: 0: 859.0. Samples: 394656. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 10:03:24,605][01623] Avg episode reward: [(0, '14.672')] +[2023-02-24 10:03:29,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1593344. Throughput: 0: 845.6. Samples: 398650. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:03:29,610][01623] Avg episode reward: [(0, '15.984')] +[2023-02-24 10:03:29,612][15460] Saving new best policy, reward=15.984! +[2023-02-24 10:03:30,812][15474] Updated weights for policy 0, policy_version 390 (0.0031) +[2023-02-24 10:03:34,601][01623] Fps is (10 sec: 2867.6, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1609728. Throughput: 0: 848.4. Samples: 400794. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:03:34,608][01623] Avg episode reward: [(0, '15.915')] +[2023-02-24 10:03:39,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1634304. Throughput: 0: 870.8. Samples: 407138. Policy #0 lag: (min: 0.0, avg: 0.8, max: 1.0) +[2023-02-24 10:03:39,604][01623] Avg episode reward: [(0, '15.268')] +[2023-02-24 10:03:40,517][15474] Updated weights for policy 0, policy_version 400 (0.0014) +[2023-02-24 10:03:44,601][01623] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1650688. Throughput: 0: 849.5. Samples: 412624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:03:44,606][01623] Avg episode reward: [(0, '15.724')] +[2023-02-24 10:03:49,606][01623] Fps is (10 sec: 2865.7, 60 sec: 3481.3, 300 sec: 3457.3). Total num frames: 1662976. Throughput: 0: 839.3. Samples: 414706. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:03:49,614][01623] Avg episode reward: [(0, '16.029')] +[2023-02-24 10:03:49,618][15460] Saving new best policy, reward=16.029! +[2023-02-24 10:03:54,602][01623] Fps is (10 sec: 2047.8, 60 sec: 3276.7, 300 sec: 3429.5). Total num frames: 1671168. Throughput: 0: 821.2. Samples: 417936. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:03:54,610][01623] Avg episode reward: [(0, '15.228')] +[2023-02-24 10:03:56,578][15474] Updated weights for policy 0, policy_version 410 (0.0014) +[2023-02-24 10:03:59,602][01623] Fps is (10 sec: 2048.9, 60 sec: 3072.0, 300 sec: 3401.8). Total num frames: 1683456. Throughput: 0: 788.0. Samples: 421698. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:03:59,605][01623] Avg episode reward: [(0, '15.284')] +[2023-02-24 10:04:04,601][01623] Fps is (10 sec: 3277.2, 60 sec: 3208.6, 300 sec: 3401.8). Total num frames: 1703936. Throughput: 0: 782.1. Samples: 424488. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:04:04,607][01623] Avg episode reward: [(0, '14.918')] +[2023-02-24 10:04:09,027][15474] Updated weights for policy 0, policy_version 420 (0.0015) +[2023-02-24 10:04:09,602][01623] Fps is (10 sec: 3686.2, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 1720320. Throughput: 0: 777.8. Samples: 429658. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:04:09,607][01623] Avg episode reward: [(0, '15.504')] +[2023-02-24 10:04:14,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3401.8). Total num frames: 1732608. Throughput: 0: 780.0. Samples: 433748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:04:14,610][01623] Avg episode reward: [(0, '15.924')] +[2023-02-24 10:04:19,601][01623] Fps is (10 sec: 3277.1, 60 sec: 3140.3, 300 sec: 3401.8). Total num frames: 1753088. Throughput: 0: 792.7. Samples: 436466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:04:19,607][01623] Avg episode reward: [(0, '16.672')] +[2023-02-24 10:04:19,611][15460] Saving new best policy, reward=16.672! +[2023-02-24 10:04:21,060][15474] Updated weights for policy 0, policy_version 430 (0.0022) +[2023-02-24 10:04:24,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3208.6, 300 sec: 3401.8). Total num frames: 1773568. Throughput: 0: 790.6. Samples: 442716. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:04:24,608][01623] Avg episode reward: [(0, '17.575')] +[2023-02-24 10:04:24,619][15460] Saving new best policy, reward=17.575! +[2023-02-24 10:04:29,601][01623] Fps is (10 sec: 3686.5, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 1789952. Throughput: 0: 779.3. Samples: 447692. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:04:29,611][01623] Avg episode reward: [(0, '17.729')] +[2023-02-24 10:04:29,615][15460] Saving new best policy, reward=17.729! +[2023-02-24 10:04:33,839][15474] Updated weights for policy 0, policy_version 440 (0.0023) +[2023-02-24 10:04:34,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3401.8). Total num frames: 1802240. Throughput: 0: 777.4. Samples: 449686. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:04:34,608][01623] Avg episode reward: [(0, '17.100')] +[2023-02-24 10:04:39,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3401.8). Total num frames: 1822720. Throughput: 0: 811.1. Samples: 454436. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:04:39,604][01623] Avg episode reward: [(0, '16.484')] +[2023-02-24 10:04:44,330][15474] Updated weights for policy 0, policy_version 450 (0.0023) +[2023-02-24 10:04:44,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3208.5, 300 sec: 3401.8). Total num frames: 1843200. Throughput: 0: 868.2. Samples: 460766. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:04:44,604][01623] Avg episode reward: [(0, '15.741')] +[2023-02-24 10:04:49,602][01623] Fps is (10 sec: 3276.7, 60 sec: 3208.8, 300 sec: 3387.9). Total num frames: 1855488. Throughput: 0: 866.7. Samples: 463492. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:04:49,604][01623] Avg episode reward: [(0, '15.975')] +[2023-02-24 10:04:54,605][01623] Fps is (10 sec: 2866.0, 60 sec: 3344.9, 300 sec: 3401.7). Total num frames: 1871872. Throughput: 0: 844.6. Samples: 467668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:04:54,607][01623] Avg episode reward: [(0, '15.501')] +[2023-02-24 10:04:57,617][15474] Updated weights for policy 0, policy_version 460 (0.0033) +[2023-02-24 10:04:59,601][01623] Fps is (10 sec: 3686.6, 60 sec: 3481.6, 300 sec: 3401.8). Total num frames: 1892352. Throughput: 0: 868.9. Samples: 472850. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:04:59,604][01623] Avg episode reward: [(0, '16.011')] +[2023-02-24 10:05:04,604][01623] Fps is (10 sec: 4096.5, 60 sec: 3481.4, 300 sec: 3415.6). Total num frames: 1912832. Throughput: 0: 878.4. Samples: 475998. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:05:04,607][01623] Avg episode reward: [(0, '15.949')] +[2023-02-24 10:05:04,622][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000467_1912832.pth... +[2023-02-24 10:05:04,745][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000265_1085440.pth +[2023-02-24 10:05:07,438][15474] Updated weights for policy 0, policy_version 470 (0.0012) +[2023-02-24 10:05:09,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3429.5). Total num frames: 1929216. Throughput: 0: 873.2. Samples: 482012. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:05:09,609][01623] Avg episode reward: [(0, '16.740')] +[2023-02-24 10:05:14,601][01623] Fps is (10 sec: 2868.0, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1941504. Throughput: 0: 853.3. Samples: 486092. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:05:14,604][01623] Avg episode reward: [(0, '17.286')] +[2023-02-24 10:05:19,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3443.5). Total num frames: 1957888. Throughput: 0: 852.9. Samples: 488068. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:05:19,604][01623] Avg episode reward: [(0, '18.113')] +[2023-02-24 10:05:19,612][15460] Saving new best policy, reward=18.113! +[2023-02-24 10:05:20,728][15474] Updated weights for policy 0, policy_version 480 (0.0031) +[2023-02-24 10:05:24,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 1978368. Throughput: 0: 884.7. Samples: 494248. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 10:05:24,607][01623] Avg episode reward: [(0, '19.522')] +[2023-02-24 10:05:24,652][15460] Saving new best policy, reward=19.522! +[2023-02-24 10:05:29,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1998848. Throughput: 0: 866.8. Samples: 499774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:05:29,604][01623] Avg episode reward: [(0, '19.879')] +[2023-02-24 10:05:29,610][15460] Saving new best policy, reward=19.879! +[2023-02-24 10:05:32,357][15474] Updated weights for policy 0, policy_version 490 (0.0014) +[2023-02-24 10:05:34,604][01623] Fps is (10 sec: 3275.8, 60 sec: 3481.4, 300 sec: 3429.6). Total num frames: 2011136. Throughput: 0: 851.0. Samples: 501788. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:05:34,608][01623] Avg episode reward: [(0, '19.909')] +[2023-02-24 10:05:34,623][15460] Saving new best policy, reward=19.909! +[2023-02-24 10:05:39,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2027520. Throughput: 0: 849.9. Samples: 505910. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:05:39,603][01623] Avg episode reward: [(0, '19.955')] +[2023-02-24 10:05:39,613][15460] Saving new best policy, reward=19.955! +[2023-02-24 10:05:43,763][15474] Updated weights for policy 0, policy_version 500 (0.0016) +[2023-02-24 10:05:44,601][01623] Fps is (10 sec: 3687.6, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2048000. Throughput: 0: 882.6. Samples: 512566. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:05:44,608][01623] Avg episode reward: [(0, '20.631')] +[2023-02-24 10:05:44,619][15460] Saving new best policy, reward=20.631! +[2023-02-24 10:05:49,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 2068480. Throughput: 0: 885.4. Samples: 515840. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:05:49,610][01623] Avg episode reward: [(0, '20.844')] +[2023-02-24 10:05:49,615][15460] Saving new best policy, reward=20.844! +[2023-02-24 10:05:54,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3481.8, 300 sec: 3415.6). Total num frames: 2080768. Throughput: 0: 853.4. Samples: 520414. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:05:54,606][01623] Avg episode reward: [(0, '21.272')] +[2023-02-24 10:05:54,626][15460] Saving new best policy, reward=21.272! +[2023-02-24 10:05:56,321][15474] Updated weights for policy 0, policy_version 510 (0.0016) +[2023-02-24 10:05:59,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2097152. Throughput: 0: 860.3. Samples: 524806. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:05:59,608][01623] Avg episode reward: [(0, '21.640')] +[2023-02-24 10:05:59,611][15460] Saving new best policy, reward=21.640! +[2023-02-24 10:06:04,601][01623] Fps is (10 sec: 4095.9, 60 sec: 3481.8, 300 sec: 3443.4). Total num frames: 2121728. Throughput: 0: 889.1. Samples: 528076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:06:04,609][01623] Avg episode reward: [(0, '21.780')] +[2023-02-24 10:06:04,624][15460] Saving new best policy, reward=21.780! +[2023-02-24 10:06:06,286][15474] Updated weights for policy 0, policy_version 520 (0.0015) +[2023-02-24 10:06:09,603][01623] Fps is (10 sec: 4504.9, 60 sec: 3549.8, 300 sec: 3443.4). Total num frames: 2142208. Throughput: 0: 898.5. Samples: 534684. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:06:09,610][01623] Avg episode reward: [(0, '23.552')] +[2023-02-24 10:06:09,617][15460] Saving new best policy, reward=23.552! +[2023-02-24 10:06:14,603][01623] Fps is (10 sec: 3276.3, 60 sec: 3549.8, 300 sec: 3429.5). Total num frames: 2154496. Throughput: 0: 874.2. Samples: 539116. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:06:14,610][01623] Avg episode reward: [(0, '23.781')] +[2023-02-24 10:06:14,627][15460] Saving new best policy, reward=23.781! +[2023-02-24 10:06:19,343][15474] Updated weights for policy 0, policy_version 530 (0.0026) +[2023-02-24 10:06:19,601][01623] Fps is (10 sec: 2867.7, 60 sec: 3549.9, 300 sec: 3443.5). Total num frames: 2170880. Throughput: 0: 875.3. Samples: 541174. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:06:19,608][01623] Avg episode reward: [(0, '22.682')] +[2023-02-24 10:06:24,601][01623] Fps is (10 sec: 3687.1, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2191360. Throughput: 0: 912.2. Samples: 546958. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:06:24,609][01623] Avg episode reward: [(0, '22.237')] +[2023-02-24 10:06:28,771][15474] Updated weights for policy 0, policy_version 540 (0.0027) +[2023-02-24 10:06:29,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2211840. Throughput: 0: 910.6. Samples: 553542. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:06:29,608][01623] Avg episode reward: [(0, '20.801')] +[2023-02-24 10:06:34,607][01623] Fps is (10 sec: 3684.1, 60 sec: 3618.0, 300 sec: 3443.3). Total num frames: 2228224. Throughput: 0: 886.6. Samples: 555744. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:06:34,617][01623] Avg episode reward: [(0, '18.850')] +[2023-02-24 10:06:39,602][01623] Fps is (10 sec: 2866.8, 60 sec: 3549.8, 300 sec: 3443.4). Total num frames: 2240512. Throughput: 0: 875.7. Samples: 559820. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:06:39,608][01623] Avg episode reward: [(0, '17.854')] +[2023-02-24 10:06:41,896][15474] Updated weights for policy 0, policy_version 550 (0.0020) +[2023-02-24 10:06:44,601][01623] Fps is (10 sec: 3278.8, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2260992. Throughput: 0: 907.7. Samples: 565652. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:06:44,611][01623] Avg episode reward: [(0, '18.313')] +[2023-02-24 10:06:49,601][01623] Fps is (10 sec: 4506.2, 60 sec: 3618.1, 300 sec: 3457.3). Total num frames: 2285568. Throughput: 0: 908.8. Samples: 568970. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:06:49,607][01623] Avg episode reward: [(0, '18.114')] +[2023-02-24 10:06:51,963][15474] Updated weights for policy 0, policy_version 560 (0.0014) +[2023-02-24 10:06:54,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3443.4). Total num frames: 2297856. Throughput: 0: 882.3. Samples: 574386. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:06:54,604][01623] Avg episode reward: [(0, '18.056')] +[2023-02-24 10:06:59,602][01623] Fps is (10 sec: 2867.0, 60 sec: 3618.1, 300 sec: 3443.4). Total num frames: 2314240. Throughput: 0: 874.4. Samples: 578462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:06:59,609][01623] Avg episode reward: [(0, '18.771')] +[2023-02-24 10:07:04,344][15474] Updated weights for policy 0, policy_version 570 (0.0023) +[2023-02-24 10:07:04,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2334720. Throughput: 0: 889.2. Samples: 581190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:07:04,604][01623] Avg episode reward: [(0, '19.929')] +[2023-02-24 10:07:04,618][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000570_2334720.pth... +[2023-02-24 10:07:04,736][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000369_1511424.pth +[2023-02-24 10:07:09,601][01623] Fps is (10 sec: 4096.3, 60 sec: 3550.0, 300 sec: 3457.3). Total num frames: 2355200. Throughput: 0: 909.9. Samples: 587904. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:07:09,604][01623] Avg episode reward: [(0, '19.326')] +[2023-02-24 10:07:14,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3443.4). Total num frames: 2371584. Throughput: 0: 878.3. Samples: 593066. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:07:14,604][01623] Avg episode reward: [(0, '18.592')] +[2023-02-24 10:07:15,816][15474] Updated weights for policy 0, policy_version 580 (0.0016) +[2023-02-24 10:07:19,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2383872. Throughput: 0: 875.5. Samples: 595138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:07:19,608][01623] Avg episode reward: [(0, '18.477')] +[2023-02-24 10:07:24,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2404352. Throughput: 0: 898.0. Samples: 600228. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:07:24,606][01623] Avg episode reward: [(0, '20.244')] +[2023-02-24 10:07:27,745][15474] Updated weights for policy 0, policy_version 590 (0.0019) +[2023-02-24 10:07:29,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2420736. Throughput: 0: 883.7. Samples: 605420. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:07:29,606][01623] Avg episode reward: [(0, '19.592')] +[2023-02-24 10:07:34,601][01623] Fps is (10 sec: 2457.6, 60 sec: 3345.4, 300 sec: 3401.8). Total num frames: 2428928. Throughput: 0: 851.5. Samples: 607288. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:07:34,604][01623] Avg episode reward: [(0, '20.107')] +[2023-02-24 10:07:39,601][01623] Fps is (10 sec: 2048.0, 60 sec: 3345.1, 300 sec: 3387.9). Total num frames: 2441216. Throughput: 0: 803.9. Samples: 610562. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:07:39,609][01623] Avg episode reward: [(0, '21.932')] +[2023-02-24 10:07:44,063][15474] Updated weights for policy 0, policy_version 600 (0.0026) +[2023-02-24 10:07:44,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 2457600. Throughput: 0: 804.8. Samples: 614676. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:07:44,604][01623] Avg episode reward: [(0, '22.029')] +[2023-02-24 10:07:49,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3401.8). Total num frames: 2478080. Throughput: 0: 817.5. Samples: 617978. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 10:07:49,604][01623] Avg episode reward: [(0, '24.077')] +[2023-02-24 10:07:49,619][15460] Saving new best policy, reward=24.077! +[2023-02-24 10:07:53,257][15474] Updated weights for policy 0, policy_version 610 (0.0019) +[2023-02-24 10:07:54,603][01623] Fps is (10 sec: 4504.8, 60 sec: 3413.2, 300 sec: 3401.7). Total num frames: 2502656. Throughput: 0: 814.6. Samples: 624564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:07:54,638][01623] Avg episode reward: [(0, '21.676')] +[2023-02-24 10:07:59,603][01623] Fps is (10 sec: 3685.6, 60 sec: 3345.0, 300 sec: 3401.8). Total num frames: 2514944. Throughput: 0: 806.0. Samples: 629338. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:07:59,607][01623] Avg episode reward: [(0, '21.667')] +[2023-02-24 10:08:04,601][01623] Fps is (10 sec: 2867.7, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 2531328. Throughput: 0: 806.5. Samples: 631432. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 10:08:04,603][01623] Avg episode reward: [(0, '20.829')] +[2023-02-24 10:08:06,462][15474] Updated weights for policy 0, policy_version 620 (0.0014) +[2023-02-24 10:08:09,601][01623] Fps is (10 sec: 3687.2, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 2551808. Throughput: 0: 815.6. Samples: 636932. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:08:09,604][01623] Avg episode reward: [(0, '21.048')] +[2023-02-24 10:08:14,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 2572288. Throughput: 0: 850.2. Samples: 643680. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 10:08:14,609][01623] Avg episode reward: [(0, '21.229')] +[2023-02-24 10:08:16,146][15474] Updated weights for policy 0, policy_version 630 (0.0034) +[2023-02-24 10:08:19,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3415.7). Total num frames: 2588672. Throughput: 0: 864.9. Samples: 646208. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:08:19,604][01623] Avg episode reward: [(0, '20.770')] +[2023-02-24 10:08:24,604][01623] Fps is (10 sec: 2866.4, 60 sec: 3276.7, 300 sec: 3415.6). Total num frames: 2600960. Throughput: 0: 886.5. Samples: 650456. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 10:08:24,609][01623] Avg episode reward: [(0, '22.123')] +[2023-02-24 10:08:28,723][15474] Updated weights for policy 0, policy_version 640 (0.0017) +[2023-02-24 10:08:29,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3429.5). Total num frames: 2621440. Throughput: 0: 918.0. Samples: 655986. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:08:29,609][01623] Avg episode reward: [(0, '22.073')] +[2023-02-24 10:08:34,601][01623] Fps is (10 sec: 4506.8, 60 sec: 3618.1, 300 sec: 3429.5). Total num frames: 2646016. Throughput: 0: 917.7. Samples: 659276. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:08:34,608][01623] Avg episode reward: [(0, '21.749')] +[2023-02-24 10:08:39,462][15474] Updated weights for policy 0, policy_version 650 (0.0015) +[2023-02-24 10:08:39,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3429.5). Total num frames: 2662400. Throughput: 0: 899.0. Samples: 665018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:08:39,607][01623] Avg episode reward: [(0, '21.415')] +[2023-02-24 10:08:44,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3429.6). Total num frames: 2674688. Throughput: 0: 885.6. Samples: 669188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:08:44,606][01623] Avg episode reward: [(0, '21.579')] +[2023-02-24 10:08:49,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3471.2). Total num frames: 2695168. Throughput: 0: 890.8. Samples: 671520. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:08:49,607][01623] Avg episode reward: [(0, '22.475')] +[2023-02-24 10:08:51,080][15474] Updated weights for policy 0, policy_version 660 (0.0012) +[2023-02-24 10:08:54,601][01623] Fps is (10 sec: 4505.7, 60 sec: 3618.2, 300 sec: 3512.8). Total num frames: 2719744. Throughput: 0: 918.0. Samples: 678240. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:08:54,607][01623] Avg episode reward: [(0, '21.894')] +[2023-02-24 10:08:59,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3485.1). Total num frames: 2732032. Throughput: 0: 893.2. Samples: 683874. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:08:59,604][01623] Avg episode reward: [(0, '22.262')] +[2023-02-24 10:09:02,695][15474] Updated weights for policy 0, policy_version 670 (0.0017) +[2023-02-24 10:09:04,602][01623] Fps is (10 sec: 2867.1, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 2748416. Throughput: 0: 885.4. Samples: 686052. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:09:04,605][01623] Avg episode reward: [(0, '24.232')] +[2023-02-24 10:09:04,622][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000671_2748416.pth... +[2023-02-24 10:09:04,776][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000467_1912832.pth +[2023-02-24 10:09:04,789][15460] Saving new best policy, reward=24.232! +[2023-02-24 10:09:09,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 2764800. Throughput: 0: 892.8. Samples: 690628. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:09:09,604][01623] Avg episode reward: [(0, '24.161')] +[2023-02-24 10:09:13,529][15474] Updated weights for policy 0, policy_version 680 (0.0012) +[2023-02-24 10:09:14,601][01623] Fps is (10 sec: 4096.2, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 2789376. Throughput: 0: 910.6. Samples: 696964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:09:14,608][01623] Avg episode reward: [(0, '24.324')] +[2023-02-24 10:09:14,621][15460] Saving new best policy, reward=24.324! +[2023-02-24 10:09:19,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 2801664. Throughput: 0: 895.1. Samples: 699554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:09:19,605][01623] Avg episode reward: [(0, '24.367')] +[2023-02-24 10:09:19,610][15460] Saving new best policy, reward=24.367! +[2023-02-24 10:09:24,601][01623] Fps is (10 sec: 2457.6, 60 sec: 3550.0, 300 sec: 3471.2). Total num frames: 2813952. Throughput: 0: 854.8. Samples: 703486. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:09:24,606][01623] Avg episode reward: [(0, '23.909')] +[2023-02-24 10:09:28,437][15474] Updated weights for policy 0, policy_version 690 (0.0030) +[2023-02-24 10:09:29,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2830336. Throughput: 0: 853.2. Samples: 707580. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:09:29,607][01623] Avg episode reward: [(0, '23.510')] +[2023-02-24 10:09:34,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 2850816. Throughput: 0: 872.7. Samples: 710790. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:09:34,604][01623] Avg episode reward: [(0, '22.200')] +[2023-02-24 10:09:37,931][15474] Updated weights for policy 0, policy_version 700 (0.0012) +[2023-02-24 10:09:39,601][01623] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2871296. Throughput: 0: 865.6. Samples: 717190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:09:39,611][01623] Avg episode reward: [(0, '22.911')] +[2023-02-24 10:09:44,601][01623] Fps is (10 sec: 3276.7, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2883584. Throughput: 0: 832.7. Samples: 721344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:09:44,610][01623] Avg episode reward: [(0, '22.923')] +[2023-02-24 10:09:49,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 2899968. Throughput: 0: 828.7. Samples: 723344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:09:49,608][01623] Avg episode reward: [(0, '22.383')] +[2023-02-24 10:09:51,240][15474] Updated weights for policy 0, policy_version 710 (0.0025) +[2023-02-24 10:09:54,601][01623] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3485.1). Total num frames: 2920448. Throughput: 0: 857.6. Samples: 729220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:09:54,603][01623] Avg episode reward: [(0, '23.340')] +[2023-02-24 10:09:59,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2940928. Throughput: 0: 861.3. Samples: 735722. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:09:59,603][01623] Avg episode reward: [(0, '22.931')] +[2023-02-24 10:10:01,844][15474] Updated weights for policy 0, policy_version 720 (0.0036) +[2023-02-24 10:10:04,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3471.2). Total num frames: 2953216. Throughput: 0: 849.6. Samples: 737786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:10:04,608][01623] Avg episode reward: [(0, '21.784')] +[2023-02-24 10:10:09,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 2969600. Throughput: 0: 855.0. Samples: 741962. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:10:09,607][01623] Avg episode reward: [(0, '21.709')] +[2023-02-24 10:10:13,855][15474] Updated weights for policy 0, policy_version 730 (0.0015) +[2023-02-24 10:10:14,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 2990080. Throughput: 0: 897.3. Samples: 747960. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:10:14,603][01623] Avg episode reward: [(0, '20.498')] +[2023-02-24 10:10:19,601][01623] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3014656. Throughput: 0: 899.9. Samples: 751286. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:10:19,604][01623] Avg episode reward: [(0, '21.542')] +[2023-02-24 10:10:24,603][01623] Fps is (10 sec: 3685.6, 60 sec: 3549.7, 300 sec: 3485.0). Total num frames: 3026944. Throughput: 0: 872.2. Samples: 756440. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:10:24,610][01623] Avg episode reward: [(0, '22.666')] +[2023-02-24 10:10:25,462][15474] Updated weights for policy 0, policy_version 740 (0.0013) +[2023-02-24 10:10:29,603][01623] Fps is (10 sec: 2457.1, 60 sec: 3481.5, 300 sec: 3485.1). Total num frames: 3039232. Throughput: 0: 869.5. Samples: 760472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:10:29,606][01623] Avg episode reward: [(0, '24.674')] +[2023-02-24 10:10:29,610][15460] Saving new best policy, reward=24.674! +[2023-02-24 10:10:34,601][01623] Fps is (10 sec: 3277.5, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3059712. Throughput: 0: 880.5. Samples: 762968. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:10:34,604][01623] Avg episode reward: [(0, '24.153')] +[2023-02-24 10:10:36,776][15474] Updated weights for policy 0, policy_version 750 (0.0014) +[2023-02-24 10:10:39,601][01623] Fps is (10 sec: 4506.5, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3084288. Throughput: 0: 897.2. Samples: 769592. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:10:39,604][01623] Avg episode reward: [(0, '23.399')] +[2023-02-24 10:10:44,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3096576. Throughput: 0: 871.3. Samples: 774932. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:10:44,608][01623] Avg episode reward: [(0, '21.750')] +[2023-02-24 10:10:49,245][15474] Updated weights for policy 0, policy_version 760 (0.0034) +[2023-02-24 10:10:49,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3112960. Throughput: 0: 872.8. Samples: 777062. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:10:49,603][01623] Avg episode reward: [(0, '19.816')] +[2023-02-24 10:10:54,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3133440. Throughput: 0: 889.6. Samples: 781994. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:10:54,609][01623] Avg episode reward: [(0, '17.983')] +[2023-02-24 10:10:59,206][15474] Updated weights for policy 0, policy_version 770 (0.0014) +[2023-02-24 10:10:59,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3153920. Throughput: 0: 903.1. Samples: 788598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:10:59,609][01623] Avg episode reward: [(0, '18.945')] +[2023-02-24 10:11:04,602][01623] Fps is (10 sec: 3686.0, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 3170304. Throughput: 0: 892.8. Samples: 791464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:11:04,608][01623] Avg episode reward: [(0, '19.522')] +[2023-02-24 10:11:04,620][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000774_3170304.pth... +[2023-02-24 10:11:04,758][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000570_2334720.pth +[2023-02-24 10:11:09,602][01623] Fps is (10 sec: 2457.3, 60 sec: 3481.5, 300 sec: 3471.2). Total num frames: 3178496. Throughput: 0: 853.0. Samples: 794826. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:11:09,605][01623] Avg episode reward: [(0, '19.659')] +[2023-02-24 10:11:14,601][01623] Fps is (10 sec: 2048.2, 60 sec: 3345.1, 300 sec: 3457.3). Total num frames: 3190784. Throughput: 0: 833.0. Samples: 797954. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:11:14,610][01623] Avg episode reward: [(0, '19.986')] +[2023-02-24 10:11:15,992][15474] Updated weights for policy 0, policy_version 780 (0.0059) +[2023-02-24 10:11:19,601][01623] Fps is (10 sec: 2457.9, 60 sec: 3140.3, 300 sec: 3429.5). Total num frames: 3203072. Throughput: 0: 812.4. Samples: 799524. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:11:19,604][01623] Avg episode reward: [(0, '20.589')] +[2023-02-24 10:11:24,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3276.9, 300 sec: 3429.5). Total num frames: 3223552. Throughput: 0: 796.8. Samples: 805450. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:11:24,604][01623] Avg episode reward: [(0, '21.398')] +[2023-02-24 10:11:26,498][15474] Updated weights for policy 0, policy_version 790 (0.0013) +[2023-02-24 10:11:29,606][01623] Fps is (10 sec: 4093.9, 60 sec: 3413.2, 300 sec: 3443.4). Total num frames: 3244032. Throughput: 0: 803.0. Samples: 811070. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:11:29,613][01623] Avg episode reward: [(0, '20.400')] +[2023-02-24 10:11:34,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3443.4). Total num frames: 3256320. Throughput: 0: 800.9. Samples: 813104. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:11:34,606][01623] Avg episode reward: [(0, '21.088')] +[2023-02-24 10:11:39,601][01623] Fps is (10 sec: 2868.7, 60 sec: 3140.3, 300 sec: 3429.5). Total num frames: 3272704. Throughput: 0: 777.4. Samples: 816976. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:11:39,606][01623] Avg episode reward: [(0, '20.692')] +[2023-02-24 10:11:40,214][15474] Updated weights for policy 0, policy_version 800 (0.0028) +[2023-02-24 10:11:44,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 3293184. Throughput: 0: 770.7. Samples: 823278. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:11:44,604][01623] Avg episode reward: [(0, '21.360')] +[2023-02-24 10:11:49,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3429.5). Total num frames: 3309568. Throughput: 0: 771.4. Samples: 826178. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:11:49,604][01623] Avg episode reward: [(0, '20.705')] +[2023-02-24 10:11:52,431][15474] Updated weights for policy 0, policy_version 810 (0.0012) +[2023-02-24 10:11:54,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3415.7). Total num frames: 3321856. Throughput: 0: 786.6. Samples: 830222. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:11:54,608][01623] Avg episode reward: [(0, '20.444')] +[2023-02-24 10:11:59,603][01623] Fps is (10 sec: 2866.6, 60 sec: 3071.9, 300 sec: 3401.7). Total num frames: 3338240. Throughput: 0: 807.9. Samples: 834312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:11:59,612][01623] Avg episode reward: [(0, '21.143')] +[2023-02-24 10:12:04,452][15474] Updated weights for policy 0, policy_version 820 (0.0021) +[2023-02-24 10:12:04,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 3401.8). Total num frames: 3358720. Throughput: 0: 842.8. Samples: 837450. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:12:04,604][01623] Avg episode reward: [(0, '21.276')] +[2023-02-24 10:12:09,602][01623] Fps is (10 sec: 3686.8, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 3375104. Throughput: 0: 848.2. Samples: 843618. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:12:09,607][01623] Avg episode reward: [(0, '21.101')] +[2023-02-24 10:12:14,608][01623] Fps is (10 sec: 2865.4, 60 sec: 3276.5, 300 sec: 3401.7). Total num frames: 3387392. Throughput: 0: 809.7. Samples: 847508. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:12:14,618][01623] Avg episode reward: [(0, '21.444')] +[2023-02-24 10:12:17,960][15474] Updated weights for policy 0, policy_version 830 (0.0036) +[2023-02-24 10:12:19,602][01623] Fps is (10 sec: 2867.4, 60 sec: 3345.0, 300 sec: 3387.9). Total num frames: 3403776. Throughput: 0: 811.2. Samples: 849610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:12:19,608][01623] Avg episode reward: [(0, '21.444')] +[2023-02-24 10:12:24,601][01623] Fps is (10 sec: 3688.7, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3424256. Throughput: 0: 852.8. Samples: 855350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:12:24,604][01623] Avg episode reward: [(0, '22.164')] +[2023-02-24 10:12:28,049][15474] Updated weights for policy 0, policy_version 840 (0.0019) +[2023-02-24 10:12:29,606][01623] Fps is (10 sec: 4094.1, 60 sec: 3345.1, 300 sec: 3443.4). Total num frames: 3444736. Throughput: 0: 845.7. Samples: 861340. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:12:29,614][01623] Avg episode reward: [(0, '21.595')] +[2023-02-24 10:12:34,603][01623] Fps is (10 sec: 3276.2, 60 sec: 3345.0, 300 sec: 3443.4). Total num frames: 3457024. Throughput: 0: 822.1. Samples: 863176. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:12:34,609][01623] Avg episode reward: [(0, '22.144')] +[2023-02-24 10:12:39,601][01623] Fps is (10 sec: 2458.9, 60 sec: 3276.8, 300 sec: 3429.5). Total num frames: 3469312. Throughput: 0: 815.2. Samples: 866906. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:12:39,604][01623] Avg episode reward: [(0, '22.594')] +[2023-02-24 10:12:42,299][15474] Updated weights for policy 0, policy_version 850 (0.0016) +[2023-02-24 10:12:44,601][01623] Fps is (10 sec: 3277.5, 60 sec: 3276.8, 300 sec: 3429.5). Total num frames: 3489792. Throughput: 0: 844.1. Samples: 872296. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:12:44,604][01623] Avg episode reward: [(0, '22.530')] +[2023-02-24 10:12:49,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3415.7). Total num frames: 3510272. Throughput: 0: 841.7. Samples: 875326. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:12:49,604][01623] Avg episode reward: [(0, '21.928')] +[2023-02-24 10:12:54,081][15474] Updated weights for policy 0, policy_version 860 (0.0020) +[2023-02-24 10:12:54,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3415.7). Total num frames: 3522560. Throughput: 0: 814.1. Samples: 880250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:12:54,605][01623] Avg episode reward: [(0, '22.352')] +[2023-02-24 10:12:59,601][01623] Fps is (10 sec: 2457.6, 60 sec: 3276.9, 300 sec: 3401.8). Total num frames: 3534848. Throughput: 0: 814.2. Samples: 884144. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:12:59,609][01623] Avg episode reward: [(0, '22.036')] +[2023-02-24 10:13:04,602][01623] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 3555328. Throughput: 0: 826.2. Samples: 886790. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:13:04,605][01623] Avg episode reward: [(0, '21.063')] +[2023-02-24 10:13:04,615][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000868_3555328.pth... +[2023-02-24 10:13:04,730][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000671_2748416.pth +[2023-02-24 10:13:06,324][15474] Updated weights for policy 0, policy_version 870 (0.0019) +[2023-02-24 10:13:09,602][01623] Fps is (10 sec: 4095.8, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3575808. Throughput: 0: 835.1. Samples: 892930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:13:09,609][01623] Avg episode reward: [(0, '23.167')] +[2023-02-24 10:13:14,601][01623] Fps is (10 sec: 3276.9, 60 sec: 3345.4, 300 sec: 3387.9). Total num frames: 3588096. Throughput: 0: 809.3. Samples: 897754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:13:14,604][01623] Avg episode reward: [(0, '23.379')] +[2023-02-24 10:13:19,369][15474] Updated weights for policy 0, policy_version 880 (0.0012) +[2023-02-24 10:13:19,601][01623] Fps is (10 sec: 2867.3, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3604480. Throughput: 0: 813.7. Samples: 899790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:13:19,604][01623] Avg episode reward: [(0, '24.310')] +[2023-02-24 10:13:24,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 3620864. Throughput: 0: 837.6. Samples: 904596. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:13:24,608][01623] Avg episode reward: [(0, '25.492')] +[2023-02-24 10:13:24,618][15460] Saving new best policy, reward=25.492! +[2023-02-24 10:13:29,601][01623] Fps is (10 sec: 3686.5, 60 sec: 3277.1, 300 sec: 3374.0). Total num frames: 3641344. Throughput: 0: 858.0. Samples: 910908. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:13:29,607][01623] Avg episode reward: [(0, '24.617')] +[2023-02-24 10:13:29,630][15474] Updated weights for policy 0, policy_version 890 (0.0018) +[2023-02-24 10:13:34,604][01623] Fps is (10 sec: 3685.2, 60 sec: 3345.0, 300 sec: 3374.0). Total num frames: 3657728. Throughput: 0: 855.7. Samples: 913836. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:13:34,609][01623] Avg episode reward: [(0, '26.436')] +[2023-02-24 10:13:34,628][15460] Saving new best policy, reward=26.436! +[2023-02-24 10:13:39,603][01623] Fps is (10 sec: 2866.7, 60 sec: 3345.0, 300 sec: 3374.0). Total num frames: 3670016. Throughput: 0: 832.0. Samples: 917692. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:13:39,612][01623] Avg episode reward: [(0, '27.368')] +[2023-02-24 10:13:39,620][15460] Saving new best policy, reward=27.368! +[2023-02-24 10:13:43,654][15474] Updated weights for policy 0, policy_version 900 (0.0013) +[2023-02-24 10:13:44,601][01623] Fps is (10 sec: 3277.9, 60 sec: 3345.1, 300 sec: 3374.0). Total num frames: 3690496. Throughput: 0: 848.8. Samples: 922338. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:13:44,610][01623] Avg episode reward: [(0, '27.946')] +[2023-02-24 10:13:44,625][15460] Saving new best policy, reward=27.946! +[2023-02-24 10:13:49,601][01623] Fps is (10 sec: 4096.6, 60 sec: 3345.1, 300 sec: 3360.1). Total num frames: 3710976. Throughput: 0: 859.9. Samples: 925484. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:13:49,608][01623] Avg episode reward: [(0, '28.720')] +[2023-02-24 10:13:49,611][15460] Saving new best policy, reward=28.720! +[2023-02-24 10:13:53,752][15474] Updated weights for policy 0, policy_version 910 (0.0026) +[2023-02-24 10:13:54,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3727360. Throughput: 0: 858.2. Samples: 931548. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:13:54,605][01623] Avg episode reward: [(0, '27.513')] +[2023-02-24 10:13:59,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3360.1). Total num frames: 3739648. Throughput: 0: 844.0. Samples: 935736. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:13:59,611][01623] Avg episode reward: [(0, '27.387')] +[2023-02-24 10:14:04,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3360.1). Total num frames: 3756032. Throughput: 0: 844.4. Samples: 937790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:14:04,604][01623] Avg episode reward: [(0, '28.952')] +[2023-02-24 10:14:04,615][15460] Saving new best policy, reward=28.952! +[2023-02-24 10:14:06,782][15474] Updated weights for policy 0, policy_version 920 (0.0022) +[2023-02-24 10:14:09,601][01623] Fps is (10 sec: 3686.3, 60 sec: 3345.1, 300 sec: 3346.2). Total num frames: 3776512. Throughput: 0: 870.1. Samples: 943750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:14:09,608][01623] Avg episode reward: [(0, '27.557')] +[2023-02-24 10:14:14,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3374.0). Total num frames: 3796992. Throughput: 0: 863.6. Samples: 949768. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:14:14,609][01623] Avg episode reward: [(0, '26.532')] +[2023-02-24 10:14:18,176][15474] Updated weights for policy 0, policy_version 930 (0.0016) +[2023-02-24 10:14:19,602][01623] Fps is (10 sec: 3276.7, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3809280. Throughput: 0: 843.1. Samples: 951774. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:14:19,609][01623] Avg episode reward: [(0, '26.854')] +[2023-02-24 10:14:24,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3825664. Throughput: 0: 848.5. Samples: 955872. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 10:14:24,610][01623] Avg episode reward: [(0, '25.906')] +[2023-02-24 10:14:29,601][01623] Fps is (10 sec: 3686.5, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3846144. Throughput: 0: 874.9. Samples: 961710. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:14:29,610][01623] Avg episode reward: [(0, '27.590')] +[2023-02-24 10:14:30,188][15474] Updated weights for policy 0, policy_version 940 (0.0020) +[2023-02-24 10:14:34,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3481.8, 300 sec: 3374.0). Total num frames: 3866624. Throughput: 0: 873.6. Samples: 964796. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:14:34,604][01623] Avg episode reward: [(0, '26.828')] +[2023-02-24 10:14:39,601][01623] Fps is (10 sec: 3276.9, 60 sec: 3481.7, 300 sec: 3374.0). Total num frames: 3878912. Throughput: 0: 851.1. Samples: 969846. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:14:39,604][01623] Avg episode reward: [(0, '25.900')] +[2023-02-24 10:14:43,678][15474] Updated weights for policy 0, policy_version 950 (0.0022) +[2023-02-24 10:14:44,601][01623] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3360.1). Total num frames: 3891200. Throughput: 0: 835.8. Samples: 973346. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:14:44,606][01623] Avg episode reward: [(0, '26.237')] +[2023-02-24 10:14:49,604][01623] Fps is (10 sec: 2456.8, 60 sec: 3208.4, 300 sec: 3332.3). Total num frames: 3903488. Throughput: 0: 825.8. Samples: 974952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:14:49,607][01623] Avg episode reward: [(0, '25.226')] +[2023-02-24 10:14:54,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3318.5). Total num frames: 3919872. Throughput: 0: 780.5. Samples: 978874. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:14:54,608][01623] Avg episode reward: [(0, '24.419')] +[2023-02-24 10:14:57,279][15474] Updated weights for policy 0, policy_version 960 (0.0034) +[2023-02-24 10:14:59,601][01623] Fps is (10 sec: 3277.9, 60 sec: 3276.8, 300 sec: 3332.3). Total num frames: 3936256. Throughput: 0: 778.3. Samples: 984792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:14:59,605][01623] Avg episode reward: [(0, '24.662')] +[2023-02-24 10:15:04,607][01623] Fps is (10 sec: 3274.8, 60 sec: 3276.5, 300 sec: 3332.3). Total num frames: 3952640. Throughput: 0: 779.6. Samples: 986862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:15:04,610][01623] Avg episode reward: [(0, '24.905')] +[2023-02-24 10:15:04,628][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000965_3952640.pth... +[2023-02-24 10:15:04,773][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000774_3170304.pth +[2023-02-24 10:15:09,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3304.6). Total num frames: 3964928. Throughput: 0: 778.5. Samples: 990904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:15:09,604][01623] Avg episode reward: [(0, '25.501')] +[2023-02-24 10:15:10,712][15474] Updated weights for policy 0, policy_version 970 (0.0027) +[2023-02-24 10:15:14,601][01623] Fps is (10 sec: 3278.8, 60 sec: 3140.3, 300 sec: 3290.7). Total num frames: 3985408. Throughput: 0: 783.4. Samples: 996964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:15:14,608][01623] Avg episode reward: [(0, '24.148')] +[2023-02-24 10:15:19,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3276.8, 300 sec: 3318.5). Total num frames: 4005888. Throughput: 0: 782.5. Samples: 1000010. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:15:19,604][01623] Avg episode reward: [(0, '23.874')] +[2023-02-24 10:15:21,850][15474] Updated weights for policy 0, policy_version 980 (0.0015) +[2023-02-24 10:15:24,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3318.5). Total num frames: 4018176. Throughput: 0: 772.6. Samples: 1004612. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:15:24,604][01623] Avg episode reward: [(0, '24.898')] +[2023-02-24 10:15:29,602][01623] Fps is (10 sec: 2866.9, 60 sec: 3140.2, 300 sec: 3304.6). Total num frames: 4034560. Throughput: 0: 780.9. Samples: 1008488. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:15:29,609][01623] Avg episode reward: [(0, '24.381')] +[2023-02-24 10:15:34,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3072.0, 300 sec: 3276.8). Total num frames: 4050944. Throughput: 0: 803.7. Samples: 1011116. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 10:15:34,606][01623] Avg episode reward: [(0, '23.606')] +[2023-02-24 10:15:34,727][15474] Updated weights for policy 0, policy_version 990 (0.0023) +[2023-02-24 10:15:39,601][01623] Fps is (10 sec: 3686.7, 60 sec: 3208.5, 300 sec: 3304.6). Total num frames: 4071424. Throughput: 0: 852.1. Samples: 1017218. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:15:39,611][01623] Avg episode reward: [(0, '22.623')] +[2023-02-24 10:15:44,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 4087808. Throughput: 0: 822.6. Samples: 1021808. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 10:15:44,603][01623] Avg episode reward: [(0, '23.347')] +[2023-02-24 10:15:47,916][15474] Updated weights for policy 0, policy_version 1000 (0.0035) +[2023-02-24 10:15:49,601][01623] Fps is (10 sec: 2867.3, 60 sec: 3277.0, 300 sec: 3276.8). Total num frames: 4100096. Throughput: 0: 818.0. Samples: 1023666. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:15:49,609][01623] Avg episode reward: [(0, '23.191')] +[2023-02-24 10:15:54,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4116480. Throughput: 0: 833.9. Samples: 1028430. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:15:54,608][01623] Avg episode reward: [(0, '25.141')] +[2023-02-24 10:15:59,076][15474] Updated weights for policy 0, policy_version 1010 (0.0022) +[2023-02-24 10:15:59,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 4136960. Throughput: 0: 835.3. Samples: 1034554. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:15:59,609][01623] Avg episode reward: [(0, '25.842')] +[2023-02-24 10:16:04,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.4, 300 sec: 3304.6). Total num frames: 4153344. Throughput: 0: 823.3. Samples: 1037058. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:16:04,604][01623] Avg episode reward: [(0, '26.643')] +[2023-02-24 10:16:09,601][01623] Fps is (10 sec: 2867.1, 60 sec: 3345.1, 300 sec: 3304.6). Total num frames: 4165632. Throughput: 0: 805.3. Samples: 1040852. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:16:09,604][01623] Avg episode reward: [(0, '26.817')] +[2023-02-24 10:16:13,184][15474] Updated weights for policy 0, policy_version 1020 (0.0019) +[2023-02-24 10:16:14,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3318.5). Total num frames: 4182016. Throughput: 0: 824.2. Samples: 1045576. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:16:14,607][01623] Avg episode reward: [(0, '26.849')] +[2023-02-24 10:16:19,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3318.5). Total num frames: 4202496. Throughput: 0: 832.0. Samples: 1048558. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 10:16:19,606][01623] Avg episode reward: [(0, '27.046')] +[2023-02-24 10:16:24,323][15474] Updated weights for policy 0, policy_version 1030 (0.0027) +[2023-02-24 10:16:24,602][01623] Fps is (10 sec: 3686.3, 60 sec: 3345.0, 300 sec: 3304.6). Total num frames: 4218880. Throughput: 0: 819.2. Samples: 1054080. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:16:24,605][01623] Avg episode reward: [(0, '26.736')] +[2023-02-24 10:16:29,603][01623] Fps is (10 sec: 2866.7, 60 sec: 3276.8, 300 sec: 3304.5). Total num frames: 4231168. Throughput: 0: 810.2. Samples: 1058268. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 10:16:29,608][01623] Avg episode reward: [(0, '25.177')] +[2023-02-24 10:16:34,601][01623] Fps is (10 sec: 2867.3, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 4247552. Throughput: 0: 814.0. Samples: 1060294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:16:34,603][01623] Avg episode reward: [(0, '24.502')] +[2023-02-24 10:16:36,892][15474] Updated weights for policy 0, policy_version 1040 (0.0039) +[2023-02-24 10:16:39,601][01623] Fps is (10 sec: 3687.1, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 4268032. Throughput: 0: 844.4. Samples: 1066426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:16:39,610][01623] Avg episode reward: [(0, '25.174')] +[2023-02-24 10:16:44,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4288512. Throughput: 0: 836.5. Samples: 1072196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:16:44,610][01623] Avg episode reward: [(0, '24.707')] +[2023-02-24 10:16:48,937][15474] Updated weights for policy 0, policy_version 1050 (0.0014) +[2023-02-24 10:16:49,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4300800. Throughput: 0: 827.1. Samples: 1074276. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:16:49,610][01623] Avg episode reward: [(0, '26.092')] +[2023-02-24 10:16:54,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4317184. Throughput: 0: 834.7. Samples: 1078412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:16:54,603][01623] Avg episode reward: [(0, '26.249')] +[2023-02-24 10:16:59,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4337664. Throughput: 0: 868.2. Samples: 1084646. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:16:59,604][01623] Avg episode reward: [(0, '26.634')] +[2023-02-24 10:17:00,178][15474] Updated weights for policy 0, policy_version 1060 (0.0031) +[2023-02-24 10:17:04,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4354048. Throughput: 0: 869.4. Samples: 1087680. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:17:04,609][01623] Avg episode reward: [(0, '27.748')] +[2023-02-24 10:17:04,624][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001063_4354048.pth... +[2023-02-24 10:17:04,779][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000868_3555328.pth +[2023-02-24 10:17:09,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4366336. Throughput: 0: 841.5. Samples: 1091948. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 10:17:09,605][01623] Avg episode reward: [(0, '28.713')] +[2023-02-24 10:17:14,252][15474] Updated weights for policy 0, policy_version 1070 (0.0013) +[2023-02-24 10:17:14,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4382720. Throughput: 0: 836.7. Samples: 1095918. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 10:17:14,609][01623] Avg episode reward: [(0, '28.407')] +[2023-02-24 10:17:19,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4403200. Throughput: 0: 856.5. Samples: 1098838. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 10:17:19,604][01623] Avg episode reward: [(0, '30.109')] +[2023-02-24 10:17:19,606][15460] Saving new best policy, reward=30.109! +[2023-02-24 10:17:24,248][15474] Updated weights for policy 0, policy_version 1080 (0.0013) +[2023-02-24 10:17:24,601][01623] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3318.5). Total num frames: 4423680. Throughput: 0: 856.7. Samples: 1104978. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:17:24,609][01623] Avg episode reward: [(0, '29.185')] +[2023-02-24 10:17:29,601][01623] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3318.5). Total num frames: 4435968. Throughput: 0: 823.5. Samples: 1109254. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:17:29,606][01623] Avg episode reward: [(0, '28.348')] +[2023-02-24 10:17:34,601][01623] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 4448256. Throughput: 0: 818.8. Samples: 1111124. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 10:17:34,610][01623] Avg episode reward: [(0, '27.903')] +[2023-02-24 10:17:38,757][15474] Updated weights for policy 0, policy_version 1090 (0.0040) +[2023-02-24 10:17:39,601][01623] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 4464640. Throughput: 0: 830.8. Samples: 1115796. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:17:39,610][01623] Avg episode reward: [(0, '28.200')] +[2023-02-24 10:17:44,601][01623] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 4485120. Throughput: 0: 820.6. Samples: 1121572. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 10:17:44,607][01623] Avg episode reward: [(0, '29.896')] +[2023-02-24 10:17:49,601][01623] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 4497408. Throughput: 0: 806.2. Samples: 1123958. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 10:17:49,611][01623] Avg episode reward: [(0, '29.531')] +[2023-02-24 10:17:51,520][15460] Stopping Batcher_0... +[2023-02-24 10:17:51,522][15460] Loop batcher_evt_loop terminating... +[2023-02-24 10:17:51,523][01623] Component Batcher_0 stopped! +[2023-02-24 10:17:51,528][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_4505600.pth... +[2023-02-24 10:17:51,545][15474] Updated weights for policy 0, policy_version 1100 (0.0014) +[2023-02-24 10:17:51,605][15474] Weights refcount: 2 0 +[2023-02-24 10:17:51,619][01623] Component InferenceWorker_p0-w0 stopped! +[2023-02-24 10:17:51,623][15474] Stopping InferenceWorker_p0-w0... +[2023-02-24 10:17:51,632][15474] Loop inference_proc0-0_evt_loop terminating... +[2023-02-24 10:17:51,674][01623] Component RolloutWorker_w7 stopped! +[2023-02-24 10:17:51,682][01623] Component RolloutWorker_w4 stopped! +[2023-02-24 10:17:51,682][15484] Stopping RolloutWorker_w4... +[2023-02-24 10:17:51,687][15484] Loop rollout_proc4_evt_loop terminating... +[2023-02-24 10:17:51,677][15486] Stopping RolloutWorker_w7... +[2023-02-24 10:17:51,704][15486] Loop rollout_proc7_evt_loop terminating... +[2023-02-24 10:17:51,707][01623] Component RolloutWorker_w3 stopped! +[2023-02-24 10:17:51,709][15482] Stopping RolloutWorker_w3... +[2023-02-24 10:17:51,710][15482] Loop rollout_proc3_evt_loop terminating... +[2023-02-24 10:17:51,729][01623] Component RolloutWorker_w5 stopped! +[2023-02-24 10:17:51,731][15483] Stopping RolloutWorker_w5... +[2023-02-24 10:17:51,737][01623] Component RolloutWorker_w1 stopped! +[2023-02-24 10:17:51,739][15476] Stopping RolloutWorker_w1... +[2023-02-24 10:17:51,739][15476] Loop rollout_proc1_evt_loop terminating... +[2023-02-24 10:17:51,751][15483] Loop rollout_proc5_evt_loop terminating... +[2023-02-24 10:17:51,763][15485] Stopping RolloutWorker_w6... +[2023-02-24 10:17:51,763][01623] Component RolloutWorker_w6 stopped! +[2023-02-24 10:17:51,779][15475] Stopping RolloutWorker_w0... +[2023-02-24 10:17:51,780][15475] Loop rollout_proc0_evt_loop terminating... +[2023-02-24 10:17:51,779][01623] Component RolloutWorker_w0 stopped! +[2023-02-24 10:17:51,788][15460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000965_3952640.pth +[2023-02-24 10:17:51,798][15485] Loop rollout_proc6_evt_loop terminating... +[2023-02-24 10:17:51,810][15460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_4505600.pth... +[2023-02-24 10:17:51,820][15481] Stopping RolloutWorker_w2... +[2023-02-24 10:17:51,821][15481] Loop rollout_proc2_evt_loop terminating... +[2023-02-24 10:17:51,820][01623] Component RolloutWorker_w2 stopped! +[2023-02-24 10:17:52,128][01623] Component LearnerWorker_p0 stopped! +[2023-02-24 10:17:52,130][01623] Waiting for process learner_proc0 to stop... +[2023-02-24 10:17:52,128][15460] Stopping LearnerWorker_p0... +[2023-02-24 10:17:52,133][15460] Loop learner_proc0_evt_loop terminating... +[2023-02-24 10:17:54,606][01623] Waiting for process inference_proc0-0 to join... +[2023-02-24 10:17:55,158][01623] Waiting for process rollout_proc0 to join... +[2023-02-24 10:17:55,837][01623] Waiting for process rollout_proc1 to join... +[2023-02-24 10:17:55,839][01623] Waiting for process rollout_proc2 to join... +[2023-02-24 10:17:55,842][01623] Waiting for process rollout_proc3 to join... +[2023-02-24 10:17:55,846][01623] Waiting for process rollout_proc4 to join... +[2023-02-24 10:17:55,850][01623] Waiting for process rollout_proc5 to join... +[2023-02-24 10:17:55,852][01623] Waiting for process rollout_proc6 to join... +[2023-02-24 10:17:55,854][01623] Waiting for process rollout_proc7 to join... +[2023-02-24 10:17:55,856][01623] Batcher 0 profile tree view: +batching: 30.6921, releasing_batches: 0.0303 +[2023-02-24 10:17:55,858][01623] InferenceWorker_p0-w0 profile tree view: +wait_policy: 0.0000 + wait_policy_total: 654.1631 +update_model: 9.7801 + weight_update: 0.0014 +one_step: 0.0027 + handle_policy_step: 625.2704 + deserialize: 17.8837, stack: 3.4432, obs_to_device_normalize: 134.9107, forward: 306.1076, send_messages: 30.4159 + prepare_outputs: 100.6715 + to_cpu: 61.9524 +[2023-02-24 10:17:55,859][01623] Learner 0 profile tree view: +misc: 0.0073, prepare_batch: 19.6645 +train: 86.4863 + epoch_init: 0.0120, minibatch_init: 0.0111, losses_postprocess: 0.6117, kl_divergence: 0.5941, after_optimizer: 37.0888 + calculate_losses: 30.5907 + losses_init: 0.0042, forward_head: 2.1203, bptt_initial: 20.0486, tail: 1.2396, advantages_returns: 0.3540, losses: 3.8575 + bptt: 2.5773 + bptt_forward_core: 2.4497 + update: 16.7731 + clip: 1.6500 +[2023-02-24 10:17:55,861][01623] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 0.3763, enqueue_policy_requests: 186.0768, env_step: 996.3301, overhead: 26.8683, complete_rollouts: 8.7595 +save_policy_outputs: 25.3304 + split_output_tensors: 12.6970 +[2023-02-24 10:17:55,863][01623] RolloutWorker_w7 profile tree view: +wait_for_trajectories: 0.3390, enqueue_policy_requests: 189.8915, env_step: 997.6871, overhead: 26.5033, complete_rollouts: 8.6160 +save_policy_outputs: 24.7950 + split_output_tensors: 11.7159 +[2023-02-24 10:17:55,864][01623] Loop Runner_EvtLoop terminating... +[2023-02-24 10:17:55,871][01623] Runner profile tree view: +main_loop: 1367.0984 +[2023-02-24 10:17:55,875][01623] Collected {0: 4505600}, FPS: 3295.7 +[2023-02-24 10:21:22,492][01623] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json +[2023-02-24 10:21:22,493][01623] Overriding arg 'num_workers' with value 1 passed from command line +[2023-02-24 10:21:22,495][01623] Adding new argument 'no_render'=True that is not in the saved config file! +[2023-02-24 10:21:22,500][01623] Adding new argument 'save_video'=True that is not in the saved config file! +[2023-02-24 10:21:22,502][01623] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! +[2023-02-24 10:21:22,504][01623] Adding new argument 'video_name'=None that is not in the saved config file! +[2023-02-24 10:21:22,505][01623] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! +[2023-02-24 10:21:22,506][01623] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! +[2023-02-24 10:21:22,508][01623] Adding new argument 'push_to_hub'=False that is not in the saved config file! +[2023-02-24 10:21:22,510][01623] Adding new argument 'hf_repository'=None that is not in the saved config file! +[2023-02-24 10:21:22,512][01623] Adding new argument 'policy_index'=0 that is not in the saved config file! +[2023-02-24 10:21:22,514][01623] Adding new argument 'eval_deterministic'=False that is not in the saved config file! +[2023-02-24 10:21:22,517][01623] Adding new argument 'train_script'=None that is not in the saved config file! +[2023-02-24 10:21:22,525][01623] Adding new argument 'enjoy_script'=None that is not in the saved config file! +[2023-02-24 10:21:22,526][01623] Using frameskip 1 and render_action_repeat=4 for evaluation +[2023-02-24 10:21:22,550][01623] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 10:21:22,555][01623] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 10:21:22,559][01623] RunningMeanStd input shape: (1,) +[2023-02-24 10:21:22,577][01623] ConvEncoder: input_channels=3 +[2023-02-24 10:21:23,261][01623] Conv encoder output size: 512 +[2023-02-24 10:21:23,263][01623] Policy head output size: 512 +[2023-02-24 10:21:26,135][01623] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_4505600.pth... +[2023-02-24 10:21:27,796][01623] Num frames 100... +[2023-02-24 10:21:27,904][01623] Num frames 200... +[2023-02-24 10:21:28,015][01623] Num frames 300... +[2023-02-24 10:21:28,127][01623] Avg episode rewards: #0: 4.520, true rewards: #0: 3.520 +[2023-02-24 10:21:28,128][01623] Avg episode reward: 4.520, avg true_objective: 3.520 +[2023-02-24 10:21:28,194][01623] Num frames 400... +[2023-02-24 10:21:28,316][01623] Num frames 500... +[2023-02-24 10:21:28,436][01623] Num frames 600... +[2023-02-24 10:21:28,549][01623] Num frames 700... +[2023-02-24 10:21:28,663][01623] Num frames 800... +[2023-02-24 10:21:28,782][01623] Num frames 900... +[2023-02-24 10:21:28,870][01623] Avg episode rewards: #0: 8.100, true rewards: #0: 4.600 +[2023-02-24 10:21:28,872][01623] Avg episode reward: 8.100, avg true_objective: 4.600 +[2023-02-24 10:21:28,968][01623] Num frames 1000... +[2023-02-24 10:21:29,089][01623] Num frames 1100... +[2023-02-24 10:21:29,209][01623] Num frames 1200... +[2023-02-24 10:21:29,323][01623] Num frames 1300... +[2023-02-24 10:21:29,438][01623] Num frames 1400... +[2023-02-24 10:21:29,555][01623] Num frames 1500... +[2023-02-24 10:21:29,667][01623] Num frames 1600... +[2023-02-24 10:21:29,785][01623] Num frames 1700... +[2023-02-24 10:21:29,905][01623] Num frames 1800... +[2023-02-24 10:21:30,017][01623] Num frames 1900... +[2023-02-24 10:21:30,121][01623] Avg episode rewards: #0: 14.473, true rewards: #0: 6.473 +[2023-02-24 10:21:30,123][01623] Avg episode reward: 14.473, avg true_objective: 6.473 +[2023-02-24 10:21:30,201][01623] Num frames 2000... +[2023-02-24 10:21:30,315][01623] Num frames 2100... +[2023-02-24 10:21:30,431][01623] Num frames 2200... +[2023-02-24 10:21:30,553][01623] Num frames 2300... +[2023-02-24 10:21:30,671][01623] Num frames 2400... +[2023-02-24 10:21:30,789][01623] Num frames 2500... +[2023-02-24 10:21:30,909][01623] Num frames 2600... +[2023-02-24 10:21:31,018][01623] Num frames 2700... +[2023-02-24 10:21:31,137][01623] Num frames 2800... +[2023-02-24 10:21:31,256][01623] Num frames 2900... +[2023-02-24 10:21:31,370][01623] Num frames 3000... +[2023-02-24 10:21:31,487][01623] Num frames 3100... +[2023-02-24 10:21:31,568][01623] Avg episode rewards: #0: 18.553, true rewards: #0: 7.802 +[2023-02-24 10:21:31,569][01623] Avg episode reward: 18.553, avg true_objective: 7.802 +[2023-02-24 10:21:31,665][01623] Num frames 3200... +[2023-02-24 10:21:31,782][01623] Num frames 3300... +[2023-02-24 10:21:31,904][01623] Num frames 3400... +[2023-02-24 10:21:32,013][01623] Num frames 3500... +[2023-02-24 10:21:32,131][01623] Num frames 3600... +[2023-02-24 10:21:32,243][01623] Num frames 3700... +[2023-02-24 10:21:32,359][01623] Num frames 3800... +[2023-02-24 10:21:32,470][01623] Num frames 3900... +[2023-02-24 10:21:32,556][01623] Avg episode rewards: #0: 18.452, true rewards: #0: 7.852 +[2023-02-24 10:21:32,558][01623] Avg episode reward: 18.452, avg true_objective: 7.852 +[2023-02-24 10:21:32,644][01623] Num frames 4000... +[2023-02-24 10:21:32,764][01623] Num frames 4100... +[2023-02-24 10:21:32,889][01623] Num frames 4200... +[2023-02-24 10:21:32,998][01623] Num frames 4300... +[2023-02-24 10:21:33,106][01623] Num frames 4400... +[2023-02-24 10:21:33,222][01623] Num frames 4500... +[2023-02-24 10:21:33,336][01623] Num frames 4600... +[2023-02-24 10:21:33,447][01623] Num frames 4700... +[2023-02-24 10:21:33,558][01623] Num frames 4800... +[2023-02-24 10:21:33,671][01623] Num frames 4900... +[2023-02-24 10:21:33,792][01623] Num frames 5000... +[2023-02-24 10:21:33,906][01623] Avg episode rewards: #0: 20.413, true rewards: #0: 8.413 +[2023-02-24 10:21:33,909][01623] Avg episode reward: 20.413, avg true_objective: 8.413 +[2023-02-24 10:21:33,975][01623] Num frames 5100... +[2023-02-24 10:21:34,102][01623] Num frames 5200... +[2023-02-24 10:21:34,216][01623] Num frames 5300... +[2023-02-24 10:21:34,340][01623] Num frames 5400... +[2023-02-24 10:21:34,392][01623] Avg episode rewards: #0: 18.143, true rewards: #0: 7.714 +[2023-02-24 10:21:34,396][01623] Avg episode reward: 18.143, avg true_objective: 7.714 +[2023-02-24 10:21:34,522][01623] Num frames 5500... +[2023-02-24 10:21:34,638][01623] Num frames 5600... +[2023-02-24 10:21:34,757][01623] Num frames 5700... +[2023-02-24 10:21:34,884][01623] Num frames 5800... +[2023-02-24 10:21:34,999][01623] Num frames 5900... +[2023-02-24 10:21:35,109][01623] Num frames 6000... +[2023-02-24 10:21:35,245][01623] Avg episode rewards: #0: 17.215, true rewards: #0: 7.590 +[2023-02-24 10:21:35,247][01623] Avg episode reward: 17.215, avg true_objective: 7.590 +[2023-02-24 10:21:35,283][01623] Num frames 6100... +[2023-02-24 10:21:35,409][01623] Num frames 6200... +[2023-02-24 10:21:35,519][01623] Num frames 6300... +[2023-02-24 10:21:35,641][01623] Num frames 6400... +[2023-02-24 10:21:35,754][01623] Num frames 6500... +[2023-02-24 10:21:35,888][01623] Num frames 6600... +[2023-02-24 10:21:35,999][01623] Num frames 6700... +[2023-02-24 10:21:36,112][01623] Num frames 6800... +[2023-02-24 10:21:36,224][01623] Num frames 6900... +[2023-02-24 10:21:36,336][01623] Num frames 7000... +[2023-02-24 10:21:36,449][01623] Num frames 7100... +[2023-02-24 10:21:36,560][01623] Num frames 7200... +[2023-02-24 10:21:36,671][01623] Num frames 7300... +[2023-02-24 10:21:36,815][01623] Avg episode rewards: #0: 18.649, true rewards: #0: 8.204 +[2023-02-24 10:21:36,817][01623] Avg episode reward: 18.649, avg true_objective: 8.204 +[2023-02-24 10:21:36,840][01623] Num frames 7400... +[2023-02-24 10:21:36,961][01623] Num frames 7500... +[2023-02-24 10:21:37,089][01623] Num frames 7600... +[2023-02-24 10:21:37,207][01623] Num frames 7700... +[2023-02-24 10:21:37,320][01623] Num frames 7800... +[2023-02-24 10:21:37,435][01623] Num frames 7900... +[2023-02-24 10:21:37,568][01623] Num frames 8000... +[2023-02-24 10:21:37,724][01623] Num frames 8100... +[2023-02-24 10:21:37,882][01623] Num frames 8200... +[2023-02-24 10:21:38,036][01623] Num frames 8300... +[2023-02-24 10:21:38,190][01623] Num frames 8400... +[2023-02-24 10:21:38,356][01623] Num frames 8500... +[2023-02-24 10:21:38,510][01623] Num frames 8600... +[2023-02-24 10:21:38,674][01623] Num frames 8700... +[2023-02-24 10:21:38,775][01623] Avg episode rewards: #0: 20.028, true rewards: #0: 8.728 +[2023-02-24 10:21:38,777][01623] Avg episode reward: 20.028, avg true_objective: 8.728 +[2023-02-24 10:22:38,283][01623] Replay video saved to /content/train_dir/default_experiment/replay.mp4! +[2023-02-24 10:27:22,190][01623] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json +[2023-02-24 10:27:22,192][01623] Overriding arg 'num_workers' with value 1 passed from command line +[2023-02-24 10:27:22,194][01623] Adding new argument 'no_render'=True that is not in the saved config file! +[2023-02-24 10:27:22,199][01623] Adding new argument 'save_video'=True that is not in the saved config file! +[2023-02-24 10:27:22,201][01623] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! +[2023-02-24 10:27:22,203][01623] Adding new argument 'video_name'=None that is not in the saved config file! +[2023-02-24 10:27:22,204][01623] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! +[2023-02-24 10:27:22,206][01623] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! +[2023-02-24 10:27:22,208][01623] Adding new argument 'push_to_hub'=True that is not in the saved config file! +[2023-02-24 10:27:22,209][01623] Adding new argument 'hf_repository'='ThomasSimonini/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! +[2023-02-24 10:27:22,210][01623] Adding new argument 'policy_index'=0 that is not in the saved config file! +[2023-02-24 10:27:22,212][01623] Adding new argument 'eval_deterministic'=False that is not in the saved config file! +[2023-02-24 10:27:22,213][01623] Adding new argument 'train_script'=None that is not in the saved config file! +[2023-02-24 10:27:22,214][01623] Adding new argument 'enjoy_script'=None that is not in the saved config file! +[2023-02-24 10:27:22,216][01623] Using frameskip 1 and render_action_repeat=4 for evaluation +[2023-02-24 10:27:22,246][01623] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 10:27:22,249][01623] RunningMeanStd input shape: (1,) +[2023-02-24 10:27:22,273][01623] ConvEncoder: input_channels=3 +[2023-02-24 10:27:22,335][01623] Conv encoder output size: 512 +[2023-02-24 10:27:22,339][01623] Policy head output size: 512 +[2023-02-24 10:27:22,371][01623] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_4505600.pth... +[2023-02-24 10:27:23,003][01623] Num frames 100... +[2023-02-24 10:27:23,164][01623] Num frames 200... +[2023-02-24 10:27:23,329][01623] Num frames 300... +[2023-02-24 10:27:23,494][01623] Num frames 400... +[2023-02-24 10:27:23,654][01623] Num frames 500... +[2023-02-24 10:27:23,737][01623] Avg episode rewards: #0: 9.120, true rewards: #0: 5.120 +[2023-02-24 10:27:23,740][01623] Avg episode reward: 9.120, avg true_objective: 5.120 +[2023-02-24 10:27:23,877][01623] Num frames 600... +[2023-02-24 10:27:24,041][01623] Num frames 700... +[2023-02-24 10:27:24,190][01623] Num frames 800... +[2023-02-24 10:27:24,301][01623] Num frames 900... +[2023-02-24 10:27:24,418][01623] Num frames 1000... +[2023-02-24 10:27:24,531][01623] Num frames 1100... +[2023-02-24 10:27:24,644][01623] Num frames 1200... +[2023-02-24 10:27:24,760][01623] Num frames 1300... +[2023-02-24 10:27:24,875][01623] Num frames 1400... +[2023-02-24 10:27:24,995][01623] Num frames 1500... +[2023-02-24 10:27:25,106][01623] Num frames 1600... +[2023-02-24 10:27:25,223][01623] Num frames 1700... +[2023-02-24 10:27:25,339][01623] Num frames 1800... +[2023-02-24 10:27:25,450][01623] Num frames 1900... +[2023-02-24 10:27:25,514][01623] Avg episode rewards: #0: 21.525, true rewards: #0: 9.525 +[2023-02-24 10:27:25,517][01623] Avg episode reward: 21.525, avg true_objective: 9.525 +[2023-02-24 10:27:25,629][01623] Num frames 2000... +[2023-02-24 10:27:25,747][01623] Num frames 2100... +[2023-02-24 10:27:25,862][01623] Num frames 2200... +[2023-02-24 10:27:25,993][01623] Num frames 2300... +[2023-02-24 10:27:26,076][01623] Avg episode rewards: #0: 16.403, true rewards: #0: 7.737 +[2023-02-24 10:27:26,078][01623] Avg episode reward: 16.403, avg true_objective: 7.737 +[2023-02-24 10:27:26,171][01623] Num frames 2400... +[2023-02-24 10:27:26,294][01623] Num frames 2500... +[2023-02-24 10:27:26,411][01623] Num frames 2600... +[2023-02-24 10:27:26,529][01623] Num frames 2700... +[2023-02-24 10:27:26,638][01623] Num frames 2800... +[2023-02-24 10:27:26,749][01623] Num frames 2900... +[2023-02-24 10:27:26,857][01623] Num frames 3000... +[2023-02-24 10:27:26,987][01623] Num frames 3100... +[2023-02-24 10:27:27,097][01623] Num frames 3200... +[2023-02-24 10:27:27,205][01623] Num frames 3300... +[2023-02-24 10:27:27,319][01623] Num frames 3400... +[2023-02-24 10:27:27,430][01623] Num frames 3500... +[2023-02-24 10:27:27,540][01623] Num frames 3600... +[2023-02-24 10:27:27,650][01623] Num frames 3700... +[2023-02-24 10:27:27,767][01623] Num frames 3800... +[2023-02-24 10:27:27,882][01623] Num frames 3900... +[2023-02-24 10:27:28,000][01623] Num frames 4000... +[2023-02-24 10:27:28,111][01623] Num frames 4100... +[2023-02-24 10:27:28,222][01623] Num frames 4200... +[2023-02-24 10:27:28,337][01623] Num frames 4300... +[2023-02-24 10:27:28,455][01623] Num frames 4400... +[2023-02-24 10:27:28,535][01623] Avg episode rewards: #0: 27.552, true rewards: #0: 11.053 +[2023-02-24 10:27:28,537][01623] Avg episode reward: 27.552, avg true_objective: 11.053 +[2023-02-24 10:27:28,634][01623] Num frames 4500... +[2023-02-24 10:27:28,754][01623] Num frames 4600... +[2023-02-24 10:27:28,880][01623] Num frames 4700... +[2023-02-24 10:27:28,996][01623] Num frames 4800... +[2023-02-24 10:27:29,106][01623] Num frames 4900... +[2023-02-24 10:27:29,216][01623] Num frames 5000... +[2023-02-24 10:27:29,335][01623] Num frames 5100... +[2023-02-24 10:27:29,445][01623] Num frames 5200... +[2023-02-24 10:27:29,554][01623] Num frames 5300... +[2023-02-24 10:27:29,662][01623] Num frames 5400... +[2023-02-24 10:27:29,780][01623] Num frames 5500... +[2023-02-24 10:27:29,886][01623] Num frames 5600... +[2023-02-24 10:27:30,014][01623] Num frames 5700... +[2023-02-24 10:27:30,123][01623] Num frames 5800... +[2023-02-24 10:27:30,240][01623] Num frames 5900... +[2023-02-24 10:27:30,352][01623] Num frames 6000... +[2023-02-24 10:27:30,472][01623] Num frames 6100... +[2023-02-24 10:27:30,583][01623] Avg episode rewards: #0: 30.892, true rewards: #0: 12.292 +[2023-02-24 10:27:30,585][01623] Avg episode reward: 30.892, avg true_objective: 12.292 +[2023-02-24 10:27:30,657][01623] Num frames 6200... +[2023-02-24 10:27:30,781][01623] Num frames 6300... +[2023-02-24 10:27:30,898][01623] Num frames 6400... +[2023-02-24 10:27:31,007][01623] Num frames 6500... +[2023-02-24 10:27:31,138][01623] Num frames 6600... +[2023-02-24 10:27:31,249][01623] Num frames 6700... +[2023-02-24 10:27:31,367][01623] Num frames 6800... +[2023-02-24 10:27:31,479][01623] Num frames 6900... +[2023-02-24 10:27:31,594][01623] Num frames 7000... +[2023-02-24 10:27:31,705][01623] Num frames 7100... +[2023-02-24 10:27:31,815][01623] Num frames 7200... +[2023-02-24 10:27:31,933][01623] Num frames 7300... +[2023-02-24 10:27:32,054][01623] Num frames 7400... +[2023-02-24 10:27:32,163][01623] Num frames 7500... +[2023-02-24 10:27:32,282][01623] Num frames 7600... +[2023-02-24 10:27:32,393][01623] Num frames 7700... +[2023-02-24 10:27:32,505][01623] Num frames 7800... +[2023-02-24 10:27:32,617][01623] Num frames 7900... +[2023-02-24 10:27:32,738][01623] Num frames 8000... +[2023-02-24 10:27:32,830][01623] Avg episode rewards: #0: 33.223, true rewards: #0: 13.390 +[2023-02-24 10:27:32,831][01623] Avg episode reward: 33.223, avg true_objective: 13.390 +[2023-02-24 10:27:32,907][01623] Num frames 8100... +[2023-02-24 10:27:33,025][01623] Num frames 8200... +[2023-02-24 10:27:33,144][01623] Num frames 8300... +[2023-02-24 10:27:33,254][01623] Num frames 8400... +[2023-02-24 10:27:33,365][01623] Num frames 8500... +[2023-02-24 10:27:33,478][01623] Num frames 8600... +[2023-02-24 10:27:33,593][01623] Num frames 8700... +[2023-02-24 10:27:33,708][01623] Num frames 8800... +[2023-02-24 10:27:33,819][01623] Num frames 8900... +[2023-02-24 10:27:33,935][01623] Num frames 9000... +[2023-02-24 10:27:34,043][01623] Num frames 9100... +[2023-02-24 10:27:34,142][01623] Avg episode rewards: #0: 31.904, true rewards: #0: 13.047 +[2023-02-24 10:27:34,144][01623] Avg episode reward: 31.904, avg true_objective: 13.047 +[2023-02-24 10:27:34,246][01623] Num frames 9200... +[2023-02-24 10:27:34,409][01623] Num frames 9300... +[2023-02-24 10:27:34,566][01623] Num frames 9400... +[2023-02-24 10:27:34,726][01623] Num frames 9500... +[2023-02-24 10:27:34,883][01623] Num frames 9600... +[2023-02-24 10:27:35,037][01623] Num frames 9700... +[2023-02-24 10:27:35,201][01623] Num frames 9800... +[2023-02-24 10:27:35,357][01623] Num frames 9900... +[2023-02-24 10:27:35,510][01623] Num frames 10000... +[2023-02-24 10:27:35,668][01623] Num frames 10100... +[2023-02-24 10:27:35,836][01623] Num frames 10200... +[2023-02-24 10:27:35,995][01623] Num frames 10300... +[2023-02-24 10:27:36,160][01623] Num frames 10400... +[2023-02-24 10:27:36,323][01623] Num frames 10500... +[2023-02-24 10:27:36,399][01623] Avg episode rewards: #0: 32.011, true rewards: #0: 13.136 +[2023-02-24 10:27:36,402][01623] Avg episode reward: 32.011, avg true_objective: 13.136 +[2023-02-24 10:27:36,561][01623] Num frames 10600... +[2023-02-24 10:27:36,720][01623] Num frames 10700... +[2023-02-24 10:27:36,880][01623] Num frames 10800... +[2023-02-24 10:27:37,031][01623] Num frames 10900... +[2023-02-24 10:27:37,192][01623] Num frames 11000... +[2023-02-24 10:27:37,499][01623] Num frames 11100... +[2023-02-24 10:27:37,666][01623] Num frames 11200... +[2023-02-24 10:27:37,805][01623] Num frames 11300... +[2023-02-24 10:27:37,923][01623] Num frames 11400... +[2023-02-24 10:27:38,035][01623] Num frames 11500... +[2023-02-24 10:27:38,148][01623] Num frames 11600... +[2023-02-24 10:27:38,266][01623] Num frames 11700... +[2023-02-24 10:27:38,381][01623] Num frames 11800... +[2023-02-24 10:27:38,505][01623] Num frames 11900... +[2023-02-24 10:27:38,621][01623] Num frames 12000... +[2023-02-24 10:27:38,739][01623] Num frames 12100... +[2023-02-24 10:27:38,858][01623] Num frames 12200... +[2023-02-24 10:27:38,971][01623] Num frames 12300... +[2023-02-24 10:27:39,082][01623] Num frames 12400... +[2023-02-24 10:27:39,202][01623] Num frames 12500... +[2023-02-24 10:27:39,291][01623] Avg episode rewards: #0: 34.583, true rewards: #0: 13.917 +[2023-02-24 10:27:39,293][01623] Avg episode reward: 34.583, avg true_objective: 13.917 +[2023-02-24 10:27:39,383][01623] Num frames 12600... +[2023-02-24 10:27:39,504][01623] Num frames 12700... +[2023-02-24 10:27:39,620][01623] Num frames 12800... +[2023-02-24 10:27:39,732][01623] Num frames 12900... +[2023-02-24 10:27:39,843][01623] Num frames 13000... +[2023-02-24 10:27:39,955][01623] Num frames 13100... +[2023-02-24 10:27:40,070][01623] Num frames 13200... +[2023-02-24 10:27:40,181][01623] Num frames 13300... +[2023-02-24 10:27:40,305][01623] Num frames 13400... +[2023-02-24 10:27:40,418][01623] Num frames 13500... +[2023-02-24 10:27:40,533][01623] Num frames 13600... +[2023-02-24 10:27:40,647][01623] Num frames 13700... +[2023-02-24 10:27:40,762][01623] Num frames 13800... +[2023-02-24 10:27:40,880][01623] Num frames 13900... +[2023-02-24 10:27:40,989][01623] Num frames 14000... +[2023-02-24 10:27:41,078][01623] Avg episode rewards: #0: 34.829, true rewards: #0: 14.029 +[2023-02-24 10:27:41,079][01623] Avg episode reward: 34.829, avg true_objective: 14.029 +[2023-02-24 10:29:09,913][01623] Replay video saved to /content/train_dir/default_experiment/replay.mp4! +[2023-02-24 10:34:07,056][01623] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json +[2023-02-24 10:34:07,057][01623] Overriding arg 'num_workers' with value 1 passed from command line +[2023-02-24 10:34:07,060][01623] Adding new argument 'no_render'=True that is not in the saved config file! +[2023-02-24 10:34:07,062][01623] Adding new argument 'save_video'=True that is not in the saved config file! +[2023-02-24 10:34:07,064][01623] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! +[2023-02-24 10:34:07,065][01623] Adding new argument 'video_name'=None that is not in the saved config file! +[2023-02-24 10:34:07,067][01623] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! +[2023-02-24 10:34:07,068][01623] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! +[2023-02-24 10:34:07,069][01623] Adding new argument 'push_to_hub'=True that is not in the saved config file! +[2023-02-24 10:34:07,070][01623] Adding new argument 'hf_repository'='dbaibak/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! +[2023-02-24 10:34:07,072][01623] Adding new argument 'policy_index'=0 that is not in the saved config file! +[2023-02-24 10:34:07,073][01623] Adding new argument 'eval_deterministic'=False that is not in the saved config file! +[2023-02-24 10:34:07,074][01623] Adding new argument 'train_script'=None that is not in the saved config file! +[2023-02-24 10:34:07,075][01623] Adding new argument 'enjoy_script'=None that is not in the saved config file! +[2023-02-24 10:34:07,076][01623] Using frameskip 1 and render_action_repeat=4 for evaluation +[2023-02-24 10:34:07,106][01623] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 10:34:07,108][01623] RunningMeanStd input shape: (1,) +[2023-02-24 10:34:07,122][01623] ConvEncoder: input_channels=3 +[2023-02-24 10:34:07,156][01623] Conv encoder output size: 512 +[2023-02-24 10:34:07,158][01623] Policy head output size: 512 +[2023-02-24 10:34:07,178][01623] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_4505600.pth... +[2023-02-24 10:34:07,615][01623] Num frames 100... +[2023-02-24 10:34:07,729][01623] Num frames 200... +[2023-02-24 10:34:07,843][01623] Num frames 300... +[2023-02-24 10:34:07,956][01623] Num frames 400... +[2023-02-24 10:34:08,075][01623] Num frames 500... +[2023-02-24 10:34:08,185][01623] Num frames 600... +[2023-02-24 10:34:08,302][01623] Num frames 700... +[2023-02-24 10:34:08,435][01623] Num frames 800... +[2023-02-24 10:34:08,562][01623] Avg episode rewards: #0: 17.640, true rewards: #0: 8.640 +[2023-02-24 10:34:08,565][01623] Avg episode reward: 17.640, avg true_objective: 8.640 +[2023-02-24 10:34:08,612][01623] Num frames 900... +[2023-02-24 10:34:08,725][01623] Num frames 1000... +[2023-02-24 10:34:08,838][01623] Num frames 1100... +[2023-02-24 10:34:08,954][01623] Num frames 1200... +[2023-02-24 10:34:09,066][01623] Num frames 1300... +[2023-02-24 10:34:09,180][01623] Num frames 1400... +[2023-02-24 10:34:09,304][01623] Num frames 1500... +[2023-02-24 10:34:09,417][01623] Num frames 1600... +[2023-02-24 10:34:09,534][01623] Num frames 1700... +[2023-02-24 10:34:09,643][01623] Num frames 1800... +[2023-02-24 10:34:09,784][01623] Avg episode rewards: #0: 19.890, true rewards: #0: 9.390 +[2023-02-24 10:34:09,786][01623] Avg episode reward: 19.890, avg true_objective: 9.390 +[2023-02-24 10:34:09,817][01623] Num frames 1900... +[2023-02-24 10:34:09,941][01623] Num frames 2000... +[2023-02-24 10:34:10,060][01623] Num frames 2100... +[2023-02-24 10:34:10,182][01623] Num frames 2200... +[2023-02-24 10:34:10,294][01623] Num frames 2300... +[2023-02-24 10:34:10,406][01623] Num frames 2400... +[2023-02-24 10:34:10,517][01623] Num frames 2500... +[2023-02-24 10:34:10,624][01623] Num frames 2600... +[2023-02-24 10:34:10,735][01623] Num frames 2700... +[2023-02-24 10:34:10,845][01623] Num frames 2800... +[2023-02-24 10:34:10,973][01623] Num frames 2900... +[2023-02-24 10:34:11,084][01623] Num frames 3000... +[2023-02-24 10:34:11,204][01623] Num frames 3100... +[2023-02-24 10:34:11,338][01623] Num frames 3200... +[2023-02-24 10:34:11,456][01623] Num frames 3300... +[2023-02-24 10:34:11,574][01623] Num frames 3400... +[2023-02-24 10:34:11,690][01623] Num frames 3500... +[2023-02-24 10:34:11,810][01623] Num frames 3600... +[2023-02-24 10:34:11,930][01623] Num frames 3700... +[2023-02-24 10:34:12,054][01623] Num frames 3800... +[2023-02-24 10:34:12,167][01623] Num frames 3900... +[2023-02-24 10:34:12,355][01623] Avg episode rewards: #0: 32.926, true rewards: #0: 13.260 +[2023-02-24 10:34:12,357][01623] Avg episode reward: 32.926, avg true_objective: 13.260 +[2023-02-24 10:34:12,398][01623] Num frames 4000... +[2023-02-24 10:34:12,561][01623] Num frames 4100... +[2023-02-24 10:34:12,718][01623] Num frames 4200... +[2023-02-24 10:34:12,871][01623] Num frames 4300... +[2023-02-24 10:34:13,044][01623] Num frames 4400... +[2023-02-24 10:34:13,205][01623] Num frames 4500... +[2023-02-24 10:34:13,368][01623] Num frames 4600... +[2023-02-24 10:34:13,533][01623] Num frames 4700... +[2023-02-24 10:34:13,702][01623] Num frames 4800... +[2023-02-24 10:34:13,863][01623] Num frames 4900... +[2023-02-24 10:34:14,026][01623] Num frames 5000... +[2023-02-24 10:34:14,184][01623] Num frames 5100... +[2023-02-24 10:34:14,344][01623] Num frames 5200... +[2023-02-24 10:34:14,445][01623] Avg episode rewards: #0: 32.065, true rewards: #0: 13.065 +[2023-02-24 10:34:14,447][01623] Avg episode reward: 32.065, avg true_objective: 13.065 +[2023-02-24 10:34:14,573][01623] Num frames 5300... +[2023-02-24 10:34:14,731][01623] Num frames 5400... +[2023-02-24 10:34:14,886][01623] Num frames 5500... +[2023-02-24 10:34:15,050][01623] Num frames 5600... +[2023-02-24 10:34:15,177][01623] Avg episode rewards: #0: 26.884, true rewards: #0: 11.284 +[2023-02-24 10:34:15,180][01623] Avg episode reward: 26.884, avg true_objective: 11.284 +[2023-02-24 10:34:15,284][01623] Num frames 5700... +[2023-02-24 10:34:15,447][01623] Num frames 5800... +[2023-02-24 10:34:15,610][01623] Num frames 5900... +[2023-02-24 10:34:15,717][01623] Avg episode rewards: #0: 23.217, true rewards: #0: 9.883 +[2023-02-24 10:34:15,720][01623] Avg episode reward: 23.217, avg true_objective: 9.883 +[2023-02-24 10:34:15,812][01623] Num frames 6000... +[2023-02-24 10:34:15,927][01623] Num frames 6100... +[2023-02-24 10:34:16,040][01623] Num frames 6200... +[2023-02-24 10:34:16,155][01623] Num frames 6300... +[2023-02-24 10:34:16,268][01623] Num frames 6400... +[2023-02-24 10:34:16,401][01623] Avg episode rewards: #0: 21.100, true rewards: #0: 9.243 +[2023-02-24 10:34:16,403][01623] Avg episode reward: 21.100, avg true_objective: 9.243 +[2023-02-24 10:34:16,440][01623] Num frames 6500... +[2023-02-24 10:34:16,571][01623] Num frames 6600... +[2023-02-24 10:34:16,685][01623] Num frames 6700... +[2023-02-24 10:34:16,796][01623] Num frames 6800... +[2023-02-24 10:34:16,905][01623] Num frames 6900... +[2023-02-24 10:34:17,018][01623] Num frames 7000... +[2023-02-24 10:34:17,190][01623] Avg episode rewards: #0: 19.874, true rewards: #0: 8.874 +[2023-02-24 10:34:17,193][01623] Avg episode reward: 19.874, avg true_objective: 8.874 +[2023-02-24 10:34:17,197][01623] Num frames 7100... +[2023-02-24 10:34:17,318][01623] Num frames 7200... +[2023-02-24 10:34:17,441][01623] Num frames 7300... +[2023-02-24 10:34:17,553][01623] Num frames 7400... +[2023-02-24 10:34:17,662][01623] Num frames 7500... +[2023-02-24 10:34:17,775][01623] Num frames 7600... +[2023-02-24 10:34:17,886][01623] Num frames 7700... +[2023-02-24 10:34:17,997][01623] Num frames 7800... +[2023-02-24 10:34:18,115][01623] Num frames 7900... +[2023-02-24 10:34:18,225][01623] Num frames 8000... +[2023-02-24 10:34:18,307][01623] Avg episode rewards: #0: 19.909, true rewards: #0: 8.909 +[2023-02-24 10:34:18,310][01623] Avg episode reward: 19.909, avg true_objective: 8.909 +[2023-02-24 10:34:18,408][01623] Num frames 8100... +[2023-02-24 10:34:18,523][01623] Num frames 8200... +[2023-02-24 10:34:18,637][01623] Num frames 8300... +[2023-02-24 10:34:18,755][01623] Num frames 8400... +[2023-02-24 10:34:18,863][01623] Num frames 8500... +[2023-02-24 10:34:18,969][01623] Num frames 8600... +[2023-02-24 10:34:19,082][01623] Num frames 8700... +[2023-02-24 10:34:19,203][01623] Num frames 8800... +[2023-02-24 10:34:19,314][01623] Num frames 8900... +[2023-02-24 10:34:19,426][01623] Num frames 9000... +[2023-02-24 10:34:19,537][01623] Num frames 9100... +[2023-02-24 10:34:19,662][01623] Num frames 9200... +[2023-02-24 10:34:19,773][01623] Num frames 9300... +[2023-02-24 10:34:19,881][01623] Num frames 9400... +[2023-02-24 10:34:19,965][01623] Avg episode rewards: #0: 21.426, true rewards: #0: 9.426 +[2023-02-24 10:34:19,967][01623] Avg episode reward: 21.426, avg true_objective: 9.426 +[2023-02-24 10:35:20,550][01623] Replay video saved to /content/train_dir/default_experiment/replay.mp4!