[2023-09-14 12:25:43,303][06576] Saving configuration to ./PPO-VizDoom/train_dir/default_experiment/config.json... [2023-09-14 12:25:43,304][06576] Rollout worker 0 uses device cpu [2023-09-14 12:25:43,304][06576] Rollout worker 1 uses device cpu [2023-09-14 12:25:43,305][06576] Rollout worker 2 uses device cpu [2023-09-14 12:25:43,305][06576] Rollout worker 3 uses device cpu [2023-09-14 12:25:43,305][06576] Rollout worker 4 uses device cpu [2023-09-14 12:25:43,305][06576] Rollout worker 5 uses device cpu [2023-09-14 12:25:43,306][06576] Rollout worker 6 uses device cpu [2023-09-14 12:25:43,306][06576] Rollout worker 7 uses device cpu [2023-09-14 12:25:43,344][06576] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-14 12:25:43,345][06576] InferenceWorker_p0-w0: min num requests: 2 [2023-09-14 12:25:43,362][06576] Starting all processes... [2023-09-14 12:25:43,362][06576] Starting process learner_proc0 [2023-09-14 12:25:44,369][06576] Starting all processes... [2023-09-14 12:25:44,372][06576] Starting process inference_proc0-0 [2023-09-14 12:25:44,373][06576] Starting process rollout_proc0 [2023-09-14 12:25:44,374][06635] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-14 12:25:44,374][06635] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-09-14 12:25:44,373][06576] Starting process rollout_proc1 [2023-09-14 12:25:44,373][06576] Starting process rollout_proc2 [2023-09-14 12:25:44,373][06576] Starting process rollout_proc3 [2023-09-14 12:25:44,373][06576] Starting process rollout_proc4 [2023-09-14 12:25:44,374][06576] Starting process rollout_proc5 [2023-09-14 12:25:44,377][06576] Starting process rollout_proc6 [2023-09-14 12:25:44,378][06576] Starting process rollout_proc7 [2023-09-14 12:25:44,385][06635] Num visible devices: 1 [2023-09-14 12:25:44,435][06635] Starting seed is not provided [2023-09-14 12:25:44,436][06635] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-14 12:25:44,437][06635] Initializing actor-critic model on device cuda:0 [2023-09-14 12:25:44,438][06635] RunningMeanStd input shape: (3, 72, 128) [2023-09-14 12:25:44,440][06635] RunningMeanStd input shape: (1,) [2023-09-14 12:25:44,453][06635] ConvEncoder: input_channels=3 [2023-09-14 12:25:44,634][06635] Conv encoder output size: 512 [2023-09-14 12:25:44,635][06635] Policy head output size: 512 [2023-09-14 12:25:44,665][06635] Created Actor Critic model with architecture: [2023-09-14 12:25:44,668][06635] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-09-14 12:25:44,911][06635] Using optimizer [2023-09-14 12:25:44,914][06635] No checkpoints found [2023-09-14 12:25:44,914][06635] Did not load from checkpoint, starting from scratch! [2023-09-14 12:25:44,915][06635] Initialized policy 0 weights for model version 0 [2023-09-14 12:25:44,919][06635] LearnerWorker_p0 finished initialization! [2023-09-14 12:25:44,919][06635] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-14 12:25:45,992][06654] Worker 1 uses CPU cores [1] [2023-09-14 12:25:46,044][06657] Worker 2 uses CPU cores [2] [2023-09-14 12:25:46,072][06664] Worker 6 uses CPU cores [0, 1, 2] [2023-09-14 12:25:46,138][06665] Worker 7 uses CPU cores [3, 4, 5] [2023-09-14 12:25:46,275][06653] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-14 12:25:46,275][06653] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-09-14 12:25:46,286][06653] Num visible devices: 1 [2023-09-14 12:25:46,344][06653] RunningMeanStd input shape: (3, 72, 128) [2023-09-14 12:25:46,344][06653] RunningMeanStd input shape: (1,) [2023-09-14 12:25:46,353][06653] ConvEncoder: input_channels=3 [2023-09-14 12:25:46,360][06655] Worker 0 uses CPU cores [0] [2023-09-14 12:25:46,374][06656] Worker 3 uses CPU cores [3] [2023-09-14 12:25:46,395][06658] Worker 4 uses CPU cores [4] [2023-09-14 12:25:46,424][06653] Conv encoder output size: 512 [2023-09-14 12:25:46,424][06653] Policy head output size: 512 [2023-09-14 12:25:46,426][06576] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-09-14 12:25:46,428][06666] Worker 5 uses CPU cores [5] [2023-09-14 12:25:46,469][06576] Inference worker 0-0 is ready! [2023-09-14 12:25:46,469][06576] All inference workers are ready! Signal rollout workers to start! [2023-09-14 12:25:46,497][06655] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:25:46,497][06657] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:25:46,498][06658] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:25:46,499][06666] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:25:46,512][06656] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:25:46,516][06665] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:25:46,520][06664] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:25:46,523][06654] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:25:46,851][06657] Decorrelating experience for 0 frames... [2023-09-14 12:25:46,851][06664] Decorrelating experience for 0 frames... [2023-09-14 12:25:46,852][06655] Decorrelating experience for 0 frames... [2023-09-14 12:25:46,851][06665] Decorrelating experience for 0 frames... [2023-09-14 12:25:46,857][06666] Decorrelating experience for 0 frames... [2023-09-14 12:25:46,859][06656] Decorrelating experience for 0 frames... [2023-09-14 12:25:47,000][06657] Decorrelating experience for 32 frames... [2023-09-14 12:25:47,000][06655] Decorrelating experience for 32 frames... [2023-09-14 12:25:47,005][06656] Decorrelating experience for 32 frames... [2023-09-14 12:25:47,096][06664] Decorrelating experience for 32 frames... [2023-09-14 12:25:47,115][06654] Decorrelating experience for 0 frames... [2023-09-14 12:25:47,159][06665] Decorrelating experience for 32 frames... [2023-09-14 12:25:47,207][06656] Decorrelating experience for 64 frames... [2023-09-14 12:25:47,250][06657] Decorrelating experience for 64 frames... [2023-09-14 12:25:47,265][06654] Decorrelating experience for 32 frames... [2023-09-14 12:25:47,365][06665] Decorrelating experience for 64 frames... [2023-09-14 12:25:47,421][06666] Decorrelating experience for 32 frames... [2023-09-14 12:25:47,426][06657] Decorrelating experience for 96 frames... [2023-09-14 12:25:47,464][06654] Decorrelating experience for 64 frames... [2023-09-14 12:25:47,464][06655] Decorrelating experience for 64 frames... [2023-09-14 12:25:47,533][06656] Decorrelating experience for 96 frames... [2023-09-14 12:25:47,617][06658] Decorrelating experience for 0 frames... [2023-09-14 12:25:47,619][06666] Decorrelating experience for 64 frames... [2023-09-14 12:25:47,624][06664] Decorrelating experience for 64 frames... [2023-09-14 12:25:47,639][06654] Decorrelating experience for 96 frames... [2023-09-14 12:25:47,791][06666] Decorrelating experience for 96 frames... [2023-09-14 12:25:47,820][06658] Decorrelating experience for 32 frames... [2023-09-14 12:25:47,860][06665] Decorrelating experience for 96 frames... [2023-09-14 12:25:48,043][06664] Decorrelating experience for 96 frames... [2023-09-14 12:25:48,121][06655] Decorrelating experience for 96 frames... [2023-09-14 12:25:48,399][06635] Signal inference workers to stop experience collection... [2023-09-14 12:25:48,416][06653] InferenceWorker_p0-w0: stopping experience collection [2023-09-14 12:25:48,531][06658] Decorrelating experience for 64 frames... [2023-09-14 12:25:48,699][06658] Decorrelating experience for 96 frames... [2023-09-14 12:25:50,766][06635] Signal inference workers to resume experience collection... [2023-09-14 12:25:50,767][06653] InferenceWorker_p0-w0: resuming experience collection [2023-09-14 12:25:50,834][06576] Fps is (10 sec: 929.3, 60 sec: 929.3, 300 sec: 929.3). Total num frames: 4096. Throughput: 0: 566.3. Samples: 2496. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-14 12:25:50,834][06576] Avg episode reward: [(0, '2.615')] [2023-09-14 12:25:53,006][06653] Updated weights for policy 0, policy_version 10 (0.0249) [2023-09-14 12:25:54,964][06653] Updated weights for policy 0, policy_version 20 (0.0006) [2023-09-14 12:25:55,834][06576] Fps is (10 sec: 10449.5, 60 sec: 10449.5, 300 sec: 10449.5). Total num frames: 98304. Throughput: 0: 2508.8. Samples: 23602. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:25:55,834][06576] Avg episode reward: [(0, '4.295')] [2023-09-14 12:25:57,003][06653] Updated weights for policy 0, policy_version 30 (0.0011) [2023-09-14 12:25:59,017][06653] Updated weights for policy 0, policy_version 40 (0.0009) [2023-09-14 12:26:00,834][06576] Fps is (10 sec: 19251.1, 60 sec: 13646.1, 300 sec: 13646.1). Total num frames: 196608. Throughput: 0: 2688.4. Samples: 38734. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:26:00,834][06576] Avg episode reward: [(0, '4.394')] [2023-09-14 12:26:00,857][06635] Saving new best policy, reward=4.394! [2023-09-14 12:26:01,042][06653] Updated weights for policy 0, policy_version 50 (0.0005) [2023-09-14 12:26:03,082][06653] Updated weights for policy 0, policy_version 60 (0.0005) [2023-09-14 12:26:03,340][06576] Heartbeat connected on Batcher_0 [2023-09-14 12:26:03,342][06576] Heartbeat connected on LearnerWorker_p0 [2023-09-14 12:26:03,347][06576] Heartbeat connected on InferenceWorker_p0-w0 [2023-09-14 12:26:03,348][06576] Heartbeat connected on RolloutWorker_w0 [2023-09-14 12:26:03,351][06576] Heartbeat connected on RolloutWorker_w1 [2023-09-14 12:26:03,353][06576] Heartbeat connected on RolloutWorker_w2 [2023-09-14 12:26:03,356][06576] Heartbeat connected on RolloutWorker_w3 [2023-09-14 12:26:03,358][06576] Heartbeat connected on RolloutWorker_w5 [2023-09-14 12:26:03,361][06576] Heartbeat connected on RolloutWorker_w6 [2023-09-14 12:26:03,362][06576] Heartbeat connected on RolloutWorker_w7 [2023-09-14 12:26:03,362][06576] Heartbeat connected on RolloutWorker_w4 [2023-09-14 12:26:05,119][06653] Updated weights for policy 0, policy_version 70 (0.0006) [2023-09-14 12:26:05,834][06576] Fps is (10 sec: 20070.2, 60 sec: 15406.7, 300 sec: 15406.7). Total num frames: 299008. Throughput: 0: 3558.0. Samples: 69052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:26:05,834][06576] Avg episode reward: [(0, '4.466')] [2023-09-14 12:26:05,834][06635] Saving new best policy, reward=4.466! [2023-09-14 12:26:07,182][06653] Updated weights for policy 0, policy_version 80 (0.0006) [2023-09-14 12:26:09,224][06653] Updated weights for policy 0, policy_version 90 (0.0008) [2023-09-14 12:26:10,834][06576] Fps is (10 sec: 20070.4, 60 sec: 16278.2, 300 sec: 16278.2). Total num frames: 397312. Throughput: 0: 4059.3. Samples: 99078. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:26:10,834][06576] Avg episode reward: [(0, '4.546')] [2023-09-14 12:26:10,856][06635] Saving new best policy, reward=4.546! [2023-09-14 12:26:11,282][06653] Updated weights for policy 0, policy_version 100 (0.0005) [2023-09-14 12:26:13,309][06653] Updated weights for policy 0, policy_version 110 (0.0006) [2023-09-14 12:26:15,416][06653] Updated weights for policy 0, policy_version 120 (0.0008) [2023-09-14 12:26:15,834][06576] Fps is (10 sec: 20070.5, 60 sec: 16992.6, 300 sec: 16992.6). Total num frames: 499712. Throughput: 0: 3878.9. Samples: 114068. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:26:15,834][06576] Avg episode reward: [(0, '4.450')] [2023-09-14 12:26:17,465][06653] Updated weights for policy 0, policy_version 130 (0.0008) [2023-09-14 12:26:19,464][06653] Updated weights for policy 0, policy_version 140 (0.0005) [2023-09-14 12:26:20,834][06576] Fps is (10 sec: 20479.9, 60 sec: 17499.4, 300 sec: 17499.4). Total num frames: 602112. Throughput: 0: 4189.8. Samples: 144162. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:26:20,834][06576] Avg episode reward: [(0, '4.689')] [2023-09-14 12:26:20,837][06635] Saving new best policy, reward=4.689! [2023-09-14 12:26:21,323][06653] Updated weights for policy 0, policy_version 150 (0.0005) [2023-09-14 12:26:23,199][06653] Updated weights for policy 0, policy_version 160 (0.0005) [2023-09-14 12:26:25,064][06653] Updated weights for policy 0, policy_version 170 (0.0010) [2023-09-14 12:26:25,834][06576] Fps is (10 sec: 21299.2, 60 sec: 18085.4, 300 sec: 18085.4). Total num frames: 712704. Throughput: 0: 4490.8. Samples: 176970. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:26:25,834][06576] Avg episode reward: [(0, '5.129')] [2023-09-14 12:26:25,834][06635] Saving new best policy, reward=5.129! [2023-09-14 12:26:26,944][06653] Updated weights for policy 0, policy_version 180 (0.0007) [2023-09-14 12:26:28,828][06653] Updated weights for policy 0, policy_version 190 (0.0005) [2023-09-14 12:26:30,834][06576] Fps is (10 sec: 21299.3, 60 sec: 18355.1, 300 sec: 18355.1). Total num frames: 815104. Throughput: 0: 4357.5. Samples: 193504. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:26:30,834][06576] Avg episode reward: [(0, '4.724')] [2023-09-14 12:26:30,904][06653] Updated weights for policy 0, policy_version 200 (0.0006) [2023-09-14 12:26:32,979][06653] Updated weights for policy 0, policy_version 210 (0.0006) [2023-09-14 12:26:34,963][06653] Updated weights for policy 0, policy_version 220 (0.0005) [2023-09-14 12:26:35,834][06576] Fps is (10 sec: 20480.1, 60 sec: 18570.1, 300 sec: 18570.1). Total num frames: 917504. Throughput: 0: 4910.2. Samples: 223454. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:26:35,834][06576] Avg episode reward: [(0, '4.892')] [2023-09-14 12:26:36,998][06653] Updated weights for policy 0, policy_version 230 (0.0011) [2023-09-14 12:26:39,067][06653] Updated weights for policy 0, policy_version 240 (0.0008) [2023-09-14 12:26:40,834][06576] Fps is (10 sec: 20070.4, 60 sec: 18670.3, 300 sec: 18670.3). Total num frames: 1015808. Throughput: 0: 5108.0. Samples: 253462. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:26:40,834][06576] Avg episode reward: [(0, '4.566')] [2023-09-14 12:26:41,129][06653] Updated weights for policy 0, policy_version 250 (0.0005) [2023-09-14 12:26:43,119][06653] Updated weights for policy 0, policy_version 260 (0.0005) [2023-09-14 12:26:45,163][06653] Updated weights for policy 0, policy_version 270 (0.0008) [2023-09-14 12:26:45,834][06576] Fps is (10 sec: 20070.2, 60 sec: 18822.6, 300 sec: 18822.6). Total num frames: 1118208. Throughput: 0: 5111.0. Samples: 268730. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 12:26:45,834][06576] Avg episode reward: [(0, '5.056')] [2023-09-14 12:26:47,208][06653] Updated weights for policy 0, policy_version 280 (0.0006) [2023-09-14 12:26:49,253][06653] Updated weights for policy 0, policy_version 290 (0.0006) [2023-09-14 12:26:50,834][06576] Fps is (10 sec: 20070.4, 60 sec: 20206.9, 300 sec: 18887.7). Total num frames: 1216512. Throughput: 0: 5102.8. Samples: 298678. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:26:50,834][06576] Avg episode reward: [(0, '4.752')] [2023-09-14 12:26:51,311][06653] Updated weights for policy 0, policy_version 300 (0.0005) [2023-09-14 12:26:53,394][06653] Updated weights for policy 0, policy_version 310 (0.0008) [2023-09-14 12:26:55,469][06653] Updated weights for policy 0, policy_version 320 (0.0012) [2023-09-14 12:26:55,834][06576] Fps is (10 sec: 19661.1, 60 sec: 20275.2, 300 sec: 18943.4). Total num frames: 1314816. Throughput: 0: 5094.2. Samples: 328316. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:26:55,834][06576] Avg episode reward: [(0, '4.645')] [2023-09-14 12:26:57,507][06653] Updated weights for policy 0, policy_version 330 (0.0005) [2023-09-14 12:26:59,548][06653] Updated weights for policy 0, policy_version 340 (0.0008) [2023-09-14 12:27:00,834][06576] Fps is (10 sec: 20070.4, 60 sec: 20343.5, 300 sec: 19046.7). Total num frames: 1417216. Throughput: 0: 5099.6. Samples: 343552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:27:00,834][06576] Avg episode reward: [(0, '4.578')] [2023-09-14 12:27:01,624][06653] Updated weights for policy 0, policy_version 350 (0.0008) [2023-09-14 12:27:03,661][06653] Updated weights for policy 0, policy_version 360 (0.0011) [2023-09-14 12:27:05,693][06653] Updated weights for policy 0, policy_version 370 (0.0008) [2023-09-14 12:27:05,834][06576] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 19085.3). Total num frames: 1515520. Throughput: 0: 5094.8. Samples: 373426. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:27:05,834][06576] Avg episode reward: [(0, '4.732')] [2023-09-14 12:27:07,682][06653] Updated weights for policy 0, policy_version 380 (0.0006) [2023-09-14 12:27:09,751][06653] Updated weights for policy 0, policy_version 390 (0.0008) [2023-09-14 12:27:10,834][06576] Fps is (10 sec: 20070.3, 60 sec: 20343.4, 300 sec: 19167.9). Total num frames: 1617920. Throughput: 0: 5036.8. Samples: 403624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:27:10,834][06576] Avg episode reward: [(0, '4.663')] [2023-09-14 12:27:11,798][06653] Updated weights for policy 0, policy_version 400 (0.0011) [2023-09-14 12:27:13,862][06653] Updated weights for policy 0, policy_version 410 (0.0005) [2023-09-14 12:27:15,834][06576] Fps is (10 sec: 20069.6, 60 sec: 20275.1, 300 sec: 19195.4). Total num frames: 1716224. Throughput: 0: 5000.9. Samples: 418548. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:27:15,835][06576] Avg episode reward: [(0, '4.644')] [2023-09-14 12:27:15,891][06653] Updated weights for policy 0, policy_version 420 (0.0006) [2023-09-14 12:27:17,939][06653] Updated weights for policy 0, policy_version 430 (0.0005) [2023-09-14 12:27:19,959][06653] Updated weights for policy 0, policy_version 440 (0.0005) [2023-09-14 12:27:20,834][06576] Fps is (10 sec: 20070.3, 60 sec: 20275.2, 300 sec: 19263.5). Total num frames: 1818624. Throughput: 0: 5010.0. Samples: 448906. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:27:20,834][06576] Avg episode reward: [(0, '4.523')] [2023-09-14 12:27:22,031][06653] Updated weights for policy 0, policy_version 450 (0.0011) [2023-09-14 12:27:24,132][06653] Updated weights for policy 0, policy_version 460 (0.0008) [2023-09-14 12:27:25,834][06576] Fps is (10 sec: 20071.3, 60 sec: 20070.4, 300 sec: 19283.5). Total num frames: 1916928. Throughput: 0: 5003.8. Samples: 478634. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:27:25,834][06576] Avg episode reward: [(0, '4.872')] [2023-09-14 12:27:26,173][06653] Updated weights for policy 0, policy_version 470 (0.0008) [2023-09-14 12:27:28,220][06653] Updated weights for policy 0, policy_version 480 (0.0005) [2023-09-14 12:27:30,250][06653] Updated weights for policy 0, policy_version 490 (0.0008) [2023-09-14 12:27:30,834][06576] Fps is (10 sec: 19660.8, 60 sec: 20002.1, 300 sec: 19301.6). Total num frames: 2015232. Throughput: 0: 4997.3. Samples: 493610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:27:30,834][06576] Avg episode reward: [(0, '4.654')] [2023-09-14 12:27:32,277][06653] Updated weights for policy 0, policy_version 500 (0.0005) [2023-09-14 12:27:34,282][06653] Updated weights for policy 0, policy_version 510 (0.0008) [2023-09-14 12:27:35,834][06576] Fps is (10 sec: 20070.4, 60 sec: 20002.1, 300 sec: 19355.4). Total num frames: 2117632. Throughput: 0: 5004.0. Samples: 523858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:27:35,834][06576] Avg episode reward: [(0, '4.554')] [2023-09-14 12:27:36,335][06653] Updated weights for policy 0, policy_version 520 (0.0011) [2023-09-14 12:27:38,363][06653] Updated weights for policy 0, policy_version 530 (0.0005) [2023-09-14 12:27:40,360][06653] Updated weights for policy 0, policy_version 540 (0.0009) [2023-09-14 12:27:40,834][06576] Fps is (10 sec: 20480.1, 60 sec: 20070.4, 300 sec: 19404.6). Total num frames: 2220032. Throughput: 0: 5018.4. Samples: 554144. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:27:40,834][06576] Avg episode reward: [(0, '4.863')] [2023-09-14 12:27:40,837][06635] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000000542_2220032.pth... [2023-09-14 12:27:42,473][06653] Updated weights for policy 0, policy_version 550 (0.0006) [2023-09-14 12:27:44,519][06653] Updated weights for policy 0, policy_version 560 (0.0005) [2023-09-14 12:27:45,834][06576] Fps is (10 sec: 20070.4, 60 sec: 20002.2, 300 sec: 19415.3). Total num frames: 2318336. Throughput: 0: 5008.8. Samples: 568950. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:27:45,834][06576] Avg episode reward: [(0, '4.700')] [2023-09-14 12:27:46,607][06653] Updated weights for policy 0, policy_version 570 (0.0008) [2023-09-14 12:27:48,664][06653] Updated weights for policy 0, policy_version 580 (0.0006) [2023-09-14 12:27:50,712][06653] Updated weights for policy 0, policy_version 590 (0.0008) [2023-09-14 12:27:50,834][06576] Fps is (10 sec: 19660.9, 60 sec: 20002.1, 300 sec: 19425.2). Total num frames: 2416640. Throughput: 0: 5004.9. Samples: 598646. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 12:27:50,834][06576] Avg episode reward: [(0, '4.579')] [2023-09-14 12:27:52,776][06653] Updated weights for policy 0, policy_version 600 (0.0008) [2023-09-14 12:27:54,794][06653] Updated weights for policy 0, policy_version 610 (0.0008) [2023-09-14 12:27:55,834][06576] Fps is (10 sec: 20070.2, 60 sec: 20070.4, 300 sec: 19465.9). Total num frames: 2519040. Throughput: 0: 5001.6. Samples: 628694. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:27:55,834][06576] Avg episode reward: [(0, '4.892')] [2023-09-14 12:27:56,849][06653] Updated weights for policy 0, policy_version 620 (0.0005) [2023-09-14 12:27:58,867][06653] Updated weights for policy 0, policy_version 630 (0.0005) [2023-09-14 12:28:00,843][06576] Fps is (10 sec: 20051.2, 60 sec: 19998.9, 300 sec: 19471.8). Total num frames: 2617344. Throughput: 0: 5004.5. Samples: 643796. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:28:00,845][06576] Avg episode reward: [(0, '4.644')] [2023-09-14 12:28:00,897][06653] Updated weights for policy 0, policy_version 640 (0.0006) [2023-09-14 12:28:02,915][06653] Updated weights for policy 0, policy_version 650 (0.0013) [2023-09-14 12:28:04,933][06653] Updated weights for policy 0, policy_version 660 (0.0008) [2023-09-14 12:28:05,834][06576] Fps is (10 sec: 20070.7, 60 sec: 20070.4, 300 sec: 19509.3). Total num frames: 2719744. Throughput: 0: 5008.5. Samples: 674286. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:28:05,834][06576] Avg episode reward: [(0, '4.578')] [2023-09-14 12:28:07,004][06653] Updated weights for policy 0, policy_version 670 (0.0005) [2023-09-14 12:28:09,058][06653] Updated weights for policy 0, policy_version 680 (0.0008) [2023-09-14 12:28:10,834][06576] Fps is (10 sec: 20089.6, 60 sec: 20002.2, 300 sec: 19514.5). Total num frames: 2818048. Throughput: 0: 5006.2. Samples: 703914. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:28:10,834][06576] Avg episode reward: [(0, '4.521')] [2023-09-14 12:28:11,159][06653] Updated weights for policy 0, policy_version 690 (0.0006) [2023-09-14 12:28:13,189][06653] Updated weights for policy 0, policy_version 700 (0.0005) [2023-09-14 12:28:15,220][06653] Updated weights for policy 0, policy_version 710 (0.0005) [2023-09-14 12:28:15,834][06576] Fps is (10 sec: 19660.8, 60 sec: 20002.3, 300 sec: 19519.4). Total num frames: 2916352. Throughput: 0: 5006.8. Samples: 718916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:28:15,834][06576] Avg episode reward: [(0, '4.649')] [2023-09-14 12:28:17,296][06653] Updated weights for policy 0, policy_version 720 (0.0005) [2023-09-14 12:28:19,364][06653] Updated weights for policy 0, policy_version 730 (0.0005) [2023-09-14 12:28:20,834][06576] Fps is (10 sec: 20070.5, 60 sec: 20002.2, 300 sec: 19550.5). Total num frames: 3018752. Throughput: 0: 4997.8. Samples: 748758. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:28:20,834][06576] Avg episode reward: [(0, '4.472')] [2023-09-14 12:28:21,430][06653] Updated weights for policy 0, policy_version 740 (0.0008) [2023-09-14 12:28:23,465][06653] Updated weights for policy 0, policy_version 750 (0.0008) [2023-09-14 12:28:25,521][06653] Updated weights for policy 0, policy_version 760 (0.0008) [2023-09-14 12:28:25,834][06576] Fps is (10 sec: 20070.3, 60 sec: 20002.1, 300 sec: 19554.0). Total num frames: 3117056. Throughput: 0: 4991.0. Samples: 778738. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 12:28:25,834][06576] Avg episode reward: [(0, '4.460')] [2023-09-14 12:28:27,552][06653] Updated weights for policy 0, policy_version 770 (0.0011) [2023-09-14 12:28:29,571][06653] Updated weights for policy 0, policy_version 780 (0.0008) [2023-09-14 12:28:30,834][06576] Fps is (10 sec: 20070.3, 60 sec: 20070.4, 300 sec: 19582.2). Total num frames: 3219456. Throughput: 0: 5000.0. Samples: 793952. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:28:30,834][06576] Avg episode reward: [(0, '4.624')] [2023-09-14 12:28:31,623][06653] Updated weights for policy 0, policy_version 790 (0.0008) [2023-09-14 12:28:33,642][06653] Updated weights for policy 0, policy_version 800 (0.0005) [2023-09-14 12:28:35,649][06653] Updated weights for policy 0, policy_version 810 (0.0005) [2023-09-14 12:28:35,834][06576] Fps is (10 sec: 20070.3, 60 sec: 20002.1, 300 sec: 19584.5). Total num frames: 3317760. Throughput: 0: 5010.8. Samples: 824132. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:28:35,834][06576] Avg episode reward: [(0, '4.725')] [2023-09-14 12:28:37,656][06653] Updated weights for policy 0, policy_version 820 (0.0005) [2023-09-14 12:28:39,723][06653] Updated weights for policy 0, policy_version 830 (0.0011) [2023-09-14 12:28:40,834][06576] Fps is (10 sec: 20070.4, 60 sec: 20002.1, 300 sec: 19610.2). Total num frames: 3420160. Throughput: 0: 5020.1. Samples: 854596. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:28:40,834][06576] Avg episode reward: [(0, '4.742')] [2023-09-14 12:28:41,716][06653] Updated weights for policy 0, policy_version 840 (0.0005) [2023-09-14 12:28:43,728][06653] Updated weights for policy 0, policy_version 850 (0.0008) [2023-09-14 12:28:45,711][06653] Updated weights for policy 0, policy_version 860 (0.0008) [2023-09-14 12:28:45,834][06576] Fps is (10 sec: 20480.1, 60 sec: 20070.4, 300 sec: 19634.4). Total num frames: 3522560. Throughput: 0: 5026.4. Samples: 869934. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:28:45,834][06576] Avg episode reward: [(0, '4.809')] [2023-09-14 12:28:47,739][06653] Updated weights for policy 0, policy_version 870 (0.0005) [2023-09-14 12:28:49,766][06653] Updated weights for policy 0, policy_version 880 (0.0006) [2023-09-14 12:28:50,834][06576] Fps is (10 sec: 20479.3, 60 sec: 20138.6, 300 sec: 19657.3). Total num frames: 3624960. Throughput: 0: 5026.0. Samples: 900460. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:28:50,834][06576] Avg episode reward: [(0, '4.951')] [2023-09-14 12:28:51,815][06653] Updated weights for policy 0, policy_version 890 (0.0008) [2023-09-14 12:28:53,883][06653] Updated weights for policy 0, policy_version 900 (0.0008) [2023-09-14 12:28:55,838][06576] Fps is (10 sec: 20061.8, 60 sec: 20069.0, 300 sec: 19657.0). Total num frames: 3723264. Throughput: 0: 5031.9. Samples: 930370. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:28:55,838][06576] Avg episode reward: [(0, '4.430')] [2023-09-14 12:28:55,890][06653] Updated weights for policy 0, policy_version 910 (0.0008) [2023-09-14 12:28:57,989][06653] Updated weights for policy 0, policy_version 920 (0.0006) [2023-09-14 12:29:00,045][06653] Updated weights for policy 0, policy_version 930 (0.0008) [2023-09-14 12:29:00,834][06576] Fps is (10 sec: 19661.5, 60 sec: 20073.6, 300 sec: 19657.5). Total num frames: 3821568. Throughput: 0: 5029.6. Samples: 945250. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:29:00,834][06576] Avg episode reward: [(0, '4.560')] [2023-09-14 12:29:02,094][06653] Updated weights for policy 0, policy_version 940 (0.0008) [2023-09-14 12:29:04,098][06653] Updated weights for policy 0, policy_version 950 (0.0008) [2023-09-14 12:29:05,834][06576] Fps is (10 sec: 20078.8, 60 sec: 20070.3, 300 sec: 19678.1). Total num frames: 3923968. Throughput: 0: 5033.4. Samples: 975264. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-09-14 12:29:05,834][06576] Avg episode reward: [(0, '4.922')] [2023-09-14 12:29:06,175][06653] Updated weights for policy 0, policy_version 960 (0.0011) [2023-09-14 12:29:08,206][06653] Updated weights for policy 0, policy_version 970 (0.0008) [2023-09-14 12:29:09,825][06635] Stopping Batcher_0... [2023-09-14 12:29:09,825][06576] Component Batcher_0 stopped! [2023-09-14 12:29:09,826][06635] Loop batcher_evt_loop terminating... [2023-09-14 12:29:09,827][06635] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-09-14 12:29:09,831][06655] Stopping RolloutWorker_w0... [2023-09-14 12:29:09,831][06654] Stopping RolloutWorker_w1... [2023-09-14 12:29:09,831][06655] Loop rollout_proc0_evt_loop terminating... [2023-09-14 12:29:09,832][06654] Loop rollout_proc1_evt_loop terminating... [2023-09-14 12:29:09,831][06576] Component RolloutWorker_w0 stopped! [2023-09-14 12:29:09,832][06576] Component RolloutWorker_w1 stopped! [2023-09-14 12:29:09,832][06664] Stopping RolloutWorker_w6... [2023-09-14 12:29:09,832][06576] Component RolloutWorker_w2 stopped! [2023-09-14 12:29:09,832][06657] Stopping RolloutWorker_w2... [2023-09-14 12:29:09,832][06576] Component RolloutWorker_w6 stopped! [2023-09-14 12:29:09,832][06664] Loop rollout_proc6_evt_loop terminating... [2023-09-14 12:29:09,833][06657] Loop rollout_proc2_evt_loop terminating... [2023-09-14 12:29:09,835][06576] Component RolloutWorker_w3 stopped! [2023-09-14 12:29:09,835][06656] Stopping RolloutWorker_w3... [2023-09-14 12:29:09,835][06656] Loop rollout_proc3_evt_loop terminating... [2023-09-14 12:29:09,841][06658] Stopping RolloutWorker_w4... [2023-09-14 12:29:09,841][06576] Component RolloutWorker_w4 stopped! [2023-09-14 12:29:09,841][06658] Loop rollout_proc4_evt_loop terminating... [2023-09-14 12:29:09,854][06666] Stopping RolloutWorker_w5... [2023-09-14 12:29:09,854][06576] Component RolloutWorker_w5 stopped! [2023-09-14 12:29:09,854][06666] Loop rollout_proc5_evt_loop terminating... [2023-09-14 12:29:09,856][06653] Weights refcount: 2 0 [2023-09-14 12:29:09,857][06653] Stopping InferenceWorker_p0-w0... [2023-09-14 12:29:09,857][06653] Loop inference_proc0-0_evt_loop terminating... [2023-09-14 12:29:09,857][06576] Component InferenceWorker_p0-w0 stopped! [2023-09-14 12:29:09,862][06576] Component RolloutWorker_w7 stopped! [2023-09-14 12:29:09,863][06665] Stopping RolloutWorker_w7... [2023-09-14 12:29:09,863][06665] Loop rollout_proc7_evt_loop terminating... [2023-09-14 12:29:09,879][06635] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-09-14 12:29:09,999][06635] Stopping LearnerWorker_p0... [2023-09-14 12:29:10,000][06635] Loop learner_proc0_evt_loop terminating... [2023-09-14 12:29:10,000][06576] Component LearnerWorker_p0 stopped! [2023-09-14 12:29:10,000][06576] Waiting for process learner_proc0 to stop... [2023-09-14 12:29:10,633][06576] Waiting for process inference_proc0-0 to join... [2023-09-14 12:29:10,634][06576] Waiting for process rollout_proc0 to join... [2023-09-14 12:29:10,664][06576] Waiting for process rollout_proc1 to join... [2023-09-14 12:29:10,665][06576] Waiting for process rollout_proc2 to join... [2023-09-14 12:29:10,665][06576] Waiting for process rollout_proc3 to join... [2023-09-14 12:29:10,782][06576] Waiting for process rollout_proc4 to join... [2023-09-14 12:29:10,782][06576] Waiting for process rollout_proc5 to join... [2023-09-14 12:29:10,783][06576] Waiting for process rollout_proc6 to join... [2023-09-14 12:29:10,783][06576] Waiting for process rollout_proc7 to join... [2023-09-14 12:29:10,799][06576] Batcher 0 profile tree view: batching: 11.7404, releasing_batches: 0.0154 [2023-09-14 12:29:10,800][06576] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 wait_policy_total: 4.8395 update_model: 2.6808 weight_update: 0.0006 one_step: 0.0014 handle_policy_step: 185.4878 deserialize: 6.9683, stack: 0.8104, obs_to_device_normalize: 41.3711, forward: 94.6897, send_messages: 9.7327 prepare_outputs: 25.7261 to_cpu: 17.2904 [2023-09-14 12:29:10,800][06576] Learner 0 profile tree view: misc: 0.0031, prepare_batch: 11.7497 train: 42.9377 epoch_init: 0.0038, minibatch_init: 0.0039, losses_postprocess: 0.1687, kl_divergence: 0.1839, after_optimizer: 1.2763 calculate_losses: 12.7532 losses_init: 0.0016, forward_head: 0.5567, bptt_initial: 8.7798, tail: 0.4275, advantages_returns: 0.1318, losses: 1.9805 bptt: 0.7602 bptt_forward_core: 0.7322 update: 28.3008 clip: 23.3343 [2023-09-14 12:29:10,801][06576] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.1243, enqueue_policy_requests: 4.7432, env_step: 87.7123, overhead: 5.2583, complete_rollouts: 0.6657 save_policy_outputs: 5.2237 split_output_tensors: 2.5254 [2023-09-14 12:29:10,801][06576] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.0675, enqueue_policy_requests: 4.0386, env_step: 111.8632, overhead: 4.3531, complete_rollouts: 0.2009 save_policy_outputs: 3.8566 split_output_tensors: 1.8811 [2023-09-14 12:29:10,802][06576] Loop Runner_EvtLoop terminating... [2023-09-14 12:29:10,802][06576] Runner profile tree view: main_loop: 207.4403 [2023-09-14 12:29:10,802][06576] Collected {0: 4005888}, FPS: 19311.0 [2023-09-14 12:29:10,873][06576] Loading existing experiment configuration from ./PPO-VizDoom/train_dir/default_experiment/config.json [2023-09-14 12:29:10,873][06576] Overriding arg 'num_workers' with value 1 passed from command line [2023-09-14 12:29:10,873][06576] Adding new argument 'no_render'=True that is not in the saved config file! [2023-09-14 12:29:10,873][06576] Adding new argument 'save_video'=True that is not in the saved config file! [2023-09-14 12:29:10,873][06576] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-09-14 12:29:10,873][06576] Adding new argument 'video_name'=None that is not in the saved config file! [2023-09-14 12:29:10,873][06576] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-09-14 12:29:10,873][06576] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-09-14 12:29:10,873][06576] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-09-14 12:29:10,873][06576] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-09-14 12:29:10,873][06576] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-09-14 12:29:10,873][06576] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-09-14 12:29:10,873][06576] Adding new argument 'train_script'=None that is not in the saved config file! [2023-09-14 12:29:10,873][06576] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-09-14 12:29:10,873][06576] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-09-14 12:29:10,892][06576] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:29:10,893][06576] RunningMeanStd input shape: (3, 72, 128) [2023-09-14 12:29:10,893][06576] RunningMeanStd input shape: (1,) [2023-09-14 12:29:10,901][06576] ConvEncoder: input_channels=3 [2023-09-14 12:29:10,978][06576] Conv encoder output size: 512 [2023-09-14 12:29:10,978][06576] Policy head output size: 512 [2023-09-14 12:29:11,057][06576] Loading state from checkpoint ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-09-14 12:29:11,430][06576] Num frames 100... [2023-09-14 12:29:11,499][06576] Num frames 200... [2023-09-14 12:29:11,568][06576] Num frames 300... [2023-09-14 12:29:11,679][06576] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840 [2023-09-14 12:29:11,679][06576] Avg episode reward: 3.840, avg true_objective: 3.840 [2023-09-14 12:29:11,690][06576] Num frames 400... [2023-09-14 12:29:11,758][06576] Num frames 500... [2023-09-14 12:29:11,826][06576] Num frames 600... [2023-09-14 12:29:11,894][06576] Num frames 700... [2023-09-14 12:29:11,992][06576] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840 [2023-09-14 12:29:11,992][06576] Avg episode reward: 3.840, avg true_objective: 3.840 [2023-09-14 12:29:12,014][06576] Num frames 800... [2023-09-14 12:29:12,082][06576] Num frames 900... [2023-09-14 12:29:12,151][06576] Num frames 1000... [2023-09-14 12:29:12,220][06576] Num frames 1100... [2023-09-14 12:29:12,308][06576] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840 [2023-09-14 12:29:12,309][06576] Avg episode reward: 3.840, avg true_objective: 3.840 [2023-09-14 12:29:12,342][06576] Num frames 1200... [2023-09-14 12:29:12,410][06576] Num frames 1300... [2023-09-14 12:29:12,479][06576] Num frames 1400... [2023-09-14 12:29:12,546][06576] Num frames 1500... [2023-09-14 12:29:12,619][06576] Num frames 1600... [2023-09-14 12:29:12,669][06576] Avg episode rewards: #0: 4.250, true rewards: #0: 4.000 [2023-09-14 12:29:12,669][06576] Avg episode reward: 4.250, avg true_objective: 4.000 [2023-09-14 12:29:12,738][06576] Num frames 1700... [2023-09-14 12:29:12,806][06576] Num frames 1800... [2023-09-14 12:29:12,874][06576] Num frames 1900... [2023-09-14 12:29:12,983][06576] Avg episode rewards: #0: 4.168, true rewards: #0: 3.968 [2023-09-14 12:29:12,984][06576] Avg episode reward: 4.168, avg true_objective: 3.968 [2023-09-14 12:29:12,995][06576] Num frames 2000... [2023-09-14 12:29:13,063][06576] Num frames 2100... [2023-09-14 12:29:13,131][06576] Num frames 2200... [2023-09-14 12:29:13,199][06576] Num frames 2300... [2023-09-14 12:29:13,298][06576] Avg episode rewards: #0: 4.113, true rewards: #0: 3.947 [2023-09-14 12:29:13,298][06576] Avg episode reward: 4.113, avg true_objective: 3.947 [2023-09-14 12:29:13,320][06576] Num frames 2400... [2023-09-14 12:29:13,389][06576] Num frames 2500... [2023-09-14 12:29:13,458][06576] Num frames 2600... [2023-09-14 12:29:13,526][06576] Num frames 2700... [2023-09-14 12:29:13,614][06576] Avg episode rewards: #0: 4.074, true rewards: #0: 3.931 [2023-09-14 12:29:13,615][06576] Avg episode reward: 4.074, avg true_objective: 3.931 [2023-09-14 12:29:13,648][06576] Num frames 2800... [2023-09-14 12:29:13,717][06576] Num frames 2900... [2023-09-14 12:29:13,786][06576] Num frames 3000... [2023-09-14 12:29:13,855][06576] Num frames 3100... [2023-09-14 12:29:13,933][06576] Avg episode rewards: #0: 4.045, true rewards: #0: 3.920 [2023-09-14 12:29:13,933][06576] Avg episode reward: 4.045, avg true_objective: 3.920 [2023-09-14 12:29:13,978][06576] Num frames 3200... [2023-09-14 12:29:14,046][06576] Num frames 3300... [2023-09-14 12:29:14,115][06576] Num frames 3400... [2023-09-14 12:29:14,186][06576] Num frames 3500... [2023-09-14 12:29:14,275][06576] Avg episode rewards: #0: 4.169, true rewards: #0: 3.947 [2023-09-14 12:29:14,275][06576] Avg episode reward: 4.169, avg true_objective: 3.947 [2023-09-14 12:29:14,309][06576] Num frames 3600... [2023-09-14 12:29:14,379][06576] Num frames 3700... [2023-09-14 12:29:14,448][06576] Num frames 3800... [2023-09-14 12:29:14,519][06576] Num frames 3900... [2023-09-14 12:29:14,591][06576] Num frames 4000... [2023-09-14 12:29:14,641][06576] Avg episode rewards: #0: 4.300, true rewards: #0: 4.000 [2023-09-14 12:29:14,642][06576] Avg episode reward: 4.300, avg true_objective: 4.000 [2023-09-14 12:29:19,909][06576] Replay video saved to ./PPO-VizDoom/train_dir/default_experiment/replay.mp4! [2023-09-14 12:41:47,695][13933] Saving configuration to ./PPO-VizDoom/train_dir/default_experiment/config.json... [2023-09-14 12:41:47,695][13933] Rollout worker 0 uses device cpu [2023-09-14 12:41:47,695][13933] Rollout worker 1 uses device cpu [2023-09-14 12:41:47,695][13933] Rollout worker 2 uses device cpu [2023-09-14 12:41:47,696][13933] Rollout worker 3 uses device cpu [2023-09-14 12:41:47,696][13933] Rollout worker 4 uses device cpu [2023-09-14 12:41:47,696][13933] Rollout worker 5 uses device cpu [2023-09-14 12:41:47,696][13933] Rollout worker 6 uses device cpu [2023-09-14 12:41:47,696][13933] Rollout worker 7 uses device cpu [2023-09-14 12:41:47,729][13933] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-14 12:41:47,729][13933] InferenceWorker_p0-w0: min num requests: 2 [2023-09-14 12:41:47,747][13933] Starting all processes... [2023-09-14 12:41:47,747][13933] Starting process learner_proc0 [2023-09-14 12:41:48,801][13933] Starting all processes... [2023-09-14 12:41:48,804][13933] Starting process inference_proc0-0 [2023-09-14 12:41:48,805][13933] Starting process rollout_proc0 [2023-09-14 12:41:48,806][13971] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-14 12:41:48,806][13971] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-09-14 12:41:48,805][13933] Starting process rollout_proc1 [2023-09-14 12:41:48,805][13933] Starting process rollout_proc2 [2023-09-14 12:41:48,805][13933] Starting process rollout_proc3 [2023-09-14 12:41:48,816][13971] Num visible devices: 1 [2023-09-14 12:41:48,808][13933] Starting process rollout_proc4 [2023-09-14 12:41:48,809][13933] Starting process rollout_proc5 [2023-09-14 12:41:48,809][13933] Starting process rollout_proc6 [2023-09-14 12:41:48,813][13933] Starting process rollout_proc7 [2023-09-14 12:41:48,876][13971] Starting seed is not provided [2023-09-14 12:41:48,877][13971] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-14 12:41:48,877][13971] Initializing actor-critic model on device cuda:0 [2023-09-14 12:41:48,877][13971] RunningMeanStd input shape: (3, 72, 128) [2023-09-14 12:41:48,878][13971] RunningMeanStd input shape: (1,) [2023-09-14 12:41:48,887][13971] ConvEncoder: input_channels=3 [2023-09-14 12:41:49,059][13971] Conv encoder output size: 512 [2023-09-14 12:41:49,066][13971] Policy head output size: 512 [2023-09-14 12:41:49,089][13971] Created Actor Critic model with architecture: [2023-09-14 12:41:49,089][13971] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-09-14 12:41:49,233][13971] Using optimizer [2023-09-14 12:41:49,234][13971] Loading state from checkpoint ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-09-14 12:41:49,277][13971] Loading model from checkpoint [2023-09-14 12:41:49,281][13971] Loaded experiment state at self.train_step=978, self.env_steps=4005888 [2023-09-14 12:41:49,282][13971] Initialized policy 0 weights for model version 978 [2023-09-14 12:41:49,285][13971] LearnerWorker_p0 finished initialization! [2023-09-14 12:41:49,285][13971] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-14 12:41:50,412][13991] Worker 1 uses CPU cores [1] [2023-09-14 12:41:50,592][13990] Worker 0 uses CPU cores [0] [2023-09-14 12:41:50,657][13992] Worker 2 uses CPU cores [2] [2023-09-14 12:41:50,736][14000] Worker 5 uses CPU cores [5] [2023-09-14 12:41:50,736][14001] Worker 6 uses CPU cores [0, 1, 2] [2023-09-14 12:41:50,935][13998] Worker 4 uses CPU cores [4] [2023-09-14 12:41:50,964][13999] Worker 3 uses CPU cores [3] [2023-09-14 12:41:51,036][14003] Worker 7 uses CPU cores [3, 4, 5] [2023-09-14 12:41:51,045][13933] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 4005888. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-09-14 12:41:51,048][13989] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-14 12:41:51,048][13989] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-09-14 12:41:51,058][13989] Num visible devices: 1 [2023-09-14 12:41:51,087][13989] RunningMeanStd input shape: (3, 72, 128) [2023-09-14 12:41:51,088][13989] RunningMeanStd input shape: (1,) [2023-09-14 12:41:51,095][13989] ConvEncoder: input_channels=3 [2023-09-14 12:41:51,159][13989] Conv encoder output size: 512 [2023-09-14 12:41:51,159][13989] Policy head output size: 512 [2023-09-14 12:41:51,202][13933] Inference worker 0-0 is ready! [2023-09-14 12:41:51,202][13933] All inference workers are ready! Signal rollout workers to start! [2023-09-14 12:41:51,231][13992] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:41:51,231][13998] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:41:51,232][13999] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:41:51,232][13991] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:41:51,249][14003] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:41:51,253][13990] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:41:51,254][14001] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:41:51,254][14000] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 12:41:51,417][13999] Decorrelating experience for 0 frames... [2023-09-14 12:41:51,420][13992] Decorrelating experience for 0 frames... [2023-09-14 12:41:51,421][13998] Decorrelating experience for 0 frames... [2023-09-14 12:41:51,453][13990] Decorrelating experience for 0 frames... [2023-09-14 12:41:51,453][14000] Decorrelating experience for 0 frames... [2023-09-14 12:41:51,492][13991] Decorrelating experience for 0 frames... [2023-09-14 12:41:51,552][14001] Decorrelating experience for 0 frames... [2023-09-14 12:41:51,569][13999] Decorrelating experience for 32 frames... [2023-09-14 12:41:51,603][14000] Decorrelating experience for 32 frames... [2023-09-14 12:41:51,693][13992] Decorrelating experience for 32 frames... [2023-09-14 12:41:51,747][14003] Decorrelating experience for 0 frames... [2023-09-14 12:41:51,754][13998] Decorrelating experience for 32 frames... [2023-09-14 12:41:51,760][13990] Decorrelating experience for 32 frames... [2023-09-14 12:41:51,779][13999] Decorrelating experience for 64 frames... [2023-09-14 12:41:51,799][13991] Decorrelating experience for 32 frames... [2023-09-14 12:41:51,808][14000] Decorrelating experience for 64 frames... [2023-09-14 12:41:51,955][13999] Decorrelating experience for 96 frames... [2023-09-14 12:41:51,968][13990] Decorrelating experience for 64 frames... [2023-09-14 12:41:52,000][14003] Decorrelating experience for 32 frames... [2023-09-14 12:41:52,006][13991] Decorrelating experience for 64 frames... [2023-09-14 12:41:52,079][14001] Decorrelating experience for 32 frames... [2023-09-14 12:41:52,143][13990] Decorrelating experience for 96 frames... [2023-09-14 12:41:52,152][13992] Decorrelating experience for 64 frames... [2023-09-14 12:41:52,178][13991] Decorrelating experience for 96 frames... [2023-09-14 12:41:52,255][14000] Decorrelating experience for 96 frames... [2023-09-14 12:41:52,289][14003] Decorrelating experience for 64 frames... [2023-09-14 12:41:52,334][13992] Decorrelating experience for 96 frames... [2023-09-14 12:41:52,474][14001] Decorrelating experience for 64 frames... [2023-09-14 12:41:52,592][14003] Decorrelating experience for 96 frames... [2023-09-14 12:41:52,750][14001] Decorrelating experience for 96 frames... [2023-09-14 12:41:52,963][13971] Signal inference workers to stop experience collection... [2023-09-14 12:41:52,966][13989] InferenceWorker_p0-w0: stopping experience collection [2023-09-14 12:41:53,048][13998] Decorrelating experience for 64 frames... [2023-09-14 12:41:53,222][13998] Decorrelating experience for 96 frames... [2023-09-14 12:41:54,915][13971] Signal inference workers to resume experience collection... [2023-09-14 12:41:54,916][13989] InferenceWorker_p0-w0: resuming experience collection [2023-09-14 12:41:55,160][13933] Fps is (10 sec: 995.6, 60 sec: 995.6, 300 sec: 995.6). Total num frames: 4009984. Throughput: 0: 631.5. Samples: 2598. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-14 12:41:55,160][13933] Avg episode reward: [(0, '2.998')] [2023-09-14 12:41:57,140][13989] Updated weights for policy 0, policy_version 988 (0.0239) [2023-09-14 12:41:59,243][13989] Updated weights for policy 0, policy_version 998 (0.0008) [2023-09-14 12:42:00,160][13933] Fps is (10 sec: 10785.9, 60 sec: 10785.9, 300 sec: 10785.9). Total num frames: 4104192. Throughput: 0: 1224.3. Samples: 11158. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:42:00,160][13933] Avg episode reward: [(0, '4.761')] [2023-09-14 12:42:01,346][13989] Updated weights for policy 0, policy_version 1008 (0.0008) [2023-09-14 12:42:03,435][13989] Updated weights for policy 0, policy_version 1018 (0.0006) [2023-09-14 12:42:05,160][13933] Fps is (10 sec: 19251.2, 60 sec: 13929.9, 300 sec: 13929.9). Total num frames: 4202496. Throughput: 0: 2871.2. Samples: 40524. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-09-14 12:42:05,160][13933] Avg episode reward: [(0, '4.730')] [2023-09-14 12:42:05,507][13989] Updated weights for policy 0, policy_version 1028 (0.0006) [2023-09-14 12:42:07,622][13989] Updated weights for policy 0, policy_version 1038 (0.0008) [2023-09-14 12:42:07,725][13933] Heartbeat connected on Batcher_0 [2023-09-14 12:42:07,727][13933] Heartbeat connected on LearnerWorker_p0 [2023-09-14 12:42:07,731][13933] Heartbeat connected on InferenceWorker_p0-w0 [2023-09-14 12:42:07,734][13933] Heartbeat connected on RolloutWorker_w1 [2023-09-14 12:42:07,735][13933] Heartbeat connected on RolloutWorker_w0 [2023-09-14 12:42:07,737][13933] Heartbeat connected on RolloutWorker_w2 [2023-09-14 12:42:07,739][13933] Heartbeat connected on RolloutWorker_w3 [2023-09-14 12:42:07,744][13933] Heartbeat connected on RolloutWorker_w5 [2023-09-14 12:42:07,745][13933] Heartbeat connected on RolloutWorker_w6 [2023-09-14 12:42:07,746][13933] Heartbeat connected on RolloutWorker_w4 [2023-09-14 12:42:07,747][13933] Heartbeat connected on RolloutWorker_w7 [2023-09-14 12:42:09,679][13989] Updated weights for policy 0, policy_version 1048 (0.0005) [2023-09-14 12:42:10,160][13933] Fps is (10 sec: 19660.8, 60 sec: 15429.0, 300 sec: 15429.0). Total num frames: 4300800. Throughput: 0: 3667.2. Samples: 70096. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:42:10,160][13933] Avg episode reward: [(0, '4.578')] [2023-09-14 12:42:11,757][13989] Updated weights for policy 0, policy_version 1058 (0.0008) [2023-09-14 12:42:13,801][13989] Updated weights for policy 0, policy_version 1068 (0.0006) [2023-09-14 12:42:15,160][13933] Fps is (10 sec: 19660.8, 60 sec: 16306.5, 300 sec: 16306.5). Total num frames: 4399104. Throughput: 0: 3521.8. Samples: 84924. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:42:15,160][13933] Avg episode reward: [(0, '4.779')] [2023-09-14 12:42:15,851][13989] Updated weights for policy 0, policy_version 1078 (0.0013) [2023-09-14 12:42:17,974][13989] Updated weights for policy 0, policy_version 1088 (0.0012) [2023-09-14 12:42:20,005][13989] Updated weights for policy 0, policy_version 1098 (0.0008) [2023-09-14 12:42:20,160][13933] Fps is (10 sec: 19660.7, 60 sec: 16882.5, 300 sec: 16882.5). Total num frames: 4497408. Throughput: 0: 3935.4. Samples: 114576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 12:42:20,160][13933] Avg episode reward: [(0, '4.483')] [2023-09-14 12:42:22,083][13989] Updated weights for policy 0, policy_version 1108 (0.0008) [2023-09-14 12:42:24,156][13989] Updated weights for policy 0, policy_version 1118 (0.0008) [2023-09-14 12:42:25,160][13933] Fps is (10 sec: 19660.7, 60 sec: 17289.7, 300 sec: 17289.7). Total num frames: 4595712. Throughput: 0: 4230.1. Samples: 144306. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:42:25,160][13933] Avg episode reward: [(0, '4.788')] [2023-09-14 12:42:26,205][13989] Updated weights for policy 0, policy_version 1128 (0.0005) [2023-09-14 12:42:28,280][13989] Updated weights for policy 0, policy_version 1138 (0.0008) [2023-09-14 12:42:30,160][13933] Fps is (10 sec: 19661.0, 60 sec: 17592.8, 300 sec: 17592.8). Total num frames: 4694016. Throughput: 0: 4070.9. Samples: 159230. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:42:30,160][13933] Avg episode reward: [(0, '4.555')] [2023-09-14 12:42:30,380][13989] Updated weights for policy 0, policy_version 1148 (0.0006) [2023-09-14 12:42:32,440][13989] Updated weights for policy 0, policy_version 1158 (0.0006) [2023-09-14 12:42:34,500][13989] Updated weights for policy 0, policy_version 1168 (0.0005) [2023-09-14 12:42:35,160][13933] Fps is (10 sec: 20070.7, 60 sec: 17920.1, 300 sec: 17920.1). Total num frames: 4796416. Throughput: 0: 4280.9. Samples: 188848. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:42:35,160][13933] Avg episode reward: [(0, '4.537')] [2023-09-14 12:42:36,604][13989] Updated weights for policy 0, policy_version 1178 (0.0008) [2023-09-14 12:42:38,671][13989] Updated weights for policy 0, policy_version 1188 (0.0006) [2023-09-14 12:42:40,160][13933] Fps is (10 sec: 20070.3, 60 sec: 18097.3, 300 sec: 18097.3). Total num frames: 4894720. Throughput: 0: 4795.4. Samples: 218392. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:42:40,160][13933] Avg episode reward: [(0, '4.904')] [2023-09-14 12:42:40,759][13989] Updated weights for policy 0, policy_version 1198 (0.0006) [2023-09-14 12:42:42,816][13989] Updated weights for policy 0, policy_version 1208 (0.0006) [2023-09-14 12:42:44,859][13989] Updated weights for policy 0, policy_version 1218 (0.0006) [2023-09-14 12:42:45,160][13933] Fps is (10 sec: 19660.6, 60 sec: 18241.7, 300 sec: 18241.7). Total num frames: 4993024. Throughput: 0: 4933.6. Samples: 233170. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:42:45,160][13933] Avg episode reward: [(0, '4.695')] [2023-09-14 12:42:46,943][13989] Updated weights for policy 0, policy_version 1228 (0.0008) [2023-09-14 12:42:49,011][13989] Updated weights for policy 0, policy_version 1238 (0.0008) [2023-09-14 12:42:50,160][13933] Fps is (10 sec: 19660.8, 60 sec: 18361.8, 300 sec: 18361.8). Total num frames: 5091328. Throughput: 0: 4940.5. Samples: 262846. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 12:42:50,160][13933] Avg episode reward: [(0, '4.864')] [2023-09-14 12:42:51,083][13989] Updated weights for policy 0, policy_version 1248 (0.0005) [2023-09-14 12:42:53,137][13989] Updated weights for policy 0, policy_version 1258 (0.0008) [2023-09-14 12:42:55,160][13933] Fps is (10 sec: 19660.7, 60 sec: 19660.8, 300 sec: 18463.1). Total num frames: 5189632. Throughput: 0: 4943.9. Samples: 292572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 3.0) [2023-09-14 12:42:55,160][13933] Avg episode reward: [(0, '4.679')] [2023-09-14 12:42:55,237][13989] Updated weights for policy 0, policy_version 1268 (0.0008) [2023-09-14 12:42:57,312][13989] Updated weights for policy 0, policy_version 1278 (0.0006) [2023-09-14 12:42:59,360][13989] Updated weights for policy 0, policy_version 1288 (0.0008) [2023-09-14 12:43:00,160][13933] Fps is (10 sec: 19660.9, 60 sec: 19729.1, 300 sec: 18549.7). Total num frames: 5287936. Throughput: 0: 4945.5. Samples: 307472. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:43:00,160][13933] Avg episode reward: [(0, '4.568')] [2023-09-14 12:43:01,448][13989] Updated weights for policy 0, policy_version 1298 (0.0006) [2023-09-14 12:43:03,496][13989] Updated weights for policy 0, policy_version 1308 (0.0005) [2023-09-14 12:43:05,160][13933] Fps is (10 sec: 20070.5, 60 sec: 19797.3, 300 sec: 18679.9). Total num frames: 5390336. Throughput: 0: 4950.4. Samples: 337344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:43:05,160][13933] Avg episode reward: [(0, '4.631')] [2023-09-14 12:43:05,548][13989] Updated weights for policy 0, policy_version 1318 (0.0008) [2023-09-14 12:43:07,584][13989] Updated weights for policy 0, policy_version 1328 (0.0008) [2023-09-14 12:43:09,580][13989] Updated weights for policy 0, policy_version 1338 (0.0005) [2023-09-14 12:43:10,160][13933] Fps is (10 sec: 20070.3, 60 sec: 19797.3, 300 sec: 18741.9). Total num frames: 5488640. Throughput: 0: 4957.6. Samples: 367398. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:43:10,160][13933] Avg episode reward: [(0, '4.729')] [2023-09-14 12:43:11,637][13989] Updated weights for policy 0, policy_version 1348 (0.0008) [2023-09-14 12:43:13,648][13989] Updated weights for policy 0, policy_version 1358 (0.0008) [2023-09-14 12:43:15,159][13933] Fps is (10 sec: 20070.6, 60 sec: 19865.6, 300 sec: 18845.3). Total num frames: 5591040. Throughput: 0: 4962.0. Samples: 382520. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:43:15,160][13933] Avg episode reward: [(0, '4.801')] [2023-09-14 12:43:15,715][13989] Updated weights for policy 0, policy_version 1368 (0.0008) [2023-09-14 12:43:17,776][13989] Updated weights for policy 0, policy_version 1378 (0.0005) [2023-09-14 12:43:19,818][13989] Updated weights for policy 0, policy_version 1388 (0.0011) [2023-09-14 12:43:20,160][13933] Fps is (10 sec: 20070.3, 60 sec: 19865.6, 300 sec: 18891.0). Total num frames: 5689344. Throughput: 0: 4967.6. Samples: 412390. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:43:20,160][13933] Avg episode reward: [(0, '4.770')] [2023-09-14 12:43:21,875][13989] Updated weights for policy 0, policy_version 1398 (0.0011) [2023-09-14 12:43:23,915][13989] Updated weights for policy 0, policy_version 1408 (0.0005) [2023-09-14 12:43:25,160][13933] Fps is (10 sec: 20070.2, 60 sec: 19933.9, 300 sec: 18975.4). Total num frames: 5791744. Throughput: 0: 4980.4. Samples: 442508. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:43:25,160][13933] Avg episode reward: [(0, '4.716')] [2023-09-14 12:43:25,969][13989] Updated weights for policy 0, policy_version 1418 (0.0006) [2023-09-14 12:43:27,947][13989] Updated weights for policy 0, policy_version 1428 (0.0005) [2023-09-14 12:43:29,959][13989] Updated weights for policy 0, policy_version 1438 (0.0005) [2023-09-14 12:43:30,160][13933] Fps is (10 sec: 20070.5, 60 sec: 19933.9, 300 sec: 19010.0). Total num frames: 5890048. Throughput: 0: 4990.3. Samples: 457734. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:43:30,160][13933] Avg episode reward: [(0, '5.154')] [2023-09-14 12:43:30,166][13971] Saving new best policy, reward=5.154! [2023-09-14 12:43:31,967][13989] Updated weights for policy 0, policy_version 1448 (0.0005) [2023-09-14 12:43:33,982][13989] Updated weights for policy 0, policy_version 1458 (0.0005) [2023-09-14 12:43:35,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19933.8, 300 sec: 19080.6). Total num frames: 5992448. Throughput: 0: 5006.4. Samples: 488132. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:43:35,160][13933] Avg episode reward: [(0, '5.077')] [2023-09-14 12:43:36,007][13989] Updated weights for policy 0, policy_version 1468 (0.0008) [2023-09-14 12:43:37,987][13989] Updated weights for policy 0, policy_version 1478 (0.0008) [2023-09-14 12:43:39,993][13989] Updated weights for policy 0, policy_version 1488 (0.0005) [2023-09-14 12:43:40,160][13933] Fps is (10 sec: 20479.8, 60 sec: 20002.1, 300 sec: 19144.7). Total num frames: 6094848. Throughput: 0: 5026.7. Samples: 518772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:43:40,160][13933] Avg episode reward: [(0, '4.918')] [2023-09-14 12:43:42,030][13989] Updated weights for policy 0, policy_version 1498 (0.0008) [2023-09-14 12:43:43,979][13989] Updated weights for policy 0, policy_version 1508 (0.0008) [2023-09-14 12:43:45,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20070.4, 300 sec: 19203.2). Total num frames: 6197248. Throughput: 0: 5038.7. Samples: 534214. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:43:45,160][13933] Avg episode reward: [(0, '4.780')] [2023-09-14 12:43:45,195][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000001514_6201344.pth... [2023-09-14 12:43:45,239][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000000542_2220032.pth [2023-09-14 12:43:45,947][13989] Updated weights for policy 0, policy_version 1518 (0.0005) [2023-09-14 12:43:47,927][13989] Updated weights for policy 0, policy_version 1528 (0.0006) [2023-09-14 12:43:49,932][13989] Updated weights for policy 0, policy_version 1538 (0.0008) [2023-09-14 12:43:50,160][13933] Fps is (10 sec: 20889.8, 60 sec: 20206.9, 300 sec: 19291.2). Total num frames: 6303744. Throughput: 0: 5065.5. Samples: 565290. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:43:50,160][13933] Avg episode reward: [(0, '4.916')] [2023-09-14 12:43:51,934][13989] Updated weights for policy 0, policy_version 1548 (0.0006) [2023-09-14 12:43:53,920][13989] Updated weights for policy 0, policy_version 1558 (0.0006) [2023-09-14 12:43:55,160][13933] Fps is (10 sec: 20889.6, 60 sec: 20275.2, 300 sec: 19339.1). Total num frames: 6406144. Throughput: 0: 5084.4. Samples: 596194. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:43:55,160][13933] Avg episode reward: [(0, '4.447')] [2023-09-14 12:43:55,892][13989] Updated weights for policy 0, policy_version 1568 (0.0005) [2023-09-14 12:43:57,863][13989] Updated weights for policy 0, policy_version 1578 (0.0010) [2023-09-14 12:43:59,836][13989] Updated weights for policy 0, policy_version 1588 (0.0008) [2023-09-14 12:44:00,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20343.5, 300 sec: 19383.3). Total num frames: 6508544. Throughput: 0: 5093.8. Samples: 611740. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:44:00,160][13933] Avg episode reward: [(0, '4.891')] [2023-09-14 12:44:01,836][13989] Updated weights for policy 0, policy_version 1598 (0.0005) [2023-09-14 12:44:03,789][13989] Updated weights for policy 0, policy_version 1608 (0.0005) [2023-09-14 12:44:05,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20343.5, 300 sec: 19424.2). Total num frames: 6610944. Throughput: 0: 5120.0. Samples: 642792. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:44:05,160][13933] Avg episode reward: [(0, '5.277')] [2023-09-14 12:44:05,215][13971] Saving new best policy, reward=5.277! [2023-09-14 12:44:05,815][13989] Updated weights for policy 0, policy_version 1618 (0.0006) [2023-09-14 12:44:07,786][13989] Updated weights for policy 0, policy_version 1628 (0.0010) [2023-09-14 12:44:09,803][13989] Updated weights for policy 0, policy_version 1638 (0.0006) [2023-09-14 12:44:10,160][13933] Fps is (10 sec: 20479.6, 60 sec: 20411.7, 300 sec: 19462.1). Total num frames: 6713344. Throughput: 0: 5130.8. Samples: 673396. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:44:10,160][13933] Avg episode reward: [(0, '5.319')] [2023-09-14 12:44:10,160][13971] Saving new best policy, reward=5.319! [2023-09-14 12:44:11,821][13989] Updated weights for policy 0, policy_version 1648 (0.0005) [2023-09-14 12:44:13,819][13989] Updated weights for policy 0, policy_version 1658 (0.0008) [2023-09-14 12:44:15,160][13933] Fps is (10 sec: 20479.8, 60 sec: 20411.7, 300 sec: 19497.4). Total num frames: 6815744. Throughput: 0: 5132.9. Samples: 688714. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:44:15,160][13933] Avg episode reward: [(0, '6.353')] [2023-09-14 12:44:15,203][13971] Saving new best policy, reward=6.353! [2023-09-14 12:44:15,843][13989] Updated weights for policy 0, policy_version 1668 (0.0011) [2023-09-14 12:44:17,862][13989] Updated weights for policy 0, policy_version 1678 (0.0008) [2023-09-14 12:44:19,849][13989] Updated weights for policy 0, policy_version 1688 (0.0008) [2023-09-14 12:44:20,160][13933] Fps is (10 sec: 20480.3, 60 sec: 20480.0, 300 sec: 19530.4). Total num frames: 6918144. Throughput: 0: 5138.1. Samples: 719348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:44:20,160][13933] Avg episode reward: [(0, '6.127')] [2023-09-14 12:44:21,897][13989] Updated weights for policy 0, policy_version 1698 (0.0008) [2023-09-14 12:44:23,927][13989] Updated weights for policy 0, policy_version 1708 (0.0005) [2023-09-14 12:44:25,160][13933] Fps is (10 sec: 20480.2, 60 sec: 20480.0, 300 sec: 19561.2). Total num frames: 7020544. Throughput: 0: 5132.8. Samples: 749748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:44:25,160][13933] Avg episode reward: [(0, '6.781')] [2023-09-14 12:44:25,162][13971] Saving new best policy, reward=6.781! [2023-09-14 12:44:25,956][13989] Updated weights for policy 0, policy_version 1718 (0.0006) [2023-09-14 12:44:28,004][13989] Updated weights for policy 0, policy_version 1728 (0.0006) [2023-09-14 12:44:29,992][13989] Updated weights for policy 0, policy_version 1738 (0.0005) [2023-09-14 12:44:30,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20480.0, 300 sec: 19564.3). Total num frames: 7118848. Throughput: 0: 5123.2. Samples: 764758. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:44:30,160][13933] Avg episode reward: [(0, '8.299')] [2023-09-14 12:44:30,160][13971] Saving new best policy, reward=8.299! [2023-09-14 12:44:32,057][13989] Updated weights for policy 0, policy_version 1748 (0.0008) [2023-09-14 12:44:34,056][13989] Updated weights for policy 0, policy_version 1758 (0.0005) [2023-09-14 12:44:35,160][13933] Fps is (10 sec: 20070.1, 60 sec: 20480.0, 300 sec: 19592.2). Total num frames: 7221248. Throughput: 0: 5107.4. Samples: 795124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:44:35,161][13933] Avg episode reward: [(0, '8.570')] [2023-09-14 12:44:35,164][13971] Saving new best policy, reward=8.570! [2023-09-14 12:44:36,075][13989] Updated weights for policy 0, policy_version 1768 (0.0011) [2023-09-14 12:44:38,091][13989] Updated weights for policy 0, policy_version 1778 (0.0006) [2023-09-14 12:44:40,123][13989] Updated weights for policy 0, policy_version 1788 (0.0011) [2023-09-14 12:44:40,160][13933] Fps is (10 sec: 20479.9, 60 sec: 20480.0, 300 sec: 19618.5). Total num frames: 7323648. Throughput: 0: 5103.3. Samples: 825842. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:44:40,161][13933] Avg episode reward: [(0, '9.939')] [2023-09-14 12:44:40,162][13971] Saving new best policy, reward=9.939! [2023-09-14 12:44:42,135][13989] Updated weights for policy 0, policy_version 1798 (0.0005) [2023-09-14 12:44:44,161][13989] Updated weights for policy 0, policy_version 1808 (0.0011) [2023-09-14 12:44:45,160][13933] Fps is (10 sec: 20070.6, 60 sec: 20411.7, 300 sec: 19619.7). Total num frames: 7421952. Throughput: 0: 5089.6. Samples: 840772. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:44:45,160][13933] Avg episode reward: [(0, '10.736')] [2023-09-14 12:44:45,191][13971] Saving new best policy, reward=10.736! [2023-09-14 12:44:46,199][13989] Updated weights for policy 0, policy_version 1818 (0.0006) [2023-09-14 12:44:48,284][13989] Updated weights for policy 0, policy_version 1828 (0.0008) [2023-09-14 12:44:50,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20343.4, 300 sec: 19643.7). Total num frames: 7524352. Throughput: 0: 5068.0. Samples: 870852. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:44:50,160][13933] Avg episode reward: [(0, '12.469')] [2023-09-14 12:44:50,160][13971] Saving new best policy, reward=12.469! [2023-09-14 12:44:50,298][13989] Updated weights for policy 0, policy_version 1838 (0.0006) [2023-09-14 12:44:52,308][13989] Updated weights for policy 0, policy_version 1848 (0.0005) [2023-09-14 12:44:54,323][13989] Updated weights for policy 0, policy_version 1858 (0.0005) [2023-09-14 12:44:55,160][13933] Fps is (10 sec: 20480.1, 60 sec: 20343.5, 300 sec: 19666.4). Total num frames: 7626752. Throughput: 0: 5063.8. Samples: 901266. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:44:55,160][13933] Avg episode reward: [(0, '15.134')] [2023-09-14 12:44:55,163][13971] Saving new best policy, reward=15.134! [2023-09-14 12:44:56,357][13989] Updated weights for policy 0, policy_version 1868 (0.0005) [2023-09-14 12:44:58,369][13989] Updated weights for policy 0, policy_version 1878 (0.0013) [2023-09-14 12:45:00,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 19666.3). Total num frames: 7725056. Throughput: 0: 5059.7. Samples: 916398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:45:00,160][13933] Avg episode reward: [(0, '13.500')] [2023-09-14 12:45:00,392][13989] Updated weights for policy 0, policy_version 1888 (0.0005) [2023-09-14 12:45:02,415][13989] Updated weights for policy 0, policy_version 1898 (0.0005) [2023-09-14 12:45:04,386][13989] Updated weights for policy 0, policy_version 1908 (0.0005) [2023-09-14 12:45:05,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 19687.2). Total num frames: 7827456. Throughput: 0: 5056.4. Samples: 946888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:45:05,160][13933] Avg episode reward: [(0, '15.391')] [2023-09-14 12:45:05,177][13971] Saving new best policy, reward=15.391! [2023-09-14 12:45:06,430][13989] Updated weights for policy 0, policy_version 1918 (0.0011) [2023-09-14 12:45:08,388][13989] Updated weights for policy 0, policy_version 1928 (0.0010) [2023-09-14 12:45:10,160][13933] Fps is (10 sec: 20480.1, 60 sec: 20275.3, 300 sec: 19707.1). Total num frames: 7929856. Throughput: 0: 5063.9. Samples: 977622. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:45:10,160][13933] Avg episode reward: [(0, '15.530')] [2023-09-14 12:45:10,160][13971] Saving new best policy, reward=15.530! [2023-09-14 12:45:10,454][13989] Updated weights for policy 0, policy_version 1938 (0.0005) [2023-09-14 12:45:12,503][13989] Updated weights for policy 0, policy_version 1948 (0.0005) [2023-09-14 12:45:14,479][13989] Updated weights for policy 0, policy_version 1958 (0.0005) [2023-09-14 12:45:15,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20275.2, 300 sec: 19726.1). Total num frames: 8032256. Throughput: 0: 5060.9. Samples: 992500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:45:15,160][13933] Avg episode reward: [(0, '17.821')] [2023-09-14 12:45:15,163][13971] Saving new best policy, reward=17.821! [2023-09-14 12:45:16,491][13989] Updated weights for policy 0, policy_version 1968 (0.0005) [2023-09-14 12:45:18,575][13989] Updated weights for policy 0, policy_version 1978 (0.0005) [2023-09-14 12:45:20,160][13933] Fps is (10 sec: 20070.3, 60 sec: 20206.9, 300 sec: 19724.5). Total num frames: 8130560. Throughput: 0: 5064.5. Samples: 1023024. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:45:20,160][13933] Avg episode reward: [(0, '18.788')] [2023-09-14 12:45:20,162][13971] Saving new best policy, reward=18.788! [2023-09-14 12:45:20,553][13989] Updated weights for policy 0, policy_version 1988 (0.0008) [2023-09-14 12:45:22,587][13989] Updated weights for policy 0, policy_version 1998 (0.0010) [2023-09-14 12:45:24,580][13989] Updated weights for policy 0, policy_version 2008 (0.0005) [2023-09-14 12:45:25,161][13933] Fps is (10 sec: 20067.3, 60 sec: 20206.4, 300 sec: 19742.0). Total num frames: 8232960. Throughput: 0: 5059.7. Samples: 1053538. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:45:25,161][13933] Avg episode reward: [(0, '17.400')] [2023-09-14 12:45:26,599][13989] Updated weights for policy 0, policy_version 2018 (0.0008) [2023-09-14 12:45:28,570][13989] Updated weights for policy 0, policy_version 2028 (0.0005) [2023-09-14 12:45:30,160][13933] Fps is (10 sec: 20480.1, 60 sec: 20275.2, 300 sec: 19759.0). Total num frames: 8335360. Throughput: 0: 5068.5. Samples: 1068854. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:45:30,160][13933] Avg episode reward: [(0, '18.872')] [2023-09-14 12:45:30,181][13971] Saving new best policy, reward=18.872! [2023-09-14 12:45:30,549][13989] Updated weights for policy 0, policy_version 2038 (0.0008) [2023-09-14 12:45:32,584][13989] Updated weights for policy 0, policy_version 2048 (0.0008) [2023-09-14 12:45:34,575][13989] Updated weights for policy 0, policy_version 2058 (0.0005) [2023-09-14 12:45:35,160][13933] Fps is (10 sec: 20483.1, 60 sec: 20275.2, 300 sec: 19775.1). Total num frames: 8437760. Throughput: 0: 5082.9. Samples: 1099584. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:45:35,160][13933] Avg episode reward: [(0, '18.008')] [2023-09-14 12:45:36,617][13989] Updated weights for policy 0, policy_version 2068 (0.0006) [2023-09-14 12:45:38,578][13989] Updated weights for policy 0, policy_version 2078 (0.0006) [2023-09-14 12:45:40,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20275.2, 300 sec: 19790.5). Total num frames: 8540160. Throughput: 0: 5090.4. Samples: 1130336. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:45:40,160][13933] Avg episode reward: [(0, '19.428')] [2023-09-14 12:45:40,200][13971] Saving new best policy, reward=19.428! [2023-09-14 12:45:40,574][13989] Updated weights for policy 0, policy_version 2088 (0.0005) [2023-09-14 12:45:42,617][13989] Updated weights for policy 0, policy_version 2098 (0.0006) [2023-09-14 12:45:44,599][13989] Updated weights for policy 0, policy_version 2108 (0.0008) [2023-09-14 12:45:45,160][13933] Fps is (10 sec: 20480.1, 60 sec: 20343.5, 300 sec: 19805.2). Total num frames: 8642560. Throughput: 0: 5092.4. Samples: 1145558. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:45:45,160][13933] Avg episode reward: [(0, '17.791')] [2023-09-14 12:45:45,197][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000002111_8646656.pth... [2023-09-14 12:45:45,244][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth [2023-09-14 12:45:46,670][13989] Updated weights for policy 0, policy_version 2118 (0.0008) [2023-09-14 12:45:48,708][13989] Updated weights for policy 0, policy_version 2128 (0.0006) [2023-09-14 12:45:50,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20343.5, 300 sec: 19819.3). Total num frames: 8744960. Throughput: 0: 5084.3. Samples: 1175680. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:45:50,160][13933] Avg episode reward: [(0, '20.462')] [2023-09-14 12:45:50,160][13971] Saving new best policy, reward=20.462! [2023-09-14 12:45:50,802][13989] Updated weights for policy 0, policy_version 2138 (0.0005) [2023-09-14 12:45:52,836][13989] Updated weights for policy 0, policy_version 2148 (0.0008) [2023-09-14 12:45:54,907][13989] Updated weights for policy 0, policy_version 2158 (0.0006) [2023-09-14 12:45:55,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 19816.0). Total num frames: 8843264. Throughput: 0: 5069.4. Samples: 1205746. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:45:55,160][13933] Avg episode reward: [(0, '18.738')] [2023-09-14 12:45:56,936][13989] Updated weights for policy 0, policy_version 2168 (0.0006) [2023-09-14 12:45:59,003][13989] Updated weights for policy 0, policy_version 2178 (0.0008) [2023-09-14 12:46:00,160][13933] Fps is (10 sec: 19660.8, 60 sec: 20275.2, 300 sec: 19812.9). Total num frames: 8941568. Throughput: 0: 5069.0. Samples: 1220604. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:46:00,160][13933] Avg episode reward: [(0, '17.951')] [2023-09-14 12:46:01,095][13989] Updated weights for policy 0, policy_version 2188 (0.0014) [2023-09-14 12:46:03,204][13989] Updated weights for policy 0, policy_version 2198 (0.0006) [2023-09-14 12:46:05,160][13933] Fps is (10 sec: 19660.8, 60 sec: 20206.9, 300 sec: 19809.9). Total num frames: 9039872. Throughput: 0: 5043.3. Samples: 1249972. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:46:05,160][13933] Avg episode reward: [(0, '20.827')] [2023-09-14 12:46:05,162][13971] Saving new best policy, reward=20.827! [2023-09-14 12:46:05,271][13989] Updated weights for policy 0, policy_version 2208 (0.0006) [2023-09-14 12:46:07,359][13989] Updated weights for policy 0, policy_version 2218 (0.0007) [2023-09-14 12:46:09,419][13989] Updated weights for policy 0, policy_version 2228 (0.0005) [2023-09-14 12:46:10,160][13933] Fps is (10 sec: 19660.8, 60 sec: 20138.7, 300 sec: 19807.1). Total num frames: 9138176. Throughput: 0: 5024.1. Samples: 1279616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:46:10,160][13933] Avg episode reward: [(0, '22.436')] [2023-09-14 12:46:10,160][13971] Saving new best policy, reward=22.436! [2023-09-14 12:46:11,519][13989] Updated weights for policy 0, policy_version 2238 (0.0006) [2023-09-14 12:46:13,575][13989] Updated weights for policy 0, policy_version 2248 (0.0006) [2023-09-14 12:46:15,160][13933] Fps is (10 sec: 19660.8, 60 sec: 20070.4, 300 sec: 19804.3). Total num frames: 9236480. Throughput: 0: 5013.0. Samples: 1294440. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 12:46:15,160][13933] Avg episode reward: [(0, '20.159')] [2023-09-14 12:46:15,637][13989] Updated weights for policy 0, policy_version 2258 (0.0008) [2023-09-14 12:46:17,641][13989] Updated weights for policy 0, policy_version 2268 (0.0005) [2023-09-14 12:46:19,683][13989] Updated weights for policy 0, policy_version 2278 (0.0005) [2023-09-14 12:46:20,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20138.7, 300 sec: 19816.8). Total num frames: 9338880. Throughput: 0: 5000.1. Samples: 1324590. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:46:20,160][13933] Avg episode reward: [(0, '20.733')] [2023-09-14 12:46:21,700][13989] Updated weights for policy 0, policy_version 2288 (0.0005) [2023-09-14 12:46:23,746][13989] Updated weights for policy 0, policy_version 2298 (0.0005) [2023-09-14 12:46:25,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20070.9, 300 sec: 19814.0). Total num frames: 9437184. Throughput: 0: 4985.6. Samples: 1354686. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:46:25,160][13933] Avg episode reward: [(0, '20.427')] [2023-09-14 12:46:25,795][13989] Updated weights for policy 0, policy_version 2308 (0.0011) [2023-09-14 12:46:27,805][13989] Updated weights for policy 0, policy_version 2318 (0.0005) [2023-09-14 12:46:29,869][13989] Updated weights for policy 0, policy_version 2328 (0.0007) [2023-09-14 12:46:30,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20070.4, 300 sec: 19825.9). Total num frames: 9539584. Throughput: 0: 4982.5. Samples: 1369770. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:46:30,160][13933] Avg episode reward: [(0, '22.064')] [2023-09-14 12:46:31,888][13989] Updated weights for policy 0, policy_version 2338 (0.0008) [2023-09-14 12:46:33,917][13989] Updated weights for policy 0, policy_version 2348 (0.0006) [2023-09-14 12:46:35,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20070.4, 300 sec: 19837.4). Total num frames: 9641984. Throughput: 0: 4985.9. Samples: 1400046. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:46:35,160][13933] Avg episode reward: [(0, '23.641')] [2023-09-14 12:46:35,162][13971] Saving new best policy, reward=23.641! [2023-09-14 12:46:35,941][13989] Updated weights for policy 0, policy_version 2358 (0.0005) [2023-09-14 12:46:37,993][13989] Updated weights for policy 0, policy_version 2368 (0.0006) [2023-09-14 12:46:40,013][13989] Updated weights for policy 0, policy_version 2378 (0.0008) [2023-09-14 12:46:40,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20002.1, 300 sec: 19834.4). Total num frames: 9740288. Throughput: 0: 4986.3. Samples: 1430128. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:46:40,160][13933] Avg episode reward: [(0, '20.172')] [2023-09-14 12:46:42,024][13989] Updated weights for policy 0, policy_version 2388 (0.0008) [2023-09-14 12:46:44,052][13989] Updated weights for policy 0, policy_version 2398 (0.0006) [2023-09-14 12:46:45,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20002.1, 300 sec: 19845.4). Total num frames: 9842688. Throughput: 0: 4996.4. Samples: 1445442. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:46:45,160][13933] Avg episode reward: [(0, '20.903')] [2023-09-14 12:46:46,041][13989] Updated weights for policy 0, policy_version 2408 (0.0005) [2023-09-14 12:46:48,099][13989] Updated weights for policy 0, policy_version 2418 (0.0005) [2023-09-14 12:46:50,130][13989] Updated weights for policy 0, policy_version 2428 (0.0008) [2023-09-14 12:46:50,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20002.1, 300 sec: 20119.0). Total num frames: 9945088. Throughput: 0: 5020.3. Samples: 1475884. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:46:50,160][13933] Avg episode reward: [(0, '21.077')] [2023-09-14 12:46:52,209][13989] Updated weights for policy 0, policy_version 2438 (0.0008) [2023-09-14 12:46:54,267][13989] Updated weights for policy 0, policy_version 2448 (0.0005) [2023-09-14 12:46:55,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20002.1, 300 sec: 20132.9). Total num frames: 10043392. Throughput: 0: 5025.7. Samples: 1505772. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 12:46:55,160][13933] Avg episode reward: [(0, '21.279')] [2023-09-14 12:46:56,259][13989] Updated weights for policy 0, policy_version 2458 (0.0005) [2023-09-14 12:46:58,299][13989] Updated weights for policy 0, policy_version 2468 (0.0005) [2023-09-14 12:47:00,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20070.4, 300 sec: 20146.8). Total num frames: 10145792. Throughput: 0: 5033.7. Samples: 1520956. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:47:00,160][13933] Avg episode reward: [(0, '19.694')] [2023-09-14 12:47:00,323][13989] Updated weights for policy 0, policy_version 2478 (0.0005) [2023-09-14 12:47:02,316][13989] Updated weights for policy 0, policy_version 2488 (0.0005) [2023-09-14 12:47:04,346][13989] Updated weights for policy 0, policy_version 2498 (0.0008) [2023-09-14 12:47:05,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20070.4, 300 sec: 20146.8). Total num frames: 10244096. Throughput: 0: 5040.8. Samples: 1551426. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:47:05,160][13933] Avg episode reward: [(0, '24.829')] [2023-09-14 12:47:05,200][13971] Saving new best policy, reward=24.829! [2023-09-14 12:47:06,380][13989] Updated weights for policy 0, policy_version 2508 (0.0008) [2023-09-14 12:47:08,392][13989] Updated weights for policy 0, policy_version 2518 (0.0005) [2023-09-14 12:47:10,160][13933] Fps is (10 sec: 20070.3, 60 sec: 20138.7, 300 sec: 20160.6). Total num frames: 10346496. Throughput: 0: 5047.8. Samples: 1581838. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:47:10,160][13933] Avg episode reward: [(0, '20.766')] [2023-09-14 12:47:10,396][13989] Updated weights for policy 0, policy_version 2528 (0.0008) [2023-09-14 12:47:12,428][13989] Updated weights for policy 0, policy_version 2538 (0.0005) [2023-09-14 12:47:14,433][13989] Updated weights for policy 0, policy_version 2548 (0.0005) [2023-09-14 12:47:15,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20206.9, 300 sec: 20174.5). Total num frames: 10448896. Throughput: 0: 5050.6. Samples: 1597046. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:47:15,160][13933] Avg episode reward: [(0, '24.985')] [2023-09-14 12:47:15,162][13971] Saving new best policy, reward=24.985! [2023-09-14 12:47:16,426][13989] Updated weights for policy 0, policy_version 2558 (0.0010) [2023-09-14 12:47:18,455][13989] Updated weights for policy 0, policy_version 2568 (0.0006) [2023-09-14 12:47:20,160][13933] Fps is (10 sec: 20479.8, 60 sec: 20206.9, 300 sec: 20188.4). Total num frames: 10551296. Throughput: 0: 5058.7. Samples: 1627686. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:47:20,160][13933] Avg episode reward: [(0, '21.502')] [2023-09-14 12:47:20,472][13989] Updated weights for policy 0, policy_version 2578 (0.0011) [2023-09-14 12:47:22,516][13989] Updated weights for policy 0, policy_version 2588 (0.0010) [2023-09-14 12:47:24,527][13989] Updated weights for policy 0, policy_version 2598 (0.0008) [2023-09-14 12:47:25,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20275.2, 300 sec: 20202.3). Total num frames: 10653696. Throughput: 0: 5062.3. Samples: 1657932. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:47:25,160][13933] Avg episode reward: [(0, '22.026')] [2023-09-14 12:47:26,540][13989] Updated weights for policy 0, policy_version 2608 (0.0010) [2023-09-14 12:47:28,527][13989] Updated weights for policy 0, policy_version 2618 (0.0005) [2023-09-14 12:47:30,160][13933] Fps is (10 sec: 20480.2, 60 sec: 20275.2, 300 sec: 20202.3). Total num frames: 10756096. Throughput: 0: 5063.9. Samples: 1673318. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:47:30,160][13933] Avg episode reward: [(0, '22.278')] [2023-09-14 12:47:30,518][13989] Updated weights for policy 0, policy_version 2628 (0.0010) [2023-09-14 12:47:32,529][13989] Updated weights for policy 0, policy_version 2638 (0.0006) [2023-09-14 12:47:34,534][13989] Updated weights for policy 0, policy_version 2648 (0.0006) [2023-09-14 12:47:35,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20275.2, 300 sec: 20216.2). Total num frames: 10858496. Throughput: 0: 5069.6. Samples: 1704016. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 12:47:35,160][13933] Avg episode reward: [(0, '23.066')] [2023-09-14 12:47:36,559][13989] Updated weights for policy 0, policy_version 2658 (0.0011) [2023-09-14 12:47:38,584][13989] Updated weights for policy 0, policy_version 2668 (0.0008) [2023-09-14 12:47:40,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20275.2, 300 sec: 20216.2). Total num frames: 10956800. Throughput: 0: 5081.1. Samples: 1734420. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:47:40,160][13933] Avg episode reward: [(0, '25.416')] [2023-09-14 12:47:40,197][13971] Saving new best policy, reward=25.416! [2023-09-14 12:47:40,608][13989] Updated weights for policy 0, policy_version 2678 (0.0005) [2023-09-14 12:47:42,636][13989] Updated weights for policy 0, policy_version 2688 (0.0006) [2023-09-14 12:47:44,671][13989] Updated weights for policy 0, policy_version 2698 (0.0005) [2023-09-14 12:47:45,160][13933] Fps is (10 sec: 20070.1, 60 sec: 20275.1, 300 sec: 20230.1). Total num frames: 11059200. Throughput: 0: 5081.4. Samples: 1749618. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:47:45,160][13933] Avg episode reward: [(0, '22.503')] [2023-09-14 12:47:45,163][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000002700_11059200.pth... [2023-09-14 12:47:45,208][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000001514_6201344.pth [2023-09-14 12:47:46,763][13989] Updated weights for policy 0, policy_version 2708 (0.0006) [2023-09-14 12:47:48,863][13989] Updated weights for policy 0, policy_version 2718 (0.0006) [2023-09-14 12:47:50,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20206.9, 300 sec: 20230.1). Total num frames: 11157504. Throughput: 0: 5064.4. Samples: 1779322. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:47:50,160][13933] Avg episode reward: [(0, '23.540')] [2023-09-14 12:47:50,913][13989] Updated weights for policy 0, policy_version 2728 (0.0005) [2023-09-14 12:47:52,844][13989] Updated weights for policy 0, policy_version 2738 (0.0010) [2023-09-14 12:47:54,634][13989] Updated weights for policy 0, policy_version 2748 (0.0008) [2023-09-14 12:47:55,160][13933] Fps is (10 sec: 20480.2, 60 sec: 20343.5, 300 sec: 20257.8). Total num frames: 11264000. Throughput: 0: 5087.6. Samples: 1810780. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:47:55,160][13933] Avg episode reward: [(0, '22.861')] [2023-09-14 12:47:56,484][13989] Updated weights for policy 0, policy_version 2758 (0.0005) [2023-09-14 12:47:58,323][13989] Updated weights for policy 0, policy_version 2768 (0.0005) [2023-09-14 12:48:00,160][13933] Fps is (10 sec: 21708.8, 60 sec: 20480.0, 300 sec: 20285.6). Total num frames: 11374592. Throughput: 0: 5121.5. Samples: 1827512. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:48:00,160][13933] Avg episode reward: [(0, '21.409')] [2023-09-14 12:48:00,177][13989] Updated weights for policy 0, policy_version 2778 (0.0007) [2023-09-14 12:48:02,041][13989] Updated weights for policy 0, policy_version 2788 (0.0007) [2023-09-14 12:48:03,866][13989] Updated weights for policy 0, policy_version 2798 (0.0005) [2023-09-14 12:48:05,160][13933] Fps is (10 sec: 22118.4, 60 sec: 20684.8, 300 sec: 20327.3). Total num frames: 11485184. Throughput: 0: 5179.1. Samples: 1860746. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:48:05,160][13933] Avg episode reward: [(0, '22.399')] [2023-09-14 12:48:05,782][13989] Updated weights for policy 0, policy_version 2808 (0.0008) [2023-09-14 12:48:07,681][13989] Updated weights for policy 0, policy_version 2818 (0.0008) [2023-09-14 12:48:09,519][13989] Updated weights for policy 0, policy_version 2828 (0.0005) [2023-09-14 12:48:10,160][13933] Fps is (10 sec: 22118.4, 60 sec: 20821.4, 300 sec: 20355.0). Total num frames: 11595776. Throughput: 0: 5234.2. Samples: 1893470. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 12:48:10,160][13933] Avg episode reward: [(0, '21.363')] [2023-09-14 12:48:11,395][13989] Updated weights for policy 0, policy_version 2838 (0.0008) [2023-09-14 12:48:13,242][13989] Updated weights for policy 0, policy_version 2848 (0.0005) [2023-09-14 12:48:15,103][13989] Updated weights for policy 0, policy_version 2858 (0.0005) [2023-09-14 12:48:15,160][13933] Fps is (10 sec: 22118.4, 60 sec: 20957.9, 300 sec: 20396.7). Total num frames: 11706368. Throughput: 0: 5256.9. Samples: 1909878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:48:15,160][13933] Avg episode reward: [(0, '19.477')] [2023-09-14 12:48:16,987][13989] Updated weights for policy 0, policy_version 2868 (0.0005) [2023-09-14 12:48:18,858][13989] Updated weights for policy 0, policy_version 2878 (0.0005) [2023-09-14 12:48:20,160][13933] Fps is (10 sec: 22118.4, 60 sec: 21094.4, 300 sec: 20424.5). Total num frames: 11816960. Throughput: 0: 5305.0. Samples: 1942740. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:48:20,160][13933] Avg episode reward: [(0, '23.911')] [2023-09-14 12:48:20,714][13989] Updated weights for policy 0, policy_version 2888 (0.0005) [2023-09-14 12:48:22,580][13989] Updated weights for policy 0, policy_version 2898 (0.0005) [2023-09-14 12:48:24,443][13989] Updated weights for policy 0, policy_version 2908 (0.0005) [2023-09-14 12:48:25,160][13933] Fps is (10 sec: 21708.7, 60 sec: 21162.7, 300 sec: 20452.2). Total num frames: 11923456. Throughput: 0: 5363.0. Samples: 1975756. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:48:25,160][13933] Avg episode reward: [(0, '22.728')] [2023-09-14 12:48:26,335][13989] Updated weights for policy 0, policy_version 2918 (0.0008) [2023-09-14 12:48:28,191][13989] Updated weights for policy 0, policy_version 2928 (0.0010) [2023-09-14 12:48:30,015][13989] Updated weights for policy 0, policy_version 2938 (0.0007) [2023-09-14 12:48:30,160][13933] Fps is (10 sec: 21708.8, 60 sec: 21299.2, 300 sec: 20480.0). Total num frames: 12034048. Throughput: 0: 5387.9. Samples: 1992074. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:48:30,160][13933] Avg episode reward: [(0, '24.630')] [2023-09-14 12:48:31,882][13989] Updated weights for policy 0, policy_version 2948 (0.0005) [2023-09-14 12:48:33,740][13989] Updated weights for policy 0, policy_version 2958 (0.0005) [2023-09-14 12:48:35,160][13933] Fps is (10 sec: 22118.0, 60 sec: 21435.7, 300 sec: 20507.8). Total num frames: 12144640. Throughput: 0: 5468.6. Samples: 2025410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:48:35,160][13933] Avg episode reward: [(0, '22.508')] [2023-09-14 12:48:35,568][13989] Updated weights for policy 0, policy_version 2968 (0.0005) [2023-09-14 12:48:37,415][13989] Updated weights for policy 0, policy_version 2978 (0.0005) [2023-09-14 12:48:39,241][13989] Updated weights for policy 0, policy_version 2988 (0.0005) [2023-09-14 12:48:40,160][13933] Fps is (10 sec: 22527.5, 60 sec: 21708.7, 300 sec: 20549.4). Total num frames: 12259328. Throughput: 0: 5512.2. Samples: 2058828. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:48:40,160][13933] Avg episode reward: [(0, '23.844')] [2023-09-14 12:48:41,067][13989] Updated weights for policy 0, policy_version 2998 (0.0005) [2023-09-14 12:48:42,911][13989] Updated weights for policy 0, policy_version 3008 (0.0005) [2023-09-14 12:48:44,767][13989] Updated weights for policy 0, policy_version 3018 (0.0005) [2023-09-14 12:48:45,160][13933] Fps is (10 sec: 22528.5, 60 sec: 21845.4, 300 sec: 20563.3). Total num frames: 12369920. Throughput: 0: 5511.4. Samples: 2075524. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:48:45,160][13933] Avg episode reward: [(0, '24.043')] [2023-09-14 12:48:46,606][13989] Updated weights for policy 0, policy_version 3028 (0.0005) [2023-09-14 12:48:48,464][13989] Updated weights for policy 0, policy_version 3038 (0.0005) [2023-09-14 12:48:50,160][13933] Fps is (10 sec: 22118.9, 60 sec: 22050.1, 300 sec: 20591.1). Total num frames: 12480512. Throughput: 0: 5512.9. Samples: 2108828. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:48:50,160][13933] Avg episode reward: [(0, '20.217')] [2023-09-14 12:48:50,310][13989] Updated weights for policy 0, policy_version 3048 (0.0005) [2023-09-14 12:48:52,162][13989] Updated weights for policy 0, policy_version 3058 (0.0005) [2023-09-14 12:48:54,026][13989] Updated weights for policy 0, policy_version 3068 (0.0008) [2023-09-14 12:48:55,160][13933] Fps is (10 sec: 22118.4, 60 sec: 22118.4, 300 sec: 20618.8). Total num frames: 12591104. Throughput: 0: 5518.7. Samples: 2141810. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:48:55,160][13933] Avg episode reward: [(0, '26.129')] [2023-09-14 12:48:55,162][13971] Saving new best policy, reward=26.129! [2023-09-14 12:48:55,884][13989] Updated weights for policy 0, policy_version 3078 (0.0005) [2023-09-14 12:48:57,711][13989] Updated weights for policy 0, policy_version 3088 (0.0008) [2023-09-14 12:48:59,547][13989] Updated weights for policy 0, policy_version 3098 (0.0005) [2023-09-14 12:49:00,160][13933] Fps is (10 sec: 22118.2, 60 sec: 22118.4, 300 sec: 20646.6). Total num frames: 12701696. Throughput: 0: 5524.7. Samples: 2158490. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:49:00,160][13933] Avg episode reward: [(0, '23.979')] [2023-09-14 12:49:01,425][13989] Updated weights for policy 0, policy_version 3108 (0.0005) [2023-09-14 12:49:03,305][13989] Updated weights for policy 0, policy_version 3118 (0.0005) [2023-09-14 12:49:05,160][13933] Fps is (10 sec: 21708.7, 60 sec: 22050.1, 300 sec: 20660.5). Total num frames: 12808192. Throughput: 0: 5529.7. Samples: 2191578. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:49:05,160][13933] Avg episode reward: [(0, '21.992')] [2023-09-14 12:49:05,197][13989] Updated weights for policy 0, policy_version 3128 (0.0005) [2023-09-14 12:49:07,103][13989] Updated weights for policy 0, policy_version 3138 (0.0005) [2023-09-14 12:49:08,914][13989] Updated weights for policy 0, policy_version 3148 (0.0005) [2023-09-14 12:49:10,160][13933] Fps is (10 sec: 21708.9, 60 sec: 22050.1, 300 sec: 20688.3). Total num frames: 12918784. Throughput: 0: 5525.9. Samples: 2224420. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:49:10,160][13933] Avg episode reward: [(0, '22.875')] [2023-09-14 12:49:10,772][13989] Updated weights for policy 0, policy_version 3158 (0.0005) [2023-09-14 12:49:12,603][13989] Updated weights for policy 0, policy_version 3168 (0.0005) [2023-09-14 12:49:14,411][13989] Updated weights for policy 0, policy_version 3178 (0.0007) [2023-09-14 12:49:15,159][13933] Fps is (10 sec: 22528.2, 60 sec: 22118.4, 300 sec: 20729.9). Total num frames: 13033472. Throughput: 0: 5536.3. Samples: 2241208. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:49:15,160][13933] Avg episode reward: [(0, '23.778')] [2023-09-14 12:49:16,265][13989] Updated weights for policy 0, policy_version 3188 (0.0005) [2023-09-14 12:49:18,094][13989] Updated weights for policy 0, policy_version 3198 (0.0010) [2023-09-14 12:49:19,923][13989] Updated weights for policy 0, policy_version 3208 (0.0005) [2023-09-14 12:49:20,160][13933] Fps is (10 sec: 22528.0, 60 sec: 22118.4, 300 sec: 20757.7). Total num frames: 13144064. Throughput: 0: 5538.9. Samples: 2274660. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:49:20,160][13933] Avg episode reward: [(0, '22.404')] [2023-09-14 12:49:21,820][13989] Updated weights for policy 0, policy_version 3218 (0.0005) [2023-09-14 12:49:23,684][13989] Updated weights for policy 0, policy_version 3228 (0.0005) [2023-09-14 12:49:25,160][13933] Fps is (10 sec: 21708.5, 60 sec: 22118.4, 300 sec: 20785.5). Total num frames: 13250560. Throughput: 0: 5529.4. Samples: 2307650. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:49:25,160][13933] Avg episode reward: [(0, '23.199')] [2023-09-14 12:49:25,563][13989] Updated weights for policy 0, policy_version 3238 (0.0005) [2023-09-14 12:49:27,448][13989] Updated weights for policy 0, policy_version 3248 (0.0005) [2023-09-14 12:49:29,304][13989] Updated weights for policy 0, policy_version 3258 (0.0007) [2023-09-14 12:49:30,160][13933] Fps is (10 sec: 21708.8, 60 sec: 22118.4, 300 sec: 20813.2). Total num frames: 13361152. Throughput: 0: 5522.0. Samples: 2324014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:49:30,160][13933] Avg episode reward: [(0, '26.322')] [2023-09-14 12:49:30,199][13971] Saving new best policy, reward=26.322! [2023-09-14 12:49:31,131][13989] Updated weights for policy 0, policy_version 3268 (0.0005) [2023-09-14 12:49:32,967][13989] Updated weights for policy 0, policy_version 3278 (0.0007) [2023-09-14 12:49:34,808][13989] Updated weights for policy 0, policy_version 3288 (0.0005) [2023-09-14 12:49:35,160][13933] Fps is (10 sec: 22118.5, 60 sec: 22118.5, 300 sec: 20841.0). Total num frames: 13471744. Throughput: 0: 5523.0. Samples: 2357362. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:49:35,160][13933] Avg episode reward: [(0, '22.571')] [2023-09-14 12:49:36,648][13989] Updated weights for policy 0, policy_version 3298 (0.0005) [2023-09-14 12:49:38,470][13989] Updated weights for policy 0, policy_version 3308 (0.0008) [2023-09-14 12:49:40,160][13933] Fps is (10 sec: 22528.0, 60 sec: 22118.5, 300 sec: 20896.5). Total num frames: 13586432. Throughput: 0: 5529.9. Samples: 2390654. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:49:40,160][13933] Avg episode reward: [(0, '24.325')] [2023-09-14 12:49:40,335][13989] Updated weights for policy 0, policy_version 3318 (0.0005) [2023-09-14 12:49:42,203][13989] Updated weights for policy 0, policy_version 3328 (0.0005) [2023-09-14 12:49:44,063][13989] Updated weights for policy 0, policy_version 3338 (0.0005) [2023-09-14 12:49:45,160][13933] Fps is (10 sec: 22118.2, 60 sec: 22050.1, 300 sec: 20910.4). Total num frames: 13692928. Throughput: 0: 5524.2. Samples: 2407080. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:49:45,160][13933] Avg episode reward: [(0, '24.158')] [2023-09-14 12:49:45,182][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000003344_13697024.pth... [2023-09-14 12:49:45,224][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000002111_8646656.pth [2023-09-14 12:49:45,947][13989] Updated weights for policy 0, policy_version 3348 (0.0005) [2023-09-14 12:49:47,812][13989] Updated weights for policy 0, policy_version 3358 (0.0005) [2023-09-14 12:49:49,672][13989] Updated weights for policy 0, policy_version 3368 (0.0005) [2023-09-14 12:49:50,160][13933] Fps is (10 sec: 21708.7, 60 sec: 22050.1, 300 sec: 20938.2). Total num frames: 13803520. Throughput: 0: 5523.1. Samples: 2440118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:49:50,160][13933] Avg episode reward: [(0, '24.816')] [2023-09-14 12:49:51,693][13989] Updated weights for policy 0, policy_version 3378 (0.0005) [2023-09-14 12:49:53,711][13989] Updated weights for policy 0, policy_version 3388 (0.0006) [2023-09-14 12:49:55,160][13933] Fps is (10 sec: 21299.4, 60 sec: 21913.6, 300 sec: 20952.1). Total num frames: 13905920. Throughput: 0: 5481.5. Samples: 2471088. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:49:55,160][13933] Avg episode reward: [(0, '23.662')] [2023-09-14 12:49:55,744][13989] Updated weights for policy 0, policy_version 3398 (0.0005) [2023-09-14 12:49:57,747][13989] Updated weights for policy 0, policy_version 3408 (0.0005) [2023-09-14 12:49:59,762][13989] Updated weights for policy 0, policy_version 3418 (0.0006) [2023-09-14 12:50:00,160][13933] Fps is (10 sec: 20070.6, 60 sec: 21708.8, 300 sec: 20938.2). Total num frames: 14004224. Throughput: 0: 5445.9. Samples: 2486272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:50:00,160][13933] Avg episode reward: [(0, '21.593')] [2023-09-14 12:50:01,777][13989] Updated weights for policy 0, policy_version 3428 (0.0005) [2023-09-14 12:50:03,792][13989] Updated weights for policy 0, policy_version 3438 (0.0005) [2023-09-14 12:50:05,160][13933] Fps is (10 sec: 20070.4, 60 sec: 21640.5, 300 sec: 20938.2). Total num frames: 14106624. Throughput: 0: 5379.8. Samples: 2516750. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:50:05,160][13933] Avg episode reward: [(0, '21.333')] [2023-09-14 12:50:05,820][13989] Updated weights for policy 0, policy_version 3448 (0.0005) [2023-09-14 12:50:07,843][13989] Updated weights for policy 0, policy_version 3458 (0.0005) [2023-09-14 12:50:09,825][13989] Updated weights for policy 0, policy_version 3468 (0.0008) [2023-09-14 12:50:10,160][13933] Fps is (10 sec: 20479.9, 60 sec: 21504.0, 300 sec: 20938.2). Total num frames: 14209024. Throughput: 0: 5326.6. Samples: 2547348. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:50:10,160][13933] Avg episode reward: [(0, '21.717')] [2023-09-14 12:50:11,883][13989] Updated weights for policy 0, policy_version 3478 (0.0005) [2023-09-14 12:50:13,864][13989] Updated weights for policy 0, policy_version 3488 (0.0011) [2023-09-14 12:50:15,160][13933] Fps is (10 sec: 20479.9, 60 sec: 21299.2, 300 sec: 20952.1). Total num frames: 14311424. Throughput: 0: 5297.3. Samples: 2562394. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 12:50:15,161][13933] Avg episode reward: [(0, '20.889')] [2023-09-14 12:50:15,911][13989] Updated weights for policy 0, policy_version 3498 (0.0005) [2023-09-14 12:50:18,023][13989] Updated weights for policy 0, policy_version 3508 (0.0008) [2023-09-14 12:50:20,049][13989] Updated weights for policy 0, policy_version 3518 (0.0008) [2023-09-14 12:50:20,160][13933] Fps is (10 sec: 20070.4, 60 sec: 21094.4, 300 sec: 20938.3). Total num frames: 14409728. Throughput: 0: 5223.4. Samples: 2592414. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 12:50:20,160][13933] Avg episode reward: [(0, '23.253')] [2023-09-14 12:50:22,063][13989] Updated weights for policy 0, policy_version 3528 (0.0005) [2023-09-14 12:50:24,119][13989] Updated weights for policy 0, policy_version 3538 (0.0005) [2023-09-14 12:50:25,160][13933] Fps is (10 sec: 20070.3, 60 sec: 21026.1, 300 sec: 20938.2). Total num frames: 14512128. Throughput: 0: 5156.0. Samples: 2622676. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:50:25,160][13933] Avg episode reward: [(0, '23.339')] [2023-09-14 12:50:26,164][13989] Updated weights for policy 0, policy_version 3548 (0.0009) [2023-09-14 12:50:28,200][13989] Updated weights for policy 0, policy_version 3558 (0.0005) [2023-09-14 12:50:30,160][13933] Fps is (10 sec: 20069.7, 60 sec: 20821.2, 300 sec: 20924.3). Total num frames: 14610432. Throughput: 0: 5121.2. Samples: 2637536. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:50:30,160][13933] Avg episode reward: [(0, '24.445')] [2023-09-14 12:50:30,252][13989] Updated weights for policy 0, policy_version 3568 (0.0006) [2023-09-14 12:50:32,311][13989] Updated weights for policy 0, policy_version 3578 (0.0005) [2023-09-14 12:50:34,310][13989] Updated weights for policy 0, policy_version 3588 (0.0005) [2023-09-14 12:50:35,160][13933] Fps is (10 sec: 20070.7, 60 sec: 20684.8, 300 sec: 20924.3). Total num frames: 14712832. Throughput: 0: 5058.6. Samples: 2667756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:50:35,160][13933] Avg episode reward: [(0, '22.641')] [2023-09-14 12:50:36,392][13989] Updated weights for policy 0, policy_version 3598 (0.0008) [2023-09-14 12:50:38,466][13989] Updated weights for policy 0, policy_version 3608 (0.0005) [2023-09-14 12:50:40,160][13933] Fps is (10 sec: 20071.0, 60 sec: 20411.7, 300 sec: 20910.4). Total num frames: 14811136. Throughput: 0: 5032.7. Samples: 2697560. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:50:40,160][13933] Avg episode reward: [(0, '21.929')] [2023-09-14 12:50:40,547][13989] Updated weights for policy 0, policy_version 3618 (0.0006) [2023-09-14 12:50:42,565][13989] Updated weights for policy 0, policy_version 3628 (0.0005) [2023-09-14 12:50:44,571][13989] Updated weights for policy 0, policy_version 3638 (0.0005) [2023-09-14 12:50:45,160][13933] Fps is (10 sec: 19660.8, 60 sec: 20275.2, 300 sec: 20896.5). Total num frames: 14909440. Throughput: 0: 5027.0. Samples: 2712488. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:50:45,160][13933] Avg episode reward: [(0, '22.082')] [2023-09-14 12:50:46,590][13989] Updated weights for policy 0, policy_version 3648 (0.0008) [2023-09-14 12:50:48,648][13989] Updated weights for policy 0, policy_version 3658 (0.0005) [2023-09-14 12:50:50,160][13933] Fps is (10 sec: 20070.3, 60 sec: 20138.6, 300 sec: 20910.4). Total num frames: 15011840. Throughput: 0: 5024.6. Samples: 2742858. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:50:50,160][13933] Avg episode reward: [(0, '24.449')] [2023-09-14 12:50:50,720][13989] Updated weights for policy 0, policy_version 3668 (0.0006) [2023-09-14 12:50:52,781][13989] Updated weights for policy 0, policy_version 3678 (0.0008) [2023-09-14 12:50:54,847][13989] Updated weights for policy 0, policy_version 3688 (0.0008) [2023-09-14 12:50:55,160][13933] Fps is (10 sec: 20070.3, 60 sec: 20070.4, 300 sec: 20910.4). Total num frames: 15110144. Throughput: 0: 5005.9. Samples: 2772612. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:50:55,160][13933] Avg episode reward: [(0, '25.569')] [2023-09-14 12:50:56,877][13989] Updated weights for policy 0, policy_version 3698 (0.0008) [2023-09-14 12:50:58,858][13989] Updated weights for policy 0, policy_version 3708 (0.0005) [2023-09-14 12:51:00,160][13933] Fps is (10 sec: 20070.3, 60 sec: 20138.6, 300 sec: 20924.3). Total num frames: 15212544. Throughput: 0: 5008.2. Samples: 2787764. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:51:00,160][13933] Avg episode reward: [(0, '24.038')] [2023-09-14 12:51:00,930][13989] Updated weights for policy 0, policy_version 3718 (0.0011) [2023-09-14 12:51:03,011][13989] Updated weights for policy 0, policy_version 3728 (0.0005) [2023-09-14 12:51:05,030][13989] Updated weights for policy 0, policy_version 3738 (0.0008) [2023-09-14 12:51:05,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20070.4, 300 sec: 20924.3). Total num frames: 15310848. Throughput: 0: 5006.8. Samples: 2817718. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:51:05,160][13933] Avg episode reward: [(0, '28.088')] [2023-09-14 12:51:05,163][13971] Saving new best policy, reward=28.088! [2023-09-14 12:51:07,089][13989] Updated weights for policy 0, policy_version 3748 (0.0006) [2023-09-14 12:51:09,119][13989] Updated weights for policy 0, policy_version 3758 (0.0006) [2023-09-14 12:51:10,160][13933] Fps is (10 sec: 19661.2, 60 sec: 20002.1, 300 sec: 20924.3). Total num frames: 15409152. Throughput: 0: 5001.4. Samples: 2847736. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:51:10,160][13933] Avg episode reward: [(0, '26.846')] [2023-09-14 12:51:11,195][13989] Updated weights for policy 0, policy_version 3768 (0.0006) [2023-09-14 12:51:13,201][13989] Updated weights for policy 0, policy_version 3778 (0.0008) [2023-09-14 12:51:15,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20002.1, 300 sec: 20924.3). Total num frames: 15511552. Throughput: 0: 5005.5. Samples: 2862782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:51:15,160][13933] Avg episode reward: [(0, '25.048')] [2023-09-14 12:51:15,232][13989] Updated weights for policy 0, policy_version 3788 (0.0005) [2023-09-14 12:51:17,249][13989] Updated weights for policy 0, policy_version 3798 (0.0005) [2023-09-14 12:51:19,255][13989] Updated weights for policy 0, policy_version 3808 (0.0005) [2023-09-14 12:51:20,160][13933] Fps is (10 sec: 20479.9, 60 sec: 20070.4, 300 sec: 20938.2). Total num frames: 15613952. Throughput: 0: 5011.5. Samples: 2893272. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 12:51:20,160][13933] Avg episode reward: [(0, '25.806')] [2023-09-14 12:51:21,343][13989] Updated weights for policy 0, policy_version 3818 (0.0005) [2023-09-14 12:51:23,491][13989] Updated weights for policy 0, policy_version 3828 (0.0008) [2023-09-14 12:51:25,160][13933] Fps is (10 sec: 20070.3, 60 sec: 20002.1, 300 sec: 20924.3). Total num frames: 15712256. Throughput: 0: 5005.7. Samples: 2922816. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:51:25,160][13933] Avg episode reward: [(0, '24.120')] [2023-09-14 12:51:25,534][13989] Updated weights for policy 0, policy_version 3838 (0.0008) [2023-09-14 12:51:27,637][13989] Updated weights for policy 0, policy_version 3848 (0.0006) [2023-09-14 12:51:29,687][13989] Updated weights for policy 0, policy_version 3858 (0.0006) [2023-09-14 12:51:30,160][13933] Fps is (10 sec: 19660.8, 60 sec: 20002.3, 300 sec: 20910.4). Total num frames: 15810560. Throughput: 0: 5000.8. Samples: 2937522. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:51:30,160][13933] Avg episode reward: [(0, '24.479')] [2023-09-14 12:51:31,799][13989] Updated weights for policy 0, policy_version 3868 (0.0008) [2023-09-14 12:51:33,821][13989] Updated weights for policy 0, policy_version 3878 (0.0005) [2023-09-14 12:51:35,160][13933] Fps is (10 sec: 19661.0, 60 sec: 19933.9, 300 sec: 20910.4). Total num frames: 15908864. Throughput: 0: 4986.9. Samples: 2967270. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:51:35,160][13933] Avg episode reward: [(0, '24.627')] [2023-09-14 12:51:35,871][13989] Updated weights for policy 0, policy_version 3888 (0.0006) [2023-09-14 12:51:37,957][13989] Updated weights for policy 0, policy_version 3898 (0.0006) [2023-09-14 12:51:40,029][13989] Updated weights for policy 0, policy_version 3908 (0.0006) [2023-09-14 12:51:40,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19933.9, 300 sec: 20896.5). Total num frames: 16007168. Throughput: 0: 4984.4. Samples: 2996910. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:51:40,160][13933] Avg episode reward: [(0, '23.050')] [2023-09-14 12:51:42,105][13989] Updated weights for policy 0, policy_version 3918 (0.0008) [2023-09-14 12:51:44,216][13989] Updated weights for policy 0, policy_version 3928 (0.0008) [2023-09-14 12:51:45,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19933.9, 300 sec: 20882.7). Total num frames: 16105472. Throughput: 0: 4977.5. Samples: 3011752. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:51:45,160][13933] Avg episode reward: [(0, '26.253')] [2023-09-14 12:51:45,163][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000003932_16105472.pth... [2023-09-14 12:51:45,211][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000002700_11059200.pth [2023-09-14 12:51:46,274][13989] Updated weights for policy 0, policy_version 3938 (0.0006) [2023-09-14 12:51:48,395][13989] Updated weights for policy 0, policy_version 3948 (0.0008) [2023-09-14 12:51:50,160][13933] Fps is (10 sec: 19660.3, 60 sec: 19865.6, 300 sec: 20882.6). Total num frames: 16203776. Throughput: 0: 4965.2. Samples: 3041152. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:51:50,160][13933] Avg episode reward: [(0, '25.416')] [2023-09-14 12:51:50,479][13989] Updated weights for policy 0, policy_version 3958 (0.0006) [2023-09-14 12:51:52,551][13989] Updated weights for policy 0, policy_version 3968 (0.0006) [2023-09-14 12:51:54,637][13989] Updated weights for policy 0, policy_version 3978 (0.0006) [2023-09-14 12:51:55,160][13933] Fps is (10 sec: 19660.6, 60 sec: 19865.6, 300 sec: 20868.8). Total num frames: 16302080. Throughput: 0: 4953.4. Samples: 3070638. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:51:55,160][13933] Avg episode reward: [(0, '27.765')] [2023-09-14 12:51:56,734][13989] Updated weights for policy 0, policy_version 3988 (0.0005) [2023-09-14 12:51:58,821][13989] Updated weights for policy 0, policy_version 3998 (0.0008) [2023-09-14 12:52:00,160][13933] Fps is (10 sec: 19661.3, 60 sec: 19797.4, 300 sec: 20868.8). Total num frames: 16400384. Throughput: 0: 4944.6. Samples: 3085288. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:52:00,160][13933] Avg episode reward: [(0, '25.304')] [2023-09-14 12:52:00,926][13989] Updated weights for policy 0, policy_version 4008 (0.0005) [2023-09-14 12:52:02,984][13989] Updated weights for policy 0, policy_version 4018 (0.0006) [2023-09-14 12:52:05,037][13989] Updated weights for policy 0, policy_version 4028 (0.0006) [2023-09-14 12:52:05,160][13933] Fps is (10 sec: 19661.1, 60 sec: 19797.3, 300 sec: 20854.9). Total num frames: 16498688. Throughput: 0: 4920.1. Samples: 3114676. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:52:05,161][13933] Avg episode reward: [(0, '24.272')] [2023-09-14 12:52:07,147][13989] Updated weights for policy 0, policy_version 4038 (0.0006) [2023-09-14 12:52:09,273][13989] Updated weights for policy 0, policy_version 4048 (0.0006) [2023-09-14 12:52:10,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19797.3, 300 sec: 20841.0). Total num frames: 16596992. Throughput: 0: 4916.5. Samples: 3144056. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:52:10,160][13933] Avg episode reward: [(0, '25.178')] [2023-09-14 12:52:11,383][13989] Updated weights for policy 0, policy_version 4058 (0.0006) [2023-09-14 12:52:13,467][13989] Updated weights for policy 0, policy_version 4068 (0.0006) [2023-09-14 12:52:15,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19729.1, 300 sec: 20827.1). Total num frames: 16695296. Throughput: 0: 4915.2. Samples: 3158708. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:52:15,161][13933] Avg episode reward: [(0, '23.370')] [2023-09-14 12:52:15,577][13989] Updated weights for policy 0, policy_version 4078 (0.0006) [2023-09-14 12:52:17,634][13989] Updated weights for policy 0, policy_version 4088 (0.0008) [2023-09-14 12:52:19,719][13989] Updated weights for policy 0, policy_version 4098 (0.0008) [2023-09-14 12:52:20,160][13933] Fps is (10 sec: 19660.3, 60 sec: 19660.7, 300 sec: 20813.2). Total num frames: 16793600. Throughput: 0: 4911.2. Samples: 3188276. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:52:20,160][13933] Avg episode reward: [(0, '26.573')] [2023-09-14 12:52:21,768][13989] Updated weights for policy 0, policy_version 4108 (0.0006) [2023-09-14 12:52:23,861][13989] Updated weights for policy 0, policy_version 4118 (0.0006) [2023-09-14 12:52:25,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19660.8, 300 sec: 20799.4). Total num frames: 16891904. Throughput: 0: 4911.5. Samples: 3217928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:52:25,161][13933] Avg episode reward: [(0, '25.823')] [2023-09-14 12:52:25,911][13989] Updated weights for policy 0, policy_version 4128 (0.0008) [2023-09-14 12:52:27,783][13989] Updated weights for policy 0, policy_version 4138 (0.0005) [2023-09-14 12:52:29,658][13989] Updated weights for policy 0, policy_version 4148 (0.0005) [2023-09-14 12:52:30,160][13933] Fps is (10 sec: 20480.4, 60 sec: 19797.3, 300 sec: 20813.2). Total num frames: 16998400. Throughput: 0: 4923.1. Samples: 3233290. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:52:30,160][13933] Avg episode reward: [(0, '22.464')] [2023-09-14 12:52:31,503][13989] Updated weights for policy 0, policy_version 4158 (0.0009) [2023-09-14 12:52:33,355][13989] Updated weights for policy 0, policy_version 4168 (0.0005) [2023-09-14 12:52:35,160][13933] Fps is (10 sec: 21708.7, 60 sec: 20002.1, 300 sec: 20854.9). Total num frames: 17108992. Throughput: 0: 5006.9. Samples: 3266462. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:52:35,160][13933] Avg episode reward: [(0, '26.554')] [2023-09-14 12:52:35,183][13989] Updated weights for policy 0, policy_version 4178 (0.0005) [2023-09-14 12:52:37,121][13989] Updated weights for policy 0, policy_version 4188 (0.0005) [2023-09-14 12:52:39,171][13989] Updated weights for policy 0, policy_version 4198 (0.0005) [2023-09-14 12:52:40,160][13933] Fps is (10 sec: 21299.3, 60 sec: 20070.4, 300 sec: 20854.9). Total num frames: 17211392. Throughput: 0: 5055.6. Samples: 3298140. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:52:40,160][13933] Avg episode reward: [(0, '26.966')] [2023-09-14 12:52:41,217][13989] Updated weights for policy 0, policy_version 4208 (0.0007) [2023-09-14 12:52:43,227][13989] Updated weights for policy 0, policy_version 4218 (0.0008) [2023-09-14 12:52:45,160][13933] Fps is (10 sec: 20480.1, 60 sec: 20138.7, 300 sec: 20868.8). Total num frames: 17313792. Throughput: 0: 5064.4. Samples: 3313186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:52:45,160][13933] Avg episode reward: [(0, '24.531')] [2023-09-14 12:52:45,250][13989] Updated weights for policy 0, policy_version 4228 (0.0005) [2023-09-14 12:52:47,260][13989] Updated weights for policy 0, policy_version 4238 (0.0008) [2023-09-14 12:52:49,302][13989] Updated weights for policy 0, policy_version 4248 (0.0005) [2023-09-14 12:52:50,160][13933] Fps is (10 sec: 20479.9, 60 sec: 20207.0, 300 sec: 20854.9). Total num frames: 17416192. Throughput: 0: 5092.5. Samples: 3343838. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:52:50,160][13933] Avg episode reward: [(0, '24.197')] [2023-09-14 12:52:51,332][13989] Updated weights for policy 0, policy_version 4258 (0.0005) [2023-09-14 12:52:53,332][13989] Updated weights for policy 0, policy_version 4268 (0.0005) [2023-09-14 12:52:55,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20275.2, 300 sec: 20827.1). Total num frames: 17518592. Throughput: 0: 5112.0. Samples: 3374094. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:52:55,160][13933] Avg episode reward: [(0, '23.768')] [2023-09-14 12:52:55,339][13989] Updated weights for policy 0, policy_version 4278 (0.0005) [2023-09-14 12:52:57,384][13989] Updated weights for policy 0, policy_version 4288 (0.0008) [2023-09-14 12:52:59,416][13989] Updated weights for policy 0, policy_version 4298 (0.0011) [2023-09-14 12:53:00,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 20785.5). Total num frames: 17616896. Throughput: 0: 5122.8. Samples: 3389232. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:53:00,160][13933] Avg episode reward: [(0, '25.226')] [2023-09-14 12:53:01,500][13989] Updated weights for policy 0, policy_version 4308 (0.0005) [2023-09-14 12:53:03,503][13989] Updated weights for policy 0, policy_version 4318 (0.0008) [2023-09-14 12:53:05,160][13933] Fps is (10 sec: 20070.2, 60 sec: 20343.4, 300 sec: 20757.7). Total num frames: 17719296. Throughput: 0: 5137.2. Samples: 3419450. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 12:53:05,160][13933] Avg episode reward: [(0, '26.607')] [2023-09-14 12:53:05,551][13989] Updated weights for policy 0, policy_version 4328 (0.0005) [2023-09-14 12:53:07,589][13989] Updated weights for policy 0, policy_version 4338 (0.0005) [2023-09-14 12:53:09,611][13989] Updated weights for policy 0, policy_version 4348 (0.0005) [2023-09-14 12:53:10,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20343.5, 300 sec: 20716.0). Total num frames: 17817600. Throughput: 0: 5146.7. Samples: 3449530. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:53:10,160][13933] Avg episode reward: [(0, '23.775')] [2023-09-14 12:53:11,704][13989] Updated weights for policy 0, policy_version 4358 (0.0009) [2023-09-14 12:53:13,722][13989] Updated weights for policy 0, policy_version 4368 (0.0005) [2023-09-14 12:53:15,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20411.7, 300 sec: 20688.3). Total num frames: 17920000. Throughput: 0: 5138.5. Samples: 3464522. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:53:15,160][13933] Avg episode reward: [(0, '25.306')] [2023-09-14 12:53:15,753][13989] Updated weights for policy 0, policy_version 4378 (0.0008) [2023-09-14 12:53:17,812][13989] Updated weights for policy 0, policy_version 4388 (0.0005) [2023-09-14 12:53:19,826][13989] Updated weights for policy 0, policy_version 4398 (0.0005) [2023-09-14 12:53:20,160][13933] Fps is (10 sec: 20070.1, 60 sec: 20411.8, 300 sec: 20660.5). Total num frames: 18018304. Throughput: 0: 5068.8. Samples: 3494558. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:53:20,160][13933] Avg episode reward: [(0, '27.542')] [2023-09-14 12:53:21,860][13989] Updated weights for policy 0, policy_version 4408 (0.0006) [2023-09-14 12:53:23,905][13989] Updated weights for policy 0, policy_version 4418 (0.0010) [2023-09-14 12:53:25,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20480.0, 300 sec: 20632.7). Total num frames: 18120704. Throughput: 0: 5035.3. Samples: 3524730. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:53:25,160][13933] Avg episode reward: [(0, '25.682')] [2023-09-14 12:53:25,942][13989] Updated weights for policy 0, policy_version 4428 (0.0005) [2023-09-14 12:53:27,987][13989] Updated weights for policy 0, policy_version 4438 (0.0011) [2023-09-14 12:53:30,015][13989] Updated weights for policy 0, policy_version 4448 (0.0007) [2023-09-14 12:53:30,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20343.5, 300 sec: 20591.1). Total num frames: 18219008. Throughput: 0: 5037.2. Samples: 3539862. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:53:30,160][13933] Avg episode reward: [(0, '25.426')] [2023-09-14 12:53:32,036][13989] Updated weights for policy 0, policy_version 4458 (0.0005) [2023-09-14 12:53:34,012][13989] Updated weights for policy 0, policy_version 4468 (0.0007) [2023-09-14 12:53:35,160][13933] Fps is (10 sec: 20070.0, 60 sec: 20206.9, 300 sec: 20549.4). Total num frames: 18321408. Throughput: 0: 5032.0. Samples: 3570278. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:53:35,160][13933] Avg episode reward: [(0, '22.852')] [2023-09-14 12:53:36,097][13989] Updated weights for policy 0, policy_version 4478 (0.0005) [2023-09-14 12:53:38,095][13989] Updated weights for policy 0, policy_version 4488 (0.0011) [2023-09-14 12:53:40,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20138.7, 300 sec: 20507.8). Total num frames: 18419712. Throughput: 0: 5034.6. Samples: 3600650. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:53:40,160][13933] Avg episode reward: [(0, '25.895')] [2023-09-14 12:53:40,165][13989] Updated weights for policy 0, policy_version 4498 (0.0005) [2023-09-14 12:53:42,201][13989] Updated weights for policy 0, policy_version 4508 (0.0008) [2023-09-14 12:53:44,212][13989] Updated weights for policy 0, policy_version 4518 (0.0005) [2023-09-14 12:53:45,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20138.6, 300 sec: 20480.0). Total num frames: 18522112. Throughput: 0: 5027.1. Samples: 3615450. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:53:45,160][13933] Avg episode reward: [(0, '25.243')] [2023-09-14 12:53:45,163][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000004522_18522112.pth... [2023-09-14 12:53:45,207][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000003344_13697024.pth [2023-09-14 12:53:46,231][13989] Updated weights for policy 0, policy_version 4528 (0.0005) [2023-09-14 12:53:48,307][13989] Updated weights for policy 0, policy_version 4538 (0.0006) [2023-09-14 12:53:50,160][13933] Fps is (10 sec: 20479.8, 60 sec: 20138.6, 300 sec: 20452.2). Total num frames: 18624512. Throughput: 0: 5029.2. Samples: 3645766. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:53:50,160][13933] Avg episode reward: [(0, '24.630')] [2023-09-14 12:53:50,376][13989] Updated weights for policy 0, policy_version 4548 (0.0006) [2023-09-14 12:53:52,423][13989] Updated weights for policy 0, policy_version 4558 (0.0005) [2023-09-14 12:53:54,470][13989] Updated weights for policy 0, policy_version 4568 (0.0006) [2023-09-14 12:53:55,160][13933] Fps is (10 sec: 20070.6, 60 sec: 20070.4, 300 sec: 20410.6). Total num frames: 18722816. Throughput: 0: 5024.4. Samples: 3675630. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:53:55,160][13933] Avg episode reward: [(0, '24.866')] [2023-09-14 12:53:56,506][13989] Updated weights for policy 0, policy_version 4578 (0.0005) [2023-09-14 12:53:58,551][13989] Updated weights for policy 0, policy_version 4588 (0.0007) [2023-09-14 12:54:00,160][13933] Fps is (10 sec: 20070.7, 60 sec: 20138.7, 300 sec: 20396.7). Total num frames: 18825216. Throughput: 0: 5023.6. Samples: 3690582. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:54:00,160][13933] Avg episode reward: [(0, '25.735')] [2023-09-14 12:54:00,581][13989] Updated weights for policy 0, policy_version 4598 (0.0008) [2023-09-14 12:54:02,575][13989] Updated weights for policy 0, policy_version 4608 (0.0005) [2023-09-14 12:54:04,580][13989] Updated weights for policy 0, policy_version 4618 (0.0005) [2023-09-14 12:54:05,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20070.4, 300 sec: 20355.0). Total num frames: 18923520. Throughput: 0: 5032.0. Samples: 3720996. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:54:05,160][13933] Avg episode reward: [(0, '25.588')] [2023-09-14 12:54:06,611][13989] Updated weights for policy 0, policy_version 4628 (0.0008) [2023-09-14 12:54:08,639][13989] Updated weights for policy 0, policy_version 4638 (0.0005) [2023-09-14 12:54:10,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20138.7, 300 sec: 20313.4). Total num frames: 19025920. Throughput: 0: 5038.8. Samples: 3751476. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 12:54:10,160][13933] Avg episode reward: [(0, '22.150')] [2023-09-14 12:54:10,680][13989] Updated weights for policy 0, policy_version 4648 (0.0005) [2023-09-14 12:54:12,728][13989] Updated weights for policy 0, policy_version 4658 (0.0006) [2023-09-14 12:54:14,721][13989] Updated weights for policy 0, policy_version 4668 (0.0005) [2023-09-14 12:54:15,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20138.7, 300 sec: 20285.6). Total num frames: 19128320. Throughput: 0: 5036.3. Samples: 3766496. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:54:15,160][13933] Avg episode reward: [(0, '25.284')] [2023-09-14 12:54:16,771][13989] Updated weights for policy 0, policy_version 4678 (0.0005) [2023-09-14 12:54:18,784][13989] Updated weights for policy 0, policy_version 4688 (0.0005) [2023-09-14 12:54:20,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20138.7, 300 sec: 20257.8). Total num frames: 19226624. Throughput: 0: 5034.1. Samples: 3796812. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:54:20,160][13933] Avg episode reward: [(0, '27.598')] [2023-09-14 12:54:20,843][13989] Updated weights for policy 0, policy_version 4698 (0.0008) [2023-09-14 12:54:22,865][13989] Updated weights for policy 0, policy_version 4708 (0.0005) [2023-09-14 12:54:24,859][13989] Updated weights for policy 0, policy_version 4718 (0.0005) [2023-09-14 12:54:25,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20138.7, 300 sec: 20230.1). Total num frames: 19329024. Throughput: 0: 5033.9. Samples: 3827176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:54:25,160][13933] Avg episode reward: [(0, '28.071')] [2023-09-14 12:54:26,871][13989] Updated weights for policy 0, policy_version 4728 (0.0006) [2023-09-14 12:54:28,823][13989] Updated weights for policy 0, policy_version 4738 (0.0005) [2023-09-14 12:54:30,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20207.0, 300 sec: 20202.3). Total num frames: 19431424. Throughput: 0: 5046.4. Samples: 3842536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:54:30,160][13933] Avg episode reward: [(0, '25.982')] [2023-09-14 12:54:30,886][13989] Updated weights for policy 0, policy_version 4748 (0.0008) [2023-09-14 12:54:32,867][13989] Updated weights for policy 0, policy_version 4758 (0.0008) [2023-09-14 12:54:34,884][13989] Updated weights for policy 0, policy_version 4768 (0.0005) [2023-09-14 12:54:35,160][13933] Fps is (10 sec: 20480.2, 60 sec: 20207.0, 300 sec: 20160.7). Total num frames: 19533824. Throughput: 0: 5053.1. Samples: 3873154. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 12:54:35,160][13933] Avg episode reward: [(0, '26.582')] [2023-09-14 12:54:36,991][13989] Updated weights for policy 0, policy_version 4778 (0.0008) [2023-09-14 12:54:38,972][13989] Updated weights for policy 0, policy_version 4788 (0.0008) [2023-09-14 12:54:40,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20206.9, 300 sec: 20132.9). Total num frames: 19632128. Throughput: 0: 5060.2. Samples: 3903338. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:54:40,160][13933] Avg episode reward: [(0, '27.337')] [2023-09-14 12:54:40,963][13989] Updated weights for policy 0, policy_version 4798 (0.0005) [2023-09-14 12:54:43,006][13989] Updated weights for policy 0, policy_version 4808 (0.0008) [2023-09-14 12:54:44,995][13989] Updated weights for policy 0, policy_version 4818 (0.0006) [2023-09-14 12:54:45,160][13933] Fps is (10 sec: 20069.5, 60 sec: 20206.8, 300 sec: 20105.1). Total num frames: 19734528. Throughput: 0: 5067.2. Samples: 3918608. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:54:45,160][13933] Avg episode reward: [(0, '26.871')] [2023-09-14 12:54:46,990][13989] Updated weights for policy 0, policy_version 4828 (0.0008) [2023-09-14 12:54:49,001][13989] Updated weights for policy 0, policy_version 4838 (0.0010) [2023-09-14 12:54:50,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20207.0, 300 sec: 20105.1). Total num frames: 19836928. Throughput: 0: 5071.2. Samples: 3949198. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:54:50,160][13933] Avg episode reward: [(0, '22.169')] [2023-09-14 12:54:50,991][13989] Updated weights for policy 0, policy_version 4848 (0.0008) [2023-09-14 12:54:52,979][13989] Updated weights for policy 0, policy_version 4858 (0.0005) [2023-09-14 12:54:54,991][13989] Updated weights for policy 0, policy_version 4868 (0.0008) [2023-09-14 12:54:55,160][13933] Fps is (10 sec: 20480.8, 60 sec: 20275.2, 300 sec: 20119.0). Total num frames: 19939328. Throughput: 0: 5078.8. Samples: 3980022. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 12:54:55,160][13933] Avg episode reward: [(0, '25.373')] [2023-09-14 12:54:57,021][13989] Updated weights for policy 0, policy_version 4878 (0.0005) [2023-09-14 12:54:58,988][13989] Updated weights for policy 0, policy_version 4888 (0.0005) [2023-09-14 12:55:00,160][13933] Fps is (10 sec: 20479.8, 60 sec: 20275.2, 300 sec: 20119.0). Total num frames: 20041728. Throughput: 0: 5085.4. Samples: 3995340. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:55:00,160][13933] Avg episode reward: [(0, '26.559')] [2023-09-14 12:55:01,000][13989] Updated weights for policy 0, policy_version 4898 (0.0008) [2023-09-14 12:55:03,029][13989] Updated weights for policy 0, policy_version 4908 (0.0005) [2023-09-14 12:55:05,062][13989] Updated weights for policy 0, policy_version 4918 (0.0008) [2023-09-14 12:55:05,160][13933] Fps is (10 sec: 20479.7, 60 sec: 20343.4, 300 sec: 20119.0). Total num frames: 20144128. Throughput: 0: 5091.4. Samples: 4025924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:55:05,160][13933] Avg episode reward: [(0, '23.801')] [2023-09-14 12:55:07,055][13989] Updated weights for policy 0, policy_version 4928 (0.0005) [2023-09-14 12:55:09,109][13989] Updated weights for policy 0, policy_version 4938 (0.0006) [2023-09-14 12:55:10,160][13933] Fps is (10 sec: 20480.2, 60 sec: 20343.5, 300 sec: 20119.0). Total num frames: 20246528. Throughput: 0: 5092.8. Samples: 4056354. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:55:10,160][13933] Avg episode reward: [(0, '27.637')] [2023-09-14 12:55:11,147][13989] Updated weights for policy 0, policy_version 4948 (0.0005) [2023-09-14 12:55:13,152][13989] Updated weights for policy 0, policy_version 4958 (0.0005) [2023-09-14 12:55:15,160][13933] Fps is (10 sec: 20070.7, 60 sec: 20275.2, 300 sec: 20119.0). Total num frames: 20344832. Throughput: 0: 5086.7. Samples: 4071438. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:55:15,160][13933] Avg episode reward: [(0, '25.316')] [2023-09-14 12:55:15,167][13989] Updated weights for policy 0, policy_version 4968 (0.0005) [2023-09-14 12:55:17,185][13989] Updated weights for policy 0, policy_version 4978 (0.0005) [2023-09-14 12:55:19,184][13989] Updated weights for policy 0, policy_version 4988 (0.0005) [2023-09-14 12:55:20,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20343.5, 300 sec: 20119.0). Total num frames: 20447232. Throughput: 0: 5082.8. Samples: 4101882. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:55:20,160][13933] Avg episode reward: [(0, '25.221')] [2023-09-14 12:55:21,206][13989] Updated weights for policy 0, policy_version 4998 (0.0007) [2023-09-14 12:55:23,219][13989] Updated weights for policy 0, policy_version 5008 (0.0010) [2023-09-14 12:55:25,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20343.5, 300 sec: 20132.9). Total num frames: 20549632. Throughput: 0: 5091.0. Samples: 4132434. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:55:25,160][13933] Avg episode reward: [(0, '26.832')] [2023-09-14 12:55:25,264][13989] Updated weights for policy 0, policy_version 5018 (0.0005) [2023-09-14 12:55:27,292][13989] Updated weights for policy 0, policy_version 5028 (0.0006) [2023-09-14 12:55:29,304][13989] Updated weights for policy 0, policy_version 5038 (0.0005) [2023-09-14 12:55:30,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20343.5, 300 sec: 20132.9). Total num frames: 20652032. Throughput: 0: 5087.7. Samples: 4147554. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:55:30,160][13933] Avg episode reward: [(0, '29.019')] [2023-09-14 12:55:30,160][13971] Saving new best policy, reward=29.019! [2023-09-14 12:55:31,358][13989] Updated weights for policy 0, policy_version 5048 (0.0005) [2023-09-14 12:55:33,418][13989] Updated weights for policy 0, policy_version 5058 (0.0005) [2023-09-14 12:55:35,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 20132.9). Total num frames: 20750336. Throughput: 0: 5074.9. Samples: 4177568. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:55:35,160][13933] Avg episode reward: [(0, '26.604')] [2023-09-14 12:55:35,431][13989] Updated weights for policy 0, policy_version 5068 (0.0005) [2023-09-14 12:55:37,484][13989] Updated weights for policy 0, policy_version 5078 (0.0005) [2023-09-14 12:55:39,517][13989] Updated weights for policy 0, policy_version 5088 (0.0008) [2023-09-14 12:55:40,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20343.5, 300 sec: 20146.8). Total num frames: 20852736. Throughput: 0: 5065.6. Samples: 4207972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 12:55:40,160][13933] Avg episode reward: [(0, '26.133')] [2023-09-14 12:55:41,544][13989] Updated weights for policy 0, policy_version 5098 (0.0009) [2023-09-14 12:55:43,579][13989] Updated weights for policy 0, policy_version 5108 (0.0008) [2023-09-14 12:55:45,160][13933] Fps is (10 sec: 20070.2, 60 sec: 20275.3, 300 sec: 20132.9). Total num frames: 20951040. Throughput: 0: 5057.8. Samples: 4222940. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:55:45,160][13933] Avg episode reward: [(0, '24.573')] [2023-09-14 12:55:45,163][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000005116_20955136.pth... [2023-09-14 12:55:45,209][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000003932_16105472.pth [2023-09-14 12:55:45,602][13989] Updated weights for policy 0, policy_version 5118 (0.0006) [2023-09-14 12:55:47,604][13989] Updated weights for policy 0, policy_version 5128 (0.0006) [2023-09-14 12:55:49,637][13989] Updated weights for policy 0, policy_version 5138 (0.0005) [2023-09-14 12:55:50,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 20146.8). Total num frames: 21053440. Throughput: 0: 5053.7. Samples: 4253340. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:55:50,160][13933] Avg episode reward: [(0, '22.762')] [2023-09-14 12:55:51,711][13989] Updated weights for policy 0, policy_version 5148 (0.0006) [2023-09-14 12:55:53,751][13989] Updated weights for policy 0, policy_version 5158 (0.0005) [2023-09-14 12:55:55,160][13933] Fps is (10 sec: 20070.6, 60 sec: 20206.9, 300 sec: 20132.9). Total num frames: 21151744. Throughput: 0: 5044.8. Samples: 4283370. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:55:55,160][13933] Avg episode reward: [(0, '24.459')] [2023-09-14 12:55:55,738][13989] Updated weights for policy 0, policy_version 5168 (0.0005) [2023-09-14 12:55:57,787][13989] Updated weights for policy 0, policy_version 5178 (0.0005) [2023-09-14 12:55:59,804][13989] Updated weights for policy 0, policy_version 5188 (0.0005) [2023-09-14 12:56:00,160][13933] Fps is (10 sec: 20070.3, 60 sec: 20206.9, 300 sec: 20146.8). Total num frames: 21254144. Throughput: 0: 5047.0. Samples: 4298552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:56:00,160][13933] Avg episode reward: [(0, '27.476')] [2023-09-14 12:56:01,900][13989] Updated weights for policy 0, policy_version 5198 (0.0005) [2023-09-14 12:56:03,932][13989] Updated weights for policy 0, policy_version 5208 (0.0011) [2023-09-14 12:56:05,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20138.7, 300 sec: 20146.8). Total num frames: 21352448. Throughput: 0: 5038.7. Samples: 4328624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:56:05,160][13933] Avg episode reward: [(0, '28.085')] [2023-09-14 12:56:06,047][13989] Updated weights for policy 0, policy_version 5218 (0.0006) [2023-09-14 12:56:08,067][13989] Updated weights for policy 0, policy_version 5228 (0.0008) [2023-09-14 12:56:10,036][13989] Updated weights for policy 0, policy_version 5238 (0.0008) [2023-09-14 12:56:10,160][13933] Fps is (10 sec: 20070.6, 60 sec: 20138.7, 300 sec: 20146.8). Total num frames: 21454848. Throughput: 0: 5027.6. Samples: 4358676. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:56:10,160][13933] Avg episode reward: [(0, '27.361')] [2023-09-14 12:56:12,104][13989] Updated weights for policy 0, policy_version 5248 (0.0006) [2023-09-14 12:56:14,099][13989] Updated weights for policy 0, policy_version 5258 (0.0011) [2023-09-14 12:56:15,160][13933] Fps is (10 sec: 20479.7, 60 sec: 20206.9, 300 sec: 20146.8). Total num frames: 21557248. Throughput: 0: 5027.6. Samples: 4373796. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:56:15,160][13933] Avg episode reward: [(0, '24.878')] [2023-09-14 12:56:16,119][13989] Updated weights for policy 0, policy_version 5268 (0.0008) [2023-09-14 12:56:18,143][13989] Updated weights for policy 0, policy_version 5278 (0.0005) [2023-09-14 12:56:20,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20138.7, 300 sec: 20146.8). Total num frames: 21655552. Throughput: 0: 5036.2. Samples: 4404198. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:56:20,160][13933] Avg episode reward: [(0, '28.088')] [2023-09-14 12:56:20,167][13989] Updated weights for policy 0, policy_version 5288 (0.0005) [2023-09-14 12:56:22,198][13989] Updated weights for policy 0, policy_version 5298 (0.0005) [2023-09-14 12:56:24,201][13989] Updated weights for policy 0, policy_version 5308 (0.0009) [2023-09-14 12:56:25,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20138.6, 300 sec: 20160.6). Total num frames: 21757952. Throughput: 0: 5036.8. Samples: 4434628. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:56:25,160][13933] Avg episode reward: [(0, '28.534')] [2023-09-14 12:56:26,218][13989] Updated weights for policy 0, policy_version 5318 (0.0008) [2023-09-14 12:56:28,204][13989] Updated weights for policy 0, policy_version 5328 (0.0006) [2023-09-14 12:56:30,160][13933] Fps is (10 sec: 20479.9, 60 sec: 20138.7, 300 sec: 20174.5). Total num frames: 21860352. Throughput: 0: 5045.3. Samples: 4449976. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:56:30,160][13933] Avg episode reward: [(0, '25.595')] [2023-09-14 12:56:30,215][13989] Updated weights for policy 0, policy_version 5338 (0.0011) [2023-09-14 12:56:32,250][13989] Updated weights for policy 0, policy_version 5348 (0.0005) [2023-09-14 12:56:34,247][13989] Updated weights for policy 0, policy_version 5358 (0.0005) [2023-09-14 12:56:35,160][13933] Fps is (10 sec: 20480.2, 60 sec: 20206.9, 300 sec: 20188.4). Total num frames: 21962752. Throughput: 0: 5048.9. Samples: 4480542. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:56:35,160][13933] Avg episode reward: [(0, '23.777')] [2023-09-14 12:56:36,300][13989] Updated weights for policy 0, policy_version 5368 (0.0008) [2023-09-14 12:56:38,316][13989] Updated weights for policy 0, policy_version 5378 (0.0005) [2023-09-14 12:56:40,160][13933] Fps is (10 sec: 20479.8, 60 sec: 20206.9, 300 sec: 20202.3). Total num frames: 22065152. Throughput: 0: 5057.5. Samples: 4510958. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:56:40,160][13933] Avg episode reward: [(0, '23.462')] [2023-09-14 12:56:40,344][13989] Updated weights for policy 0, policy_version 5388 (0.0008) [2023-09-14 12:56:42,364][13989] Updated weights for policy 0, policy_version 5398 (0.0005) [2023-09-14 12:56:44,371][13989] Updated weights for policy 0, policy_version 5408 (0.0005) [2023-09-14 12:56:45,160][13933] Fps is (10 sec: 20480.2, 60 sec: 20275.3, 300 sec: 20216.2). Total num frames: 22167552. Throughput: 0: 5054.1. Samples: 4525984. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:56:45,160][13933] Avg episode reward: [(0, '26.899')] [2023-09-14 12:56:46,384][13989] Updated weights for policy 0, policy_version 5418 (0.0005) [2023-09-14 12:56:48,427][13989] Updated weights for policy 0, policy_version 5428 (0.0005) [2023-09-14 12:56:50,160][13933] Fps is (10 sec: 20070.6, 60 sec: 20206.9, 300 sec: 20216.2). Total num frames: 22265856. Throughput: 0: 5062.4. Samples: 4556430. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:56:50,160][13933] Avg episode reward: [(0, '24.264')] [2023-09-14 12:56:50,458][13989] Updated weights for policy 0, policy_version 5438 (0.0008) [2023-09-14 12:56:52,460][13989] Updated weights for policy 0, policy_version 5448 (0.0005) [2023-09-14 12:56:54,485][13989] Updated weights for policy 0, policy_version 5458 (0.0008) [2023-09-14 12:56:55,160][13933] Fps is (10 sec: 20070.2, 60 sec: 20275.2, 300 sec: 20230.1). Total num frames: 22368256. Throughput: 0: 5071.6. Samples: 4586898. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:56:55,160][13933] Avg episode reward: [(0, '24.378')] [2023-09-14 12:56:56,508][13989] Updated weights for policy 0, policy_version 5468 (0.0008) [2023-09-14 12:56:58,528][13989] Updated weights for policy 0, policy_version 5478 (0.0005) [2023-09-14 12:57:00,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20275.2, 300 sec: 20244.0). Total num frames: 22470656. Throughput: 0: 5073.4. Samples: 4602100. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:57:00,160][13933] Avg episode reward: [(0, '23.935')] [2023-09-14 12:57:00,522][13989] Updated weights for policy 0, policy_version 5488 (0.0005) [2023-09-14 12:57:02,548][13989] Updated weights for policy 0, policy_version 5498 (0.0008) [2023-09-14 12:57:04,550][13989] Updated weights for policy 0, policy_version 5508 (0.0008) [2023-09-14 12:57:05,160][13933] Fps is (10 sec: 20480.1, 60 sec: 20343.5, 300 sec: 20257.8). Total num frames: 22573056. Throughput: 0: 5073.4. Samples: 4632502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:57:05,160][13933] Avg episode reward: [(0, '24.751')] [2023-09-14 12:57:06,572][13989] Updated weights for policy 0, policy_version 5518 (0.0005) [2023-09-14 12:57:08,593][13989] Updated weights for policy 0, policy_version 5528 (0.0005) [2023-09-14 12:57:10,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 20257.8). Total num frames: 22671360. Throughput: 0: 5075.9. Samples: 4663044. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:57:10,160][13933] Avg episode reward: [(0, '25.958')] [2023-09-14 12:57:10,609][13989] Updated weights for policy 0, policy_version 5538 (0.0008) [2023-09-14 12:57:12,634][13989] Updated weights for policy 0, policy_version 5548 (0.0005) [2023-09-14 12:57:14,618][13989] Updated weights for policy 0, policy_version 5558 (0.0005) [2023-09-14 12:57:15,160][13933] Fps is (10 sec: 20070.1, 60 sec: 20275.2, 300 sec: 20271.7). Total num frames: 22773760. Throughput: 0: 5073.9. Samples: 4678300. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:57:15,160][13933] Avg episode reward: [(0, '29.244')] [2023-09-14 12:57:15,162][13971] Saving new best policy, reward=29.244! [2023-09-14 12:57:16,663][13989] Updated weights for policy 0, policy_version 5568 (0.0008) [2023-09-14 12:57:18,642][13989] Updated weights for policy 0, policy_version 5578 (0.0005) [2023-09-14 12:57:20,160][13933] Fps is (10 sec: 20479.8, 60 sec: 20343.4, 300 sec: 20285.6). Total num frames: 22876160. Throughput: 0: 5074.7. Samples: 4708904. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:57:20,160][13933] Avg episode reward: [(0, '23.831')] [2023-09-14 12:57:20,708][13989] Updated weights for policy 0, policy_version 5588 (0.0005) [2023-09-14 12:57:22,716][13989] Updated weights for policy 0, policy_version 5598 (0.0008) [2023-09-14 12:57:24,739][13989] Updated weights for policy 0, policy_version 5608 (0.0008) [2023-09-14 12:57:25,160][13933] Fps is (10 sec: 20070.7, 60 sec: 20275.2, 300 sec: 20257.8). Total num frames: 22974464. Throughput: 0: 5069.4. Samples: 4739080. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 12:57:25,160][13933] Avg episode reward: [(0, '25.904')] [2023-09-14 12:57:26,794][13989] Updated weights for policy 0, policy_version 5618 (0.0005) [2023-09-14 12:57:28,800][13989] Updated weights for policy 0, policy_version 5628 (0.0005) [2023-09-14 12:57:30,160][13933] Fps is (10 sec: 20070.6, 60 sec: 20275.2, 300 sec: 20230.1). Total num frames: 23076864. Throughput: 0: 5070.2. Samples: 4754142. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:57:30,160][13933] Avg episode reward: [(0, '25.663')] [2023-09-14 12:57:30,829][13989] Updated weights for policy 0, policy_version 5638 (0.0005) [2023-09-14 12:57:32,856][13989] Updated weights for policy 0, policy_version 5648 (0.0006) [2023-09-14 12:57:34,828][13989] Updated weights for policy 0, policy_version 5658 (0.0008) [2023-09-14 12:57:35,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20275.2, 300 sec: 20230.1). Total num frames: 23179264. Throughput: 0: 5073.0. Samples: 4784714. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:57:35,160][13933] Avg episode reward: [(0, '25.163')] [2023-09-14 12:57:36,872][13989] Updated weights for policy 0, policy_version 5668 (0.0005) [2023-09-14 12:57:38,894][13989] Updated weights for policy 0, policy_version 5678 (0.0005) [2023-09-14 12:57:40,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20275.2, 300 sec: 20230.1). Total num frames: 23281664. Throughput: 0: 5075.9. Samples: 4815312. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:57:40,160][13933] Avg episode reward: [(0, '25.153')] [2023-09-14 12:57:40,890][13989] Updated weights for policy 0, policy_version 5688 (0.0008) [2023-09-14 12:57:42,895][13989] Updated weights for policy 0, policy_version 5698 (0.0006) [2023-09-14 12:57:44,897][13989] Updated weights for policy 0, policy_version 5708 (0.0008) [2023-09-14 12:57:45,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20275.2, 300 sec: 20230.1). Total num frames: 23384064. Throughput: 0: 5075.5. Samples: 4830498. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:57:45,160][13933] Avg episode reward: [(0, '25.460')] [2023-09-14 12:57:45,162][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000005709_23384064.pth... [2023-09-14 12:57:45,209][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000004522_18522112.pth [2023-09-14 12:57:46,935][13989] Updated weights for policy 0, policy_version 5718 (0.0006) [2023-09-14 12:57:48,927][13989] Updated weights for policy 0, policy_version 5728 (0.0005) [2023-09-14 12:57:50,160][13933] Fps is (10 sec: 20479.9, 60 sec: 20343.5, 300 sec: 20230.1). Total num frames: 23486464. Throughput: 0: 5077.0. Samples: 4860968. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 12:57:50,160][13933] Avg episode reward: [(0, '25.027')] [2023-09-14 12:57:50,970][13989] Updated weights for policy 0, policy_version 5738 (0.0006) [2023-09-14 12:57:52,949][13989] Updated weights for policy 0, policy_version 5748 (0.0008) [2023-09-14 12:57:54,968][13989] Updated weights for policy 0, policy_version 5758 (0.0005) [2023-09-14 12:57:55,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 20230.1). Total num frames: 23584768. Throughput: 0: 5078.5. Samples: 4891578. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:57:55,160][13933] Avg episode reward: [(0, '27.863')] [2023-09-14 12:57:56,989][13989] Updated weights for policy 0, policy_version 5768 (0.0005) [2023-09-14 12:57:58,984][13989] Updated weights for policy 0, policy_version 5778 (0.0005) [2023-09-14 12:58:00,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 20230.1). Total num frames: 23687168. Throughput: 0: 5078.5. Samples: 4906830. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:58:00,160][13933] Avg episode reward: [(0, '26.215')] [2023-09-14 12:58:00,987][13989] Updated weights for policy 0, policy_version 5788 (0.0005) [2023-09-14 12:58:02,995][13989] Updated weights for policy 0, policy_version 5798 (0.0005) [2023-09-14 12:58:05,028][13989] Updated weights for policy 0, policy_version 5808 (0.0005) [2023-09-14 12:58:05,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20275.2, 300 sec: 20244.0). Total num frames: 23789568. Throughput: 0: 5077.6. Samples: 4937394. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:58:05,161][13933] Avg episode reward: [(0, '27.264')] [2023-09-14 12:58:07,060][13989] Updated weights for policy 0, policy_version 5818 (0.0005) [2023-09-14 12:58:09,028][13989] Updated weights for policy 0, policy_version 5828 (0.0005) [2023-09-14 12:58:10,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20343.5, 300 sec: 20244.0). Total num frames: 23891968. Throughput: 0: 5085.5. Samples: 4967926. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 12:58:10,161][13933] Avg episode reward: [(0, '25.887')] [2023-09-14 12:58:11,051][13989] Updated weights for policy 0, policy_version 5838 (0.0005) [2023-09-14 12:58:13,072][13989] Updated weights for policy 0, policy_version 5848 (0.0005) [2023-09-14 12:58:15,107][13989] Updated weights for policy 0, policy_version 5858 (0.0011) [2023-09-14 12:58:15,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20343.5, 300 sec: 20257.9). Total num frames: 23994368. Throughput: 0: 5091.9. Samples: 4983278. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:58:15,160][13933] Avg episode reward: [(0, '23.447')] [2023-09-14 12:58:17,139][13989] Updated weights for policy 0, policy_version 5868 (0.0005) [2023-09-14 12:58:19,149][13989] Updated weights for policy 0, policy_version 5878 (0.0005) [2023-09-14 12:58:20,160][13933] Fps is (10 sec: 20479.3, 60 sec: 20343.4, 300 sec: 20257.8). Total num frames: 24096768. Throughput: 0: 5084.6. Samples: 5013524. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 12:58:20,160][13933] Avg episode reward: [(0, '24.457')] [2023-09-14 12:58:21,178][13989] Updated weights for policy 0, policy_version 5888 (0.0008) [2023-09-14 12:58:23,217][13989] Updated weights for policy 0, policy_version 5898 (0.0005) [2023-09-14 12:58:25,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20343.5, 300 sec: 20257.8). Total num frames: 24195072. Throughput: 0: 5078.0. Samples: 5043822. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:58:25,160][13933] Avg episode reward: [(0, '24.395')] [2023-09-14 12:58:25,232][13989] Updated weights for policy 0, policy_version 5908 (0.0005) [2023-09-14 12:58:27,231][13989] Updated weights for policy 0, policy_version 5918 (0.0005) [2023-09-14 12:58:29,205][13989] Updated weights for policy 0, policy_version 5928 (0.0005) [2023-09-14 12:58:30,160][13933] Fps is (10 sec: 20071.1, 60 sec: 20343.5, 300 sec: 20257.9). Total num frames: 24297472. Throughput: 0: 5081.8. Samples: 5059180. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:58:30,160][13933] Avg episode reward: [(0, '27.337')] [2023-09-14 12:58:31,260][13989] Updated weights for policy 0, policy_version 5938 (0.0006) [2023-09-14 12:58:33,331][13989] Updated weights for policy 0, policy_version 5948 (0.0006) [2023-09-14 12:58:35,160][13933] Fps is (10 sec: 20479.7, 60 sec: 20343.4, 300 sec: 20271.7). Total num frames: 24399872. Throughput: 0: 5077.4. Samples: 5089450. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 12:58:35,160][13933] Avg episode reward: [(0, '26.000')] [2023-09-14 12:58:35,342][13989] Updated weights for policy 0, policy_version 5958 (0.0006) [2023-09-14 12:58:37,389][13989] Updated weights for policy 0, policy_version 5968 (0.0008) [2023-09-14 12:58:39,378][13989] Updated weights for policy 0, policy_version 5978 (0.0005) [2023-09-14 12:58:40,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 20257.9). Total num frames: 24498176. Throughput: 0: 5072.2. Samples: 5119826. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:58:40,160][13933] Avg episode reward: [(0, '26.964')] [2023-09-14 12:58:41,403][13989] Updated weights for policy 0, policy_version 5988 (0.0006) [2023-09-14 12:58:43,451][13989] Updated weights for policy 0, policy_version 5998 (0.0008) [2023-09-14 12:58:45,160][13933] Fps is (10 sec: 20070.6, 60 sec: 20275.2, 300 sec: 20257.8). Total num frames: 24600576. Throughput: 0: 5068.3. Samples: 5134904. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 12:58:45,160][13933] Avg episode reward: [(0, '25.656')] [2023-09-14 12:58:45,491][13989] Updated weights for policy 0, policy_version 6008 (0.0011) [2023-09-14 12:58:47,493][13989] Updated weights for policy 0, policy_version 6018 (0.0005) [2023-09-14 12:58:49,498][13989] Updated weights for policy 0, policy_version 6028 (0.0005) [2023-09-14 12:58:50,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20275.2, 300 sec: 20271.7). Total num frames: 24702976. Throughput: 0: 5063.1. Samples: 5165234. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:58:50,160][13933] Avg episode reward: [(0, '25.719')] [2023-09-14 12:58:51,546][13989] Updated weights for policy 0, policy_version 6038 (0.0005) [2023-09-14 12:58:53,538][13989] Updated weights for policy 0, policy_version 6048 (0.0005) [2023-09-14 12:58:55,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20343.5, 300 sec: 20271.7). Total num frames: 24805376. Throughput: 0: 5066.3. Samples: 5195908. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 12:58:55,160][13933] Avg episode reward: [(0, '29.664')] [2023-09-14 12:58:55,163][13971] Saving new best policy, reward=29.664! [2023-09-14 12:58:55,570][13989] Updated weights for policy 0, policy_version 6058 (0.0005) [2023-09-14 12:58:57,587][13989] Updated weights for policy 0, policy_version 6068 (0.0006) [2023-09-14 12:58:59,604][13989] Updated weights for policy 0, policy_version 6078 (0.0005) [2023-09-14 12:59:00,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20275.2, 300 sec: 20271.7). Total num frames: 24903680. Throughput: 0: 5057.4. Samples: 5210862. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-09-14 12:59:00,160][13933] Avg episode reward: [(0, '27.339')] [2023-09-14 12:59:01,607][13989] Updated weights for policy 0, policy_version 6088 (0.0005) [2023-09-14 12:59:03,595][13989] Updated weights for policy 0, policy_version 6098 (0.0005) [2023-09-14 12:59:05,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 20271.7). Total num frames: 25006080. Throughput: 0: 5068.5. Samples: 5241606. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-09-14 12:59:05,160][13933] Avg episode reward: [(0, '27.668')] [2023-09-14 12:59:05,590][13989] Updated weights for policy 0, policy_version 6108 (0.0005) [2023-09-14 12:59:07,623][13989] Updated weights for policy 0, policy_version 6118 (0.0005) [2023-09-14 12:59:09,603][13989] Updated weights for policy 0, policy_version 6128 (0.0005) [2023-09-14 12:59:10,160][13933] Fps is (10 sec: 20479.7, 60 sec: 20275.2, 300 sec: 20271.7). Total num frames: 25108480. Throughput: 0: 5074.9. Samples: 5272194. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:59:10,160][13933] Avg episode reward: [(0, '27.399')] [2023-09-14 12:59:11,667][13989] Updated weights for policy 0, policy_version 6138 (0.0006) [2023-09-14 12:59:13,680][13989] Updated weights for policy 0, policy_version 6148 (0.0005) [2023-09-14 12:59:15,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20275.2, 300 sec: 20285.6). Total num frames: 25210880. Throughput: 0: 5071.7. Samples: 5287406. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:59:15,160][13933] Avg episode reward: [(0, '23.526')] [2023-09-14 12:59:15,729][13989] Updated weights for policy 0, policy_version 6158 (0.0005) [2023-09-14 12:59:17,760][13989] Updated weights for policy 0, policy_version 6168 (0.0005) [2023-09-14 12:59:19,830][13989] Updated weights for policy 0, policy_version 6178 (0.0006) [2023-09-14 12:59:20,160][13933] Fps is (10 sec: 20070.6, 60 sec: 20207.0, 300 sec: 20271.7). Total num frames: 25309184. Throughput: 0: 5064.1. Samples: 5317334. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:59:20,160][13933] Avg episode reward: [(0, '27.760')] [2023-09-14 12:59:21,845][13989] Updated weights for policy 0, policy_version 6188 (0.0005) [2023-09-14 12:59:23,714][13989] Updated weights for policy 0, policy_version 6198 (0.0005) [2023-09-14 12:59:25,160][13933] Fps is (10 sec: 20480.0, 60 sec: 20343.5, 300 sec: 20285.6). Total num frames: 25415680. Throughput: 0: 5090.9. Samples: 5348918. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:59:25,160][13933] Avg episode reward: [(0, '27.148')] [2023-09-14 12:59:25,552][13989] Updated weights for policy 0, policy_version 6208 (0.0005) [2023-09-14 12:59:27,423][13989] Updated weights for policy 0, policy_version 6218 (0.0009) [2023-09-14 12:59:29,299][13989] Updated weights for policy 0, policy_version 6228 (0.0005) [2023-09-14 12:59:30,160][13933] Fps is (10 sec: 21708.8, 60 sec: 20480.0, 300 sec: 20313.4). Total num frames: 25526272. Throughput: 0: 5122.5. Samples: 5365416. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:59:30,160][13933] Avg episode reward: [(0, '25.963')] [2023-09-14 12:59:31,200][13989] Updated weights for policy 0, policy_version 6238 (0.0005) [2023-09-14 12:59:33,052][13989] Updated weights for policy 0, policy_version 6248 (0.0008) [2023-09-14 12:59:34,903][13989] Updated weights for policy 0, policy_version 6258 (0.0005) [2023-09-14 12:59:35,160][13933] Fps is (10 sec: 22118.4, 60 sec: 20616.6, 300 sec: 20355.0). Total num frames: 25636864. Throughput: 0: 5175.0. Samples: 5398108. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:59:35,160][13933] Avg episode reward: [(0, '26.836')] [2023-09-14 12:59:36,759][13989] Updated weights for policy 0, policy_version 6268 (0.0005) [2023-09-14 12:59:38,657][13989] Updated weights for policy 0, policy_version 6278 (0.0005) [2023-09-14 12:59:40,160][13933] Fps is (10 sec: 21708.8, 60 sec: 20753.1, 300 sec: 20369.0). Total num frames: 25743360. Throughput: 0: 5222.1. Samples: 5430904. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:59:40,160][13933] Avg episode reward: [(0, '27.851')] [2023-09-14 12:59:40,520][13989] Updated weights for policy 0, policy_version 6288 (0.0005) [2023-09-14 12:59:42,425][13989] Updated weights for policy 0, policy_version 6298 (0.0005) [2023-09-14 12:59:44,293][13989] Updated weights for policy 0, policy_version 6308 (0.0005) [2023-09-14 12:59:45,160][13933] Fps is (10 sec: 21708.8, 60 sec: 20889.6, 300 sec: 20396.7). Total num frames: 25853952. Throughput: 0: 5253.2. Samples: 5447258. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 12:59:45,160][13933] Avg episode reward: [(0, '25.807')] [2023-09-14 12:59:45,164][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000006312_25853952.pth... [2023-09-14 12:59:45,210][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000005116_20955136.pth [2023-09-14 12:59:46,174][13989] Updated weights for policy 0, policy_version 6318 (0.0005) [2023-09-14 12:59:48,107][13989] Updated weights for policy 0, policy_version 6328 (0.0005) [2023-09-14 12:59:49,963][13989] Updated weights for policy 0, policy_version 6338 (0.0005) [2023-09-14 12:59:50,160][13933] Fps is (10 sec: 21708.8, 60 sec: 20957.9, 300 sec: 20410.6). Total num frames: 25960448. Throughput: 0: 5293.4. Samples: 5479808. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 12:59:50,161][13933] Avg episode reward: [(0, '26.915')] [2023-09-14 12:59:51,848][13989] Updated weights for policy 0, policy_version 6348 (0.0008) [2023-09-14 12:59:53,737][13989] Updated weights for policy 0, policy_version 6358 (0.0005) [2023-09-14 12:59:55,160][13933] Fps is (10 sec: 21708.7, 60 sec: 21094.4, 300 sec: 20438.4). Total num frames: 26071040. Throughput: 0: 5340.1. Samples: 5512496. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 12:59:55,160][13933] Avg episode reward: [(0, '28.652')] [2023-09-14 12:59:55,780][13989] Updated weights for policy 0, policy_version 6368 (0.0005) [2023-09-14 12:59:57,919][13989] Updated weights for policy 0, policy_version 6378 (0.0006) [2023-09-14 13:00:00,072][13989] Updated weights for policy 0, policy_version 6388 (0.0006) [2023-09-14 13:00:00,160][13933] Fps is (10 sec: 20479.8, 60 sec: 21026.1, 300 sec: 20410.6). Total num frames: 26165248. Throughput: 0: 5322.8. Samples: 5526932. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:00:00,160][13933] Avg episode reward: [(0, '29.775')] [2023-09-14 13:00:00,160][13971] Saving new best policy, reward=29.775! [2023-09-14 13:00:02,205][13989] Updated weights for policy 0, policy_version 6398 (0.0009) [2023-09-14 13:00:04,311][13989] Updated weights for policy 0, policy_version 6408 (0.0009) [2023-09-14 13:00:05,160][13933] Fps is (10 sec: 19251.1, 60 sec: 20957.8, 300 sec: 20396.7). Total num frames: 26263552. Throughput: 0: 5300.2. Samples: 5555844. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:00:05,160][13933] Avg episode reward: [(0, '27.599')] [2023-09-14 13:00:06,366][13989] Updated weights for policy 0, policy_version 6418 (0.0008) [2023-09-14 13:00:08,430][13989] Updated weights for policy 0, policy_version 6428 (0.0008) [2023-09-14 13:00:10,160][13933] Fps is (10 sec: 19660.4, 60 sec: 20889.6, 300 sec: 20396.7). Total num frames: 26361856. Throughput: 0: 5262.3. Samples: 5585724. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:00:10,160][13933] Avg episode reward: [(0, '26.156')] [2023-09-14 13:00:10,553][13989] Updated weights for policy 0, policy_version 6438 (0.0006) [2023-09-14 13:00:12,639][13989] Updated weights for policy 0, policy_version 6448 (0.0007) [2023-09-14 13:00:14,586][13989] Updated weights for policy 0, policy_version 6458 (0.0006) [2023-09-14 13:00:15,160][13933] Fps is (10 sec: 19661.0, 60 sec: 20821.3, 300 sec: 20382.8). Total num frames: 26460160. Throughput: 0: 5209.0. Samples: 5599822. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:00:15,160][13933] Avg episode reward: [(0, '26.450')] [2023-09-14 13:00:16,480][13989] Updated weights for policy 0, policy_version 6468 (0.0005) [2023-09-14 13:00:18,347][13989] Updated weights for policy 0, policy_version 6478 (0.0005) [2023-09-14 13:00:20,160][13933] Fps is (10 sec: 20890.1, 60 sec: 21026.1, 300 sec: 20410.6). Total num frames: 26570752. Throughput: 0: 5196.5. Samples: 5631952. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:00:20,160][13933] Avg episode reward: [(0, '25.598')] [2023-09-14 13:00:20,184][13989] Updated weights for policy 0, policy_version 6488 (0.0005) [2023-09-14 13:00:22,067][13989] Updated weights for policy 0, policy_version 6498 (0.0005) [2023-09-14 13:00:23,909][13989] Updated weights for policy 0, policy_version 6508 (0.0007) [2023-09-14 13:00:25,160][13933] Fps is (10 sec: 22118.4, 60 sec: 21094.4, 300 sec: 20438.3). Total num frames: 26681344. Throughput: 0: 5207.2. Samples: 5665226. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 13:00:25,160][13933] Avg episode reward: [(0, '25.890')] [2023-09-14 13:00:25,759][13989] Updated weights for policy 0, policy_version 6518 (0.0005) [2023-09-14 13:00:27,590][13989] Updated weights for policy 0, policy_version 6528 (0.0005) [2023-09-14 13:00:29,446][13989] Updated weights for policy 0, policy_version 6538 (0.0008) [2023-09-14 13:00:30,160][13933] Fps is (10 sec: 22118.4, 60 sec: 21094.4, 300 sec: 20480.0). Total num frames: 26791936. Throughput: 0: 5211.7. Samples: 5681784. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:00:30,160][13933] Avg episode reward: [(0, '25.054')] [2023-09-14 13:00:31,446][13989] Updated weights for policy 0, policy_version 6548 (0.0008) [2023-09-14 13:00:33,488][13989] Updated weights for policy 0, policy_version 6558 (0.0005) [2023-09-14 13:00:35,160][13933] Fps is (10 sec: 21299.2, 60 sec: 20957.9, 300 sec: 20480.0). Total num frames: 26894336. Throughput: 0: 5187.1. Samples: 5713226. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:00:35,160][13933] Avg episode reward: [(0, '26.512')] [2023-09-14 13:00:35,529][13989] Updated weights for policy 0, policy_version 6568 (0.0005) [2023-09-14 13:00:37,576][13989] Updated weights for policy 0, policy_version 6578 (0.0006) [2023-09-14 13:00:39,594][13989] Updated weights for policy 0, policy_version 6588 (0.0008) [2023-09-14 13:00:40,160][13933] Fps is (10 sec: 20070.2, 60 sec: 20821.3, 300 sec: 20480.0). Total num frames: 26992640. Throughput: 0: 5129.2. Samples: 5743312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:00:40,160][13933] Avg episode reward: [(0, '26.378')] [2023-09-14 13:00:41,644][13989] Updated weights for policy 0, policy_version 6598 (0.0005) [2023-09-14 13:00:43,641][13989] Updated weights for policy 0, policy_version 6608 (0.0008) [2023-09-14 13:00:45,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20684.8, 300 sec: 20480.0). Total num frames: 27095040. Throughput: 0: 5148.0. Samples: 5758590. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 13:00:45,160][13933] Avg episode reward: [(0, '26.192')] [2023-09-14 13:00:45,654][13989] Updated weights for policy 0, policy_version 6618 (0.0008) [2023-09-14 13:00:47,670][13989] Updated weights for policy 0, policy_version 6628 (0.0005) [2023-09-14 13:00:49,695][13989] Updated weights for policy 0, policy_version 6638 (0.0006) [2023-09-14 13:00:50,160][13933] Fps is (10 sec: 20480.2, 60 sec: 20616.5, 300 sec: 20493.9). Total num frames: 27197440. Throughput: 0: 5181.3. Samples: 5789000. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 13:00:50,160][13933] Avg episode reward: [(0, '28.681')] [2023-09-14 13:00:51,734][13989] Updated weights for policy 0, policy_version 6648 (0.0006) [2023-09-14 13:00:53,776][13989] Updated weights for policy 0, policy_version 6658 (0.0010) [2023-09-14 13:00:55,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20411.7, 300 sec: 20480.0). Total num frames: 27295744. Throughput: 0: 5186.6. Samples: 5819122. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:00:55,160][13933] Avg episode reward: [(0, '25.500')] [2023-09-14 13:00:55,804][13989] Updated weights for policy 0, policy_version 6668 (0.0006) [2023-09-14 13:00:57,884][13989] Updated weights for policy 0, policy_version 6678 (0.0005) [2023-09-14 13:00:59,963][13989] Updated weights for policy 0, policy_version 6688 (0.0008) [2023-09-14 13:01:00,160][13933] Fps is (10 sec: 19660.7, 60 sec: 20480.0, 300 sec: 20480.0). Total num frames: 27394048. Throughput: 0: 5208.4. Samples: 5834200. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 13:01:00,160][13933] Avg episode reward: [(0, '27.989')] [2023-09-14 13:01:02,073][13989] Updated weights for policy 0, policy_version 6698 (0.0006) [2023-09-14 13:01:04,139][13989] Updated weights for policy 0, policy_version 6708 (0.0006) [2023-09-14 13:01:05,160][13933] Fps is (10 sec: 19660.8, 60 sec: 20480.0, 300 sec: 20466.1). Total num frames: 27492352. Throughput: 0: 5147.0. Samples: 5863568. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 13:01:05,160][13933] Avg episode reward: [(0, '23.594')] [2023-09-14 13:01:06,225][13989] Updated weights for policy 0, policy_version 6718 (0.0006) [2023-09-14 13:01:08,331][13989] Updated weights for policy 0, policy_version 6728 (0.0008) [2023-09-14 13:01:10,160][13933] Fps is (10 sec: 19660.8, 60 sec: 20480.1, 300 sec: 20452.2). Total num frames: 27590656. Throughput: 0: 5064.0. Samples: 5893108. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 13:01:10,160][13933] Avg episode reward: [(0, '26.170')] [2023-09-14 13:01:10,395][13989] Updated weights for policy 0, policy_version 6738 (0.0006) [2023-09-14 13:01:12,504][13989] Updated weights for policy 0, policy_version 6748 (0.0006) [2023-09-14 13:01:14,634][13989] Updated weights for policy 0, policy_version 6758 (0.0005) [2023-09-14 13:01:15,160][13933] Fps is (10 sec: 19660.8, 60 sec: 20480.0, 300 sec: 20452.2). Total num frames: 27688960. Throughput: 0: 5021.1. Samples: 5907734. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 13:01:15,160][13933] Avg episode reward: [(0, '25.200')] [2023-09-14 13:01:16,703][13989] Updated weights for policy 0, policy_version 6768 (0.0006) [2023-09-14 13:01:18,840][13989] Updated weights for policy 0, policy_version 6778 (0.0006) [2023-09-14 13:01:20,160][13933] Fps is (10 sec: 19660.9, 60 sec: 20275.2, 300 sec: 20438.4). Total num frames: 27787264. Throughput: 0: 4970.4. Samples: 5936894. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:01:20,160][13933] Avg episode reward: [(0, '24.164')] [2023-09-14 13:01:20,898][13989] Updated weights for policy 0, policy_version 6788 (0.0006) [2023-09-14 13:01:23,014][13989] Updated weights for policy 0, policy_version 6798 (0.0006) [2023-09-14 13:01:25,127][13989] Updated weights for policy 0, policy_version 6808 (0.0006) [2023-09-14 13:01:25,160][13933] Fps is (10 sec: 19660.6, 60 sec: 20070.4, 300 sec: 20424.5). Total num frames: 27885568. Throughput: 0: 4954.2. Samples: 5966252. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 13:01:25,160][13933] Avg episode reward: [(0, '26.644')] [2023-09-14 13:01:27,226][13989] Updated weights for policy 0, policy_version 6818 (0.0008) [2023-09-14 13:01:29,279][13989] Updated weights for policy 0, policy_version 6828 (0.0005) [2023-09-14 13:01:30,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19865.6, 300 sec: 20410.6). Total num frames: 27983872. Throughput: 0: 4939.0. Samples: 5980846. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-09-14 13:01:30,160][13933] Avg episode reward: [(0, '27.517')] [2023-09-14 13:01:31,336][13989] Updated weights for policy 0, policy_version 6838 (0.0005) [2023-09-14 13:01:33,411][13989] Updated weights for policy 0, policy_version 6848 (0.0006) [2023-09-14 13:01:35,160][13933] Fps is (10 sec: 19660.9, 60 sec: 19797.3, 300 sec: 20396.7). Total num frames: 28082176. Throughput: 0: 4923.3. Samples: 6010548. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:01:35,160][13933] Avg episode reward: [(0, '29.812')] [2023-09-14 13:01:35,163][13971] Saving new best policy, reward=29.812! [2023-09-14 13:01:35,450][13989] Updated weights for policy 0, policy_version 6858 (0.0005) [2023-09-14 13:01:37,504][13989] Updated weights for policy 0, policy_version 6868 (0.0009) [2023-09-14 13:01:39,576][13989] Updated weights for policy 0, policy_version 6878 (0.0006) [2023-09-14 13:01:40,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19797.4, 300 sec: 20382.8). Total num frames: 28180480. Throughput: 0: 4919.0. Samples: 6040476. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:01:40,161][13933] Avg episode reward: [(0, '26.615')] [2023-09-14 13:01:41,620][13989] Updated weights for policy 0, policy_version 6888 (0.0006) [2023-09-14 13:01:43,676][13989] Updated weights for policy 0, policy_version 6898 (0.0006) [2023-09-14 13:01:45,160][13933] Fps is (10 sec: 20070.6, 60 sec: 19797.3, 300 sec: 20396.7). Total num frames: 28282880. Throughput: 0: 4917.5. Samples: 6055486. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:01:45,160][13933] Avg episode reward: [(0, '26.068')] [2023-09-14 13:01:45,162][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000006905_28282880.pth... [2023-09-14 13:01:45,206][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000005709_23384064.pth [2023-09-14 13:01:45,759][13989] Updated weights for policy 0, policy_version 6908 (0.0005) [2023-09-14 13:01:47,811][13989] Updated weights for policy 0, policy_version 6918 (0.0006) [2023-09-14 13:01:49,847][13989] Updated weights for policy 0, policy_version 6928 (0.0005) [2023-09-14 13:01:50,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19729.1, 300 sec: 20382.8). Total num frames: 28381184. Throughput: 0: 4926.7. Samples: 6085270. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:01:50,160][13933] Avg episode reward: [(0, '26.706')] [2023-09-14 13:01:51,973][13989] Updated weights for policy 0, policy_version 6938 (0.0005) [2023-09-14 13:01:53,967][13989] Updated weights for policy 0, policy_version 6948 (0.0005) [2023-09-14 13:01:55,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19729.1, 300 sec: 20368.9). Total num frames: 28479488. Throughput: 0: 4933.5. Samples: 6115116. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-09-14 13:01:55,160][13933] Avg episode reward: [(0, '27.341')] [2023-09-14 13:01:56,013][13989] Updated weights for policy 0, policy_version 6958 (0.0005) [2023-09-14 13:01:58,047][13989] Updated weights for policy 0, policy_version 6968 (0.0008) [2023-09-14 13:02:00,115][13989] Updated weights for policy 0, policy_version 6978 (0.0006) [2023-09-14 13:02:00,160][13933] Fps is (10 sec: 20070.3, 60 sec: 19797.3, 300 sec: 20368.9). Total num frames: 28581888. Throughput: 0: 4944.1. Samples: 6130218. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:02:00,160][13933] Avg episode reward: [(0, '27.392')] [2023-09-14 13:02:02,243][13989] Updated weights for policy 0, policy_version 6988 (0.0011) [2023-09-14 13:02:04,291][13989] Updated weights for policy 0, policy_version 6998 (0.0005) [2023-09-14 13:02:05,160][13933] Fps is (10 sec: 20070.3, 60 sec: 19797.3, 300 sec: 20368.9). Total num frames: 28680192. Throughput: 0: 4953.8. Samples: 6159814. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:02:05,160][13933] Avg episode reward: [(0, '29.386')] [2023-09-14 13:02:06,399][13989] Updated weights for policy 0, policy_version 7008 (0.0005) [2023-09-14 13:02:08,492][13989] Updated weights for policy 0, policy_version 7018 (0.0006) [2023-09-14 13:02:10,160][13933] Fps is (10 sec: 19251.3, 60 sec: 19729.1, 300 sec: 20341.2). Total num frames: 28774400. Throughput: 0: 4950.8. Samples: 6189038. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:02:10,160][13933] Avg episode reward: [(0, '28.349')] [2023-09-14 13:02:10,614][13989] Updated weights for policy 0, policy_version 7028 (0.0006) [2023-09-14 13:02:12,703][13989] Updated weights for policy 0, policy_version 7038 (0.0008) [2023-09-14 13:02:14,770][13989] Updated weights for policy 0, policy_version 7048 (0.0006) [2023-09-14 13:02:15,160][13933] Fps is (10 sec: 19251.3, 60 sec: 19729.1, 300 sec: 20327.3). Total num frames: 28872704. Throughput: 0: 4952.9. Samples: 6203728. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:02:15,160][13933] Avg episode reward: [(0, '29.142')] [2023-09-14 13:02:16,810][13989] Updated weights for policy 0, policy_version 7058 (0.0011) [2023-09-14 13:02:18,883][13989] Updated weights for policy 0, policy_version 7068 (0.0006) [2023-09-14 13:02:20,160][13933] Fps is (10 sec: 20070.6, 60 sec: 19797.4, 300 sec: 20341.2). Total num frames: 28975104. Throughput: 0: 4955.8. Samples: 6233560. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:02:20,160][13933] Avg episode reward: [(0, '29.838')] [2023-09-14 13:02:20,160][13971] Saving new best policy, reward=29.838! [2023-09-14 13:02:20,951][13989] Updated weights for policy 0, policy_version 7078 (0.0005) [2023-09-14 13:02:23,012][13989] Updated weights for policy 0, policy_version 7088 (0.0011) [2023-09-14 13:02:25,059][13989] Updated weights for policy 0, policy_version 7098 (0.0008) [2023-09-14 13:02:25,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19797.4, 300 sec: 20327.3). Total num frames: 29073408. Throughput: 0: 4954.0. Samples: 6263408. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:02:25,160][13933] Avg episode reward: [(0, '28.757')] [2023-09-14 13:02:27,150][13989] Updated weights for policy 0, policy_version 7108 (0.0006) [2023-09-14 13:02:29,195][13989] Updated weights for policy 0, policy_version 7118 (0.0006) [2023-09-14 13:02:30,160][13933] Fps is (10 sec: 19660.6, 60 sec: 19797.3, 300 sec: 20313.4). Total num frames: 29171712. Throughput: 0: 4948.1. Samples: 6278150. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:02:30,160][13933] Avg episode reward: [(0, '28.815')] [2023-09-14 13:02:31,223][13989] Updated weights for policy 0, policy_version 7128 (0.0006) [2023-09-14 13:02:33,330][13989] Updated weights for policy 0, policy_version 7138 (0.0006) [2023-09-14 13:02:35,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19865.6, 300 sec: 20313.4). Total num frames: 29274112. Throughput: 0: 4948.5. Samples: 6307954. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:02:35,160][13933] Avg episode reward: [(0, '28.645')] [2023-09-14 13:02:35,392][13989] Updated weights for policy 0, policy_version 7148 (0.0005) [2023-09-14 13:02:37,446][13989] Updated weights for policy 0, policy_version 7158 (0.0005) [2023-09-14 13:02:39,523][13989] Updated weights for policy 0, policy_version 7168 (0.0005) [2023-09-14 13:02:40,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19797.3, 300 sec: 20285.6). Total num frames: 29368320. Throughput: 0: 4946.5. Samples: 6337710. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:02:40,160][13933] Avg episode reward: [(0, '28.573')] [2023-09-14 13:02:41,601][13989] Updated weights for policy 0, policy_version 7178 (0.0006) [2023-09-14 13:02:43,644][13989] Updated weights for policy 0, policy_version 7188 (0.0005) [2023-09-14 13:02:45,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19797.3, 300 sec: 20285.6). Total num frames: 29470720. Throughput: 0: 4940.9. Samples: 6352560. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:02:45,160][13933] Avg episode reward: [(0, '28.565')] [2023-09-14 13:02:45,706][13989] Updated weights for policy 0, policy_version 7198 (0.0008) [2023-09-14 13:02:47,759][13989] Updated weights for policy 0, policy_version 7208 (0.0011) [2023-09-14 13:02:49,817][13989] Updated weights for policy 0, policy_version 7218 (0.0006) [2023-09-14 13:02:50,160][13933] Fps is (10 sec: 20070.5, 60 sec: 19797.3, 300 sec: 20285.6). Total num frames: 29569024. Throughput: 0: 4945.6. Samples: 6382366. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:02:50,160][13933] Avg episode reward: [(0, '28.983')] [2023-09-14 13:02:51,864][13989] Updated weights for policy 0, policy_version 7228 (0.0006) [2023-09-14 13:02:53,934][13989] Updated weights for policy 0, policy_version 7238 (0.0005) [2023-09-14 13:02:55,160][13933] Fps is (10 sec: 19660.9, 60 sec: 19797.4, 300 sec: 20271.7). Total num frames: 29667328. Throughput: 0: 4963.0. Samples: 6412372. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:02:55,160][13933] Avg episode reward: [(0, '28.322')] [2023-09-14 13:02:55,995][13989] Updated weights for policy 0, policy_version 7248 (0.0005) [2023-09-14 13:02:58,038][13989] Updated weights for policy 0, policy_version 7258 (0.0006) [2023-09-14 13:03:00,098][13989] Updated weights for policy 0, policy_version 7268 (0.0006) [2023-09-14 13:03:00,160][13933] Fps is (10 sec: 20070.3, 60 sec: 19797.3, 300 sec: 20271.7). Total num frames: 29769728. Throughput: 0: 4966.4. Samples: 6427218. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:03:00,160][13933] Avg episode reward: [(0, '25.118')] [2023-09-14 13:03:02,198][13989] Updated weights for policy 0, policy_version 7278 (0.0005) [2023-09-14 13:03:04,227][13989] Updated weights for policy 0, policy_version 7288 (0.0006) [2023-09-14 13:03:05,160][13933] Fps is (10 sec: 20070.3, 60 sec: 19797.3, 300 sec: 20257.8). Total num frames: 29868032. Throughput: 0: 4964.3. Samples: 6456956. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 13:03:05,160][13933] Avg episode reward: [(0, '27.009')] [2023-09-14 13:03:06,268][13989] Updated weights for policy 0, policy_version 7298 (0.0008) [2023-09-14 13:03:08,348][13989] Updated weights for policy 0, policy_version 7308 (0.0006) [2023-09-14 13:03:10,160][13933] Fps is (10 sec: 19660.4, 60 sec: 19865.5, 300 sec: 20243.9). Total num frames: 29966336. Throughput: 0: 4965.8. Samples: 6486868. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:03:10,160][13933] Avg episode reward: [(0, '25.640')] [2023-09-14 13:03:10,390][13989] Updated weights for policy 0, policy_version 7318 (0.0006) [2023-09-14 13:03:12,463][13989] Updated weights for policy 0, policy_version 7328 (0.0005) [2023-09-14 13:03:14,478][13989] Updated weights for policy 0, policy_version 7338 (0.0008) [2023-09-14 13:03:15,159][13933] Fps is (10 sec: 20070.6, 60 sec: 19933.9, 300 sec: 20244.0). Total num frames: 30068736. Throughput: 0: 4972.8. Samples: 6501924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:03:15,160][13933] Avg episode reward: [(0, '24.738')] [2023-09-14 13:03:16,537][13989] Updated weights for policy 0, policy_version 7348 (0.0006) [2023-09-14 13:03:18,548][13989] Updated weights for policy 0, policy_version 7358 (0.0005) [2023-09-14 13:03:20,160][13933] Fps is (10 sec: 20070.8, 60 sec: 19865.6, 300 sec: 20244.0). Total num frames: 30167040. Throughput: 0: 4979.3. Samples: 6532024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:03:20,160][13933] Avg episode reward: [(0, '26.361')] [2023-09-14 13:03:20,643][13989] Updated weights for policy 0, policy_version 7368 (0.0006) [2023-09-14 13:03:22,660][13989] Updated weights for policy 0, policy_version 7378 (0.0005) [2023-09-14 13:03:24,701][13989] Updated weights for policy 0, policy_version 7388 (0.0008) [2023-09-14 13:03:25,160][13933] Fps is (10 sec: 20070.1, 60 sec: 19933.9, 300 sec: 20244.0). Total num frames: 30269440. Throughput: 0: 4986.3. Samples: 6562094. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:03:25,161][13933] Avg episode reward: [(0, '27.471')] [2023-09-14 13:03:26,756][13989] Updated weights for policy 0, policy_version 7398 (0.0011) [2023-09-14 13:03:28,808][13989] Updated weights for policy 0, policy_version 7408 (0.0006) [2023-09-14 13:03:30,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19933.9, 300 sec: 20230.1). Total num frames: 30367744. Throughput: 0: 4987.9. Samples: 6577016. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:03:30,160][13933] Avg episode reward: [(0, '27.368')] [2023-09-14 13:03:30,862][13989] Updated weights for policy 0, policy_version 7418 (0.0008) [2023-09-14 13:03:32,899][13989] Updated weights for policy 0, policy_version 7428 (0.0006) [2023-09-14 13:03:34,962][13989] Updated weights for policy 0, policy_version 7438 (0.0012) [2023-09-14 13:03:35,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19865.6, 300 sec: 20230.1). Total num frames: 30466048. Throughput: 0: 4992.4. Samples: 6607026. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:03:35,160][13933] Avg episode reward: [(0, '27.947')] [2023-09-14 13:03:37,031][13989] Updated weights for policy 0, policy_version 7448 (0.0008) [2023-09-14 13:03:39,094][13989] Updated weights for policy 0, policy_version 7458 (0.0008) [2023-09-14 13:03:40,160][13933] Fps is (10 sec: 20070.6, 60 sec: 20002.2, 300 sec: 20230.1). Total num frames: 30568448. Throughput: 0: 4987.0. Samples: 6636786. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:03:40,160][13933] Avg episode reward: [(0, '32.212')] [2023-09-14 13:03:40,160][13971] Saving new best policy, reward=32.212! [2023-09-14 13:03:41,200][13989] Updated weights for policy 0, policy_version 7468 (0.0009) [2023-09-14 13:03:43,230][13989] Updated weights for policy 0, policy_version 7478 (0.0006) [2023-09-14 13:03:45,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19933.9, 300 sec: 20216.2). Total num frames: 30666752. Throughput: 0: 4983.7. Samples: 6651484. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:03:45,160][13933] Avg episode reward: [(0, '27.949')] [2023-09-14 13:03:45,163][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000007487_30666752.pth... [2023-09-14 13:03:45,207][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000006312_25853952.pth [2023-09-14 13:03:45,364][13989] Updated weights for policy 0, policy_version 7488 (0.0005) [2023-09-14 13:03:47,431][13989] Updated weights for policy 0, policy_version 7498 (0.0009) [2023-09-14 13:03:49,517][13989] Updated weights for policy 0, policy_version 7508 (0.0005) [2023-09-14 13:03:50,160][13933] Fps is (10 sec: 19660.7, 60 sec: 19933.9, 300 sec: 20202.3). Total num frames: 30765056. Throughput: 0: 4980.0. Samples: 6681056. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:03:50,160][13933] Avg episode reward: [(0, '28.684')] [2023-09-14 13:03:51,576][13989] Updated weights for policy 0, policy_version 7518 (0.0005) [2023-09-14 13:03:53,611][13989] Updated weights for policy 0, policy_version 7528 (0.0009) [2023-09-14 13:03:55,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19933.8, 300 sec: 20202.3). Total num frames: 30863360. Throughput: 0: 4975.5. Samples: 6710766. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:03:55,160][13933] Avg episode reward: [(0, '28.611')] [2023-09-14 13:03:55,732][13989] Updated weights for policy 0, policy_version 7538 (0.0008) [2023-09-14 13:03:57,807][13989] Updated weights for policy 0, policy_version 7548 (0.0008) [2023-09-14 13:03:59,871][13989] Updated weights for policy 0, policy_version 7558 (0.0008) [2023-09-14 13:04:00,160][13933] Fps is (10 sec: 19660.7, 60 sec: 19865.6, 300 sec: 20188.4). Total num frames: 30961664. Throughput: 0: 4965.4. Samples: 6725366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 13:04:00,160][13933] Avg episode reward: [(0, '27.264')] [2023-09-14 13:04:01,985][13989] Updated weights for policy 0, policy_version 7568 (0.0005) [2023-09-14 13:04:04,029][13989] Updated weights for policy 0, policy_version 7578 (0.0005) [2023-09-14 13:04:05,160][13933] Fps is (10 sec: 19660.5, 60 sec: 19865.5, 300 sec: 20174.5). Total num frames: 31059968. Throughput: 0: 4955.8. Samples: 6755036. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 13:04:05,160][13933] Avg episode reward: [(0, '28.701')] [2023-09-14 13:04:06,047][13989] Updated weights for policy 0, policy_version 7588 (0.0006) [2023-09-14 13:04:08,104][13989] Updated weights for policy 0, policy_version 7598 (0.0008) [2023-09-14 13:04:10,146][13989] Updated weights for policy 0, policy_version 7608 (0.0006) [2023-09-14 13:04:10,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19933.9, 300 sec: 20174.5). Total num frames: 31162368. Throughput: 0: 4956.1. Samples: 6785120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:04:10,160][13933] Avg episode reward: [(0, '27.253')] [2023-09-14 13:04:12,225][13989] Updated weights for policy 0, policy_version 7618 (0.0006) [2023-09-14 13:04:14,294][13989] Updated weights for policy 0, policy_version 7628 (0.0005) [2023-09-14 13:04:15,160][13933] Fps is (10 sec: 20070.7, 60 sec: 19865.6, 300 sec: 20174.5). Total num frames: 31260672. Throughput: 0: 4953.2. Samples: 6799912. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:04:15,160][13933] Avg episode reward: [(0, '26.117')] [2023-09-14 13:04:16,340][13989] Updated weights for policy 0, policy_version 7638 (0.0005) [2023-09-14 13:04:18,365][13989] Updated weights for policy 0, policy_version 7648 (0.0008) [2023-09-14 13:04:20,160][13933] Fps is (10 sec: 19660.9, 60 sec: 19865.6, 300 sec: 20146.8). Total num frames: 31358976. Throughput: 0: 4954.6. Samples: 6829982. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 13:04:20,160][13933] Avg episode reward: [(0, '25.849')] [2023-09-14 13:04:20,386][13989] Updated weights for policy 0, policy_version 7658 (0.0005) [2023-09-14 13:04:22,445][13989] Updated weights for policy 0, policy_version 7668 (0.0008) [2023-09-14 13:04:24,459][13989] Updated weights for policy 0, policy_version 7678 (0.0006) [2023-09-14 13:04:25,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19865.6, 300 sec: 20119.0). Total num frames: 31461376. Throughput: 0: 4965.8. Samples: 6860246. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 13:04:25,160][13933] Avg episode reward: [(0, '26.478')] [2023-09-14 13:04:26,503][13989] Updated weights for policy 0, policy_version 7688 (0.0008) [2023-09-14 13:04:28,579][13989] Updated weights for policy 0, policy_version 7698 (0.0008) [2023-09-14 13:04:30,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19865.6, 300 sec: 20077.3). Total num frames: 31559680. Throughput: 0: 4971.7. Samples: 6875212. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:04:30,160][13933] Avg episode reward: [(0, '30.650')] [2023-09-14 13:04:30,634][13989] Updated weights for policy 0, policy_version 7708 (0.0008) [2023-09-14 13:04:32,692][13989] Updated weights for policy 0, policy_version 7718 (0.0010) [2023-09-14 13:04:34,732][13989] Updated weights for policy 0, policy_version 7728 (0.0006) [2023-09-14 13:04:35,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19933.9, 300 sec: 20063.5). Total num frames: 31662080. Throughput: 0: 4978.0. Samples: 6905068. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:04:35,160][13933] Avg episode reward: [(0, '29.604')] [2023-09-14 13:04:36,793][13989] Updated weights for policy 0, policy_version 7738 (0.0006) [2023-09-14 13:04:38,799][13989] Updated weights for policy 0, policy_version 7748 (0.0005) [2023-09-14 13:04:40,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19865.6, 300 sec: 20021.8). Total num frames: 31760384. Throughput: 0: 4987.9. Samples: 6935222. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:04:40,160][13933] Avg episode reward: [(0, '27.049')] [2023-09-14 13:04:40,846][13989] Updated weights for policy 0, policy_version 7758 (0.0005) [2023-09-14 13:04:42,891][13989] Updated weights for policy 0, policy_version 7768 (0.0006) [2023-09-14 13:04:44,907][13989] Updated weights for policy 0, policy_version 7778 (0.0005) [2023-09-14 13:04:45,161][13933] Fps is (10 sec: 20067.8, 60 sec: 19933.4, 300 sec: 20007.8). Total num frames: 31862784. Throughput: 0: 4999.7. Samples: 6950360. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:04:45,161][13933] Avg episode reward: [(0, '28.374')] [2023-09-14 13:04:46,961][13989] Updated weights for policy 0, policy_version 7788 (0.0006) [2023-09-14 13:04:48,998][13989] Updated weights for policy 0, policy_version 7798 (0.0011) [2023-09-14 13:04:50,160][13933] Fps is (10 sec: 20070.2, 60 sec: 19933.8, 300 sec: 19966.3). Total num frames: 31961088. Throughput: 0: 5007.1. Samples: 6980354. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:04:50,160][13933] Avg episode reward: [(0, '28.122')] [2023-09-14 13:04:51,043][13989] Updated weights for policy 0, policy_version 7808 (0.0006) [2023-09-14 13:04:53,079][13989] Updated weights for policy 0, policy_version 7818 (0.0006) [2023-09-14 13:04:55,116][13989] Updated weights for policy 0, policy_version 7828 (0.0005) [2023-09-14 13:04:55,160][13933] Fps is (10 sec: 20072.9, 60 sec: 20002.1, 300 sec: 19994.0). Total num frames: 32063488. Throughput: 0: 5008.9. Samples: 7010522. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:04:55,160][13933] Avg episode reward: [(0, '27.012')] [2023-09-14 13:04:57,213][13989] Updated weights for policy 0, policy_version 7838 (0.0005) [2023-09-14 13:04:59,258][13989] Updated weights for policy 0, policy_version 7848 (0.0005) [2023-09-14 13:05:00,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20002.1, 300 sec: 19994.0). Total num frames: 32161792. Throughput: 0: 5009.0. Samples: 7025318. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:05:00,160][13933] Avg episode reward: [(0, '28.857')] [2023-09-14 13:05:01,348][13989] Updated weights for policy 0, policy_version 7858 (0.0009) [2023-09-14 13:05:03,427][13989] Updated weights for policy 0, policy_version 7868 (0.0011) [2023-09-14 13:05:05,160][13933] Fps is (10 sec: 19660.6, 60 sec: 20002.1, 300 sec: 19994.0). Total num frames: 32260096. Throughput: 0: 4999.9. Samples: 7054980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:05:05,160][13933] Avg episode reward: [(0, '29.707')] [2023-09-14 13:05:05,463][13989] Updated weights for policy 0, policy_version 7878 (0.0005) [2023-09-14 13:05:07,459][13989] Updated weights for policy 0, policy_version 7888 (0.0005) [2023-09-14 13:05:09,477][13989] Updated weights for policy 0, policy_version 7898 (0.0011) [2023-09-14 13:05:10,160][13933] Fps is (10 sec: 20070.6, 60 sec: 20002.2, 300 sec: 20007.9). Total num frames: 32362496. Throughput: 0: 5004.8. Samples: 7085464. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:05:10,160][13933] Avg episode reward: [(0, '27.323')] [2023-09-14 13:05:11,527][13989] Updated weights for policy 0, policy_version 7908 (0.0005) [2023-09-14 13:05:13,535][13989] Updated weights for policy 0, policy_version 7918 (0.0005) [2023-09-14 13:05:15,160][13933] Fps is (10 sec: 20070.6, 60 sec: 20002.1, 300 sec: 19966.3). Total num frames: 32460800. Throughput: 0: 5005.0. Samples: 7100436. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:05:15,160][13933] Avg episode reward: [(0, '28.639')] [2023-09-14 13:05:15,567][13989] Updated weights for policy 0, policy_version 7928 (0.0008) [2023-09-14 13:05:17,615][13989] Updated weights for policy 0, policy_version 7938 (0.0005) [2023-09-14 13:05:19,692][13989] Updated weights for policy 0, policy_version 7948 (0.0008) [2023-09-14 13:05:20,160][13933] Fps is (10 sec: 20070.3, 60 sec: 20070.4, 300 sec: 19938.5). Total num frames: 32563200. Throughput: 0: 5012.7. Samples: 7130638. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:05:20,160][13933] Avg episode reward: [(0, '26.945')] [2023-09-14 13:05:21,749][13989] Updated weights for policy 0, policy_version 7958 (0.0008) [2023-09-14 13:05:23,780][13989] Updated weights for policy 0, policy_version 7968 (0.0006) [2023-09-14 13:05:25,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20002.1, 300 sec: 19896.8). Total num frames: 32661504. Throughput: 0: 5007.5. Samples: 7160558. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:05:25,160][13933] Avg episode reward: [(0, '28.024')] [2023-09-14 13:05:25,863][13989] Updated weights for policy 0, policy_version 7978 (0.0005) [2023-09-14 13:05:27,881][13989] Updated weights for policy 0, policy_version 7988 (0.0008) [2023-09-14 13:05:29,951][13989] Updated weights for policy 0, policy_version 7998 (0.0006) [2023-09-14 13:05:30,160][13933] Fps is (10 sec: 19660.8, 60 sec: 20002.1, 300 sec: 19883.0). Total num frames: 32759808. Throughput: 0: 5004.4. Samples: 7175554. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:05:30,167][13933] Avg episode reward: [(0, '27.604')] [2023-09-14 13:05:31,974][13989] Updated weights for policy 0, policy_version 8008 (0.0008) [2023-09-14 13:05:34,003][13989] Updated weights for policy 0, policy_version 8018 (0.0005) [2023-09-14 13:05:35,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20002.1, 300 sec: 19896.8). Total num frames: 32862208. Throughput: 0: 5007.1. Samples: 7205672. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 13:05:35,160][13933] Avg episode reward: [(0, '26.749')] [2023-09-14 13:05:36,053][13989] Updated weights for policy 0, policy_version 8028 (0.0008) [2023-09-14 13:05:38,066][13989] Updated weights for policy 0, policy_version 8038 (0.0005) [2023-09-14 13:05:40,084][13989] Updated weights for policy 0, policy_version 8048 (0.0008) [2023-09-14 13:05:40,160][13933] Fps is (10 sec: 20480.2, 60 sec: 20070.4, 300 sec: 19896.8). Total num frames: 32964608. Throughput: 0: 5011.3. Samples: 7236030. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 13:05:40,160][13933] Avg episode reward: [(0, '29.817')] [2023-09-14 13:05:42,168][13989] Updated weights for policy 0, policy_version 8058 (0.0006) [2023-09-14 13:05:44,180][13989] Updated weights for policy 0, policy_version 8068 (0.0011) [2023-09-14 13:05:45,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20002.6, 300 sec: 19883.0). Total num frames: 33062912. Throughput: 0: 5009.5. Samples: 7250746. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 13:05:45,160][13933] Avg episode reward: [(0, '29.297')] [2023-09-14 13:05:45,200][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000008073_33067008.pth... [2023-09-14 13:05:45,249][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000006905_28282880.pth [2023-09-14 13:05:46,208][13989] Updated weights for policy 0, policy_version 8078 (0.0005) [2023-09-14 13:05:48,287][13989] Updated weights for policy 0, policy_version 8088 (0.0005) [2023-09-14 13:05:50,160][13933] Fps is (10 sec: 20070.3, 60 sec: 20070.4, 300 sec: 19896.8). Total num frames: 33165312. Throughput: 0: 5025.2. Samples: 7281112. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:05:50,173][13933] Avg episode reward: [(0, '28.176')] [2023-09-14 13:05:50,318][13989] Updated weights for policy 0, policy_version 8098 (0.0006) [2023-09-14 13:05:52,321][13989] Updated weights for policy 0, policy_version 8108 (0.0005) [2023-09-14 13:05:54,366][13989] Updated weights for policy 0, policy_version 8118 (0.0008) [2023-09-14 13:05:55,160][13933] Fps is (10 sec: 20479.9, 60 sec: 20070.4, 300 sec: 19910.7). Total num frames: 33267712. Throughput: 0: 5020.5. Samples: 7311386. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:05:55,160][13933] Avg episode reward: [(0, '30.585')] [2023-09-14 13:05:56,388][13989] Updated weights for policy 0, policy_version 8128 (0.0006) [2023-09-14 13:05:58,383][13989] Updated weights for policy 0, policy_version 8138 (0.0005) [2023-09-14 13:06:00,164][13933] Fps is (10 sec: 20061.6, 60 sec: 20069.0, 300 sec: 19910.4). Total num frames: 33366016. Throughput: 0: 5022.9. Samples: 7326488. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:06:00,164][13933] Avg episode reward: [(0, '27.222')] [2023-09-14 13:06:00,421][13989] Updated weights for policy 0, policy_version 8148 (0.0005) [2023-09-14 13:06:02,480][13989] Updated weights for policy 0, policy_version 8158 (0.0011) [2023-09-14 13:06:04,474][13989] Updated weights for policy 0, policy_version 8168 (0.0005) [2023-09-14 13:06:05,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20138.7, 300 sec: 19924.6). Total num frames: 33468416. Throughput: 0: 5026.3. Samples: 7356822. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:06:05,160][13933] Avg episode reward: [(0, '28.190')] [2023-09-14 13:06:06,481][13989] Updated weights for policy 0, policy_version 8178 (0.0008) [2023-09-14 13:06:08,492][13989] Updated weights for policy 0, policy_version 8188 (0.0005) [2023-09-14 13:06:10,160][13933] Fps is (10 sec: 20488.7, 60 sec: 20138.6, 300 sec: 19938.5). Total num frames: 33570816. Throughput: 0: 5039.5. Samples: 7387338. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:06:10,160][13933] Avg episode reward: [(0, '29.750')] [2023-09-14 13:06:10,554][13989] Updated weights for policy 0, policy_version 8198 (0.0006) [2023-09-14 13:06:12,619][13989] Updated weights for policy 0, policy_version 8208 (0.0005) [2023-09-14 13:06:14,656][13989] Updated weights for policy 0, policy_version 8218 (0.0006) [2023-09-14 13:06:15,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20138.7, 300 sec: 19938.5). Total num frames: 33669120. Throughput: 0: 5035.6. Samples: 7402154. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:06:15,160][13933] Avg episode reward: [(0, '28.481')] [2023-09-14 13:06:16,686][13989] Updated weights for policy 0, policy_version 8228 (0.0005) [2023-09-14 13:06:18,744][13989] Updated weights for policy 0, policy_version 8238 (0.0005) [2023-09-14 13:06:20,160][13933] Fps is (10 sec: 19660.6, 60 sec: 20070.3, 300 sec: 19938.5). Total num frames: 33767424. Throughput: 0: 5033.3. Samples: 7432170. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:06:20,160][13933] Avg episode reward: [(0, '26.932')] [2023-09-14 13:06:20,804][13989] Updated weights for policy 0, policy_version 8248 (0.0006) [2023-09-14 13:06:22,836][13989] Updated weights for policy 0, policy_version 8258 (0.0005) [2023-09-14 13:06:24,830][13989] Updated weights for policy 0, policy_version 8268 (0.0008) [2023-09-14 13:06:25,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20138.7, 300 sec: 19952.4). Total num frames: 33869824. Throughput: 0: 5031.9. Samples: 7462466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:06:25,160][13933] Avg episode reward: [(0, '27.078')] [2023-09-14 13:06:26,908][13989] Updated weights for policy 0, policy_version 8278 (0.0008) [2023-09-14 13:06:28,924][13989] Updated weights for policy 0, policy_version 8288 (0.0005) [2023-09-14 13:06:30,160][13933] Fps is (10 sec: 20070.9, 60 sec: 20138.7, 300 sec: 19952.4). Total num frames: 33968128. Throughput: 0: 5039.0. Samples: 7477502. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:06:30,160][13933] Avg episode reward: [(0, '27.726')] [2023-09-14 13:06:31,000][13989] Updated weights for policy 0, policy_version 8298 (0.0007) [2023-09-14 13:06:33,003][13989] Updated weights for policy 0, policy_version 8308 (0.0008) [2023-09-14 13:06:35,071][13989] Updated weights for policy 0, policy_version 8318 (0.0005) [2023-09-14 13:06:35,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20138.7, 300 sec: 19966.3). Total num frames: 34070528. Throughput: 0: 5031.7. Samples: 7507538. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:06:35,160][13933] Avg episode reward: [(0, '26.275')] [2023-09-14 13:06:37,146][13989] Updated weights for policy 0, policy_version 8328 (0.0005) [2023-09-14 13:06:39,195][13989] Updated weights for policy 0, policy_version 8338 (0.0006) [2023-09-14 13:06:40,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20070.4, 300 sec: 19952.4). Total num frames: 34168832. Throughput: 0: 5020.3. Samples: 7537300. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:06:40,160][13933] Avg episode reward: [(0, '27.517')] [2023-09-14 13:06:41,281][13989] Updated weights for policy 0, policy_version 8348 (0.0006) [2023-09-14 13:06:43,380][13989] Updated weights for policy 0, policy_version 8358 (0.0005) [2023-09-14 13:06:45,160][13933] Fps is (10 sec: 19660.8, 60 sec: 20070.4, 300 sec: 19952.4). Total num frames: 34267136. Throughput: 0: 5014.8. Samples: 7552130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:06:45,160][13933] Avg episode reward: [(0, '28.397')] [2023-09-14 13:06:45,419][13989] Updated weights for policy 0, policy_version 8368 (0.0006) [2023-09-14 13:06:47,475][13989] Updated weights for policy 0, policy_version 8378 (0.0009) [2023-09-14 13:06:49,546][13989] Updated weights for policy 0, policy_version 8388 (0.0005) [2023-09-14 13:06:50,160][13933] Fps is (10 sec: 20070.3, 60 sec: 20070.4, 300 sec: 19966.3). Total num frames: 34369536. Throughput: 0: 5002.6. Samples: 7581938. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:06:50,160][13933] Avg episode reward: [(0, '28.693')] [2023-09-14 13:06:51,599][13989] Updated weights for policy 0, policy_version 8398 (0.0008) [2023-09-14 13:06:53,701][13989] Updated weights for policy 0, policy_version 8408 (0.0006) [2023-09-14 13:06:55,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20002.2, 300 sec: 19952.4). Total num frames: 34467840. Throughput: 0: 4983.4. Samples: 7611588. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:06:55,160][13933] Avg episode reward: [(0, '27.747')] [2023-09-14 13:06:55,791][13989] Updated weights for policy 0, policy_version 8418 (0.0008) [2023-09-14 13:06:57,867][13989] Updated weights for policy 0, policy_version 8428 (0.0008) [2023-09-14 13:06:59,912][13989] Updated weights for policy 0, policy_version 8438 (0.0008) [2023-09-14 13:07:00,160][13933] Fps is (10 sec: 19660.5, 60 sec: 20003.6, 300 sec: 19952.4). Total num frames: 34566144. Throughput: 0: 4981.5. Samples: 7626324. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:07:00,160][13933] Avg episode reward: [(0, '29.583')] [2023-09-14 13:07:02,015][13989] Updated weights for policy 0, policy_version 8448 (0.0008) [2023-09-14 13:07:04,065][13989] Updated weights for policy 0, policy_version 8458 (0.0008) [2023-09-14 13:07:05,160][13933] Fps is (10 sec: 19660.5, 60 sec: 19933.8, 300 sec: 19966.3). Total num frames: 34664448. Throughput: 0: 4974.8. Samples: 7656034. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:07:05,160][13933] Avg episode reward: [(0, '28.314')] [2023-09-14 13:07:06,147][13989] Updated weights for policy 0, policy_version 8468 (0.0008) [2023-09-14 13:07:08,233][13989] Updated weights for policy 0, policy_version 8478 (0.0008) [2023-09-14 13:07:10,160][13933] Fps is (10 sec: 19660.9, 60 sec: 19865.6, 300 sec: 19966.3). Total num frames: 34762752. Throughput: 0: 4961.3. Samples: 7685724. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:07:10,161][13933] Avg episode reward: [(0, '26.791')] [2023-09-14 13:07:10,260][13989] Updated weights for policy 0, policy_version 8488 (0.0006) [2023-09-14 13:07:12,330][13989] Updated weights for policy 0, policy_version 8498 (0.0005) [2023-09-14 13:07:14,387][13989] Updated weights for policy 0, policy_version 8508 (0.0006) [2023-09-14 13:07:15,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19865.6, 300 sec: 19952.4). Total num frames: 34861056. Throughput: 0: 4958.0. Samples: 7700614. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:07:15,161][13933] Avg episode reward: [(0, '28.066')] [2023-09-14 13:07:16,452][13989] Updated weights for policy 0, policy_version 8518 (0.0006) [2023-09-14 13:07:18,557][13989] Updated weights for policy 0, policy_version 8528 (0.0008) [2023-09-14 13:07:20,160][13933] Fps is (10 sec: 20070.7, 60 sec: 19934.0, 300 sec: 19966.3). Total num frames: 34963456. Throughput: 0: 4950.3. Samples: 7730300. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:07:20,160][13933] Avg episode reward: [(0, '28.864')] [2023-09-14 13:07:20,572][13989] Updated weights for policy 0, policy_version 8538 (0.0005) [2023-09-14 13:07:22,570][13989] Updated weights for policy 0, policy_version 8548 (0.0005) [2023-09-14 13:07:24,596][13989] Updated weights for policy 0, policy_version 8558 (0.0008) [2023-09-14 13:07:25,160][13933] Fps is (10 sec: 20070.6, 60 sec: 19865.6, 300 sec: 19966.3). Total num frames: 35061760. Throughput: 0: 4963.9. Samples: 7760674. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:07:25,161][13933] Avg episode reward: [(0, '27.899')] [2023-09-14 13:07:26,615][13989] Updated weights for policy 0, policy_version 8568 (0.0008) [2023-09-14 13:07:28,635][13989] Updated weights for policy 0, policy_version 8578 (0.0005) [2023-09-14 13:07:30,160][13933] Fps is (10 sec: 20070.3, 60 sec: 19933.9, 300 sec: 19966.3). Total num frames: 35164160. Throughput: 0: 4971.3. Samples: 7775838. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:07:30,173][13933] Avg episode reward: [(0, '28.055')] [2023-09-14 13:07:30,678][13989] Updated weights for policy 0, policy_version 8588 (0.0008) [2023-09-14 13:07:32,702][13989] Updated weights for policy 0, policy_version 8598 (0.0006) [2023-09-14 13:07:34,691][13989] Updated weights for policy 0, policy_version 8608 (0.0008) [2023-09-14 13:07:35,160][13933] Fps is (10 sec: 20480.1, 60 sec: 19933.9, 300 sec: 19994.0). Total num frames: 35266560. Throughput: 0: 4983.1. Samples: 7806176. Policy #0 lag: (min: 0.0, avg: 0.5, max: 3.0) [2023-09-14 13:07:35,161][13933] Avg episode reward: [(0, '29.583')] [2023-09-14 13:07:36,766][13989] Updated weights for policy 0, policy_version 8618 (0.0006) [2023-09-14 13:07:38,729][13989] Updated weights for policy 0, policy_version 8628 (0.0008) [2023-09-14 13:07:40,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19933.9, 300 sec: 19980.1). Total num frames: 35364864. Throughput: 0: 4999.2. Samples: 7836554. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:07:40,161][13933] Avg episode reward: [(0, '29.291')] [2023-09-14 13:07:40,776][13989] Updated weights for policy 0, policy_version 8638 (0.0005) [2023-09-14 13:07:42,839][13989] Updated weights for policy 0, policy_version 8648 (0.0005) [2023-09-14 13:07:44,911][13989] Updated weights for policy 0, policy_version 8658 (0.0005) [2023-09-14 13:07:45,160][13933] Fps is (10 sec: 20070.3, 60 sec: 20002.1, 300 sec: 19994.0). Total num frames: 35467264. Throughput: 0: 5006.2. Samples: 7851604. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:07:45,161][13933] Avg episode reward: [(0, '29.378')] [2023-09-14 13:07:45,163][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000008659_35467264.pth... [2023-09-14 13:07:45,211][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000007487_30666752.pth [2023-09-14 13:07:46,935][13989] Updated weights for policy 0, policy_version 8668 (0.0011) [2023-09-14 13:07:48,963][13989] Updated weights for policy 0, policy_version 8678 (0.0005) [2023-09-14 13:07:50,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19933.9, 300 sec: 19994.0). Total num frames: 35565568. Throughput: 0: 5012.7. Samples: 7881606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:07:50,160][13933] Avg episode reward: [(0, '30.029')] [2023-09-14 13:07:51,030][13989] Updated weights for policy 0, policy_version 8688 (0.0005) [2023-09-14 13:07:53,068][13989] Updated weights for policy 0, policy_version 8698 (0.0005) [2023-09-14 13:07:55,150][13989] Updated weights for policy 0, policy_version 8708 (0.0006) [2023-09-14 13:07:55,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20002.1, 300 sec: 19994.0). Total num frames: 35667968. Throughput: 0: 5019.7. Samples: 7911612. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:07:55,160][13933] Avg episode reward: [(0, '28.776')] [2023-09-14 13:07:57,161][13989] Updated weights for policy 0, policy_version 8718 (0.0005) [2023-09-14 13:07:59,218][13989] Updated weights for policy 0, policy_version 8728 (0.0011) [2023-09-14 13:08:00,160][13933] Fps is (10 sec: 20070.3, 60 sec: 20002.2, 300 sec: 19994.0). Total num frames: 35766272. Throughput: 0: 5022.9. Samples: 7926642. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:08:00,160][13933] Avg episode reward: [(0, '28.097')] [2023-09-14 13:08:01,290][13989] Updated weights for policy 0, policy_version 8738 (0.0011) [2023-09-14 13:08:03,339][13989] Updated weights for policy 0, policy_version 8748 (0.0008) [2023-09-14 13:08:05,160][13933] Fps is (10 sec: 19660.7, 60 sec: 20002.2, 300 sec: 19994.0). Total num frames: 35864576. Throughput: 0: 5026.0. Samples: 7956472. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:08:05,160][13933] Avg episode reward: [(0, '26.821')] [2023-09-14 13:08:05,391][13989] Updated weights for policy 0, policy_version 8758 (0.0005) [2023-09-14 13:08:07,463][13989] Updated weights for policy 0, policy_version 8768 (0.0006) [2023-09-14 13:08:09,500][13989] Updated weights for policy 0, policy_version 8778 (0.0005) [2023-09-14 13:08:10,160][13933] Fps is (10 sec: 20070.5, 60 sec: 20070.4, 300 sec: 19994.0). Total num frames: 35966976. Throughput: 0: 5018.3. Samples: 7986496. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:08:10,160][13933] Avg episode reward: [(0, '28.351')] [2023-09-14 13:08:11,605][13989] Updated weights for policy 0, policy_version 8788 (0.0005) [2023-09-14 13:08:13,716][13989] Updated weights for policy 0, policy_version 8798 (0.0006) [2023-09-14 13:08:15,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20070.4, 300 sec: 19994.0). Total num frames: 36065280. Throughput: 0: 5005.6. Samples: 8001092. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 13:08:15,160][13933] Avg episode reward: [(0, '26.786')] [2023-09-14 13:08:15,729][13989] Updated weights for policy 0, policy_version 8808 (0.0005) [2023-09-14 13:08:17,781][13989] Updated weights for policy 0, policy_version 8818 (0.0005) [2023-09-14 13:08:19,823][13989] Updated weights for policy 0, policy_version 8828 (0.0006) [2023-09-14 13:08:20,160][13933] Fps is (10 sec: 19660.8, 60 sec: 20002.1, 300 sec: 19980.2). Total num frames: 36163584. Throughput: 0: 4996.6. Samples: 8031024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 13:08:20,160][13933] Avg episode reward: [(0, '27.559')] [2023-09-14 13:08:21,868][13989] Updated weights for policy 0, policy_version 8838 (0.0006) [2023-09-14 13:08:23,887][13989] Updated weights for policy 0, policy_version 8848 (0.0005) [2023-09-14 13:08:25,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20070.4, 300 sec: 19994.0). Total num frames: 36265984. Throughput: 0: 4992.5. Samples: 8061216. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 13:08:25,160][13933] Avg episode reward: [(0, '30.458')] [2023-09-14 13:08:25,926][13989] Updated weights for policy 0, policy_version 8858 (0.0008) [2023-09-14 13:08:28,011][13989] Updated weights for policy 0, policy_version 8868 (0.0013) [2023-09-14 13:08:30,065][13989] Updated weights for policy 0, policy_version 8878 (0.0006) [2023-09-14 13:08:30,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20002.1, 300 sec: 19994.0). Total num frames: 36364288. Throughput: 0: 4990.2. Samples: 8076162. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:08:30,160][13933] Avg episode reward: [(0, '27.602')] [2023-09-14 13:08:32,191][13989] Updated weights for policy 0, policy_version 8888 (0.0006) [2023-09-14 13:08:34,247][13989] Updated weights for policy 0, policy_version 8898 (0.0005) [2023-09-14 13:08:35,160][13933] Fps is (10 sec: 19660.8, 60 sec: 19933.9, 300 sec: 19980.1). Total num frames: 36462592. Throughput: 0: 4977.5. Samples: 8105592. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:08:35,160][13933] Avg episode reward: [(0, '27.293')] [2023-09-14 13:08:36,313][13989] Updated weights for policy 0, policy_version 8908 (0.0010) [2023-09-14 13:08:38,360][13989] Updated weights for policy 0, policy_version 8918 (0.0011) [2023-09-14 13:08:40,160][13933] Fps is (10 sec: 19660.6, 60 sec: 19933.8, 300 sec: 19980.1). Total num frames: 36560896. Throughput: 0: 4976.9. Samples: 8135572. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:08:40,160][13933] Avg episode reward: [(0, '29.882')] [2023-09-14 13:08:40,412][13989] Updated weights for policy 0, policy_version 8928 (0.0005) [2023-09-14 13:08:42,418][13989] Updated weights for policy 0, policy_version 8938 (0.0005) [2023-09-14 13:08:44,456][13989] Updated weights for policy 0, policy_version 8948 (0.0010) [2023-09-14 13:08:45,160][13933] Fps is (10 sec: 20070.4, 60 sec: 19933.9, 300 sec: 19994.0). Total num frames: 36663296. Throughput: 0: 4976.9. Samples: 8150602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:08:45,160][13933] Avg episode reward: [(0, '27.124')] [2023-09-14 13:08:46,495][13989] Updated weights for policy 0, policy_version 8958 (0.0005) [2023-09-14 13:08:48,491][13989] Updated weights for policy 0, policy_version 8968 (0.0008) [2023-09-14 13:08:50,160][13933] Fps is (10 sec: 20480.1, 60 sec: 20002.1, 300 sec: 20007.9). Total num frames: 36765696. Throughput: 0: 4991.9. Samples: 8181108. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:08:50,160][13933] Avg episode reward: [(0, '29.140')] [2023-09-14 13:08:50,537][13989] Updated weights for policy 0, policy_version 8978 (0.0005) [2023-09-14 13:08:52,547][13989] Updated weights for policy 0, policy_version 8988 (0.0005) [2023-09-14 13:08:54,568][13989] Updated weights for policy 0, policy_version 8998 (0.0008) [2023-09-14 13:08:55,160][13933] Fps is (10 sec: 20070.2, 60 sec: 19933.8, 300 sec: 20007.9). Total num frames: 36864000. Throughput: 0: 4995.0. Samples: 8211272. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:08:55,160][13933] Avg episode reward: [(0, '28.288')] [2023-09-14 13:08:56,486][13989] Updated weights for policy 0, policy_version 9008 (0.0005) [2023-09-14 13:08:58,346][13989] Updated weights for policy 0, policy_version 9018 (0.0005) [2023-09-14 13:09:00,160][13933] Fps is (10 sec: 20889.7, 60 sec: 20138.7, 300 sec: 20049.6). Total num frames: 36974592. Throughput: 0: 5032.0. Samples: 8227532. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:09:00,160][13933] Avg episode reward: [(0, '27.311')] [2023-09-14 13:09:00,201][13989] Updated weights for policy 0, policy_version 9028 (0.0005) [2023-09-14 13:09:02,087][13989] Updated weights for policy 0, policy_version 9038 (0.0005) [2023-09-14 13:09:04,001][13989] Updated weights for policy 0, policy_version 9048 (0.0005) [2023-09-14 13:09:05,160][13933] Fps is (10 sec: 22118.7, 60 sec: 20343.5, 300 sec: 20077.3). Total num frames: 37085184. Throughput: 0: 5092.9. Samples: 8260204. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:09:05,160][13933] Avg episode reward: [(0, '29.143')] [2023-09-14 13:09:05,875][13989] Updated weights for policy 0, policy_version 9058 (0.0005) [2023-09-14 13:09:07,746][13989] Updated weights for policy 0, policy_version 9068 (0.0005) [2023-09-14 13:09:09,616][13989] Updated weights for policy 0, policy_version 9078 (0.0005) [2023-09-14 13:09:10,160][13933] Fps is (10 sec: 21708.8, 60 sec: 20411.7, 300 sec: 20105.1). Total num frames: 37191680. Throughput: 0: 5149.2. Samples: 8292928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:09:10,160][13933] Avg episode reward: [(0, '29.458')] [2023-09-14 13:09:11,506][13989] Updated weights for policy 0, policy_version 9088 (0.0005) [2023-09-14 13:09:13,351][13989] Updated weights for policy 0, policy_version 9098 (0.0008) [2023-09-14 13:09:15,160][13933] Fps is (10 sec: 21708.8, 60 sec: 20616.5, 300 sec: 20146.8). Total num frames: 37302272. Throughput: 0: 5181.8. Samples: 8309344. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:09:15,160][13933] Avg episode reward: [(0, '25.489')] [2023-09-14 13:09:15,259][13989] Updated weights for policy 0, policy_version 9108 (0.0008) [2023-09-14 13:09:17,139][13989] Updated weights for policy 0, policy_version 9118 (0.0005) [2023-09-14 13:09:19,022][13989] Updated weights for policy 0, policy_version 9128 (0.0005) [2023-09-14 13:09:20,160][13933] Fps is (10 sec: 21708.8, 60 sec: 20753.1, 300 sec: 20160.7). Total num frames: 37408768. Throughput: 0: 5253.1. Samples: 8341982. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:09:20,160][13933] Avg episode reward: [(0, '29.504')] [2023-09-14 13:09:20,920][13989] Updated weights for policy 0, policy_version 9138 (0.0005) [2023-09-14 13:09:22,827][13989] Updated weights for policy 0, policy_version 9148 (0.0005) [2023-09-14 13:09:24,699][13989] Updated weights for policy 0, policy_version 9158 (0.0005) [2023-09-14 13:09:25,160][13933] Fps is (10 sec: 21708.6, 60 sec: 20889.6, 300 sec: 20202.3). Total num frames: 37519360. Throughput: 0: 5306.4. Samples: 8374358. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 13:09:25,160][13933] Avg episode reward: [(0, '30.236')] [2023-09-14 13:09:26,594][13989] Updated weights for policy 0, policy_version 9168 (0.0006) [2023-09-14 13:09:28,472][13989] Updated weights for policy 0, policy_version 9178 (0.0005) [2023-09-14 13:09:30,160][13933] Fps is (10 sec: 21708.7, 60 sec: 21026.1, 300 sec: 20216.2). Total num frames: 37625856. Throughput: 0: 5334.7. Samples: 8390662. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:09:30,160][13933] Avg episode reward: [(0, '27.687')] [2023-09-14 13:09:30,376][13989] Updated weights for policy 0, policy_version 9188 (0.0010) [2023-09-14 13:09:32,279][13989] Updated weights for policy 0, policy_version 9198 (0.0005) [2023-09-14 13:09:34,108][13989] Updated weights for policy 0, policy_version 9208 (0.0005) [2023-09-14 13:09:35,160][13933] Fps is (10 sec: 21709.1, 60 sec: 21231.0, 300 sec: 20257.8). Total num frames: 37736448. Throughput: 0: 5380.1. Samples: 8423210. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:09:35,160][13933] Avg episode reward: [(0, '26.012')] [2023-09-14 13:09:35,980][13989] Updated weights for policy 0, policy_version 9218 (0.0005) [2023-09-14 13:09:37,847][13989] Updated weights for policy 0, policy_version 9228 (0.0008) [2023-09-14 13:09:39,715][13989] Updated weights for policy 0, policy_version 9238 (0.0012) [2023-09-14 13:09:40,160][13933] Fps is (10 sec: 22118.4, 60 sec: 21435.8, 300 sec: 20285.7). Total num frames: 37847040. Throughput: 0: 5441.6. Samples: 8456142. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:09:40,160][13933] Avg episode reward: [(0, '30.124')] [2023-09-14 13:09:41,578][13989] Updated weights for policy 0, policy_version 9248 (0.0010) [2023-09-14 13:09:43,443][13989] Updated weights for policy 0, policy_version 9258 (0.0005) [2023-09-14 13:09:45,160][13933] Fps is (10 sec: 22118.3, 60 sec: 21572.3, 300 sec: 20327.3). Total num frames: 37957632. Throughput: 0: 5448.6. Samples: 8472718. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:09:45,160][13933] Avg episode reward: [(0, '29.366')] [2023-09-14 13:09:45,163][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000009267_37957632.pth... [2023-09-14 13:09:45,210][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000008073_33067008.pth [2023-09-14 13:09:45,352][13989] Updated weights for policy 0, policy_version 9268 (0.0008) [2023-09-14 13:09:47,211][13989] Updated weights for policy 0, policy_version 9278 (0.0005) [2023-09-14 13:09:49,116][13989] Updated weights for policy 0, policy_version 9288 (0.0005) [2023-09-14 13:09:50,160][13933] Fps is (10 sec: 21708.8, 60 sec: 21640.6, 300 sec: 20341.2). Total num frames: 38064128. Throughput: 0: 5444.8. Samples: 8505222. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:09:50,160][13933] Avg episode reward: [(0, '27.500')] [2023-09-14 13:09:51,016][13989] Updated weights for policy 0, policy_version 9298 (0.0005) [2023-09-14 13:09:52,854][13989] Updated weights for policy 0, policy_version 9308 (0.0008) [2023-09-14 13:09:54,754][13989] Updated weights for policy 0, policy_version 9318 (0.0008) [2023-09-14 13:09:55,160][13933] Fps is (10 sec: 21708.8, 60 sec: 21845.4, 300 sec: 20382.8). Total num frames: 38174720. Throughput: 0: 5447.0. Samples: 8538042. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:09:55,160][13933] Avg episode reward: [(0, '29.365')] [2023-09-14 13:09:56,608][13989] Updated weights for policy 0, policy_version 9328 (0.0005) [2023-09-14 13:09:58,484][13989] Updated weights for policy 0, policy_version 9338 (0.0005) [2023-09-14 13:10:00,160][13933] Fps is (10 sec: 21708.8, 60 sec: 21777.1, 300 sec: 20410.6). Total num frames: 38281216. Throughput: 0: 5444.3. Samples: 8554336. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:10:00,160][13933] Avg episode reward: [(0, '32.442')] [2023-09-14 13:10:00,161][13971] Saving new best policy, reward=32.442! [2023-09-14 13:10:00,469][13989] Updated weights for policy 0, policy_version 9348 (0.0006) [2023-09-14 13:10:02,527][13989] Updated weights for policy 0, policy_version 9358 (0.0008) [2023-09-14 13:10:04,622][13989] Updated weights for policy 0, policy_version 9368 (0.0005) [2023-09-14 13:10:05,160][13933] Fps is (10 sec: 20480.0, 60 sec: 21572.3, 300 sec: 20396.7). Total num frames: 38379520. Throughput: 0: 5400.7. Samples: 8585014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 13:10:05,160][13933] Avg episode reward: [(0, '29.751')] [2023-09-14 13:10:06,728][13989] Updated weights for policy 0, policy_version 9378 (0.0006) [2023-09-14 13:10:08,764][13989] Updated weights for policy 0, policy_version 9388 (0.0005) [2023-09-14 13:10:10,160][13933] Fps is (10 sec: 19660.9, 60 sec: 21435.7, 300 sec: 20396.7). Total num frames: 38477824. Throughput: 0: 5344.5. Samples: 8614858. Policy #0 lag: (min: 0.0, avg: 0.6, max: 3.0) [2023-09-14 13:10:10,161][13933] Avg episode reward: [(0, '28.860')] [2023-09-14 13:10:10,780][13989] Updated weights for policy 0, policy_version 9398 (0.0006) [2023-09-14 13:10:12,890][13989] Updated weights for policy 0, policy_version 9408 (0.0008) [2023-09-14 13:10:14,932][13989] Updated weights for policy 0, policy_version 9418 (0.0006) [2023-09-14 13:10:15,160][13933] Fps is (10 sec: 20070.4, 60 sec: 21299.2, 300 sec: 20396.7). Total num frames: 38580224. Throughput: 0: 5313.3. Samples: 8629762. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 13:10:15,160][13933] Avg episode reward: [(0, '28.846')] [2023-09-14 13:10:16,931][13989] Updated weights for policy 0, policy_version 9428 (0.0006) [2023-09-14 13:10:18,980][13989] Updated weights for policy 0, policy_version 9438 (0.0005) [2023-09-14 13:10:20,160][13933] Fps is (10 sec: 20070.3, 60 sec: 21162.7, 300 sec: 20396.7). Total num frames: 38678528. Throughput: 0: 5256.3. Samples: 8659742. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-09-14 13:10:20,160][13933] Avg episode reward: [(0, '28.994')] [2023-09-14 13:10:21,091][13989] Updated weights for policy 0, policy_version 9448 (0.0009) [2023-09-14 13:10:23,181][13989] Updated weights for policy 0, policy_version 9458 (0.0008) [2023-09-14 13:10:25,160][13933] Fps is (10 sec: 19660.8, 60 sec: 20957.9, 300 sec: 20396.7). Total num frames: 38776832. Throughput: 0: 5183.1. Samples: 8689382. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:10:25,160][13933] Avg episode reward: [(0, '27.076')] [2023-09-14 13:10:25,200][13989] Updated weights for policy 0, policy_version 9468 (0.0005) [2023-09-14 13:10:27,255][13989] Updated weights for policy 0, policy_version 9478 (0.0008) [2023-09-14 13:10:29,259][13989] Updated weights for policy 0, policy_version 9488 (0.0005) [2023-09-14 13:10:30,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20889.6, 300 sec: 20396.7). Total num frames: 38879232. Throughput: 0: 5153.1. Samples: 8704606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:10:30,160][13933] Avg episode reward: [(0, '27.999')] [2023-09-14 13:10:31,312][13989] Updated weights for policy 0, policy_version 9498 (0.0005) [2023-09-14 13:10:33,335][13989] Updated weights for policy 0, policy_version 9508 (0.0005) [2023-09-14 13:10:35,160][13933] Fps is (10 sec: 20070.3, 60 sec: 20684.8, 300 sec: 20382.8). Total num frames: 38977536. Throughput: 0: 5100.8. Samples: 8734758. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:10:35,160][13933] Avg episode reward: [(0, '26.992')] [2023-09-14 13:10:35,419][13989] Updated weights for policy 0, policy_version 9518 (0.0005) [2023-09-14 13:10:37,443][13989] Updated weights for policy 0, policy_version 9528 (0.0006) [2023-09-14 13:10:39,479][13989] Updated weights for policy 0, policy_version 9538 (0.0005) [2023-09-14 13:10:40,160][13933] Fps is (10 sec: 20070.4, 60 sec: 20548.3, 300 sec: 20396.7). Total num frames: 39079936. Throughput: 0: 5040.2. Samples: 8764850. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:10:40,160][13933] Avg episode reward: [(0, '27.407')] [2023-09-14 13:10:41,491][13989] Updated weights for policy 0, policy_version 9548 (0.0008) [2023-09-14 13:10:43,301][13989] Updated weights for policy 0, policy_version 9558 (0.0005) [2023-09-14 13:10:45,160][13933] Fps is (10 sec: 20889.7, 60 sec: 20480.0, 300 sec: 20410.6). Total num frames: 39186432. Throughput: 0: 5024.4. Samples: 8780432. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:10:45,160][13933] Avg episode reward: [(0, '29.995')] [2023-09-14 13:10:45,192][13989] Updated weights for policy 0, policy_version 9568 (0.0005) [2023-09-14 13:10:47,055][13989] Updated weights for policy 0, policy_version 9578 (0.0005) [2023-09-14 13:10:48,904][13989] Updated weights for policy 0, policy_version 9588 (0.0005) [2023-09-14 13:10:50,160][13933] Fps is (10 sec: 21708.8, 60 sec: 20548.3, 300 sec: 20438.3). Total num frames: 39297024. Throughput: 0: 5076.1. Samples: 8813438. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:10:50,161][13933] Avg episode reward: [(0, '28.208')] [2023-09-14 13:10:50,796][13989] Updated weights for policy 0, policy_version 9598 (0.0008) [2023-09-14 13:10:52,649][13989] Updated weights for policy 0, policy_version 9608 (0.0005) [2023-09-14 13:10:54,540][13989] Updated weights for policy 0, policy_version 9618 (0.0005) [2023-09-14 13:10:55,160][13933] Fps is (10 sec: 22118.2, 60 sec: 20548.2, 300 sec: 20480.3). Total num frames: 39407616. Throughput: 0: 5143.6. Samples: 8846322. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:10:55,160][13933] Avg episode reward: [(0, '29.370')] [2023-09-14 13:10:56,391][13989] Updated weights for policy 0, policy_version 9628 (0.0007) [2023-09-14 13:10:58,245][13989] Updated weights for policy 0, policy_version 9638 (0.0005) [2023-09-14 13:11:00,135][13989] Updated weights for policy 0, policy_version 9648 (0.0005) [2023-09-14 13:11:00,160][13933] Fps is (10 sec: 22118.2, 60 sec: 20616.5, 300 sec: 20507.8). Total num frames: 39518208. Throughput: 0: 5178.6. Samples: 8862800. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:11:00,160][13933] Avg episode reward: [(0, '27.988')] [2023-09-14 13:11:02,034][13989] Updated weights for policy 0, policy_version 9658 (0.0005) [2023-09-14 13:11:03,896][13989] Updated weights for policy 0, policy_version 9668 (0.0007) [2023-09-14 13:11:05,160][13933] Fps is (10 sec: 22118.7, 60 sec: 20821.4, 300 sec: 20535.6). Total num frames: 39628800. Throughput: 0: 5239.7. Samples: 8895526. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-09-14 13:11:05,160][13933] Avg episode reward: [(0, '27.762')] [2023-09-14 13:11:05,731][13989] Updated weights for policy 0, policy_version 9678 (0.0005) [2023-09-14 13:11:07,572][13989] Updated weights for policy 0, policy_version 9688 (0.0008) [2023-09-14 13:11:09,426][13989] Updated weights for policy 0, policy_version 9698 (0.0005) [2023-09-14 13:11:10,160][13933] Fps is (10 sec: 21709.0, 60 sec: 20957.9, 300 sec: 20563.3). Total num frames: 39735296. Throughput: 0: 5322.1. Samples: 8928878. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-09-14 13:11:10,160][13933] Avg episode reward: [(0, '29.682')] [2023-09-14 13:11:11,266][13989] Updated weights for policy 0, policy_version 9708 (0.0005) [2023-09-14 13:11:13,137][13989] Updated weights for policy 0, policy_version 9718 (0.0005) [2023-09-14 13:11:15,008][13989] Updated weights for policy 0, policy_version 9728 (0.0005) [2023-09-14 13:11:15,160][13933] Fps is (10 sec: 21708.6, 60 sec: 21094.4, 300 sec: 20605.0). Total num frames: 39845888. Throughput: 0: 5348.7. Samples: 8945298. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:11:15,160][13933] Avg episode reward: [(0, '30.240')] [2023-09-14 13:11:16,871][13989] Updated weights for policy 0, policy_version 9738 (0.0008) [2023-09-14 13:11:18,731][13989] Updated weights for policy 0, policy_version 9748 (0.0005) [2023-09-14 13:11:20,160][13933] Fps is (10 sec: 22118.4, 60 sec: 21299.2, 300 sec: 20632.7). Total num frames: 39956480. Throughput: 0: 5411.9. Samples: 8978294. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-09-14 13:11:20,160][13933] Avg episode reward: [(0, '29.384')] [2023-09-14 13:11:20,580][13989] Updated weights for policy 0, policy_version 9758 (0.0005) [2023-09-14 13:11:22,238][13992] Stopping RolloutWorker_w2... [2023-09-14 13:11:22,238][13992] Loop rollout_proc2_evt_loop terminating... [2023-09-14 13:11:22,239][13999] Stopping RolloutWorker_w3... [2023-09-14 13:11:22,239][13933] Component RolloutWorker_w2 stopped! [2023-09-14 13:11:22,239][13999] Loop rollout_proc3_evt_loop terminating... [2023-09-14 13:11:22,239][13933] Component RolloutWorker_w3 stopped! [2023-09-14 13:11:22,239][13933] Component RolloutWorker_w0 stopped! [2023-09-14 13:11:22,239][13990] Stopping RolloutWorker_w0... [2023-09-14 13:11:22,240][13990] Loop rollout_proc0_evt_loop terminating... [2023-09-14 13:11:22,240][13971] Stopping Batcher_0... [2023-09-14 13:11:22,240][13971] Loop batcher_evt_loop terminating... [2023-09-14 13:11:22,241][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000009767_40005632.pth... [2023-09-14 13:11:22,241][14000] Stopping RolloutWorker_w5... [2023-09-14 13:11:22,241][13933] Component Batcher_0 stopped! [2023-09-14 13:11:22,243][14000] Loop rollout_proc5_evt_loop terminating... [2023-09-14 13:11:22,243][13933] Component RolloutWorker_w5 stopped! [2023-09-14 13:11:22,243][13933] Component RolloutWorker_w6 stopped! [2023-09-14 13:11:22,243][13933] Component RolloutWorker_w1 stopped! [2023-09-14 13:11:22,242][14001] Stopping RolloutWorker_w6... [2023-09-14 13:11:22,244][13933] Component RolloutWorker_w4 stopped! [2023-09-14 13:11:22,242][13991] Stopping RolloutWorker_w1... [2023-09-14 13:11:22,243][13998] Stopping RolloutWorker_w4... [2023-09-14 13:11:22,244][13991] Loop rollout_proc1_evt_loop terminating... [2023-09-14 13:11:22,244][13998] Loop rollout_proc4_evt_loop terminating... [2023-09-14 13:11:22,244][14001] Loop rollout_proc6_evt_loop terminating... [2023-09-14 13:11:22,265][14003] Stopping RolloutWorker_w7... [2023-09-14 13:11:22,265][13933] Component RolloutWorker_w7 stopped! [2023-09-14 13:11:22,265][14003] Loop rollout_proc7_evt_loop terminating... [2023-09-14 13:11:22,268][13989] Weights refcount: 2 0 [2023-09-14 13:11:22,269][13989] Stopping InferenceWorker_p0-w0... [2023-09-14 13:11:22,270][13989] Loop inference_proc0-0_evt_loop terminating... [2023-09-14 13:11:22,270][13933] Component InferenceWorker_p0-w0 stopped! [2023-09-14 13:11:22,293][13971] Removing ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000008659_35467264.pth [2023-09-14 13:11:22,298][13971] Saving ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000009767_40005632.pth... [2023-09-14 13:11:22,420][13971] Stopping LearnerWorker_p0... [2023-09-14 13:11:22,421][13971] Loop learner_proc0_evt_loop terminating... [2023-09-14 13:11:22,420][13933] Component LearnerWorker_p0 stopped! [2023-09-14 13:11:22,421][13933] Waiting for process learner_proc0 to stop... [2023-09-14 13:11:23,032][13933] Waiting for process inference_proc0-0 to join... [2023-09-14 13:11:23,032][13933] Waiting for process rollout_proc0 to join... [2023-09-14 13:11:23,033][13933] Waiting for process rollout_proc1 to join... [2023-09-14 13:11:23,059][13933] Waiting for process rollout_proc2 to join... [2023-09-14 13:11:23,059][13933] Waiting for process rollout_proc3 to join... [2023-09-14 13:11:23,142][13933] Waiting for process rollout_proc4 to join... [2023-09-14 13:11:23,197][13933] Waiting for process rollout_proc5 to join... [2023-09-14 13:11:23,197][13933] Waiting for process rollout_proc6 to join... [2023-09-14 13:11:23,197][13933] Waiting for process rollout_proc7 to join... [2023-09-14 13:11:23,201][13933] Batcher 0 profile tree view: batching: 108.5303, releasing_batches: 0.1377 [2023-09-14 13:11:23,201][13933] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 wait_policy_total: 32.8843 update_model: 21.3338 weight_update: 0.0007 one_step: 0.0038 handle_policy_step: 1645.0826 deserialize: 62.9726, stack: 7.4605, obs_to_device_normalize: 366.7744, forward: 837.9830, send_messages: 86.4006 prepare_outputs: 228.7535 to_cpu: 151.6978 [2023-09-14 13:11:23,201][13933] Learner 0 profile tree view: misc: 0.0323, prepare_batch: 91.8599 train: 365.6640 epoch_init: 0.0313, minibatch_init: 0.0336, losses_postprocess: 1.6467, kl_divergence: 1.6459, after_optimizer: 6.5903 calculate_losses: 110.2351 losses_init: 0.0159, forward_head: 3.6102, bptt_initial: 75.7316, tail: 3.9951, advantages_returns: 1.1093, losses: 17.8592 bptt: 7.0209 bptt_forward_core: 6.7788 update: 243.2517 clip: 203.7871 [2023-09-14 13:11:23,201][13933] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.9303, enqueue_policy_requests: 38.2952, env_step: 702.9638, overhead: 40.9634, complete_rollouts: 4.1382 save_policy_outputs: 40.3524 split_output_tensors: 19.0828 [2023-09-14 13:11:23,202][13933] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.5956, enqueue_policy_requests: 35.3910, env_step: 974.9872, overhead: 38.2093, complete_rollouts: 1.7157 save_policy_outputs: 33.9563 split_output_tensors: 16.3422 [2023-09-14 13:11:23,202][13933] Loop Runner_EvtLoop terminating... [2023-09-14 13:11:23,202][13933] Runner profile tree view: main_loop: 1775.4556 [2023-09-14 13:11:23,202][13933] Collected {0: 40005632}, FPS: 20276.3 [2023-09-14 13:11:23,207][13933] Loading existing experiment configuration from ./PPO-VizDoom/train_dir/default_experiment/config.json [2023-09-14 13:11:23,207][13933] Overriding arg 'num_workers' with value 1 passed from command line [2023-09-14 13:11:23,207][13933] Adding new argument 'no_render'=True that is not in the saved config file! [2023-09-14 13:11:23,207][13933] Adding new argument 'save_video'=True that is not in the saved config file! [2023-09-14 13:11:23,208][13933] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-09-14 13:11:23,208][13933] Adding new argument 'video_name'=None that is not in the saved config file! [2023-09-14 13:11:23,208][13933] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-09-14 13:11:23,208][13933] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-09-14 13:11:23,208][13933] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-09-14 13:11:23,208][13933] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-09-14 13:11:23,208][13933] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-09-14 13:11:23,208][13933] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-09-14 13:11:23,208][13933] Adding new argument 'train_script'=None that is not in the saved config file! [2023-09-14 13:11:23,208][13933] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-09-14 13:11:23,208][13933] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-09-14 13:11:23,226][13933] Doom resolution: 160x120, resize resolution: (128, 72) [2023-09-14 13:11:23,228][13933] RunningMeanStd input shape: (3, 72, 128) [2023-09-14 13:11:23,228][13933] RunningMeanStd input shape: (1,) [2023-09-14 13:11:23,236][13933] ConvEncoder: input_channels=3 [2023-09-14 13:11:23,316][13933] Conv encoder output size: 512 [2023-09-14 13:11:23,317][13933] Policy head output size: 512 [2023-09-14 13:11:23,395][13933] Loading state from checkpoint ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000009767_40005632.pth... [2023-09-14 13:11:23,747][13933] Num frames 100... [2023-09-14 13:11:23,821][13933] Num frames 200... [2023-09-14 13:11:23,888][13933] Num frames 300... [2023-09-14 13:11:23,956][13933] Num frames 400... [2023-09-14 13:11:24,022][13933] Num frames 500... [2023-09-14 13:11:24,089][13933] Num frames 600... [2023-09-14 13:11:24,155][13933] Num frames 700... [2023-09-14 13:11:24,220][13933] Num frames 800... [2023-09-14 13:11:24,286][13933] Num frames 900... [2023-09-14 13:11:24,352][13933] Num frames 1000... [2023-09-14 13:11:24,438][13933] Avg episode rewards: #0: 23.510, true rewards: #0: 10.510 [2023-09-14 13:11:24,439][13933] Avg episode reward: 23.510, avg true_objective: 10.510 [2023-09-14 13:11:24,471][13933] Num frames 1100... [2023-09-14 13:11:24,538][13933] Num frames 1200... [2023-09-14 13:11:24,605][13933] Num frames 1300... [2023-09-14 13:11:24,672][13933] Num frames 1400... [2023-09-14 13:11:24,738][13933] Num frames 1500... [2023-09-14 13:11:24,804][13933] Num frames 1600... [2023-09-14 13:11:24,870][13933] Num frames 1700... [2023-09-14 13:11:24,937][13933] Num frames 1800... [2023-09-14 13:11:25,012][13933] Num frames 1900... [2023-09-14 13:11:25,082][13933] Num frames 2000... [2023-09-14 13:11:25,152][13933] Num frames 2100... [2023-09-14 13:11:25,221][13933] Num frames 2200... [2023-09-14 13:11:25,291][13933] Num frames 2300... [2023-09-14 13:11:25,360][13933] Num frames 2400... [2023-09-14 13:11:25,453][13933] Avg episode rewards: #0: 28.295, true rewards: #0: 12.295 [2023-09-14 13:11:25,453][13933] Avg episode reward: 28.295, avg true_objective: 12.295 [2023-09-14 13:11:25,482][13933] Num frames 2500... [2023-09-14 13:11:25,551][13933] Num frames 2600... [2023-09-14 13:11:25,622][13933] Num frames 2700... [2023-09-14 13:11:25,691][13933] Num frames 2800... [2023-09-14 13:11:25,761][13933] Num frames 2900... [2023-09-14 13:11:25,830][13933] Num frames 3000... [2023-09-14 13:11:25,899][13933] Num frames 3100... [2023-09-14 13:11:25,967][13933] Num frames 3200... [2023-09-14 13:11:26,037][13933] Num frames 3300... [2023-09-14 13:11:26,107][13933] Num frames 3400... [2023-09-14 13:11:26,177][13933] Num frames 3500... [2023-09-14 13:11:26,238][13933] Avg episode rewards: #0: 28.040, true rewards: #0: 11.707 [2023-09-14 13:11:26,239][13933] Avg episode reward: 28.040, avg true_objective: 11.707 [2023-09-14 13:11:26,299][13933] Num frames 3600... [2023-09-14 13:11:26,368][13933] Num frames 3700... [2023-09-14 13:11:26,467][13933] Avg episode rewards: #0: 21.670, true rewards: #0: 9.420 [2023-09-14 13:11:26,468][13933] Avg episode reward: 21.670, avg true_objective: 9.420 [2023-09-14 13:11:26,491][13933] Num frames 3800... [2023-09-14 13:11:26,561][13933] Num frames 3900... [2023-09-14 13:11:26,632][13933] Num frames 4000... [2023-09-14 13:11:26,702][13933] Num frames 4100... [2023-09-14 13:11:26,768][13933] Avg episode rewards: #0: 18.240, true rewards: #0: 8.240 [2023-09-14 13:11:26,768][13933] Avg episode reward: 18.240, avg true_objective: 8.240 [2023-09-14 13:11:26,827][13933] Num frames 4200... [2023-09-14 13:11:26,898][13933] Num frames 4300... [2023-09-14 13:11:26,968][13933] Num frames 4400... [2023-09-14 13:11:27,038][13933] Num frames 4500... [2023-09-14 13:11:27,108][13933] Num frames 4600... [2023-09-14 13:11:27,178][13933] Num frames 4700... [2023-09-14 13:11:27,249][13933] Num frames 4800... [2023-09-14 13:11:27,320][13933] Num frames 4900... [2023-09-14 13:11:27,393][13933] Num frames 5000... [2023-09-14 13:11:27,462][13933] Num frames 5100... [2023-09-14 13:11:27,547][13933] Avg episode rewards: #0: 20.080, true rewards: #0: 8.580 [2023-09-14 13:11:27,547][13933] Avg episode reward: 20.080, avg true_objective: 8.580 [2023-09-14 13:11:27,584][13933] Num frames 5200... [2023-09-14 13:11:27,653][13933] Num frames 5300... [2023-09-14 13:11:27,721][13933] Num frames 5400... [2023-09-14 13:11:27,790][13933] Num frames 5500... [2023-09-14 13:11:27,859][13933] Num frames 5600... [2023-09-14 13:11:27,927][13933] Num frames 5700... [2023-09-14 13:11:27,995][13933] Num frames 5800... [2023-09-14 13:11:28,063][13933] Num frames 5900... [2023-09-14 13:11:28,132][13933] Num frames 6000... [2023-09-14 13:11:28,201][13933] Num frames 6100... [2023-09-14 13:11:28,281][13933] Avg episode rewards: #0: 19.914, true rewards: #0: 8.771 [2023-09-14 13:11:28,281][13933] Avg episode reward: 19.914, avg true_objective: 8.771 [2023-09-14 13:11:28,323][13933] Num frames 6200... [2023-09-14 13:11:28,391][13933] Num frames 6300... [2023-09-14 13:11:28,460][13933] Num frames 6400... [2023-09-14 13:11:28,529][13933] Num frames 6500... [2023-09-14 13:11:28,600][13933] Num frames 6600... [2023-09-14 13:11:28,680][13933] Avg episode rewards: #0: 18.548, true rewards: #0: 8.297 [2023-09-14 13:11:28,681][13933] Avg episode reward: 18.548, avg true_objective: 8.297 [2023-09-14 13:11:28,723][13933] Num frames 6700... [2023-09-14 13:11:28,792][13933] Num frames 6800... [2023-09-14 13:11:28,861][13933] Num frames 6900... [2023-09-14 13:11:28,929][13933] Num frames 7000... [2023-09-14 13:11:28,998][13933] Num frames 7100... [2023-09-14 13:11:29,067][13933] Num frames 7200... [2023-09-14 13:11:29,136][13933] Num frames 7300... [2023-09-14 13:11:29,205][13933] Num frames 7400... [2023-09-14 13:11:29,274][13933] Num frames 7500... [2023-09-14 13:11:29,346][13933] Num frames 7600... [2023-09-14 13:11:29,416][13933] Num frames 7700... [2023-09-14 13:11:29,534][13933] Avg episode rewards: #0: 19.438, true rewards: #0: 8.660 [2023-09-14 13:11:29,534][13933] Avg episode reward: 19.438, avg true_objective: 8.660 [2023-09-14 13:11:29,539][13933] Num frames 7800... [2023-09-14 13:11:29,609][13933] Num frames 7900... [2023-09-14 13:11:29,678][13933] Num frames 8000... [2023-09-14 13:11:29,749][13933] Num frames 8100... [2023-09-14 13:11:29,820][13933] Num frames 8200... [2023-09-14 13:11:29,889][13933] Num frames 8300... [2023-09-14 13:11:29,965][13933] Num frames 8400... [2023-09-14 13:11:30,048][13933] Num frames 8500... [2023-09-14 13:11:30,157][13933] Avg episode rewards: #0: 19.080, true rewards: #0: 8.580 [2023-09-14 13:11:30,157][13933] Avg episode reward: 19.080, avg true_objective: 8.580 [2023-09-14 13:11:42,161][13933] Replay video saved to ./PPO-VizDoom/train_dir/default_experiment/replay.mp4! [2023-09-14 13:11:42,170][13933] Loading existing experiment configuration from ./PPO-VizDoom/train_dir/default_experiment/config.json [2023-09-14 13:11:42,171][13933] Overriding arg 'num_workers' with value 1 passed from command line [2023-09-14 13:11:42,171][13933] Adding new argument 'no_render'=True that is not in the saved config file! [2023-09-14 13:11:42,171][13933] Adding new argument 'save_video'=True that is not in the saved config file! [2023-09-14 13:11:42,171][13933] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-09-14 13:11:42,171][13933] Adding new argument 'video_name'=None that is not in the saved config file! [2023-09-14 13:11:42,171][13933] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-09-14 13:11:42,171][13933] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-09-14 13:11:42,171][13933] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-09-14 13:11:42,171][13933] Adding new argument 'hf_repository'='Lethargus/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-09-14 13:11:42,171][13933] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-09-14 13:11:42,171][13933] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-09-14 13:11:42,171][13933] Adding new argument 'train_script'=None that is not in the saved config file! [2023-09-14 13:11:42,172][13933] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-09-14 13:11:42,172][13933] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-09-14 13:11:42,189][13933] RunningMeanStd input shape: (3, 72, 128) [2023-09-14 13:11:42,189][13933] RunningMeanStd input shape: (1,) [2023-09-14 13:11:42,195][13933] ConvEncoder: input_channels=3 [2023-09-14 13:11:42,217][13933] Conv encoder output size: 512 [2023-09-14 13:11:42,217][13933] Policy head output size: 512 [2023-09-14 13:11:42,227][13933] Loading state from checkpoint ./PPO-VizDoom/train_dir/default_experiment/checkpoint_p0/checkpoint_000009767_40005632.pth... [2023-09-14 13:11:42,481][13933] Num frames 100... [2023-09-14 13:11:42,548][13933] Num frames 200... [2023-09-14 13:11:42,618][13933] Num frames 300... [2023-09-14 13:11:42,688][13933] Num frames 400... [2023-09-14 13:11:42,757][13933] Num frames 500... [2023-09-14 13:11:42,824][13933] Num frames 600... [2023-09-14 13:11:42,882][13933] Avg episode rewards: #0: 12.080, true rewards: #0: 6.080 [2023-09-14 13:11:42,882][13933] Avg episode reward: 12.080, avg true_objective: 6.080 [2023-09-14 13:11:42,944][13933] Num frames 700... [2023-09-14 13:11:43,012][13933] Num frames 800... [2023-09-14 13:11:43,080][13933] Num frames 900... [2023-09-14 13:11:43,148][13933] Num frames 1000... [2023-09-14 13:11:43,215][13933] Num frames 1100... [2023-09-14 13:11:43,284][13933] Num frames 1200... [2023-09-14 13:11:43,352][13933] Num frames 1300... [2023-09-14 13:11:43,421][13933] Num frames 1400... [2023-09-14 13:11:43,490][13933] Num frames 1500... [2023-09-14 13:11:43,558][13933] Num frames 1600... [2023-09-14 13:11:43,628][13933] Num frames 1700... [2023-09-14 13:11:43,698][13933] Num frames 1800... [2023-09-14 13:11:43,767][13933] Num frames 1900... [2023-09-14 13:11:43,836][13933] Num frames 2000... [2023-09-14 13:11:43,905][13933] Num frames 2100... [2023-09-14 13:11:43,975][13933] Num frames 2200... [2023-09-14 13:11:44,043][13933] Num frames 2300... [2023-09-14 13:11:44,142][13933] Avg episode rewards: #0: 28.840, true rewards: #0: 11.840 [2023-09-14 13:11:44,142][13933] Avg episode reward: 28.840, avg true_objective: 11.840 [2023-09-14 13:11:44,165][13933] Num frames 2400... [2023-09-14 13:11:44,232][13933] Num frames 2500... [2023-09-14 13:11:44,300][13933] Num frames 2600... [2023-09-14 13:11:44,369][13933] Num frames 2700... [2023-09-14 13:11:44,438][13933] Num frames 2800... [2023-09-14 13:11:44,545][13933] Avg episode rewards: #0: 21.933, true rewards: #0: 9.600 [2023-09-14 13:11:44,545][13933] Avg episode reward: 21.933, avg true_objective: 9.600 [2023-09-14 13:11:44,559][13933] Num frames 2900... [2023-09-14 13:11:44,632][13933] Num frames 3000... [2023-09-14 13:11:44,701][13933] Num frames 3100... [2023-09-14 13:11:44,770][13933] Num frames 3200... [2023-09-14 13:11:44,839][13933] Num frames 3300... [2023-09-14 13:11:44,908][13933] Num frames 3400... [2023-09-14 13:11:44,977][13933] Num frames 3500... [2023-09-14 13:11:45,048][13933] Num frames 3600... [2023-09-14 13:11:45,155][13933] Avg episode rewards: #0: 20.450, true rewards: #0: 9.200 [2023-09-14 13:11:45,156][13933] Avg episode reward: 20.450, avg true_objective: 9.200 [2023-09-14 13:11:45,172][13933] Num frames 3700... [2023-09-14 13:11:45,240][13933] Num frames 3800... [2023-09-14 13:11:45,308][13933] Num frames 3900... [2023-09-14 13:11:45,377][13933] Num frames 4000... [2023-09-14 13:11:45,446][13933] Num frames 4100... [2023-09-14 13:11:45,513][13933] Num frames 4200... [2023-09-14 13:11:45,582][13933] Num frames 4300... [2023-09-14 13:11:45,651][13933] Num frames 4400... [2023-09-14 13:11:45,721][13933] Num frames 4500... [2023-09-14 13:11:45,790][13933] Num frames 4600... [2023-09-14 13:11:45,857][13933] Num frames 4700... [2023-09-14 13:11:45,925][13933] Num frames 4800... [2023-09-14 13:11:45,993][13933] Num frames 4900... [2023-09-14 13:11:46,061][13933] Num frames 5000... [2023-09-14 13:11:46,130][13933] Num frames 5100... [2023-09-14 13:11:46,198][13933] Num frames 5200... [2023-09-14 13:11:46,267][13933] Num frames 5300... [2023-09-14 13:11:46,336][13933] Num frames 5400... [2023-09-14 13:11:46,405][13933] Num frames 5500... [2023-09-14 13:11:46,473][13933] Num frames 5600... [2023-09-14 13:11:46,542][13933] Num frames 5700... [2023-09-14 13:11:46,649][13933] Avg episode rewards: #0: 28.760, true rewards: #0: 11.560 [2023-09-14 13:11:46,649][13933] Avg episode reward: 28.760, avg true_objective: 11.560 [2023-09-14 13:11:46,664][13933] Num frames 5800... [2023-09-14 13:11:46,732][13933] Num frames 5900... [2023-09-14 13:11:46,801][13933] Num frames 6000... [2023-09-14 13:11:46,869][13933] Num frames 6100... [2023-09-14 13:11:46,938][13933] Num frames 6200... [2023-09-14 13:11:47,005][13933] Num frames 6300... [2023-09-14 13:11:47,116][13933] Avg episode rewards: #0: 25.646, true rewards: #0: 10.647 [2023-09-14 13:11:47,117][13933] Avg episode reward: 25.646, avg true_objective: 10.647 [2023-09-14 13:11:47,125][13933] Num frames 6400... [2023-09-14 13:11:47,194][13933] Num frames 6500... [2023-09-14 13:11:47,261][13933] Num frames 6600... [2023-09-14 13:11:47,329][13933] Num frames 6700... [2023-09-14 13:11:47,397][13933] Num frames 6800... [2023-09-14 13:11:47,464][13933] Num frames 6900... [2023-09-14 13:11:47,532][13933] Avg episode rewards: #0: 23.174, true rewards: #0: 9.889 [2023-09-14 13:11:47,532][13933] Avg episode reward: 23.174, avg true_objective: 9.889 [2023-09-14 13:11:47,585][13933] Num frames 7000... [2023-09-14 13:11:47,655][13933] Num frames 7100... [2023-09-14 13:11:47,724][13933] Num frames 7200... [2023-09-14 13:11:47,794][13933] Num frames 7300... [2023-09-14 13:11:47,864][13933] Num frames 7400... [2023-09-14 13:11:47,964][13933] Avg episode rewards: #0: 21.335, true rewards: #0: 9.335 [2023-09-14 13:11:47,964][13933] Avg episode reward: 21.335, avg true_objective: 9.335 [2023-09-14 13:11:47,988][13933] Num frames 7500... [2023-09-14 13:11:48,055][13933] Num frames 7600... [2023-09-14 13:11:48,123][13933] Num frames 7700... [2023-09-14 13:11:48,190][13933] Num frames 7800... [2023-09-14 13:11:48,258][13933] Num frames 7900... [2023-09-14 13:11:48,371][13933] Avg episode rewards: #0: 19.990, true rewards: #0: 8.879 [2023-09-14 13:11:48,371][13933] Avg episode reward: 19.990, avg true_objective: 8.879 [2023-09-14 13:11:48,377][13933] Num frames 8000... [2023-09-14 13:11:48,444][13933] Num frames 8100... [2023-09-14 13:11:48,512][13933] Num frames 8200... [2023-09-14 13:11:48,579][13933] Num frames 8300... [2023-09-14 13:11:48,648][13933] Num frames 8400... [2023-09-14 13:11:48,716][13933] Num frames 8500... [2023-09-14 13:11:48,784][13933] Num frames 8600... [2023-09-14 13:11:48,852][13933] Num frames 8700... [2023-09-14 13:11:48,921][13933] Num frames 8800... [2023-09-14 13:11:48,989][13933] Num frames 8900... [2023-09-14 13:11:49,058][13933] Num frames 9000... [2023-09-14 13:11:49,128][13933] Num frames 9100... [2023-09-14 13:11:49,199][13933] Num frames 9200... [2023-09-14 13:11:49,270][13933] Num frames 9300... [2023-09-14 13:11:49,341][13933] Num frames 9400... [2023-09-14 13:11:49,413][13933] Num frames 9500... [2023-09-14 13:11:49,483][13933] Num frames 9600... [2023-09-14 13:11:49,552][13933] Num frames 9700... [2023-09-14 13:11:49,625][13933] Num frames 9800... [2023-09-14 13:11:49,743][13933] Avg episode rewards: #0: 22.492, true rewards: #0: 9.892 [2023-09-14 13:11:49,743][13933] Avg episode reward: 22.492, avg true_objective: 9.892 [2023-09-14 13:12:03,682][13933] Replay video saved to ./PPO-VizDoom/train_dir/default_experiment/replay.mp4! [2023-09-14 13:13:30,890][13933] The model has been pushed to https://huggingface.co/Lethargus/rl_course_vizdoom_health_gathering_supreme