numan966's picture
Upload . with huggingface_hub
110540a
[2023-02-26 09:18:53,350][00216] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-26 09:18:53,353][00216] Rollout worker 0 uses device cpu
[2023-02-26 09:18:53,356][00216] Rollout worker 1 uses device cpu
[2023-02-26 09:18:53,364][00216] Rollout worker 2 uses device cpu
[2023-02-26 09:18:53,372][00216] Rollout worker 3 uses device cpu
[2023-02-26 09:18:53,377][00216] Rollout worker 4 uses device cpu
[2023-02-26 09:18:53,379][00216] Rollout worker 5 uses device cpu
[2023-02-26 09:18:53,387][00216] Rollout worker 6 uses device cpu
[2023-02-26 09:18:53,388][00216] Rollout worker 7 uses device cpu
[2023-02-26 09:18:53,875][00216] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-26 09:18:53,881][00216] InferenceWorker_p0-w0: min num requests: 2
[2023-02-26 09:18:54,011][00216] Starting all processes...
[2023-02-26 09:18:54,013][00216] Starting process learner_proc0
[2023-02-26 09:18:54,144][00216] Starting all processes...
[2023-02-26 09:18:54,195][00216] Starting process inference_proc0-0
[2023-02-26 09:18:54,195][00216] Starting process rollout_proc0
[2023-02-26 09:18:54,197][00216] Starting process rollout_proc1
[2023-02-26 09:18:54,197][00216] Starting process rollout_proc2
[2023-02-26 09:18:54,197][00216] Starting process rollout_proc3
[2023-02-26 09:18:54,197][00216] Starting process rollout_proc4
[2023-02-26 09:18:54,197][00216] Starting process rollout_proc5
[2023-02-26 09:18:54,197][00216] Starting process rollout_proc6
[2023-02-26 09:18:54,197][00216] Starting process rollout_proc7
[2023-02-26 09:19:05,810][13460] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-26 09:19:05,815][13460] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-26 09:19:05,981][13476] Worker 1 uses CPU cores [1]
[2023-02-26 09:19:06,161][13479] Worker 3 uses CPU cores [1]
[2023-02-26 09:19:06,176][13474] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-26 09:19:06,182][13474] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-26 09:19:06,286][13478] Worker 4 uses CPU cores [0]
[2023-02-26 09:19:06,661][13481] Worker 6 uses CPU cores [0]
[2023-02-26 09:19:06,662][13475] Worker 0 uses CPU cores [0]
[2023-02-26 09:19:06,711][13482] Worker 7 uses CPU cores [1]
[2023-02-26 09:19:06,724][13477] Worker 2 uses CPU cores [0]
[2023-02-26 09:19:06,787][13460] Num visible devices: 1
[2023-02-26 09:19:06,790][13474] Num visible devices: 1
[2023-02-26 09:19:06,818][13460] Starting seed is not provided
[2023-02-26 09:19:06,818][13460] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-26 09:19:06,819][13460] Initializing actor-critic model on device cuda:0
[2023-02-26 09:19:06,820][13460] RunningMeanStd input shape: (3, 72, 128)
[2023-02-26 09:19:06,823][13460] RunningMeanStd input shape: (1,)
[2023-02-26 09:19:06,841][13480] Worker 5 uses CPU cores [1]
[2023-02-26 09:19:06,847][13460] ConvEncoder: input_channels=3
[2023-02-26 09:19:07,436][13460] Conv encoder output size: 512
[2023-02-26 09:19:07,437][13460] Policy head output size: 512
[2023-02-26 09:19:07,514][13460] Created Actor Critic model with architecture:
[2023-02-26 09:19:07,515][13460] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-26 09:19:13,832][00216] Heartbeat connected on Batcher_0
[2023-02-26 09:19:13,875][00216] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-26 09:19:13,902][00216] Heartbeat connected on RolloutWorker_w0
[2023-02-26 09:19:13,919][00216] Heartbeat connected on RolloutWorker_w1
[2023-02-26 09:19:13,924][00216] Heartbeat connected on RolloutWorker_w2
[2023-02-26 09:19:13,942][00216] Heartbeat connected on RolloutWorker_w3
[2023-02-26 09:19:13,968][00216] Heartbeat connected on RolloutWorker_w4
[2023-02-26 09:19:13,974][00216] Heartbeat connected on RolloutWorker_w5
[2023-02-26 09:19:13,995][00216] Heartbeat connected on RolloutWorker_w6
[2023-02-26 09:19:14,010][00216] Heartbeat connected on RolloutWorker_w7
[2023-02-26 09:19:16,307][13460] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-26 09:19:16,308][13460] No checkpoints found
[2023-02-26 09:19:16,309][13460] Did not load from checkpoint, starting from scratch!
[2023-02-26 09:19:16,309][13460] Initialized policy 0 weights for model version 0
[2023-02-26 09:19:16,314][13460] LearnerWorker_p0 finished initialization!
[2023-02-26 09:19:16,315][13460] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-26 09:19:16,315][00216] Heartbeat connected on LearnerWorker_p0
[2023-02-26 09:19:16,515][13474] RunningMeanStd input shape: (3, 72, 128)
[2023-02-26 09:19:16,516][13474] RunningMeanStd input shape: (1,)
[2023-02-26 09:19:16,528][13474] ConvEncoder: input_channels=3
[2023-02-26 09:19:16,624][13474] Conv encoder output size: 512
[2023-02-26 09:19:16,624][13474] Policy head output size: 512
[2023-02-26 09:19:18,226][00216] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-26 09:19:18,865][00216] Inference worker 0-0 is ready!
[2023-02-26 09:19:18,867][00216] All inference workers are ready! Signal rollout workers to start!
[2023-02-26 09:19:18,997][13477] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 09:19:18,998][13475] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 09:19:19,012][13479] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 09:19:19,022][13476] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 09:19:19,033][13480] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 09:19:19,039][13478] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 09:19:19,039][13482] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 09:19:19,044][13481] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 09:19:19,846][13475] Decorrelating experience for 0 frames...
[2023-02-26 09:19:19,848][13477] Decorrelating experience for 0 frames...
[2023-02-26 09:19:19,847][13480] Decorrelating experience for 0 frames...
[2023-02-26 09:19:19,848][13482] Decorrelating experience for 0 frames...
[2023-02-26 09:19:20,845][13476] Decorrelating experience for 0 frames...
[2023-02-26 09:19:20,882][13475] Decorrelating experience for 32 frames...
[2023-02-26 09:19:20,887][13482] Decorrelating experience for 32 frames...
[2023-02-26 09:19:20,890][13477] Decorrelating experience for 32 frames...
[2023-02-26 09:19:20,894][13480] Decorrelating experience for 32 frames...
[2023-02-26 09:19:20,909][13481] Decorrelating experience for 0 frames...
[2023-02-26 09:19:22,007][13479] Decorrelating experience for 0 frames...
[2023-02-26 09:19:22,119][13476] Decorrelating experience for 32 frames...
[2023-02-26 09:19:22,390][13482] Decorrelating experience for 64 frames...
[2023-02-26 09:19:22,408][13481] Decorrelating experience for 32 frames...
[2023-02-26 09:19:22,434][13478] Decorrelating experience for 0 frames...
[2023-02-26 09:19:22,643][13477] Decorrelating experience for 64 frames...
[2023-02-26 09:19:22,667][13475] Decorrelating experience for 64 frames...
[2023-02-26 09:19:23,226][00216] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-26 09:19:23,865][13479] Decorrelating experience for 32 frames...
[2023-02-26 09:19:24,090][13478] Decorrelating experience for 32 frames...
[2023-02-26 09:19:24,104][13481] Decorrelating experience for 64 frames...
[2023-02-26 09:19:24,225][13476] Decorrelating experience for 64 frames...
[2023-02-26 09:19:24,258][13480] Decorrelating experience for 64 frames...
[2023-02-26 09:19:24,501][13482] Decorrelating experience for 96 frames...
[2023-02-26 09:19:25,574][13477] Decorrelating experience for 96 frames...
[2023-02-26 09:19:25,651][13481] Decorrelating experience for 96 frames...
[2023-02-26 09:19:25,731][13475] Decorrelating experience for 96 frames...
[2023-02-26 09:19:26,443][13479] Decorrelating experience for 64 frames...
[2023-02-26 09:19:26,621][13480] Decorrelating experience for 96 frames...
[2023-02-26 09:19:26,627][13476] Decorrelating experience for 96 frames...
[2023-02-26 09:19:27,068][13478] Decorrelating experience for 64 frames...
[2023-02-26 09:19:27,576][13478] Decorrelating experience for 96 frames...
[2023-02-26 09:19:27,899][13479] Decorrelating experience for 96 frames...
[2023-02-26 09:19:28,226][00216] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-26 09:19:32,410][13460] Signal inference workers to stop experience collection...
[2023-02-26 09:19:32,442][13474] InferenceWorker_p0-w0: stopping experience collection
[2023-02-26 09:19:33,225][00216] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 74.3. Samples: 1114. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-26 09:19:33,229][00216] Avg episode reward: [(0, '2.045')]
[2023-02-26 09:19:34,958][13460] Signal inference workers to resume experience collection...
[2023-02-26 09:19:34,959][13474] InferenceWorker_p0-w0: resuming experience collection
[2023-02-26 09:19:38,226][00216] Fps is (10 sec: 1638.4, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 16384. Throughput: 0: 179.3. Samples: 3586. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:19:38,233][00216] Avg episode reward: [(0, '3.448')]
[2023-02-26 09:19:43,226][00216] Fps is (10 sec: 3276.7, 60 sec: 1310.7, 300 sec: 1310.7). Total num frames: 32768. Throughput: 0: 362.7. Samples: 9068. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 09:19:43,230][00216] Avg episode reward: [(0, '3.881')]
[2023-02-26 09:19:45,343][13474] Updated weights for policy 0, policy_version 10 (0.0012)
[2023-02-26 09:19:48,226][00216] Fps is (10 sec: 2867.2, 60 sec: 1501.9, 300 sec: 1501.9). Total num frames: 45056. Throughput: 0: 372.3. Samples: 11170. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:19:48,231][00216] Avg episode reward: [(0, '4.367')]
[2023-02-26 09:19:53,226][00216] Fps is (10 sec: 3686.5, 60 sec: 1989.5, 300 sec: 1989.5). Total num frames: 69632. Throughput: 0: 470.9. Samples: 16480. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:19:53,231][00216] Avg episode reward: [(0, '4.432')]
[2023-02-26 09:19:55,848][13474] Updated weights for policy 0, policy_version 20 (0.0027)
[2023-02-26 09:19:58,226][00216] Fps is (10 sec: 4505.6, 60 sec: 2252.8, 300 sec: 2252.8). Total num frames: 90112. Throughput: 0: 586.5. Samples: 23462. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:19:58,231][00216] Avg episode reward: [(0, '4.338')]
[2023-02-26 09:20:03,226][00216] Fps is (10 sec: 3686.3, 60 sec: 2366.6, 300 sec: 2366.6). Total num frames: 106496. Throughput: 0: 586.4. Samples: 26386. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:20:03,230][00216] Avg episode reward: [(0, '4.326')]
[2023-02-26 09:20:03,238][13460] Saving new best policy, reward=4.326!
[2023-02-26 09:20:07,830][13474] Updated weights for policy 0, policy_version 30 (0.0013)
[2023-02-26 09:20:08,226][00216] Fps is (10 sec: 3276.7, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 122880. Throughput: 0: 681.8. Samples: 30680. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:20:08,232][00216] Avg episode reward: [(0, '4.373')]
[2023-02-26 09:20:08,245][13460] Saving new best policy, reward=4.373!
[2023-02-26 09:20:13,226][00216] Fps is (10 sec: 3686.5, 60 sec: 2606.5, 300 sec: 2606.5). Total num frames: 143360. Throughput: 0: 811.8. Samples: 36532. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:20:13,228][00216] Avg episode reward: [(0, '4.361')]
[2023-02-26 09:20:17,113][13474] Updated weights for policy 0, policy_version 40 (0.0021)
[2023-02-26 09:20:18,226][00216] Fps is (10 sec: 4505.7, 60 sec: 2798.9, 300 sec: 2798.9). Total num frames: 167936. Throughput: 0: 864.7. Samples: 40026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:20:18,228][00216] Avg episode reward: [(0, '4.414')]
[2023-02-26 09:20:18,243][13460] Saving new best policy, reward=4.414!
[2023-02-26 09:20:23,230][00216] Fps is (10 sec: 3684.8, 60 sec: 3003.5, 300 sec: 2772.5). Total num frames: 180224. Throughput: 0: 936.2. Samples: 45720. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:20:23,232][00216] Avg episode reward: [(0, '4.308')]
[2023-02-26 09:20:28,226][00216] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 2808.7). Total num frames: 196608. Throughput: 0: 905.4. Samples: 49810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:20:28,234][00216] Avg episode reward: [(0, '4.305')]
[2023-02-26 09:20:29,987][13474] Updated weights for policy 0, policy_version 50 (0.0011)
[2023-02-26 09:20:33,227][00216] Fps is (10 sec: 3687.6, 60 sec: 3618.1, 300 sec: 2894.5). Total num frames: 217088. Throughput: 0: 926.2. Samples: 52848. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:20:33,229][00216] Avg episode reward: [(0, '4.354')]
[2023-02-26 09:20:38,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3020.8). Total num frames: 241664. Throughput: 0: 967.6. Samples: 60024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:20:38,232][00216] Avg episode reward: [(0, '4.371')]
[2023-02-26 09:20:38,597][13474] Updated weights for policy 0, policy_version 60 (0.0017)
[2023-02-26 09:20:43,231][00216] Fps is (10 sec: 4094.0, 60 sec: 3754.3, 300 sec: 3035.7). Total num frames: 258048. Throughput: 0: 931.1. Samples: 65366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:20:43,234][00216] Avg episode reward: [(0, '4.388')]
[2023-02-26 09:20:48,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3049.2). Total num frames: 274432. Throughput: 0: 914.7. Samples: 67546. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 09:20:48,229][00216] Avg episode reward: [(0, '4.385')]
[2023-02-26 09:20:48,239][13460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000067_274432.pth...
[2023-02-26 09:20:51,206][13474] Updated weights for policy 0, policy_version 70 (0.0040)
[2023-02-26 09:20:53,226][00216] Fps is (10 sec: 3688.6, 60 sec: 3754.7, 300 sec: 3104.3). Total num frames: 294912. Throughput: 0: 945.6. Samples: 73234. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:20:53,229][00216] Avg episode reward: [(0, '4.372')]
[2023-02-26 09:20:58,231][00216] Fps is (10 sec: 4503.1, 60 sec: 3822.6, 300 sec: 3194.7). Total num frames: 319488. Throughput: 0: 970.5. Samples: 80208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:20:58,238][00216] Avg episode reward: [(0, '4.474')]
[2023-02-26 09:20:58,253][13460] Saving new best policy, reward=4.474!
[2023-02-26 09:21:00,598][13474] Updated weights for policy 0, policy_version 80 (0.0033)
[2023-02-26 09:21:03,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3159.8). Total num frames: 331776. Throughput: 0: 946.7. Samples: 82628. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:21:03,236][00216] Avg episode reward: [(0, '4.475')]
[2023-02-26 09:21:03,247][13460] Saving new best policy, reward=4.475!
[2023-02-26 09:21:08,226][00216] Fps is (10 sec: 2868.7, 60 sec: 3754.7, 300 sec: 3165.1). Total num frames: 348160. Throughput: 0: 914.6. Samples: 86874. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:21:08,233][00216] Avg episode reward: [(0, '4.389')]
[2023-02-26 09:21:12,624][13474] Updated weights for policy 0, policy_version 90 (0.0016)
[2023-02-26 09:21:13,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3205.6). Total num frames: 368640. Throughput: 0: 959.2. Samples: 92974. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:21:13,228][00216] Avg episode reward: [(0, '4.375')]
[2023-02-26 09:21:18,226][00216] Fps is (10 sec: 4096.1, 60 sec: 3686.4, 300 sec: 3242.7). Total num frames: 389120. Throughput: 0: 963.2. Samples: 96192. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:21:18,234][00216] Avg episode reward: [(0, '4.565')]
[2023-02-26 09:21:18,246][13460] Saving new best policy, reward=4.565!
[2023-02-26 09:21:23,227][00216] Fps is (10 sec: 2866.9, 60 sec: 3618.3, 300 sec: 3178.5). Total num frames: 397312. Throughput: 0: 889.0. Samples: 100028. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:21:23,232][00216] Avg episode reward: [(0, '4.496')]
[2023-02-26 09:21:26,954][13474] Updated weights for policy 0, policy_version 100 (0.0030)
[2023-02-26 09:21:28,226][00216] Fps is (10 sec: 2048.0, 60 sec: 3549.9, 300 sec: 3150.8). Total num frames: 409600. Throughput: 0: 848.2. Samples: 103530. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:21:28,229][00216] Avg episode reward: [(0, '4.451')]
[2023-02-26 09:21:33,226][00216] Fps is (10 sec: 3277.2, 60 sec: 3549.9, 300 sec: 3185.8). Total num frames: 430080. Throughput: 0: 850.4. Samples: 105812. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:21:33,233][00216] Avg episode reward: [(0, '4.377')]
[2023-02-26 09:21:37,524][13474] Updated weights for policy 0, policy_version 110 (0.0013)
[2023-02-26 09:21:38,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3218.3). Total num frames: 450560. Throughput: 0: 873.3. Samples: 112534. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:21:38,232][00216] Avg episode reward: [(0, '4.530')]
[2023-02-26 09:21:43,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3550.2, 300 sec: 3248.6). Total num frames: 471040. Throughput: 0: 863.6. Samples: 119066. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:21:43,233][00216] Avg episode reward: [(0, '4.510')]
[2023-02-26 09:21:48,226][00216] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3249.5). Total num frames: 487424. Throughput: 0: 858.4. Samples: 121256. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:21:48,235][00216] Avg episode reward: [(0, '4.405')]
[2023-02-26 09:21:48,841][13474] Updated weights for policy 0, policy_version 120 (0.0011)
[2023-02-26 09:21:53,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3250.4). Total num frames: 503808. Throughput: 0: 869.4. Samples: 125998. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:21:53,238][00216] Avg episode reward: [(0, '4.500')]
[2023-02-26 09:21:58,226][00216] Fps is (10 sec: 4096.1, 60 sec: 3481.9, 300 sec: 3302.4). Total num frames: 528384. Throughput: 0: 888.8. Samples: 132972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:21:58,230][00216] Avg episode reward: [(0, '4.559')]
[2023-02-26 09:21:58,580][13474] Updated weights for policy 0, policy_version 130 (0.0026)
[2023-02-26 09:22:03,231][00216] Fps is (10 sec: 4503.3, 60 sec: 3617.8, 300 sec: 3326.3). Total num frames: 548864. Throughput: 0: 895.0. Samples: 136470. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:22:03,238][00216] Avg episode reward: [(0, '4.591')]
[2023-02-26 09:22:03,240][13460] Saving new best policy, reward=4.591!
[2023-02-26 09:22:08,227][00216] Fps is (10 sec: 3276.5, 60 sec: 3549.8, 300 sec: 3300.9). Total num frames: 561152. Throughput: 0: 908.5. Samples: 140910. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:22:08,232][00216] Avg episode reward: [(0, '4.427')]
[2023-02-26 09:22:11,123][13474] Updated weights for policy 0, policy_version 140 (0.0020)
[2023-02-26 09:22:13,226][00216] Fps is (10 sec: 3278.5, 60 sec: 3549.9, 300 sec: 3323.6). Total num frames: 581632. Throughput: 0: 943.0. Samples: 145966. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:22:13,233][00216] Avg episode reward: [(0, '4.403')]
[2023-02-26 09:22:18,226][00216] Fps is (10 sec: 4506.0, 60 sec: 3618.1, 300 sec: 3367.8). Total num frames: 606208. Throughput: 0: 969.2. Samples: 149426. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:22:18,229][00216] Avg episode reward: [(0, '4.305')]
[2023-02-26 09:22:19,842][13474] Updated weights for policy 0, policy_version 150 (0.0012)
[2023-02-26 09:22:23,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3365.4). Total num frames: 622592. Throughput: 0: 967.4. Samples: 156066. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 09:22:23,228][00216] Avg episode reward: [(0, '4.562')]
[2023-02-26 09:22:28,228][00216] Fps is (10 sec: 3276.0, 60 sec: 3822.8, 300 sec: 3363.0). Total num frames: 638976. Throughput: 0: 918.5. Samples: 160402. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:22:28,233][00216] Avg episode reward: [(0, '4.746')]
[2023-02-26 09:22:28,248][13460] Saving new best policy, reward=4.746!
[2023-02-26 09:22:32,440][13474] Updated weights for policy 0, policy_version 160 (0.0021)
[2023-02-26 09:22:33,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3360.8). Total num frames: 655360. Throughput: 0: 917.4. Samples: 162538. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 09:22:33,228][00216] Avg episode reward: [(0, '4.591')]
[2023-02-26 09:22:38,226][00216] Fps is (10 sec: 4097.0, 60 sec: 3822.9, 300 sec: 3399.7). Total num frames: 679936. Throughput: 0: 967.7. Samples: 169544. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 09:22:38,229][00216] Avg episode reward: [(0, '4.551')]
[2023-02-26 09:22:41,289][13474] Updated weights for policy 0, policy_version 170 (0.0015)
[2023-02-26 09:22:43,227][00216] Fps is (10 sec: 4505.1, 60 sec: 3822.9, 300 sec: 3416.6). Total num frames: 700416. Throughput: 0: 952.5. Samples: 175836. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:22:43,232][00216] Avg episode reward: [(0, '4.382')]
[2023-02-26 09:22:48,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3393.8). Total num frames: 712704. Throughput: 0: 922.9. Samples: 177996. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:22:48,234][00216] Avg episode reward: [(0, '4.597')]
[2023-02-26 09:22:48,253][13460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000175_716800.pth...
[2023-02-26 09:22:53,226][00216] Fps is (10 sec: 3277.2, 60 sec: 3822.9, 300 sec: 3410.2). Total num frames: 733184. Throughput: 0: 933.8. Samples: 182930. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:22:53,233][00216] Avg episode reward: [(0, '4.687')]
[2023-02-26 09:22:53,655][13474] Updated weights for policy 0, policy_version 180 (0.0029)
[2023-02-26 09:22:58,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3444.4). Total num frames: 757760. Throughput: 0: 977.6. Samples: 189958. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:22:58,228][00216] Avg episode reward: [(0, '4.697')]
[2023-02-26 09:23:03,054][13474] Updated weights for policy 0, policy_version 190 (0.0022)
[2023-02-26 09:23:03,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3823.3, 300 sec: 3458.8). Total num frames: 778240. Throughput: 0: 977.2. Samples: 193398. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 09:23:03,236][00216] Avg episode reward: [(0, '4.774')]
[2023-02-26 09:23:03,240][13460] Saving new best policy, reward=4.774!
[2023-02-26 09:23:08,226][00216] Fps is (10 sec: 3276.6, 60 sec: 3823.0, 300 sec: 3437.1). Total num frames: 790528. Throughput: 0: 926.7. Samples: 197770. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 09:23:08,229][00216] Avg episode reward: [(0, '4.823')]
[2023-02-26 09:23:08,252][13460] Saving new best policy, reward=4.823!
[2023-02-26 09:23:13,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3451.1). Total num frames: 811008. Throughput: 0: 951.2. Samples: 203202. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 09:23:13,233][00216] Avg episode reward: [(0, '4.659')]
[2023-02-26 09:23:14,735][13474] Updated weights for policy 0, policy_version 200 (0.0020)
[2023-02-26 09:23:18,226][00216] Fps is (10 sec: 4096.3, 60 sec: 3754.7, 300 sec: 3464.5). Total num frames: 831488. Throughput: 0: 980.2. Samples: 206648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:23:18,228][00216] Avg episode reward: [(0, '4.712')]
[2023-02-26 09:23:23,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3477.4). Total num frames: 851968. Throughput: 0: 969.7. Samples: 213180. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 09:23:23,229][00216] Avg episode reward: [(0, '4.637')]
[2023-02-26 09:23:25,176][13474] Updated weights for policy 0, policy_version 210 (0.0013)
[2023-02-26 09:23:28,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3823.1, 300 sec: 3473.4). Total num frames: 868352. Throughput: 0: 926.8. Samples: 217542. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:23:28,232][00216] Avg episode reward: [(0, '4.679')]
[2023-02-26 09:23:33,236][00216] Fps is (10 sec: 3273.3, 60 sec: 3822.3, 300 sec: 3469.4). Total num frames: 884736. Throughput: 0: 932.9. Samples: 219986. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 09:23:33,244][00216] Avg episode reward: [(0, '4.665')]
[2023-02-26 09:23:36,098][13474] Updated weights for policy 0, policy_version 220 (0.0019)
[2023-02-26 09:23:38,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3497.4). Total num frames: 909312. Throughput: 0: 975.7. Samples: 226838. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:23:38,234][00216] Avg episode reward: [(0, '4.586')]
[2023-02-26 09:23:43,226][00216] Fps is (10 sec: 4100.3, 60 sec: 3754.7, 300 sec: 3493.2). Total num frames: 925696. Throughput: 0: 948.0. Samples: 232620. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:23:43,231][00216] Avg episode reward: [(0, '4.537')]
[2023-02-26 09:23:47,817][13474] Updated weights for policy 0, policy_version 230 (0.0018)
[2023-02-26 09:23:48,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3489.2). Total num frames: 942080. Throughput: 0: 918.0. Samples: 234710. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:23:48,231][00216] Avg episode reward: [(0, '4.453')]
[2023-02-26 09:23:53,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3485.3). Total num frames: 958464. Throughput: 0: 932.6. Samples: 239736. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:23:53,231][00216] Avg episode reward: [(0, '4.556')]
[2023-02-26 09:23:57,537][13474] Updated weights for policy 0, policy_version 240 (0.0024)
[2023-02-26 09:23:58,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3510.9). Total num frames: 983040. Throughput: 0: 965.8. Samples: 246664. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:23:58,233][00216] Avg episode reward: [(0, '4.762')]
[2023-02-26 09:24:03,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3521.1). Total num frames: 1003520. Throughput: 0: 956.9. Samples: 249710. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-26 09:24:03,231][00216] Avg episode reward: [(0, '4.707')]
[2023-02-26 09:24:08,228][00216] Fps is (10 sec: 3276.1, 60 sec: 3754.6, 300 sec: 3502.8). Total num frames: 1015808. Throughput: 0: 908.0. Samples: 254044. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 09:24:08,230][00216] Avg episode reward: [(0, '4.537')]
[2023-02-26 09:24:10,026][13474] Updated weights for policy 0, policy_version 250 (0.0022)
[2023-02-26 09:24:13,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3512.8). Total num frames: 1036288. Throughput: 0: 936.7. Samples: 259692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:24:13,230][00216] Avg episode reward: [(0, '4.457')]
[2023-02-26 09:24:18,226][00216] Fps is (10 sec: 4096.8, 60 sec: 3754.7, 300 sec: 3582.3). Total num frames: 1056768. Throughput: 0: 957.7. Samples: 263074. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:24:18,228][00216] Avg episode reward: [(0, '4.693')]
[2023-02-26 09:24:19,349][13474] Updated weights for policy 0, policy_version 260 (0.0020)
[2023-02-26 09:24:23,230][00216] Fps is (10 sec: 3684.9, 60 sec: 3686.1, 300 sec: 3637.8). Total num frames: 1073152. Throughput: 0: 931.8. Samples: 268772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:24:23,238][00216] Avg episode reward: [(0, '4.536')]
[2023-02-26 09:24:28,227][00216] Fps is (10 sec: 2866.7, 60 sec: 3618.0, 300 sec: 3679.4). Total num frames: 1085440. Throughput: 0: 873.7. Samples: 271936. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:24:28,231][00216] Avg episode reward: [(0, '4.543')]
[2023-02-26 09:24:33,226][00216] Fps is (10 sec: 2458.6, 60 sec: 3550.5, 300 sec: 3665.6). Total num frames: 1097728. Throughput: 0: 866.9. Samples: 273722. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:24:33,230][00216] Avg episode reward: [(0, '4.571')]
[2023-02-26 09:24:35,653][13474] Updated weights for policy 0, policy_version 270 (0.0012)
[2023-02-26 09:24:38,226][00216] Fps is (10 sec: 2867.6, 60 sec: 3413.3, 300 sec: 3665.6). Total num frames: 1114112. Throughput: 0: 862.5. Samples: 278548. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:24:38,229][00216] Avg episode reward: [(0, '4.833')]
[2023-02-26 09:24:38,241][13460] Saving new best policy, reward=4.833!
[2023-02-26 09:24:43,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3707.2). Total num frames: 1138688. Throughput: 0: 856.6. Samples: 285212. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:24:43,228][00216] Avg episode reward: [(0, '4.914')]
[2023-02-26 09:24:43,236][13460] Saving new best policy, reward=4.914!
[2023-02-26 09:24:45,541][13474] Updated weights for policy 0, policy_version 280 (0.0012)
[2023-02-26 09:24:48,226][00216] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3665.6). Total num frames: 1150976. Throughput: 0: 838.5. Samples: 287444. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:24:48,229][00216] Avg episode reward: [(0, '5.062')]
[2023-02-26 09:24:48,247][13460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000281_1150976.pth...
[2023-02-26 09:24:48,396][13460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000067_274432.pth
[2023-02-26 09:24:48,411][13460] Saving new best policy, reward=5.062!
[2023-02-26 09:24:53,226][00216] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 1167360. Throughput: 0: 834.6. Samples: 291600. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:24:53,232][00216] Avg episode reward: [(0, '5.176')]
[2023-02-26 09:24:53,237][13460] Saving new best policy, reward=5.176!
[2023-02-26 09:24:57,549][13474] Updated weights for policy 0, policy_version 290 (0.0032)
[2023-02-26 09:24:58,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3665.6). Total num frames: 1187840. Throughput: 0: 852.0. Samples: 298030. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:24:58,234][00216] Avg episode reward: [(0, '5.084')]
[2023-02-26 09:25:03,225][00216] Fps is (10 sec: 4505.7, 60 sec: 3481.6, 300 sec: 3693.3). Total num frames: 1212416. Throughput: 0: 855.7. Samples: 301580. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:25:03,228][00216] Avg episode reward: [(0, '5.305')]
[2023-02-26 09:25:03,235][13460] Saving new best policy, reward=5.305!
[2023-02-26 09:25:07,757][13474] Updated weights for policy 0, policy_version 300 (0.0012)
[2023-02-26 09:25:08,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3679.5). Total num frames: 1228800. Throughput: 0: 851.7. Samples: 307094. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:25:08,232][00216] Avg episode reward: [(0, '5.334')]
[2023-02-26 09:25:08,247][13460] Saving new best policy, reward=5.334!
[2023-02-26 09:25:13,226][00216] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3637.8). Total num frames: 1241088. Throughput: 0: 876.5. Samples: 311376. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:25:13,230][00216] Avg episode reward: [(0, '5.201')]
[2023-02-26 09:25:18,226][00216] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3679.5). Total num frames: 1265664. Throughput: 0: 910.3. Samples: 314684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:25:18,228][00216] Avg episode reward: [(0, '5.289')]
[2023-02-26 09:25:18,869][13474] Updated weights for policy 0, policy_version 310 (0.0015)
[2023-02-26 09:25:23,230][00216] Fps is (10 sec: 4503.7, 60 sec: 3549.9, 300 sec: 3693.3). Total num frames: 1286144. Throughput: 0: 954.9. Samples: 321524. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:25:23,232][00216] Avg episode reward: [(0, '5.435')]
[2023-02-26 09:25:23,241][13460] Saving new best policy, reward=5.435!
[2023-02-26 09:25:28,228][00216] Fps is (10 sec: 3685.6, 60 sec: 3618.1, 300 sec: 3679.4). Total num frames: 1302528. Throughput: 0: 917.2. Samples: 326488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:25:28,231][00216] Avg episode reward: [(0, '5.516')]
[2023-02-26 09:25:28,252][13460] Saving new best policy, reward=5.516!
[2023-02-26 09:25:30,271][13474] Updated weights for policy 0, policy_version 320 (0.0019)
[2023-02-26 09:25:33,226][00216] Fps is (10 sec: 3278.2, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 1318912. Throughput: 0: 914.7. Samples: 328606. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:25:33,233][00216] Avg episode reward: [(0, '5.423')]
[2023-02-26 09:25:38,227][00216] Fps is (10 sec: 3686.8, 60 sec: 3754.6, 300 sec: 3665.6). Total num frames: 1339392. Throughput: 0: 957.3. Samples: 334678. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:25:38,232][00216] Avg episode reward: [(0, '5.275')]
[2023-02-26 09:25:40,324][13474] Updated weights for policy 0, policy_version 330 (0.0016)
[2023-02-26 09:25:43,227][00216] Fps is (10 sec: 4505.1, 60 sec: 3754.6, 300 sec: 3693.3). Total num frames: 1363968. Throughput: 0: 963.9. Samples: 341408. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:25:43,229][00216] Avg episode reward: [(0, '5.337')]
[2023-02-26 09:25:48,226][00216] Fps is (10 sec: 3686.8, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 1376256. Throughput: 0: 933.0. Samples: 343566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:25:48,228][00216] Avg episode reward: [(0, '5.293')]
[2023-02-26 09:25:52,990][13474] Updated weights for policy 0, policy_version 340 (0.0026)
[2023-02-26 09:25:53,226][00216] Fps is (10 sec: 2867.5, 60 sec: 3754.7, 300 sec: 3637.9). Total num frames: 1392640. Throughput: 0: 904.1. Samples: 347780. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:25:53,227][00216] Avg episode reward: [(0, '5.256')]
[2023-02-26 09:25:58,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 1413120. Throughput: 0: 952.4. Samples: 354236. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 09:25:58,229][00216] Avg episode reward: [(0, '5.181')]
[2023-02-26 09:26:02,133][13474] Updated weights for policy 0, policy_version 350 (0.0020)
[2023-02-26 09:26:03,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 1433600. Throughput: 0: 954.8. Samples: 357648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:26:03,231][00216] Avg episode reward: [(0, '5.417')]
[2023-02-26 09:26:08,229][00216] Fps is (10 sec: 3685.2, 60 sec: 3686.2, 300 sec: 3665.5). Total num frames: 1449984. Throughput: 0: 916.7. Samples: 362774. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 09:26:08,232][00216] Avg episode reward: [(0, '5.374')]
[2023-02-26 09:26:13,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 1466368. Throughput: 0: 907.9. Samples: 367342. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 09:26:13,230][00216] Avg episode reward: [(0, '5.660')]
[2023-02-26 09:26:13,233][13460] Saving new best policy, reward=5.660!
[2023-02-26 09:26:14,460][13474] Updated weights for policy 0, policy_version 360 (0.0013)
[2023-02-26 09:26:18,226][00216] Fps is (10 sec: 4097.3, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 1490944. Throughput: 0: 936.6. Samples: 370754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:26:18,228][00216] Avg episode reward: [(0, '5.510')]
[2023-02-26 09:26:23,232][00216] Fps is (10 sec: 4502.8, 60 sec: 3754.5, 300 sec: 3734.9). Total num frames: 1511424. Throughput: 0: 957.9. Samples: 377790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:26:23,235][00216] Avg episode reward: [(0, '5.276')]
[2023-02-26 09:26:23,625][13474] Updated weights for policy 0, policy_version 370 (0.0031)
[2023-02-26 09:26:28,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3754.8, 300 sec: 3721.1). Total num frames: 1527808. Throughput: 0: 915.2. Samples: 382590. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 09:26:28,232][00216] Avg episode reward: [(0, '5.612')]
[2023-02-26 09:26:33,226][00216] Fps is (10 sec: 3278.9, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 1544192. Throughput: 0: 916.3. Samples: 384798. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:26:33,232][00216] Avg episode reward: [(0, '5.876')]
[2023-02-26 09:26:33,235][13460] Saving new best policy, reward=5.876!
[2023-02-26 09:26:35,611][13474] Updated weights for policy 0, policy_version 380 (0.0032)
[2023-02-26 09:26:38,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3721.1). Total num frames: 1568768. Throughput: 0: 967.4. Samples: 391312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:26:38,228][00216] Avg episode reward: [(0, '5.991')]
[2023-02-26 09:26:38,238][13460] Saving new best policy, reward=5.991!
[2023-02-26 09:26:43,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 1589248. Throughput: 0: 973.2. Samples: 398028. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:26:43,234][00216] Avg episode reward: [(0, '5.915')]
[2023-02-26 09:26:45,492][13474] Updated weights for policy 0, policy_version 390 (0.0019)
[2023-02-26 09:26:48,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 1605632. Throughput: 0: 948.8. Samples: 400346. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:26:48,230][00216] Avg episode reward: [(0, '5.760')]
[2023-02-26 09:26:48,237][13460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000392_1605632.pth...
[2023-02-26 09:26:48,366][13460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000175_716800.pth
[2023-02-26 09:26:53,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3707.2). Total num frames: 1622016. Throughput: 0: 934.2. Samples: 404812. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:26:53,228][00216] Avg episode reward: [(0, '6.108')]
[2023-02-26 09:26:53,233][13460] Saving new best policy, reward=6.108!
[2023-02-26 09:26:56,499][13474] Updated weights for policy 0, policy_version 400 (0.0018)
[2023-02-26 09:26:58,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3707.3). Total num frames: 1642496. Throughput: 0: 987.5. Samples: 411780. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:26:58,228][00216] Avg episode reward: [(0, '6.231')]
[2023-02-26 09:26:58,244][13460] Saving new best policy, reward=6.231!
[2023-02-26 09:27:03,229][00216] Fps is (10 sec: 4504.0, 60 sec: 3891.0, 300 sec: 3748.8). Total num frames: 1667072. Throughput: 0: 989.2. Samples: 415270. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:27:03,235][00216] Avg episode reward: [(0, '6.173')]
[2023-02-26 09:27:07,235][13474] Updated weights for policy 0, policy_version 410 (0.0018)
[2023-02-26 09:27:08,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3823.1, 300 sec: 3721.1). Total num frames: 1679360. Throughput: 0: 942.3. Samples: 420186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:27:08,235][00216] Avg episode reward: [(0, '6.699')]
[2023-02-26 09:27:08,251][13460] Saving new best policy, reward=6.699!
[2023-02-26 09:27:13,226][00216] Fps is (10 sec: 2868.2, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 1695744. Throughput: 0: 939.4. Samples: 424864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:27:13,232][00216] Avg episode reward: [(0, '7.311')]
[2023-02-26 09:27:13,235][13460] Saving new best policy, reward=7.311!
[2023-02-26 09:27:18,038][13474] Updated weights for policy 0, policy_version 420 (0.0023)
[2023-02-26 09:27:18,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 1720320. Throughput: 0: 965.6. Samples: 428248. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:27:18,232][00216] Avg episode reward: [(0, '7.888')]
[2023-02-26 09:27:18,242][13460] Saving new best policy, reward=7.888!
[2023-02-26 09:27:23,226][00216] Fps is (10 sec: 4505.7, 60 sec: 3823.3, 300 sec: 3735.0). Total num frames: 1740800. Throughput: 0: 976.6. Samples: 435258. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:27:23,228][00216] Avg episode reward: [(0, '7.251')]
[2023-02-26 09:27:28,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 1753088. Throughput: 0: 917.8. Samples: 439330. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 09:27:28,228][00216] Avg episode reward: [(0, '7.411')]
[2023-02-26 09:27:30,521][13474] Updated weights for policy 0, policy_version 430 (0.0016)
[2023-02-26 09:27:33,226][00216] Fps is (10 sec: 2457.6, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 1765376. Throughput: 0: 903.9. Samples: 441022. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 09:27:33,232][00216] Avg episode reward: [(0, '7.307')]
[2023-02-26 09:27:38,229][00216] Fps is (10 sec: 2866.3, 60 sec: 3549.7, 300 sec: 3665.5). Total num frames: 1781760. Throughput: 0: 892.0. Samples: 444954. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:27:38,234][00216] Avg episode reward: [(0, '7.298')]
[2023-02-26 09:27:42,718][13474] Updated weights for policy 0, policy_version 440 (0.0022)
[2023-02-26 09:27:43,226][00216] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3693.3). Total num frames: 1802240. Throughput: 0: 882.8. Samples: 451508. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:27:43,231][00216] Avg episode reward: [(0, '7.609')]
[2023-02-26 09:27:48,227][00216] Fps is (10 sec: 3687.2, 60 sec: 3549.8, 300 sec: 3679.4). Total num frames: 1818624. Throughput: 0: 870.1. Samples: 454422. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:27:48,230][00216] Avg episode reward: [(0, '7.923')]
[2023-02-26 09:27:48,247][13460] Saving new best policy, reward=7.923!
[2023-02-26 09:27:53,233][00216] Fps is (10 sec: 3274.6, 60 sec: 3549.5, 300 sec: 3651.6). Total num frames: 1835008. Throughput: 0: 858.9. Samples: 458844. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:27:53,238][00216] Avg episode reward: [(0, '8.439')]
[2023-02-26 09:27:53,243][13460] Saving new best policy, reward=8.439!
[2023-02-26 09:27:55,046][13474] Updated weights for policy 0, policy_version 450 (0.0036)
[2023-02-26 09:27:58,226][00216] Fps is (10 sec: 3686.8, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 1855488. Throughput: 0: 889.3. Samples: 464882. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:27:58,235][00216] Avg episode reward: [(0, '9.050')]
[2023-02-26 09:27:58,249][13460] Saving new best policy, reward=9.050!
[2023-02-26 09:28:03,226][00216] Fps is (10 sec: 4508.7, 60 sec: 3550.1, 300 sec: 3693.4). Total num frames: 1880064. Throughput: 0: 889.7. Samples: 468286. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:28:03,228][00216] Avg episode reward: [(0, '9.371')]
[2023-02-26 09:28:03,232][13460] Saving new best policy, reward=9.371!
[2023-02-26 09:28:03,815][13474] Updated weights for policy 0, policy_version 460 (0.0028)
[2023-02-26 09:28:08,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3679.5). Total num frames: 1896448. Throughput: 0: 864.6. Samples: 474164. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:28:08,231][00216] Avg episode reward: [(0, '9.103')]
[2023-02-26 09:28:13,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3665.6). Total num frames: 1912832. Throughput: 0: 872.4. Samples: 478588. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:28:13,228][00216] Avg episode reward: [(0, '9.286')]
[2023-02-26 09:28:16,175][13474] Updated weights for policy 0, policy_version 470 (0.0012)
[2023-02-26 09:28:18,227][00216] Fps is (10 sec: 3686.0, 60 sec: 3549.8, 300 sec: 3665.6). Total num frames: 1933312. Throughput: 0: 900.6. Samples: 481552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:28:18,232][00216] Avg episode reward: [(0, '9.031')]
[2023-02-26 09:28:23,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3693.3). Total num frames: 1957888. Throughput: 0: 967.8. Samples: 488502. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:28:23,228][00216] Avg episode reward: [(0, '9.141')]
[2023-02-26 09:28:25,365][13474] Updated weights for policy 0, policy_version 480 (0.0014)
[2023-02-26 09:28:28,226][00216] Fps is (10 sec: 4096.2, 60 sec: 3686.4, 300 sec: 3693.5). Total num frames: 1974272. Throughput: 0: 944.1. Samples: 493994. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:28:28,235][00216] Avg episode reward: [(0, '9.003')]
[2023-02-26 09:28:33,227][00216] Fps is (10 sec: 2866.8, 60 sec: 3686.3, 300 sec: 3651.7). Total num frames: 1986560. Throughput: 0: 927.8. Samples: 496174. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:28:33,233][00216] Avg episode reward: [(0, '8.939')]
[2023-02-26 09:28:37,442][13474] Updated weights for policy 0, policy_version 490 (0.0013)
[2023-02-26 09:28:38,226][00216] Fps is (10 sec: 3276.9, 60 sec: 3754.9, 300 sec: 3665.6). Total num frames: 2007040. Throughput: 0: 956.5. Samples: 501880. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:28:38,228][00216] Avg episode reward: [(0, '8.846')]
[2023-02-26 09:28:43,226][00216] Fps is (10 sec: 4506.2, 60 sec: 3823.0, 300 sec: 3693.3). Total num frames: 2031616. Throughput: 0: 978.6. Samples: 508918. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:28:43,229][00216] Avg episode reward: [(0, '9.300')]
[2023-02-26 09:28:47,263][13474] Updated weights for policy 0, policy_version 500 (0.0017)
[2023-02-26 09:28:48,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3693.3). Total num frames: 2048000. Throughput: 0: 965.0. Samples: 511710. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:28:48,233][00216] Avg episode reward: [(0, '9.482')]
[2023-02-26 09:28:48,247][13460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000500_2048000.pth...
[2023-02-26 09:28:48,397][13460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000281_1150976.pth
[2023-02-26 09:28:48,409][13460] Saving new best policy, reward=9.482!
[2023-02-26 09:28:53,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3823.4, 300 sec: 3665.6). Total num frames: 2064384. Throughput: 0: 928.6. Samples: 515950. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:28:53,228][00216] Avg episode reward: [(0, '9.634')]
[2023-02-26 09:28:53,231][13460] Saving new best policy, reward=9.634!
[2023-02-26 09:28:58,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3665.6). Total num frames: 2084864. Throughput: 0: 967.5. Samples: 522126. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:28:58,233][00216] Avg episode reward: [(0, '9.609')]
[2023-02-26 09:28:58,566][13474] Updated weights for policy 0, policy_version 510 (0.0021)
[2023-02-26 09:29:03,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3707.3). Total num frames: 2109440. Throughput: 0: 980.4. Samples: 525670. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:29:03,228][00216] Avg episode reward: [(0, '9.504')]
[2023-02-26 09:29:08,226][00216] Fps is (10 sec: 4095.7, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 2125824. Throughput: 0: 954.2. Samples: 531442. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:29:08,231][00216] Avg episode reward: [(0, '9.927')]
[2023-02-26 09:29:08,248][13460] Saving new best policy, reward=9.927!
[2023-02-26 09:29:09,217][13474] Updated weights for policy 0, policy_version 520 (0.0018)
[2023-02-26 09:29:13,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 2142208. Throughput: 0: 929.9. Samples: 535840. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:29:13,229][00216] Avg episode reward: [(0, '10.053')]
[2023-02-26 09:29:13,231][13460] Saving new best policy, reward=10.053!
[2023-02-26 09:29:18,226][00216] Fps is (10 sec: 3686.7, 60 sec: 3823.0, 300 sec: 3693.4). Total num frames: 2162688. Throughput: 0: 950.3. Samples: 538936. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:29:18,232][00216] Avg episode reward: [(0, '10.755')]
[2023-02-26 09:29:18,242][13460] Saving new best policy, reward=10.755!
[2023-02-26 09:29:19,657][13474] Updated weights for policy 0, policy_version 530 (0.0014)
[2023-02-26 09:29:23,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 2187264. Throughput: 0: 978.2. Samples: 545898. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:29:23,227][00216] Avg episode reward: [(0, '11.304')]
[2023-02-26 09:29:23,234][13460] Saving new best policy, reward=11.304!
[2023-02-26 09:29:28,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 2199552. Throughput: 0: 937.5. Samples: 551104. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:29:28,228][00216] Avg episode reward: [(0, '11.625')]
[2023-02-26 09:29:28,245][13460] Saving new best policy, reward=11.625!
[2023-02-26 09:29:31,298][13474] Updated weights for policy 0, policy_version 540 (0.0011)
[2023-02-26 09:29:33,226][00216] Fps is (10 sec: 2867.2, 60 sec: 3823.0, 300 sec: 3735.0). Total num frames: 2215936. Throughput: 0: 922.7. Samples: 553230. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:29:33,229][00216] Avg episode reward: [(0, '11.423')]
[2023-02-26 09:29:38,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 2240512. Throughput: 0: 962.5. Samples: 559262. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:29:38,228][00216] Avg episode reward: [(0, '11.195')]
[2023-02-26 09:29:40,784][13474] Updated weights for policy 0, policy_version 550 (0.0011)
[2023-02-26 09:29:43,226][00216] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 2260992. Throughput: 0: 983.0. Samples: 566360. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:29:43,228][00216] Avg episode reward: [(0, '11.257')]
[2023-02-26 09:29:48,229][00216] Fps is (10 sec: 3685.2, 60 sec: 3822.7, 300 sec: 3762.7). Total num frames: 2277376. Throughput: 0: 961.4. Samples: 568934. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:29:48,231][00216] Avg episode reward: [(0, '11.194')]
[2023-02-26 09:29:52,590][13474] Updated weights for policy 0, policy_version 560 (0.0019)
[2023-02-26 09:29:53,227][00216] Fps is (10 sec: 3276.3, 60 sec: 3822.8, 300 sec: 3748.9). Total num frames: 2293760. Throughput: 0: 934.2. Samples: 573480. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:29:53,237][00216] Avg episode reward: [(0, '11.864')]
[2023-02-26 09:29:53,243][13460] Saving new best policy, reward=11.864!
[2023-02-26 09:29:58,226][00216] Fps is (10 sec: 4097.3, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 2318336. Throughput: 0: 980.6. Samples: 579966. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:29:58,229][00216] Avg episode reward: [(0, '11.984')]
[2023-02-26 09:29:58,243][13460] Saving new best policy, reward=11.984!
[2023-02-26 09:30:01,761][13474] Updated weights for policy 0, policy_version 570 (0.0014)
[2023-02-26 09:30:03,226][00216] Fps is (10 sec: 4506.4, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 2338816. Throughput: 0: 988.2. Samples: 583406. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:30:03,234][00216] Avg episode reward: [(0, '12.548')]
[2023-02-26 09:30:03,237][13460] Saving new best policy, reward=12.548!
[2023-02-26 09:30:08,229][00216] Fps is (10 sec: 3685.2, 60 sec: 3822.8, 300 sec: 3776.6). Total num frames: 2355200. Throughput: 0: 954.3. Samples: 588844. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:30:08,232][00216] Avg episode reward: [(0, '11.949')]
[2023-02-26 09:30:13,227][00216] Fps is (10 sec: 3276.4, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 2371584. Throughput: 0: 938.3. Samples: 593330. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:30:13,231][00216] Avg episode reward: [(0, '11.476')]
[2023-02-26 09:30:14,102][13474] Updated weights for policy 0, policy_version 580 (0.0017)
[2023-02-26 09:30:18,226][00216] Fps is (10 sec: 3687.6, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 2392064. Throughput: 0: 965.1. Samples: 596658. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:30:18,234][00216] Avg episode reward: [(0, '11.146')]
[2023-02-26 09:30:22,760][13474] Updated weights for policy 0, policy_version 590 (0.0013)
[2023-02-26 09:30:23,226][00216] Fps is (10 sec: 4505.9, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 2416640. Throughput: 0: 988.5. Samples: 603746. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:30:23,228][00216] Avg episode reward: [(0, '11.839')]
[2023-02-26 09:30:28,226][00216] Fps is (10 sec: 4095.8, 60 sec: 3891.2, 300 sec: 3776.6). Total num frames: 2433024. Throughput: 0: 943.6. Samples: 608824. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:30:28,229][00216] Avg episode reward: [(0, '12.059')]
[2023-02-26 09:30:33,226][00216] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 2445312. Throughput: 0: 931.9. Samples: 610868. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:30:33,229][00216] Avg episode reward: [(0, '12.442')]
[2023-02-26 09:30:36,871][13474] Updated weights for policy 0, policy_version 600 (0.0015)
[2023-02-26 09:30:38,226][00216] Fps is (10 sec: 2457.7, 60 sec: 3618.1, 300 sec: 3707.2). Total num frames: 2457600. Throughput: 0: 918.0. Samples: 614788. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:30:38,232][00216] Avg episode reward: [(0, '13.385')]
[2023-02-26 09:30:38,250][13460] Saving new best policy, reward=13.385!
[2023-02-26 09:30:43,228][00216] Fps is (10 sec: 3276.3, 60 sec: 3618.0, 300 sec: 3735.0). Total num frames: 2478080. Throughput: 0: 894.8. Samples: 620232. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 09:30:43,234][00216] Avg episode reward: [(0, '13.486')]
[2023-02-26 09:30:43,237][13460] Saving new best policy, reward=13.486!
[2023-02-26 09:30:48,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3735.0). Total num frames: 2494464. Throughput: 0: 883.0. Samples: 623140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:30:48,231][00216] Avg episode reward: [(0, '14.082')]
[2023-02-26 09:30:48,245][13460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000609_2494464.pth...
[2023-02-26 09:30:48,351][13460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000392_1605632.pth
[2023-02-26 09:30:48,363][13460] Saving new best policy, reward=14.082!
[2023-02-26 09:30:48,845][13474] Updated weights for policy 0, policy_version 610 (0.0014)
[2023-02-26 09:30:53,230][00216] Fps is (10 sec: 3276.1, 60 sec: 3618.0, 300 sec: 3721.1). Total num frames: 2510848. Throughput: 0: 860.3. Samples: 627558. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:30:53,232][00216] Avg episode reward: [(0, '14.645')]
[2023-02-26 09:30:53,241][13460] Saving new best policy, reward=14.645!
[2023-02-26 09:30:58,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3721.1). Total num frames: 2531328. Throughput: 0: 898.5. Samples: 633762. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:30:58,233][00216] Avg episode reward: [(0, '15.440')]
[2023-02-26 09:30:58,246][13460] Saving new best policy, reward=15.440!
[2023-02-26 09:30:59,413][13474] Updated weights for policy 0, policy_version 620 (0.0015)
[2023-02-26 09:31:03,226][00216] Fps is (10 sec: 4507.4, 60 sec: 3618.1, 300 sec: 3748.9). Total num frames: 2555904. Throughput: 0: 902.2. Samples: 637256. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:31:03,234][00216] Avg episode reward: [(0, '14.876')]
[2023-02-26 09:31:08,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3618.3, 300 sec: 3748.9). Total num frames: 2572288. Throughput: 0: 880.4. Samples: 643362. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:31:08,229][00216] Avg episode reward: [(0, '14.582')]
[2023-02-26 09:31:09,953][13474] Updated weights for policy 0, policy_version 630 (0.0038)
[2023-02-26 09:31:13,226][00216] Fps is (10 sec: 3276.9, 60 sec: 3618.2, 300 sec: 3721.1). Total num frames: 2588672. Throughput: 0: 866.8. Samples: 647830. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:31:13,231][00216] Avg episode reward: [(0, '14.313')]
[2023-02-26 09:31:18,227][00216] Fps is (10 sec: 3686.0, 60 sec: 3618.1, 300 sec: 3721.2). Total num frames: 2609152. Throughput: 0: 888.7. Samples: 650860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:31:18,230][00216] Avg episode reward: [(0, '14.172')]
[2023-02-26 09:31:20,256][13474] Updated weights for policy 0, policy_version 640 (0.0028)
[2023-02-26 09:31:23,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3748.9). Total num frames: 2633728. Throughput: 0: 959.3. Samples: 657958. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:31:23,232][00216] Avg episode reward: [(0, '15.906')]
[2023-02-26 09:31:23,236][13460] Saving new best policy, reward=15.906!
[2023-02-26 09:31:28,226][00216] Fps is (10 sec: 4096.5, 60 sec: 3618.2, 300 sec: 3748.9). Total num frames: 2650112. Throughput: 0: 956.9. Samples: 663290. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:31:28,231][00216] Avg episode reward: [(0, '16.873')]
[2023-02-26 09:31:28,245][13460] Saving new best policy, reward=16.873!
[2023-02-26 09:31:31,847][13474] Updated weights for policy 0, policy_version 650 (0.0018)
[2023-02-26 09:31:33,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 2666496. Throughput: 0: 940.8. Samples: 665476. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:31:33,231][00216] Avg episode reward: [(0, '16.678')]
[2023-02-26 09:31:38,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 2686976. Throughput: 0: 970.5. Samples: 671226. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:31:38,228][00216] Avg episode reward: [(0, '16.084')]
[2023-02-26 09:31:41,168][13474] Updated weights for policy 0, policy_version 660 (0.0019)
[2023-02-26 09:31:43,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3891.3, 300 sec: 3748.9). Total num frames: 2711552. Throughput: 0: 993.1. Samples: 678452. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:31:43,233][00216] Avg episode reward: [(0, '15.074')]
[2023-02-26 09:31:48,228][00216] Fps is (10 sec: 4095.1, 60 sec: 3891.1, 300 sec: 3748.9). Total num frames: 2727936. Throughput: 0: 976.3. Samples: 681192. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:31:48,230][00216] Avg episode reward: [(0, '14.117')]
[2023-02-26 09:31:53,192][13474] Updated weights for policy 0, policy_version 670 (0.0014)
[2023-02-26 09:31:53,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3891.5, 300 sec: 3735.0). Total num frames: 2744320. Throughput: 0: 938.8. Samples: 685610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:31:53,233][00216] Avg episode reward: [(0, '15.521')]
[2023-02-26 09:31:58,226][00216] Fps is (10 sec: 3687.3, 60 sec: 3891.2, 300 sec: 3721.2). Total num frames: 2764800. Throughput: 0: 979.5. Samples: 691908. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:31:58,232][00216] Avg episode reward: [(0, '17.436')]
[2023-02-26 09:31:58,242][13460] Saving new best policy, reward=17.436!
[2023-02-26 09:32:02,188][13474] Updated weights for policy 0, policy_version 680 (0.0013)
[2023-02-26 09:32:03,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3762.8). Total num frames: 2789376. Throughput: 0: 989.3. Samples: 695376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:32:03,233][00216] Avg episode reward: [(0, '16.932')]
[2023-02-26 09:32:08,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3762.8). Total num frames: 2805760. Throughput: 0: 959.2. Samples: 701120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:32:08,229][00216] Avg episode reward: [(0, '17.583')]
[2023-02-26 09:32:08,248][13460] Saving new best policy, reward=17.583!
[2023-02-26 09:32:13,226][00216] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 2818048. Throughput: 0: 939.3. Samples: 705558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:32:13,234][00216] Avg episode reward: [(0, '16.656')]
[2023-02-26 09:32:14,468][13474] Updated weights for policy 0, policy_version 690 (0.0027)
[2023-02-26 09:32:18,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3891.3, 300 sec: 3735.0). Total num frames: 2842624. Throughput: 0: 961.2. Samples: 708730. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:32:18,228][00216] Avg episode reward: [(0, '16.884')]
[2023-02-26 09:32:23,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 2863104. Throughput: 0: 990.0. Samples: 715774. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:32:23,228][00216] Avg episode reward: [(0, '17.133')]
[2023-02-26 09:32:23,261][13474] Updated weights for policy 0, policy_version 700 (0.0017)
[2023-02-26 09:32:28,226][00216] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 2879488. Throughput: 0: 945.7. Samples: 721010. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:32:28,228][00216] Avg episode reward: [(0, '16.495')]
[2023-02-26 09:32:33,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 2895872. Throughput: 0: 933.4. Samples: 723194. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:32:33,231][00216] Avg episode reward: [(0, '16.505')]
[2023-02-26 09:32:35,532][13474] Updated weights for policy 0, policy_version 710 (0.0023)
[2023-02-26 09:32:38,226][00216] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 2920448. Throughput: 0: 970.7. Samples: 729290. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:32:38,228][00216] Avg episode reward: [(0, '16.442')]
[2023-02-26 09:32:43,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 2940928. Throughput: 0: 986.8. Samples: 736314. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:32:43,230][00216] Avg episode reward: [(0, '17.443')]
[2023-02-26 09:32:44,797][13474] Updated weights for policy 0, policy_version 720 (0.0015)
[2023-02-26 09:32:48,226][00216] Fps is (10 sec: 3686.3, 60 sec: 3823.1, 300 sec: 3804.5). Total num frames: 2957312. Throughput: 0: 961.2. Samples: 738630. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:32:48,230][00216] Avg episode reward: [(0, '18.086')]
[2023-02-26 09:32:48,246][13460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000722_2957312.pth...
[2023-02-26 09:32:48,374][13460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000500_2048000.pth
[2023-02-26 09:32:48,382][13460] Saving new best policy, reward=18.086!
[2023-02-26 09:32:53,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 2973696. Throughput: 0: 927.6. Samples: 742862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:32:53,235][00216] Avg episode reward: [(0, '18.974')]
[2023-02-26 09:32:53,237][13460] Saving new best policy, reward=18.974!
[2023-02-26 09:32:56,765][13474] Updated weights for policy 0, policy_version 730 (0.0039)
[2023-02-26 09:32:58,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 2994176. Throughput: 0: 975.5. Samples: 749456. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:32:58,228][00216] Avg episode reward: [(0, '20.149')]
[2023-02-26 09:32:58,247][13460] Saving new best policy, reward=20.149!
[2023-02-26 09:33:03,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3018752. Throughput: 0: 982.8. Samples: 752956. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:33:03,230][00216] Avg episode reward: [(0, '19.032')]
[2023-02-26 09:33:07,053][13474] Updated weights for policy 0, policy_version 740 (0.0014)
[2023-02-26 09:33:08,226][00216] Fps is (10 sec: 3686.3, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 3031040. Throughput: 0: 942.9. Samples: 758204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:33:08,231][00216] Avg episode reward: [(0, '18.793')]
[2023-02-26 09:33:13,226][00216] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 3047424. Throughput: 0: 930.4. Samples: 762878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:33:13,229][00216] Avg episode reward: [(0, '18.609')]
[2023-02-26 09:33:17,738][13474] Updated weights for policy 0, policy_version 750 (0.0014)
[2023-02-26 09:33:18,226][00216] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 3072000. Throughput: 0: 961.6. Samples: 766464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:33:18,233][00216] Avg episode reward: [(0, '18.087')]
[2023-02-26 09:33:23,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 3092480. Throughput: 0: 981.4. Samples: 773452. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:33:23,230][00216] Avg episode reward: [(0, '16.883')]
[2023-02-26 09:33:28,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3108864. Throughput: 0: 929.7. Samples: 778152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:33:28,233][00216] Avg episode reward: [(0, '17.660')]
[2023-02-26 09:33:28,781][13474] Updated weights for policy 0, policy_version 760 (0.0020)
[2023-02-26 09:33:33,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 3125248. Throughput: 0: 927.1. Samples: 780350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:33:33,227][00216] Avg episode reward: [(0, '16.969')]
[2023-02-26 09:33:38,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 3145728. Throughput: 0: 978.4. Samples: 786890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:33:38,232][00216] Avg episode reward: [(0, '16.538')]
[2023-02-26 09:33:39,947][13474] Updated weights for policy 0, policy_version 770 (0.0024)
[2023-02-26 09:33:43,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 3162112. Throughput: 0: 928.3. Samples: 791230. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:33:43,228][00216] Avg episode reward: [(0, '17.020')]
[2023-02-26 09:33:48,228][00216] Fps is (10 sec: 2866.6, 60 sec: 3618.0, 300 sec: 3762.7). Total num frames: 3174400. Throughput: 0: 890.6. Samples: 793036. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:33:48,231][00216] Avg episode reward: [(0, '16.985')]
[2023-02-26 09:33:53,226][00216] Fps is (10 sec: 2457.5, 60 sec: 3549.8, 300 sec: 3735.0). Total num frames: 3186688. Throughput: 0: 862.8. Samples: 797032. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:33:53,228][00216] Avg episode reward: [(0, '19.298')]
[2023-02-26 09:33:54,426][13474] Updated weights for policy 0, policy_version 780 (0.0020)
[2023-02-26 09:33:58,226][00216] Fps is (10 sec: 3687.1, 60 sec: 3618.1, 300 sec: 3735.0). Total num frames: 3211264. Throughput: 0: 901.5. Samples: 803444. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:33:58,229][00216] Avg episode reward: [(0, '18.446')]
[2023-02-26 09:34:03,166][13474] Updated weights for policy 0, policy_version 790 (0.0019)
[2023-02-26 09:34:03,226][00216] Fps is (10 sec: 4915.3, 60 sec: 3618.1, 300 sec: 3762.8). Total num frames: 3235840. Throughput: 0: 897.9. Samples: 806868. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:34:03,234][00216] Avg episode reward: [(0, '21.317')]
[2023-02-26 09:34:03,237][13460] Saving new best policy, reward=21.317!
[2023-02-26 09:34:08,252][00216] Fps is (10 sec: 3676.9, 60 sec: 3616.6, 300 sec: 3748.5). Total num frames: 3248128. Throughput: 0: 869.4. Samples: 812596. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:34:08,256][00216] Avg episode reward: [(0, '21.325')]
[2023-02-26 09:34:08,344][13460] Saving new best policy, reward=21.325!
[2023-02-26 09:34:13,226][00216] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3735.0). Total num frames: 3264512. Throughput: 0: 863.1. Samples: 816990. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:34:13,233][00216] Avg episode reward: [(0, '21.111')]
[2023-02-26 09:34:15,516][13474] Updated weights for policy 0, policy_version 800 (0.0024)
[2023-02-26 09:34:18,226][00216] Fps is (10 sec: 4106.7, 60 sec: 3618.1, 300 sec: 3735.0). Total num frames: 3289088. Throughput: 0: 886.8. Samples: 820258. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:34:18,231][00216] Avg episode reward: [(0, '19.757')]
[2023-02-26 09:34:23,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3762.8). Total num frames: 3309568. Throughput: 0: 895.7. Samples: 827196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:34:23,232][00216] Avg episode reward: [(0, '17.580')]
[2023-02-26 09:34:24,879][13474] Updated weights for policy 0, policy_version 810 (0.0037)
[2023-02-26 09:34:28,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3762.8). Total num frames: 3325952. Throughput: 0: 915.8. Samples: 832442. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:34:28,228][00216] Avg episode reward: [(0, '16.973')]
[2023-02-26 09:34:33,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3735.0). Total num frames: 3342336. Throughput: 0: 926.3. Samples: 834718. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:34:33,231][00216] Avg episode reward: [(0, '16.238')]
[2023-02-26 09:34:36,376][13474] Updated weights for policy 0, policy_version 820 (0.0014)
[2023-02-26 09:34:38,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 3366912. Throughput: 0: 974.8. Samples: 840896. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:34:38,228][00216] Avg episode reward: [(0, '15.791')]
[2023-02-26 09:34:43,226][00216] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 3391488. Throughput: 0: 992.4. Samples: 848102. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:34:43,231][00216] Avg episode reward: [(0, '16.458')]
[2023-02-26 09:34:45,794][13474] Updated weights for policy 0, policy_version 830 (0.0011)
[2023-02-26 09:34:48,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3823.1, 300 sec: 3762.8). Total num frames: 3403776. Throughput: 0: 969.4. Samples: 850490. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:34:48,236][00216] Avg episode reward: [(0, '17.738')]
[2023-02-26 09:34:48,256][13460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000831_3403776.pth...
[2023-02-26 09:34:48,407][13460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000609_2494464.pth
[2023-02-26 09:34:53,226][00216] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 3420160. Throughput: 0: 939.8. Samples: 854864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:34:53,231][00216] Avg episode reward: [(0, '18.051')]
[2023-02-26 09:34:57,047][13474] Updated weights for policy 0, policy_version 840 (0.0021)
[2023-02-26 09:34:58,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 3444736. Throughput: 0: 994.3. Samples: 861734. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:34:58,228][00216] Avg episode reward: [(0, '19.342')]
[2023-02-26 09:35:03,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 3465216. Throughput: 0: 1000.4. Samples: 865278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:35:03,228][00216] Avg episode reward: [(0, '19.225')]
[2023-02-26 09:35:07,555][13474] Updated weights for policy 0, policy_version 850 (0.0012)
[2023-02-26 09:35:08,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3892.9, 300 sec: 3762.8). Total num frames: 3481600. Throughput: 0: 964.3. Samples: 870588. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:35:08,229][00216] Avg episode reward: [(0, '19.059')]
[2023-02-26 09:35:13,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 3497984. Throughput: 0: 947.6. Samples: 875084. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:35:13,231][00216] Avg episode reward: [(0, '19.283')]
[2023-02-26 09:35:18,143][13474] Updated weights for policy 0, policy_version 860 (0.0013)
[2023-02-26 09:35:18,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 3522560. Throughput: 0: 975.5. Samples: 878616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:35:18,228][00216] Avg episode reward: [(0, '18.462')]
[2023-02-26 09:35:23,227][00216] Fps is (10 sec: 4505.1, 60 sec: 3891.1, 300 sec: 3762.8). Total num frames: 3543040. Throughput: 0: 989.7. Samples: 885432. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:35:23,231][00216] Avg episode reward: [(0, '18.448')]
[2023-02-26 09:35:28,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 3559424. Throughput: 0: 938.4. Samples: 890328. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:35:28,231][00216] Avg episode reward: [(0, '19.081')]
[2023-02-26 09:35:29,453][13474] Updated weights for policy 0, policy_version 870 (0.0011)
[2023-02-26 09:35:33,226][00216] Fps is (10 sec: 3277.2, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3575808. Throughput: 0: 934.2. Samples: 892528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:35:33,229][00216] Avg episode reward: [(0, '19.582')]
[2023-02-26 09:35:38,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 3596288. Throughput: 0: 979.3. Samples: 898932. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:35:38,227][00216] Avg episode reward: [(0, '19.699')]
[2023-02-26 09:35:39,250][13474] Updated weights for policy 0, policy_version 880 (0.0019)
[2023-02-26 09:35:43,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3620864. Throughput: 0: 980.8. Samples: 905872. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:35:43,233][00216] Avg episode reward: [(0, '19.967')]
[2023-02-26 09:35:48,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.4). Total num frames: 3637248. Throughput: 0: 952.3. Samples: 908130. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:35:48,228][00216] Avg episode reward: [(0, '19.150')]
[2023-02-26 09:35:50,978][13474] Updated weights for policy 0, policy_version 890 (0.0014)
[2023-02-26 09:35:53,225][00216] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3653632. Throughput: 0: 933.4. Samples: 912590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:35:53,228][00216] Avg episode reward: [(0, '18.615')]
[2023-02-26 09:35:58,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 3674112. Throughput: 0: 991.2. Samples: 919690. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:35:58,231][00216] Avg episode reward: [(0, '18.432')]
[2023-02-26 09:36:00,028][13474] Updated weights for policy 0, policy_version 900 (0.0012)
[2023-02-26 09:36:03,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3698688. Throughput: 0: 991.2. Samples: 923218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:36:03,232][00216] Avg episode reward: [(0, '18.462')]
[2023-02-26 09:36:08,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3710976. Throughput: 0: 952.6. Samples: 928298. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:36:08,230][00216] Avg episode reward: [(0, '18.432')]
[2023-02-26 09:36:12,258][13474] Updated weights for policy 0, policy_version 910 (0.0014)
[2023-02-26 09:36:13,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3731456. Throughput: 0: 955.1. Samples: 933306. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:36:13,234][00216] Avg episode reward: [(0, '19.106')]
[2023-02-26 09:36:18,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 3751936. Throughput: 0: 983.4. Samples: 936782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:36:18,228][00216] Avg episode reward: [(0, '20.129')]
[2023-02-26 09:36:20,864][13474] Updated weights for policy 0, policy_version 920 (0.0020)
[2023-02-26 09:36:23,226][00216] Fps is (10 sec: 4505.2, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3776512. Throughput: 0: 997.9. Samples: 943838. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 09:36:23,231][00216] Avg episode reward: [(0, '20.616')]
[2023-02-26 09:36:28,233][00216] Fps is (10 sec: 3683.8, 60 sec: 3822.5, 300 sec: 3804.3). Total num frames: 3788800. Throughput: 0: 944.4. Samples: 948376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:36:28,235][00216] Avg episode reward: [(0, '21.902')]
[2023-02-26 09:36:28,254][13460] Saving new best policy, reward=21.902!
[2023-02-26 09:36:32,883][13474] Updated weights for policy 0, policy_version 930 (0.0048)
[2023-02-26 09:36:33,226][00216] Fps is (10 sec: 3277.1, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3809280. Throughput: 0: 943.1. Samples: 950568. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:36:33,229][00216] Avg episode reward: [(0, '21.855')]
[2023-02-26 09:36:38,226][00216] Fps is (10 sec: 4508.8, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 3833856. Throughput: 0: 997.5. Samples: 957478. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:36:38,235][00216] Avg episode reward: [(0, '21.639')]
[2023-02-26 09:36:43,175][13474] Updated weights for policy 0, policy_version 940 (0.0013)
[2023-02-26 09:36:43,226][00216] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 3850240. Throughput: 0: 963.2. Samples: 963036. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:36:43,236][00216] Avg episode reward: [(0, '21.340')]
[2023-02-26 09:36:48,230][00216] Fps is (10 sec: 2456.4, 60 sec: 3686.1, 300 sec: 3776.6). Total num frames: 3858432. Throughput: 0: 924.3. Samples: 964814. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:36:48,233][00216] Avg episode reward: [(0, '20.290')]
[2023-02-26 09:36:48,350][13460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000943_3862528.pth...
[2023-02-26 09:36:48,593][13460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000722_2957312.pth
[2023-02-26 09:36:53,226][00216] Fps is (10 sec: 2048.0, 60 sec: 3618.1, 300 sec: 3748.9). Total num frames: 3870720. Throughput: 0: 887.4. Samples: 968232. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 09:36:53,233][00216] Avg episode reward: [(0, '20.981')]
[2023-02-26 09:36:57,394][13474] Updated weights for policy 0, policy_version 950 (0.0019)
[2023-02-26 09:36:58,226][00216] Fps is (10 sec: 3278.4, 60 sec: 3618.1, 300 sec: 3735.0). Total num frames: 3891200. Throughput: 0: 903.7. Samples: 973972. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 09:36:58,227][00216] Avg episode reward: [(0, '19.445')]
[2023-02-26 09:37:03,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3762.8). Total num frames: 3915776. Throughput: 0: 904.7. Samples: 977494. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:37:03,227][00216] Avg episode reward: [(0, '20.813')]
[2023-02-26 09:37:06,358][13474] Updated weights for policy 0, policy_version 960 (0.0013)
[2023-02-26 09:37:08,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 3936256. Throughput: 0: 889.0. Samples: 983844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:37:08,229][00216] Avg episode reward: [(0, '22.096')]
[2023-02-26 09:37:08,243][13460] Saving new best policy, reward=22.096!
[2023-02-26 09:37:13,226][00216] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3748.9). Total num frames: 3948544. Throughput: 0: 889.9. Samples: 988414. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 09:37:13,234][00216] Avg episode reward: [(0, '22.623')]
[2023-02-26 09:37:13,239][13460] Saving new best policy, reward=22.623!
[2023-02-26 09:37:18,149][13474] Updated weights for policy 0, policy_version 970 (0.0034)
[2023-02-26 09:37:18,226][00216] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3762.8). Total num frames: 3973120. Throughput: 0: 900.4. Samples: 991086. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:37:18,232][00216] Avg episode reward: [(0, '22.831')]
[2023-02-26 09:37:18,247][13460] Saving new best policy, reward=22.831!
[2023-02-26 09:37:23,226][00216] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3776.7). Total num frames: 3993600. Throughput: 0: 903.2. Samples: 998120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 09:37:23,228][00216] Avg episode reward: [(0, '23.140')]
[2023-02-26 09:37:23,230][13460] Saving new best policy, reward=23.140!
[2023-02-26 09:37:25,483][13460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-26 09:37:25,493][13460] Stopping Batcher_0...
[2023-02-26 09:37:25,493][13460] Loop batcher_evt_loop terminating...
[2023-02-26 09:37:25,494][00216] Component Batcher_0 stopped!
[2023-02-26 09:37:25,516][13475] Stopping RolloutWorker_w0...
[2023-02-26 09:37:25,516][13475] Loop rollout_proc0_evt_loop terminating...
[2023-02-26 09:37:25,517][00216] Component RolloutWorker_w0 stopped!
[2023-02-26 09:37:25,557][13481] Stopping RolloutWorker_w6...
[2023-02-26 09:37:25,557][13481] Loop rollout_proc6_evt_loop terminating...
[2023-02-26 09:37:25,555][13478] Stopping RolloutWorker_w4...
[2023-02-26 09:37:25,559][00216] Component RolloutWorker_w4 stopped!
[2023-02-26 09:37:25,559][13478] Loop rollout_proc4_evt_loop terminating...
[2023-02-26 09:37:25,561][00216] Component RolloutWorker_w6 stopped!
[2023-02-26 09:37:25,575][13477] Stopping RolloutWorker_w2...
[2023-02-26 09:37:25,579][13477] Loop rollout_proc2_evt_loop terminating...
[2023-02-26 09:37:25,575][00216] Component RolloutWorker_w2 stopped!
[2023-02-26 09:37:25,666][13474] Weights refcount: 2 0
[2023-02-26 09:37:25,682][00216] Component InferenceWorker_p0-w0 stopped!
[2023-02-26 09:37:25,684][13474] Stopping InferenceWorker_p0-w0...
[2023-02-26 09:37:25,685][13474] Loop inference_proc0-0_evt_loop terminating...
[2023-02-26 09:37:25,757][00216] Component RolloutWorker_w5 stopped!
[2023-02-26 09:37:25,769][13480] Stopping RolloutWorker_w5...
[2023-02-26 09:37:25,770][13480] Loop rollout_proc5_evt_loop terminating...
[2023-02-26 09:37:25,772][00216] Component RolloutWorker_w3 stopped!
[2023-02-26 09:37:25,774][13479] Stopping RolloutWorker_w3...
[2023-02-26 09:37:25,753][13460] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000831_3403776.pth
[2023-02-26 09:37:25,775][13479] Loop rollout_proc3_evt_loop terminating...
[2023-02-26 09:37:25,793][13460] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-26 09:37:25,808][13476] Stopping RolloutWorker_w1...
[2023-02-26 09:37:25,808][00216] Component RolloutWorker_w1 stopped!
[2023-02-26 09:37:25,818][13482] Stopping RolloutWorker_w7...
[2023-02-26 09:37:25,818][00216] Component RolloutWorker_w7 stopped!
[2023-02-26 09:37:25,826][13476] Loop rollout_proc1_evt_loop terminating...
[2023-02-26 09:37:25,846][13482] Loop rollout_proc7_evt_loop terminating...
[2023-02-26 09:37:26,121][13460] Stopping LearnerWorker_p0...
[2023-02-26 09:37:26,122][00216] Component LearnerWorker_p0 stopped!
[2023-02-26 09:37:26,123][13460] Loop learner_proc0_evt_loop terminating...
[2023-02-26 09:37:26,123][00216] Waiting for process learner_proc0 to stop...
[2023-02-26 09:37:28,253][00216] Waiting for process inference_proc0-0 to join...
[2023-02-26 09:37:29,031][00216] Waiting for process rollout_proc0 to join...
[2023-02-26 09:37:29,081][00216] Waiting for process rollout_proc1 to join...
[2023-02-26 09:37:29,845][00216] Waiting for process rollout_proc2 to join...
[2023-02-26 09:37:29,847][00216] Waiting for process rollout_proc3 to join...
[2023-02-26 09:37:29,856][00216] Waiting for process rollout_proc4 to join...
[2023-02-26 09:37:29,861][00216] Waiting for process rollout_proc5 to join...
[2023-02-26 09:37:29,866][00216] Waiting for process rollout_proc6 to join...
[2023-02-26 09:37:29,868][00216] Waiting for process rollout_proc7 to join...
[2023-02-26 09:37:29,875][00216] Batcher 0 profile tree view:
batching: 25.9250, releasing_batches: 0.0241
[2023-02-26 09:37:29,877][00216] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 534.9245
update_model: 7.6956
weight_update: 0.0036
one_step: 0.0167
handle_policy_step: 502.2466
deserialize: 14.6894, stack: 2.8199, obs_to_device_normalize: 113.4075, forward: 239.1676, send_messages: 25.3623
prepare_outputs: 81.2103
to_cpu: 50.4486
[2023-02-26 09:37:29,879][00216] Learner 0 profile tree view:
misc: 0.0062, prepare_batch: 15.6359
train: 75.1877
epoch_init: 0.0058, minibatch_init: 0.0171, losses_postprocess: 0.5741, kl_divergence: 0.5399, after_optimizer: 32.9012
calculate_losses: 26.7405
losses_init: 0.0037, forward_head: 1.6479, bptt_initial: 17.7252, tail: 1.0197, advantages_returns: 0.3107, losses: 3.6813
bptt: 2.0454
bptt_forward_core: 1.9500
update: 13.7707
clip: 1.3561
[2023-02-26 09:37:29,881][00216] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3931, enqueue_policy_requests: 147.6515, env_step: 811.2446, overhead: 20.7207, complete_rollouts: 6.2950
save_policy_outputs: 20.2096
split_output_tensors: 9.7217
[2023-02-26 09:37:29,884][00216] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.2951, enqueue_policy_requests: 144.5329, env_step: 815.4259, overhead: 20.3001, complete_rollouts: 7.1259
save_policy_outputs: 19.9807
split_output_tensors: 9.4458
[2023-02-26 09:37:29,887][00216] Loop Runner_EvtLoop terminating...
[2023-02-26 09:37:29,889][00216] Runner profile tree view:
main_loop: 1115.8789
[2023-02-26 09:37:29,891][00216] Collected {0: 4005888}, FPS: 3589.9
[2023-02-26 09:49:35,609][00216] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-26 09:49:35,611][00216] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-26 09:49:35,614][00216] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-26 09:49:35,617][00216] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-26 09:49:35,619][00216] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-26 09:49:35,620][00216] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-26 09:49:35,623][00216] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-26 09:49:35,625][00216] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-26 09:49:35,627][00216] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-26 09:49:35,630][00216] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-26 09:49:35,631][00216] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-26 09:49:35,633][00216] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-26 09:49:35,634][00216] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-26 09:49:35,635][00216] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-26 09:49:35,636][00216] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-26 09:49:35,665][00216] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 09:49:35,669][00216] RunningMeanStd input shape: (3, 72, 128)
[2023-02-26 09:49:35,673][00216] RunningMeanStd input shape: (1,)
[2023-02-26 09:49:35,691][00216] ConvEncoder: input_channels=3
[2023-02-26 09:49:36,352][00216] Conv encoder output size: 512
[2023-02-26 09:49:36,354][00216] Policy head output size: 512
[2023-02-26 09:49:38,877][00216] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-26 09:49:40,155][00216] Num frames 100...
[2023-02-26 09:49:40,270][00216] Num frames 200...
[2023-02-26 09:49:40,384][00216] Num frames 300...
[2023-02-26 09:49:40,501][00216] Avg episode rewards: #0: 4.520, true rewards: #0: 3.520
[2023-02-26 09:49:40,503][00216] Avg episode reward: 4.520, avg true_objective: 3.520
[2023-02-26 09:49:40,567][00216] Num frames 400...
[2023-02-26 09:49:40,684][00216] Num frames 500...
[2023-02-26 09:49:40,811][00216] Num frames 600...
[2023-02-26 09:49:40,930][00216] Num frames 700...
[2023-02-26 09:49:41,081][00216] Num frames 800...
[2023-02-26 09:49:41,241][00216] Num frames 900...
[2023-02-26 09:49:41,401][00216] Num frames 1000...
[2023-02-26 09:49:41,547][00216] Avg episode rewards: #0: 9.780, true rewards: #0: 5.280
[2023-02-26 09:49:41,549][00216] Avg episode reward: 9.780, avg true_objective: 5.280
[2023-02-26 09:49:41,622][00216] Num frames 1100...
[2023-02-26 09:49:41,781][00216] Num frames 1200...
[2023-02-26 09:49:41,967][00216] Num frames 1300...
[2023-02-26 09:49:42,136][00216] Num frames 1400...
[2023-02-26 09:49:42,304][00216] Num frames 1500...
[2023-02-26 09:49:42,466][00216] Num frames 1600...
[2023-02-26 09:49:42,632][00216] Avg episode rewards: #0: 9.547, true rewards: #0: 5.547
[2023-02-26 09:49:42,635][00216] Avg episode reward: 9.547, avg true_objective: 5.547
[2023-02-26 09:49:42,709][00216] Num frames 1700...
[2023-02-26 09:49:42,879][00216] Num frames 1800...
[2023-02-26 09:49:43,056][00216] Num frames 1900...
[2023-02-26 09:49:43,224][00216] Num frames 2000...
[2023-02-26 09:49:43,394][00216] Num frames 2100...
[2023-02-26 09:49:43,566][00216] Num frames 2200...
[2023-02-26 09:49:43,742][00216] Num frames 2300...
[2023-02-26 09:49:43,923][00216] Num frames 2400...
[2023-02-26 09:49:44,092][00216] Num frames 2500...
[2023-02-26 09:49:44,259][00216] Num frames 2600...
[2023-02-26 09:49:44,431][00216] Num frames 2700...
[2023-02-26 09:49:44,524][00216] Avg episode rewards: #0: 13.800, true rewards: #0: 6.800
[2023-02-26 09:49:44,526][00216] Avg episode reward: 13.800, avg true_objective: 6.800
[2023-02-26 09:49:44,669][00216] Num frames 2800...
[2023-02-26 09:49:44,807][00216] Num frames 2900...
[2023-02-26 09:49:44,945][00216] Num frames 3000...
[2023-02-26 09:49:45,077][00216] Num frames 3100...
[2023-02-26 09:49:45,194][00216] Num frames 3200...
[2023-02-26 09:49:45,312][00216] Num frames 3300...
[2023-02-26 09:49:45,429][00216] Num frames 3400...
[2023-02-26 09:49:45,549][00216] Num frames 3500...
[2023-02-26 09:49:45,676][00216] Num frames 3600...
[2023-02-26 09:49:45,800][00216] Num frames 3700...
[2023-02-26 09:49:45,924][00216] Num frames 3800...
[2023-02-26 09:49:46,055][00216] Num frames 3900...
[2023-02-26 09:49:46,177][00216] Num frames 4000...
[2023-02-26 09:49:46,301][00216] Num frames 4100...
[2023-02-26 09:49:46,419][00216] Num frames 4200...
[2023-02-26 09:49:46,574][00216] Avg episode rewards: #0: 18.568, true rewards: #0: 8.568
[2023-02-26 09:49:46,577][00216] Avg episode reward: 18.568, avg true_objective: 8.568
[2023-02-26 09:49:46,599][00216] Num frames 4300...
[2023-02-26 09:49:46,727][00216] Num frames 4400...
[2023-02-26 09:49:46,854][00216] Num frames 4500...
[2023-02-26 09:49:46,978][00216] Num frames 4600...
[2023-02-26 09:49:47,106][00216] Num frames 4700...
[2023-02-26 09:49:47,224][00216] Num frames 4800...
[2023-02-26 09:49:47,352][00216] Num frames 4900...
[2023-02-26 09:49:47,471][00216] Num frames 5000...
[2023-02-26 09:49:47,598][00216] Num frames 5100...
[2023-02-26 09:49:47,735][00216] Num frames 5200...
[2023-02-26 09:49:47,811][00216] Avg episode rewards: #0: 18.525, true rewards: #0: 8.692
[2023-02-26 09:49:47,812][00216] Avg episode reward: 18.525, avg true_objective: 8.692
[2023-02-26 09:49:47,915][00216] Num frames 5300...
[2023-02-26 09:49:48,032][00216] Num frames 5400...
[2023-02-26 09:49:48,148][00216] Num frames 5500...
[2023-02-26 09:49:48,271][00216] Num frames 5600...
[2023-02-26 09:49:48,388][00216] Num frames 5700...
[2023-02-26 09:49:48,507][00216] Num frames 5800...
[2023-02-26 09:49:48,623][00216] Num frames 5900...
[2023-02-26 09:49:48,790][00216] Avg episode rewards: #0: 17.690, true rewards: #0: 8.547
[2023-02-26 09:49:48,792][00216] Avg episode reward: 17.690, avg true_objective: 8.547
[2023-02-26 09:49:48,818][00216] Num frames 6000...
[2023-02-26 09:49:48,940][00216] Num frames 6100...
[2023-02-26 09:49:49,068][00216] Num frames 6200...
[2023-02-26 09:49:49,187][00216] Num frames 6300...
[2023-02-26 09:49:49,316][00216] Num frames 6400...
[2023-02-26 09:49:49,437][00216] Num frames 6500...
[2023-02-26 09:49:49,572][00216] Num frames 6600...
[2023-02-26 09:49:49,701][00216] Num frames 6700...
[2023-02-26 09:49:49,789][00216] Avg episode rewards: #0: 17.399, true rewards: #0: 8.399
[2023-02-26 09:49:49,791][00216] Avg episode reward: 17.399, avg true_objective: 8.399
[2023-02-26 09:49:49,889][00216] Num frames 6800...
[2023-02-26 09:49:50,008][00216] Num frames 6900...
[2023-02-26 09:49:50,123][00216] Num frames 7000...
[2023-02-26 09:49:50,239][00216] Num frames 7100...
[2023-02-26 09:49:50,364][00216] Num frames 7200...
[2023-02-26 09:49:50,493][00216] Num frames 7300...
[2023-02-26 09:49:50,619][00216] Num frames 7400...
[2023-02-26 09:49:50,743][00216] Num frames 7500...
[2023-02-26 09:49:50,869][00216] Num frames 7600...
[2023-02-26 09:49:51,003][00216] Num frames 7700...
[2023-02-26 09:49:51,127][00216] Num frames 7800...
[2023-02-26 09:49:51,252][00216] Num frames 7900...
[2023-02-26 09:49:51,370][00216] Num frames 8000...
[2023-02-26 09:49:51,489][00216] Num frames 8100...
[2023-02-26 09:49:51,617][00216] Num frames 8200...
[2023-02-26 09:49:51,737][00216] Num frames 8300...
[2023-02-26 09:49:51,872][00216] Num frames 8400...
[2023-02-26 09:49:51,990][00216] Num frames 8500...
[2023-02-26 09:49:52,114][00216] Num frames 8600...
[2023-02-26 09:49:52,233][00216] Num frames 8700...
[2023-02-26 09:49:52,361][00216] Num frames 8800...
[2023-02-26 09:49:52,441][00216] Avg episode rewards: #0: 21.132, true rewards: #0: 9.799
[2023-02-26 09:49:52,442][00216] Avg episode reward: 21.132, avg true_objective: 9.799
[2023-02-26 09:49:52,544][00216] Num frames 8900...
[2023-02-26 09:49:52,660][00216] Num frames 9000...
[2023-02-26 09:49:52,779][00216] Num frames 9100...
[2023-02-26 09:49:52,893][00216] Num frames 9200...
[2023-02-26 09:49:53,017][00216] Num frames 9300...
[2023-02-26 09:49:53,137][00216] Num frames 9400...
[2023-02-26 09:49:53,264][00216] Num frames 9500...
[2023-02-26 09:49:53,380][00216] Num frames 9600...
[2023-02-26 09:49:53,503][00216] Num frames 9700...
[2023-02-26 09:49:53,595][00216] Avg episode rewards: #0: 20.632, true rewards: #0: 9.732
[2023-02-26 09:49:53,597][00216] Avg episode reward: 20.632, avg true_objective: 9.732
[2023-02-26 09:50:58,109][00216] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-26 09:52:41,498][00216] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-26 09:52:41,501][00216] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-26 09:52:41,504][00216] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-26 09:52:41,507][00216] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-26 09:52:41,508][00216] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-26 09:52:41,510][00216] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-26 09:52:41,512][00216] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-26 09:52:41,514][00216] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-26 09:52:41,515][00216] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-26 09:52:41,517][00216] Adding new argument 'hf_repository'='numan966/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-26 09:52:41,519][00216] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-26 09:52:41,520][00216] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-26 09:52:41,522][00216] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-26 09:52:41,523][00216] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-26 09:52:41,525][00216] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-26 09:52:41,557][00216] RunningMeanStd input shape: (3, 72, 128)
[2023-02-26 09:52:41,560][00216] RunningMeanStd input shape: (1,)
[2023-02-26 09:52:41,577][00216] ConvEncoder: input_channels=3
[2023-02-26 09:52:41,619][00216] Conv encoder output size: 512
[2023-02-26 09:52:41,621][00216] Policy head output size: 512
[2023-02-26 09:52:41,644][00216] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-26 09:52:42,163][00216] Num frames 100...
[2023-02-26 09:52:42,297][00216] Num frames 200...
[2023-02-26 09:52:42,435][00216] Num frames 300...
[2023-02-26 09:52:42,570][00216] Num frames 400...
[2023-02-26 09:52:42,708][00216] Num frames 500...
[2023-02-26 09:52:42,845][00216] Num frames 600...
[2023-02-26 09:52:42,978][00216] Num frames 700...
[2023-02-26 09:52:43,110][00216] Num frames 800...
[2023-02-26 09:52:43,247][00216] Num frames 900...
[2023-02-26 09:52:43,385][00216] Num frames 1000...
[2023-02-26 09:52:43,519][00216] Num frames 1100...
[2023-02-26 09:52:43,646][00216] Num frames 1200...
[2023-02-26 09:52:43,779][00216] Num frames 1300...
[2023-02-26 09:52:43,913][00216] Num frames 1400...
[2023-02-26 09:52:44,047][00216] Num frames 1500...
[2023-02-26 09:52:44,181][00216] Num frames 1600...
[2023-02-26 09:52:44,315][00216] Num frames 1700...
[2023-02-26 09:52:44,460][00216] Num frames 1800...
[2023-02-26 09:52:44,588][00216] Num frames 1900...
[2023-02-26 09:52:44,727][00216] Num frames 2000...
[2023-02-26 09:52:44,862][00216] Num frames 2100...
[2023-02-26 09:52:44,922][00216] Avg episode rewards: #0: 58.999, true rewards: #0: 21.000
[2023-02-26 09:52:44,926][00216] Avg episode reward: 58.999, avg true_objective: 21.000
[2023-02-26 09:52:45,054][00216] Num frames 2200...
[2023-02-26 09:52:45,189][00216] Num frames 2300...
[2023-02-26 09:52:45,328][00216] Num frames 2400...
[2023-02-26 09:52:45,464][00216] Num frames 2500...
[2023-02-26 09:52:45,600][00216] Num frames 2600...
[2023-02-26 09:52:45,728][00216] Num frames 2700...
[2023-02-26 09:52:45,864][00216] Num frames 2800...
[2023-02-26 09:52:45,998][00216] Num frames 2900...
[2023-02-26 09:52:46,129][00216] Num frames 3000...
[2023-02-26 09:52:46,271][00216] Num frames 3100...
[2023-02-26 09:52:46,402][00216] Num frames 3200...
[2023-02-26 09:52:46,545][00216] Num frames 3300...
[2023-02-26 09:52:46,663][00216] Avg episode rewards: #0: 42.240, true rewards: #0: 16.740
[2023-02-26 09:52:46,668][00216] Avg episode reward: 42.240, avg true_objective: 16.740
[2023-02-26 09:52:46,739][00216] Num frames 3400...
[2023-02-26 09:52:46,868][00216] Num frames 3500...
[2023-02-26 09:52:47,004][00216] Num frames 3600...
[2023-02-26 09:52:47,143][00216] Num frames 3700...
[2023-02-26 09:52:47,278][00216] Num frames 3800...
[2023-02-26 09:52:47,416][00216] Num frames 3900...
[2023-02-26 09:52:47,559][00216] Num frames 4000...
[2023-02-26 09:52:47,686][00216] Num frames 4100...
[2023-02-26 09:52:47,821][00216] Num frames 4200...
[2023-02-26 09:52:47,953][00216] Num frames 4300...
[2023-02-26 09:52:48,086][00216] Num frames 4400...
[2023-02-26 09:52:48,254][00216] Num frames 4500...
[2023-02-26 09:52:48,487][00216] Avg episode rewards: #0: 36.986, true rewards: #0: 15.320
[2023-02-26 09:52:48,490][00216] Avg episode reward: 36.986, avg true_objective: 15.320
[2023-02-26 09:52:48,503][00216] Num frames 4600...
[2023-02-26 09:52:48,693][00216] Num frames 4700...
[2023-02-26 09:52:48,882][00216] Num frames 4800...
[2023-02-26 09:52:49,069][00216] Num frames 4900...
[2023-02-26 09:52:49,269][00216] Num frames 5000...
[2023-02-26 09:52:49,456][00216] Num frames 5100...
[2023-02-26 09:52:49,657][00216] Num frames 5200...
[2023-02-26 09:52:49,846][00216] Num frames 5300...
[2023-02-26 09:52:50,038][00216] Num frames 5400...
[2023-02-26 09:52:50,221][00216] Num frames 5500...
[2023-02-26 09:52:50,424][00216] Num frames 5600...
[2023-02-26 09:52:50,603][00216] Num frames 5700...
[2023-02-26 09:52:50,784][00216] Num frames 5800...
[2023-02-26 09:52:50,988][00216] Avg episode rewards: #0: 34.190, true rewards: #0: 14.690
[2023-02-26 09:52:50,990][00216] Avg episode reward: 34.190, avg true_objective: 14.690
[2023-02-26 09:52:51,041][00216] Num frames 5900...
[2023-02-26 09:52:51,225][00216] Num frames 6000...
[2023-02-26 09:52:51,420][00216] Num frames 6100...
[2023-02-26 09:52:51,611][00216] Num frames 6200...
[2023-02-26 09:52:51,816][00216] Num frames 6300...
[2023-02-26 09:52:51,922][00216] Avg episode rewards: #0: 28.448, true rewards: #0: 12.648
[2023-02-26 09:52:51,925][00216] Avg episode reward: 28.448, avg true_objective: 12.648
[2023-02-26 09:52:52,072][00216] Num frames 6400...
[2023-02-26 09:52:52,269][00216] Num frames 6500...
[2023-02-26 09:52:52,459][00216] Num frames 6600...
[2023-02-26 09:52:52,591][00216] Num frames 6700...
[2023-02-26 09:52:52,740][00216] Num frames 6800...
[2023-02-26 09:52:52,870][00216] Num frames 6900...
[2023-02-26 09:52:53,014][00216] Num frames 7000...
[2023-02-26 09:52:53,145][00216] Num frames 7100...
[2023-02-26 09:52:53,277][00216] Num frames 7200...
[2023-02-26 09:52:53,411][00216] Num frames 7300...
[2023-02-26 09:52:53,539][00216] Num frames 7400...
[2023-02-26 09:52:53,678][00216] Num frames 7500...
[2023-02-26 09:52:53,813][00216] Num frames 7600...
[2023-02-26 09:52:53,955][00216] Num frames 7700...
[2023-02-26 09:52:54,084][00216] Num frames 7800...
[2023-02-26 09:52:54,260][00216] Avg episode rewards: #0: 30.153, true rewards: #0: 13.153
[2023-02-26 09:52:54,264][00216] Avg episode reward: 30.153, avg true_objective: 13.153
[2023-02-26 09:52:54,283][00216] Num frames 7900...
[2023-02-26 09:52:54,414][00216] Num frames 8000...
[2023-02-26 09:52:54,550][00216] Num frames 8100...
[2023-02-26 09:52:54,681][00216] Num frames 8200...
[2023-02-26 09:52:54,826][00216] Num frames 8300...
[2023-02-26 09:52:54,966][00216] Num frames 8400...
[2023-02-26 09:52:55,102][00216] Num frames 8500...
[2023-02-26 09:52:55,243][00216] Num frames 8600...
[2023-02-26 09:52:55,379][00216] Num frames 8700...
[2023-02-26 09:52:55,517][00216] Num frames 8800...
[2023-02-26 09:52:55,646][00216] Num frames 8900...
[2023-02-26 09:52:55,789][00216] Num frames 9000...
[2023-02-26 09:52:55,928][00216] Num frames 9100...
[2023-02-26 09:52:56,065][00216] Num frames 9200...
[2023-02-26 09:52:56,196][00216] Num frames 9300...
[2023-02-26 09:52:56,334][00216] Num frames 9400...
[2023-02-26 09:52:56,465][00216] Num frames 9500...
[2023-02-26 09:52:56,645][00216] Avg episode rewards: #0: 30.839, true rewards: #0: 13.696
[2023-02-26 09:52:56,647][00216] Avg episode reward: 30.839, avg true_objective: 13.696
[2023-02-26 09:52:56,670][00216] Num frames 9600...
[2023-02-26 09:52:56,810][00216] Num frames 9700...
[2023-02-26 09:52:56,940][00216] Num frames 9800...
[2023-02-26 09:52:57,073][00216] Num frames 9900...
[2023-02-26 09:52:57,221][00216] Avg episode rewards: #0: 27.714, true rewards: #0: 12.464
[2023-02-26 09:52:57,223][00216] Avg episode reward: 27.714, avg true_objective: 12.464
[2023-02-26 09:52:57,266][00216] Num frames 10000...
[2023-02-26 09:52:57,394][00216] Num frames 10100...
[2023-02-26 09:52:57,523][00216] Num frames 10200...
[2023-02-26 09:52:57,648][00216] Num frames 10300...
[2023-02-26 09:52:57,776][00216] Avg episode rewards: #0: 25.283, true rewards: #0: 11.506
[2023-02-26 09:52:57,778][00216] Avg episode reward: 25.283, avg true_objective: 11.506
[2023-02-26 09:52:57,842][00216] Num frames 10400...
[2023-02-26 09:52:57,979][00216] Num frames 10500...
[2023-02-26 09:52:58,109][00216] Num frames 10600...
[2023-02-26 09:52:58,245][00216] Num frames 10700...
[2023-02-26 09:52:58,390][00216] Num frames 10800...
[2023-02-26 09:52:58,524][00216] Num frames 10900...
[2023-02-26 09:52:58,667][00216] Num frames 11000...
[2023-02-26 09:52:58,805][00216] Num frames 11100...
[2023-02-26 09:52:58,943][00216] Num frames 11200...
[2023-02-26 09:52:59,078][00216] Num frames 11300...
[2023-02-26 09:52:59,210][00216] Num frames 11400...
[2023-02-26 09:52:59,348][00216] Num frames 11500...
[2023-02-26 09:52:59,482][00216] Num frames 11600...
[2023-02-26 09:52:59,615][00216] Avg episode rewards: #0: 25.756, true rewards: #0: 11.656
[2023-02-26 09:52:59,618][00216] Avg episode reward: 25.756, avg true_objective: 11.656
[2023-02-26 09:54:22,274][00216] Replay video saved to /content/train_dir/default_experiment/replay.mp4!