zipbomb's picture
Upload . with huggingface_hub
85592c4
[2023-02-26 17:13:02,693][00199] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-26 17:13:02,698][00199] Rollout worker 0 uses device cpu
[2023-02-26 17:13:02,699][00199] Rollout worker 1 uses device cpu
[2023-02-26 17:13:02,702][00199] Rollout worker 2 uses device cpu
[2023-02-26 17:13:02,703][00199] Rollout worker 3 uses device cpu
[2023-02-26 17:13:02,706][00199] Rollout worker 4 uses device cpu
[2023-02-26 17:13:02,707][00199] Rollout worker 5 uses device cpu
[2023-02-26 17:13:02,708][00199] Rollout worker 6 uses device cpu
[2023-02-26 17:13:02,711][00199] Rollout worker 7 uses device cpu
[2023-02-26 17:13:02,934][00199] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-26 17:13:02,939][00199] InferenceWorker_p0-w0: min num requests: 2
[2023-02-26 17:13:02,981][00199] Starting all processes...
[2023-02-26 17:13:02,984][00199] Starting process learner_proc0
[2023-02-26 17:13:03,064][00199] Starting all processes...
[2023-02-26 17:13:03,081][00199] Starting process inference_proc0-0
[2023-02-26 17:13:03,084][00199] Starting process rollout_proc0
[2023-02-26 17:13:03,084][00199] Starting process rollout_proc1
[2023-02-26 17:13:03,084][00199] Starting process rollout_proc2
[2023-02-26 17:13:03,118][00199] Starting process rollout_proc3
[2023-02-26 17:13:03,118][00199] Starting process rollout_proc4
[2023-02-26 17:13:03,138][00199] Starting process rollout_proc5
[2023-02-26 17:13:03,143][00199] Starting process rollout_proc6
[2023-02-26 17:13:03,145][00199] Starting process rollout_proc7
[2023-02-26 17:13:12,509][11467] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-26 17:13:12,509][11467] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-26 17:13:12,878][11484] Worker 1 uses CPU cores [1]
[2023-02-26 17:13:13,129][11482] Worker 0 uses CPU cores [0]
[2023-02-26 17:13:13,146][11489] Worker 7 uses CPU cores [1]
[2023-02-26 17:13:13,193][11483] Worker 2 uses CPU cores [0]
[2023-02-26 17:13:13,213][11487] Worker 5 uses CPU cores [1]
[2023-02-26 17:13:13,253][11467] Num visible devices: 1
[2023-02-26 17:13:13,269][11480] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-26 17:13:13,269][11480] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-26 17:13:13,277][11467] Starting seed is not provided
[2023-02-26 17:13:13,278][11467] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-26 17:13:13,278][11467] Initializing actor-critic model on device cuda:0
[2023-02-26 17:13:13,279][11467] RunningMeanStd input shape: (3, 72, 128)
[2023-02-26 17:13:13,282][11467] RunningMeanStd input shape: (1,)
[2023-02-26 17:13:13,297][11480] Num visible devices: 1
[2023-02-26 17:13:13,304][11488] Worker 6 uses CPU cores [0]
[2023-02-26 17:13:13,318][11486] Worker 3 uses CPU cores [1]
[2023-02-26 17:13:13,331][11467] ConvEncoder: input_channels=3
[2023-02-26 17:13:13,335][11485] Worker 4 uses CPU cores [0]
[2023-02-26 17:13:13,620][11467] Conv encoder output size: 512
[2023-02-26 17:13:13,620][11467] Policy head output size: 512
[2023-02-26 17:13:13,674][11467] Created Actor Critic model with architecture:
[2023-02-26 17:13:13,674][11467] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-26 17:13:21,711][11467] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-26 17:13:21,712][11467] No checkpoints found
[2023-02-26 17:13:21,712][11467] Did not load from checkpoint, starting from scratch!
[2023-02-26 17:13:21,713][11467] Initialized policy 0 weights for model version 0
[2023-02-26 17:13:21,721][11467] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-26 17:13:21,728][11467] LearnerWorker_p0 finished initialization!
[2023-02-26 17:13:21,922][11480] RunningMeanStd input shape: (3, 72, 128)
[2023-02-26 17:13:21,923][11480] RunningMeanStd input shape: (1,)
[2023-02-26 17:13:21,938][11480] ConvEncoder: input_channels=3
[2023-02-26 17:13:22,038][11480] Conv encoder output size: 512
[2023-02-26 17:13:22,038][11480] Policy head output size: 512
[2023-02-26 17:13:22,564][00199] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-26 17:13:22,922][00199] Heartbeat connected on Batcher_0
[2023-02-26 17:13:22,936][00199] Heartbeat connected on LearnerWorker_p0
[2023-02-26 17:13:22,948][00199] Heartbeat connected on RolloutWorker_w0
[2023-02-26 17:13:22,955][00199] Heartbeat connected on RolloutWorker_w1
[2023-02-26 17:13:22,960][00199] Heartbeat connected on RolloutWorker_w2
[2023-02-26 17:13:22,963][00199] Heartbeat connected on RolloutWorker_w3
[2023-02-26 17:13:22,968][00199] Heartbeat connected on RolloutWorker_w4
[2023-02-26 17:13:22,972][00199] Heartbeat connected on RolloutWorker_w5
[2023-02-26 17:13:22,976][00199] Heartbeat connected on RolloutWorker_w6
[2023-02-26 17:13:22,981][00199] Heartbeat connected on RolloutWorker_w7
[2023-02-26 17:13:24,382][00199] Inference worker 0-0 is ready!
[2023-02-26 17:13:24,384][00199] All inference workers are ready! Signal rollout workers to start!
[2023-02-26 17:13:24,386][00199] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-26 17:13:24,533][11488] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 17:13:24,534][11482] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 17:13:24,541][11485] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 17:13:24,542][11483] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 17:13:24,544][11486] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 17:13:24,558][11484] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 17:13:24,568][11489] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 17:13:24,581][11487] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 17:13:25,780][11484] Decorrelating experience for 0 frames...
[2023-02-26 17:13:25,782][11487] Decorrelating experience for 0 frames...
[2023-02-26 17:13:25,784][11486] Decorrelating experience for 0 frames...
[2023-02-26 17:13:26,073][11483] Decorrelating experience for 0 frames...
[2023-02-26 17:13:26,075][11488] Decorrelating experience for 0 frames...
[2023-02-26 17:13:26,071][11485] Decorrelating experience for 0 frames...
[2023-02-26 17:13:26,081][11482] Decorrelating experience for 0 frames...
[2023-02-26 17:13:26,278][11487] Decorrelating experience for 32 frames...
[2023-02-26 17:13:27,357][11488] Decorrelating experience for 32 frames...
[2023-02-26 17:13:27,383][11482] Decorrelating experience for 32 frames...
[2023-02-26 17:13:27,516][11485] Decorrelating experience for 32 frames...
[2023-02-26 17:13:27,540][11486] Decorrelating experience for 32 frames...
[2023-02-26 17:13:27,546][11484] Decorrelating experience for 32 frames...
[2023-02-26 17:13:27,564][00199] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-26 17:13:27,614][11489] Decorrelating experience for 0 frames...
[2023-02-26 17:13:27,792][11483] Decorrelating experience for 32 frames...
[2023-02-26 17:13:28,267][11487] Decorrelating experience for 64 frames...
[2023-02-26 17:13:29,427][11488] Decorrelating experience for 64 frames...
[2023-02-26 17:13:29,510][11482] Decorrelating experience for 64 frames...
[2023-02-26 17:13:29,800][11489] Decorrelating experience for 32 frames...
[2023-02-26 17:13:29,953][11483] Decorrelating experience for 64 frames...
[2023-02-26 17:13:30,126][11486] Decorrelating experience for 64 frames...
[2023-02-26 17:13:30,150][11484] Decorrelating experience for 64 frames...
[2023-02-26 17:13:30,472][11487] Decorrelating experience for 96 frames...
[2023-02-26 17:13:30,915][11485] Decorrelating experience for 64 frames...
[2023-02-26 17:13:31,587][11488] Decorrelating experience for 96 frames...
[2023-02-26 17:13:31,810][11483] Decorrelating experience for 96 frames...
[2023-02-26 17:13:32,036][11482] Decorrelating experience for 96 frames...
[2023-02-26 17:13:32,564][00199] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-26 17:13:32,594][11489] Decorrelating experience for 64 frames...
[2023-02-26 17:13:32,678][11486] Decorrelating experience for 96 frames...
[2023-02-26 17:13:32,711][11484] Decorrelating experience for 96 frames...
[2023-02-26 17:13:33,214][11485] Decorrelating experience for 96 frames...
[2023-02-26 17:13:33,781][11489] Decorrelating experience for 96 frames...
[2023-02-26 17:13:37,564][00199] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 31.9. Samples: 478. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-26 17:13:37,569][00199] Avg episode reward: [(0, '1.368')]
[2023-02-26 17:13:38,462][11467] Signal inference workers to stop experience collection...
[2023-02-26 17:13:38,485][11480] InferenceWorker_p0-w0: stopping experience collection
[2023-02-26 17:13:41,246][11467] Signal inference workers to resume experience collection...
[2023-02-26 17:13:41,247][11480] InferenceWorker_p0-w0: resuming experience collection
[2023-02-26 17:13:42,563][00199] Fps is (10 sec: 409.6, 60 sec: 204.8, 300 sec: 204.8). Total num frames: 4096. Throughput: 0: 141.7. Samples: 2834. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-02-26 17:13:42,570][00199] Avg episode reward: [(0, '2.409')]
[2023-02-26 17:13:47,567][00199] Fps is (10 sec: 2456.8, 60 sec: 982.9, 300 sec: 982.9). Total num frames: 24576. Throughput: 0: 234.3. Samples: 5858. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-26 17:13:47,575][00199] Avg episode reward: [(0, '3.569')]
[2023-02-26 17:13:52,574][00199] Fps is (10 sec: 2455.0, 60 sec: 955.4, 300 sec: 955.4). Total num frames: 28672. Throughput: 0: 263.5. Samples: 7908. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-26 17:13:52,580][00199] Avg episode reward: [(0, '3.694')]
[2023-02-26 17:13:56,125][11480] Updated weights for policy 0, policy_version 10 (0.0387)
[2023-02-26 17:13:57,564][00199] Fps is (10 sec: 2048.7, 60 sec: 1287.3, 300 sec: 1287.3). Total num frames: 45056. Throughput: 0: 340.3. Samples: 11910. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-26 17:13:57,566][00199] Avg episode reward: [(0, '4.140')]
[2023-02-26 17:14:02,564][00199] Fps is (10 sec: 3690.3, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 65536. Throughput: 0: 380.9. Samples: 15234. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:14:02,572][00199] Avg episode reward: [(0, '4.519')]
[2023-02-26 17:14:05,747][11480] Updated weights for policy 0, policy_version 20 (0.0021)
[2023-02-26 17:14:07,564][00199] Fps is (10 sec: 4096.0, 60 sec: 1911.5, 300 sec: 1911.5). Total num frames: 86016. Throughput: 0: 477.2. Samples: 21472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:14:07,571][00199] Avg episode reward: [(0, '4.398')]
[2023-02-26 17:14:12,564][00199] Fps is (10 sec: 3276.8, 60 sec: 1966.1, 300 sec: 1966.1). Total num frames: 98304. Throughput: 0: 562.4. Samples: 25310. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-26 17:14:12,567][00199] Avg episode reward: [(0, '4.325')]
[2023-02-26 17:14:17,564][00199] Fps is (10 sec: 2867.1, 60 sec: 2085.2, 300 sec: 2085.2). Total num frames: 114688. Throughput: 0: 607.0. Samples: 27314. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-26 17:14:17,566][00199] Avg episode reward: [(0, '4.240')]
[2023-02-26 17:14:17,576][11467] Saving new best policy, reward=4.240!
[2023-02-26 17:14:19,438][11480] Updated weights for policy 0, policy_version 30 (0.0014)
[2023-02-26 17:14:22,564][00199] Fps is (10 sec: 3686.4, 60 sec: 2252.8, 300 sec: 2252.8). Total num frames: 135168. Throughput: 0: 727.9. Samples: 33232. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2023-02-26 17:14:22,566][00199] Avg episode reward: [(0, '4.228')]
[2023-02-26 17:14:27,571][00199] Fps is (10 sec: 3683.9, 60 sec: 2525.6, 300 sec: 2331.3). Total num frames: 151552. Throughput: 0: 800.3. Samples: 38852. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-26 17:14:27,573][00199] Avg episode reward: [(0, '4.374')]
[2023-02-26 17:14:27,587][11467] Saving new best policy, reward=4.374!
[2023-02-26 17:14:31,524][11480] Updated weights for policy 0, policy_version 40 (0.0032)
[2023-02-26 17:14:32,566][00199] Fps is (10 sec: 2866.5, 60 sec: 2730.6, 300 sec: 2340.5). Total num frames: 163840. Throughput: 0: 773.5. Samples: 40666. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-26 17:14:32,574][00199] Avg episode reward: [(0, '4.494')]
[2023-02-26 17:14:32,576][11467] Saving new best policy, reward=4.494!
[2023-02-26 17:14:37,564][00199] Fps is (10 sec: 2869.3, 60 sec: 3003.7, 300 sec: 2403.0). Total num frames: 180224. Throughput: 0: 813.7. Samples: 44516. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 17:14:37,566][00199] Avg episode reward: [(0, '4.517')]
[2023-02-26 17:14:37,581][11467] Saving new best policy, reward=4.517!
[2023-02-26 17:14:42,564][00199] Fps is (10 sec: 3687.3, 60 sec: 3276.8, 300 sec: 2508.8). Total num frames: 200704. Throughput: 0: 863.1. Samples: 50750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:14:42,566][00199] Avg episode reward: [(0, '4.479')]
[2023-02-26 17:14:43,480][11480] Updated weights for policy 0, policy_version 50 (0.0022)
[2023-02-26 17:14:47,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3208.7, 300 sec: 2554.0). Total num frames: 217088. Throughput: 0: 859.1. Samples: 53892. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:14:47,572][00199] Avg episode reward: [(0, '4.686')]
[2023-02-26 17:14:47,592][11467] Saving new best policy, reward=4.686!
[2023-02-26 17:14:52,571][00199] Fps is (10 sec: 2865.2, 60 sec: 3345.3, 300 sec: 2548.4). Total num frames: 229376. Throughput: 0: 808.8. Samples: 57874. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 17:14:52,574][00199] Avg episode reward: [(0, '4.658')]
[2023-02-26 17:14:57,035][11480] Updated weights for policy 0, policy_version 60 (0.0040)
[2023-02-26 17:14:57,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 2586.9). Total num frames: 245760. Throughput: 0: 825.4. Samples: 62454. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:14:57,569][00199] Avg episode reward: [(0, '4.506')]
[2023-02-26 17:14:57,579][11467] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000060_245760.pth...
[2023-02-26 17:15:02,563][00199] Fps is (10 sec: 3689.0, 60 sec: 3345.1, 300 sec: 2662.4). Total num frames: 266240. Throughput: 0: 852.3. Samples: 65666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:15:02,572][00199] Avg episode reward: [(0, '4.595')]
[2023-02-26 17:15:06,837][11480] Updated weights for policy 0, policy_version 70 (0.0012)
[2023-02-26 17:15:07,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 2730.7). Total num frames: 286720. Throughput: 0: 860.4. Samples: 71950. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:15:07,569][00199] Avg episode reward: [(0, '4.685')]
[2023-02-26 17:15:12,566][00199] Fps is (10 sec: 3275.9, 60 sec: 3344.9, 300 sec: 2718.2). Total num frames: 299008. Throughput: 0: 824.3. Samples: 75942. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 17:15:12,572][00199] Avg episode reward: [(0, '4.497')]
[2023-02-26 17:15:17,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 2742.5). Total num frames: 315392. Throughput: 0: 829.5. Samples: 77990. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:15:17,566][00199] Avg episode reward: [(0, '4.561')]
[2023-02-26 17:15:19,874][11480] Updated weights for policy 0, policy_version 80 (0.0014)
[2023-02-26 17:15:22,564][00199] Fps is (10 sec: 3687.3, 60 sec: 3345.1, 300 sec: 2798.9). Total num frames: 335872. Throughput: 0: 882.4. Samples: 84222. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:15:22,566][00199] Avg episode reward: [(0, '4.478')]
[2023-02-26 17:15:27,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3413.7, 300 sec: 2850.8). Total num frames: 356352. Throughput: 0: 871.7. Samples: 89978. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:15:27,566][00199] Avg episode reward: [(0, '4.361')]
[2023-02-26 17:15:31,533][11480] Updated weights for policy 0, policy_version 90 (0.0019)
[2023-02-26 17:15:32,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3413.5, 300 sec: 2835.7). Total num frames: 368640. Throughput: 0: 846.0. Samples: 91962. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 17:15:32,571][00199] Avg episode reward: [(0, '4.341')]
[2023-02-26 17:15:37,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 2852.0). Total num frames: 385024. Throughput: 0: 850.1. Samples: 96124. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:15:37,566][00199] Avg episode reward: [(0, '4.396')]
[2023-02-26 17:15:42,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 2896.5). Total num frames: 405504. Throughput: 0: 883.6. Samples: 102218. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:15:42,572][00199] Avg episode reward: [(0, '4.649')]
[2023-02-26 17:15:43,185][11480] Updated weights for policy 0, policy_version 100 (0.0027)
[2023-02-26 17:15:47,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 2909.6). Total num frames: 421888. Throughput: 0: 879.6. Samples: 105250. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:15:47,568][00199] Avg episode reward: [(0, '4.670')]
[2023-02-26 17:15:52,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3413.7, 300 sec: 2894.5). Total num frames: 434176. Throughput: 0: 827.7. Samples: 109198. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:15:52,570][00199] Avg episode reward: [(0, '4.551')]
[2023-02-26 17:15:56,914][11480] Updated weights for policy 0, policy_version 110 (0.0027)
[2023-02-26 17:15:57,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 2906.8). Total num frames: 450560. Throughput: 0: 842.7. Samples: 113862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:15:57,566][00199] Avg episode reward: [(0, '4.452')]
[2023-02-26 17:16:02,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 2969.6). Total num frames: 475136. Throughput: 0: 871.4. Samples: 117204. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:16:02,566][00199] Avg episode reward: [(0, '4.423')]
[2023-02-26 17:16:06,253][11480] Updated weights for policy 0, policy_version 120 (0.0022)
[2023-02-26 17:16:07,566][00199] Fps is (10 sec: 4094.9, 60 sec: 3413.2, 300 sec: 2978.9). Total num frames: 491520. Throughput: 0: 874.7. Samples: 123586. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 17:16:07,571][00199] Avg episode reward: [(0, '4.333')]
[2023-02-26 17:16:12,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3481.8, 300 sec: 2987.7). Total num frames: 507904. Throughput: 0: 834.2. Samples: 127518. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:16:12,573][00199] Avg episode reward: [(0, '4.428')]
[2023-02-26 17:16:17,564][00199] Fps is (10 sec: 3277.7, 60 sec: 3481.6, 300 sec: 2995.9). Total num frames: 524288. Throughput: 0: 834.1. Samples: 129496. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:16:17,570][00199] Avg episode reward: [(0, '4.437')]
[2023-02-26 17:16:19,479][11480] Updated weights for policy 0, policy_version 130 (0.0019)
[2023-02-26 17:16:22,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3026.5). Total num frames: 544768. Throughput: 0: 882.3. Samples: 135828. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 17:16:22,566][00199] Avg episode reward: [(0, '4.486')]
[2023-02-26 17:16:27,564][00199] Fps is (10 sec: 3686.2, 60 sec: 3413.3, 300 sec: 3033.2). Total num frames: 561152. Throughput: 0: 874.2. Samples: 141558. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-26 17:16:27,567][00199] Avg episode reward: [(0, '4.454')]
[2023-02-26 17:16:31,012][11480] Updated weights for policy 0, policy_version 140 (0.0032)
[2023-02-26 17:16:32,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3018.1). Total num frames: 573440. Throughput: 0: 850.8. Samples: 143534. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:16:32,571][00199] Avg episode reward: [(0, '4.484')]
[2023-02-26 17:16:37,564][00199] Fps is (10 sec: 2867.3, 60 sec: 3413.3, 300 sec: 3024.7). Total num frames: 589824. Throughput: 0: 850.4. Samples: 147466. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:16:37,570][00199] Avg episode reward: [(0, '4.426')]
[2023-02-26 17:16:42,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3051.5). Total num frames: 610304. Throughput: 0: 877.5. Samples: 153350. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 17:16:42,571][00199] Avg episode reward: [(0, '4.328')]
[2023-02-26 17:16:43,385][11480] Updated weights for policy 0, policy_version 150 (0.0017)
[2023-02-26 17:16:47,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3057.0). Total num frames: 626688. Throughput: 0: 867.6. Samples: 156248. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:16:47,569][00199] Avg episode reward: [(0, '4.427')]
[2023-02-26 17:16:52,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3042.7). Total num frames: 638976. Throughput: 0: 813.7. Samples: 160198. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:16:52,565][00199] Avg episode reward: [(0, '4.391')]
[2023-02-26 17:16:57,173][11480] Updated weights for policy 0, policy_version 160 (0.0036)
[2023-02-26 17:16:57,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3048.2). Total num frames: 655360. Throughput: 0: 823.9. Samples: 164594. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:16:57,571][00199] Avg episode reward: [(0, '4.475')]
[2023-02-26 17:16:57,583][11467] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000160_655360.pth...
[2023-02-26 17:17:02,565][00199] Fps is (10 sec: 3276.2, 60 sec: 3276.7, 300 sec: 3053.4). Total num frames: 671744. Throughput: 0: 844.6. Samples: 167506. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:17:02,568][00199] Avg episode reward: [(0, '4.471')]
[2023-02-26 17:17:07,567][00199] Fps is (10 sec: 3685.1, 60 sec: 3345.0, 300 sec: 3076.5). Total num frames: 692224. Throughput: 0: 829.8. Samples: 173174. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:17:07,570][00199] Avg episode reward: [(0, '4.563')]
[2023-02-26 17:17:08,886][11480] Updated weights for policy 0, policy_version 170 (0.0023)
[2023-02-26 17:17:12,564][00199] Fps is (10 sec: 3277.4, 60 sec: 3276.8, 300 sec: 3063.1). Total num frames: 704512. Throughput: 0: 784.5. Samples: 176860. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 17:17:12,569][00199] Avg episode reward: [(0, '4.578')]
[2023-02-26 17:17:17,564][00199] Fps is (10 sec: 2458.5, 60 sec: 3208.5, 300 sec: 3050.2). Total num frames: 716800. Throughput: 0: 781.4. Samples: 178698. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 17:17:17,571][00199] Avg episode reward: [(0, '4.638')]
[2023-02-26 17:17:21,863][11480] Updated weights for policy 0, policy_version 180 (0.0034)
[2023-02-26 17:17:22,564][00199] Fps is (10 sec: 3276.7, 60 sec: 3208.5, 300 sec: 3072.0). Total num frames: 737280. Throughput: 0: 823.1. Samples: 184506. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 17:17:22,567][00199] Avg episode reward: [(0, '4.566')]
[2023-02-26 17:17:27,565][00199] Fps is (10 sec: 3685.9, 60 sec: 3208.5, 300 sec: 3076.2). Total num frames: 753664. Throughput: 0: 801.7. Samples: 189428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:17:27,572][00199] Avg episode reward: [(0, '4.530')]
[2023-02-26 17:17:32,564][00199] Fps is (10 sec: 2457.7, 60 sec: 3140.3, 300 sec: 3047.4). Total num frames: 761856. Throughput: 0: 768.5. Samples: 190830. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:17:32,571][00199] Avg episode reward: [(0, '4.470')]
[2023-02-26 17:17:37,564][00199] Fps is (10 sec: 2048.2, 60 sec: 3072.0, 300 sec: 3035.9). Total num frames: 774144. Throughput: 0: 743.9. Samples: 193674. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 17:17:37,568][00199] Avg episode reward: [(0, '4.592')]
[2023-02-26 17:17:39,102][11480] Updated weights for policy 0, policy_version 190 (0.0028)
[2023-02-26 17:17:42,564][00199] Fps is (10 sec: 2457.6, 60 sec: 2935.5, 300 sec: 3024.7). Total num frames: 786432. Throughput: 0: 733.0. Samples: 197578. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:17:42,566][00199] Avg episode reward: [(0, '4.565')]
[2023-02-26 17:17:47,564][00199] Fps is (10 sec: 3277.0, 60 sec: 3003.7, 300 sec: 3045.0). Total num frames: 806912. Throughput: 0: 733.2. Samples: 200500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:17:47,566][00199] Avg episode reward: [(0, '4.386')]
[2023-02-26 17:17:50,308][11480] Updated weights for policy 0, policy_version 200 (0.0016)
[2023-02-26 17:17:52,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3072.0, 300 sec: 3049.2). Total num frames: 823296. Throughput: 0: 737.0. Samples: 206338. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:17:52,571][00199] Avg episode reward: [(0, '4.328')]
[2023-02-26 17:17:57,571][00199] Fps is (10 sec: 2865.1, 60 sec: 3003.4, 300 sec: 3038.4). Total num frames: 835584. Throughput: 0: 736.5. Samples: 210010. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:17:57,574][00199] Avg episode reward: [(0, '4.458')]
[2023-02-26 17:18:02,566][00199] Fps is (10 sec: 2866.5, 60 sec: 3003.7, 300 sec: 3042.7). Total num frames: 851968. Throughput: 0: 737.4. Samples: 211882. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 17:18:02,574][00199] Avg episode reward: [(0, '4.419')]
[2023-02-26 17:18:04,213][11480] Updated weights for policy 0, policy_version 210 (0.0029)
[2023-02-26 17:18:07,564][00199] Fps is (10 sec: 3689.1, 60 sec: 3003.9, 300 sec: 3061.2). Total num frames: 872448. Throughput: 0: 739.0. Samples: 217760. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:18:07,568][00199] Avg episode reward: [(0, '4.296')]
[2023-02-26 17:18:12,564][00199] Fps is (10 sec: 3687.3, 60 sec: 3072.0, 300 sec: 3064.9). Total num frames: 888832. Throughput: 0: 748.5. Samples: 223110. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:18:12,569][00199] Avg episode reward: [(0, '4.374')]
[2023-02-26 17:18:16,810][11480] Updated weights for policy 0, policy_version 220 (0.0028)
[2023-02-26 17:18:17,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 3054.6). Total num frames: 901120. Throughput: 0: 757.6. Samples: 224924. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:18:17,572][00199] Avg episode reward: [(0, '4.390')]
[2023-02-26 17:18:22,564][00199] Fps is (10 sec: 2457.6, 60 sec: 2935.5, 300 sec: 3096.3). Total num frames: 913408. Throughput: 0: 779.6. Samples: 228754. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 17:18:22,572][00199] Avg episode reward: [(0, '4.407')]
[2023-02-26 17:18:27,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3003.8, 300 sec: 3165.7). Total num frames: 933888. Throughput: 0: 825.2. Samples: 234712. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:18:27,570][00199] Avg episode reward: [(0, '4.298')]
[2023-02-26 17:18:28,772][11480] Updated weights for policy 0, policy_version 230 (0.0021)
[2023-02-26 17:18:32,564][00199] Fps is (10 sec: 4095.7, 60 sec: 3208.5, 300 sec: 3235.1). Total num frames: 954368. Throughput: 0: 826.4. Samples: 237690. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:18:32,576][00199] Avg episode reward: [(0, '4.312')]
[2023-02-26 17:18:37,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3208.6, 300 sec: 3262.9). Total num frames: 966656. Throughput: 0: 786.1. Samples: 241714. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:18:37,570][00199] Avg episode reward: [(0, '4.355')]
[2023-02-26 17:18:42,509][11480] Updated weights for policy 0, policy_version 240 (0.0034)
[2023-02-26 17:18:42,564][00199] Fps is (10 sec: 2867.4, 60 sec: 3276.8, 300 sec: 3249.1). Total num frames: 983040. Throughput: 0: 803.2. Samples: 246150. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:18:42,575][00199] Avg episode reward: [(0, '4.325')]
[2023-02-26 17:18:47,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3304.7). Total num frames: 1003520. Throughput: 0: 828.0. Samples: 249140. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:18:47,568][00199] Avg episode reward: [(0, '4.423')]
[2023-02-26 17:18:52,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 1019904. Throughput: 0: 828.3. Samples: 255034. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:18:52,568][00199] Avg episode reward: [(0, '4.471')]
[2023-02-26 17:18:53,673][11480] Updated weights for policy 0, policy_version 250 (0.0025)
[2023-02-26 17:18:57,567][00199] Fps is (10 sec: 2866.2, 60 sec: 3277.0, 300 sec: 3276.8). Total num frames: 1032192. Throughput: 0: 791.8. Samples: 258742. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:18:57,572][00199] Avg episode reward: [(0, '4.509')]
[2023-02-26 17:18:57,581][11467] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000252_1032192.pth...
[2023-02-26 17:18:57,711][11467] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000060_245760.pth
[2023-02-26 17:19:02,564][00199] Fps is (10 sec: 2457.6, 60 sec: 3208.7, 300 sec: 3249.0). Total num frames: 1044480. Throughput: 0: 792.0. Samples: 260564. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:19:02,566][00199] Avg episode reward: [(0, '4.408')]
[2023-02-26 17:19:06,990][11480] Updated weights for policy 0, policy_version 260 (0.0021)
[2023-02-26 17:19:07,564][00199] Fps is (10 sec: 3277.9, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 1064960. Throughput: 0: 831.9. Samples: 266190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:19:07,566][00199] Avg episode reward: [(0, '4.626')]
[2023-02-26 17:19:12,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 1081344. Throughput: 0: 821.7. Samples: 271688. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:19:12,569][00199] Avg episode reward: [(0, '4.721')]
[2023-02-26 17:19:12,571][11467] Saving new best policy, reward=4.721!
[2023-02-26 17:19:17,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 1093632. Throughput: 0: 795.2. Samples: 273474. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:19:17,568][00199] Avg episode reward: [(0, '4.554')]
[2023-02-26 17:19:20,991][11480] Updated weights for policy 0, policy_version 270 (0.0034)
[2023-02-26 17:19:22,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3249.1). Total num frames: 1110016. Throughput: 0: 789.2. Samples: 277226. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:19:22,571][00199] Avg episode reward: [(0, '4.544')]
[2023-02-26 17:19:27,564][00199] Fps is (10 sec: 3686.3, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 1130496. Throughput: 0: 822.9. Samples: 283180. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:19:27,567][00199] Avg episode reward: [(0, '4.589')]
[2023-02-26 17:19:31,665][11480] Updated weights for policy 0, policy_version 280 (0.0014)
[2023-02-26 17:19:32,565][00199] Fps is (10 sec: 3685.8, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 1146880. Throughput: 0: 824.6. Samples: 286248. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:19:32,568][00199] Avg episode reward: [(0, '4.733')]
[2023-02-26 17:19:32,569][11467] Saving new best policy, reward=4.733!
[2023-02-26 17:19:37,571][00199] Fps is (10 sec: 2865.2, 60 sec: 3208.2, 300 sec: 3249.0). Total num frames: 1159168. Throughput: 0: 781.9. Samples: 290226. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-26 17:19:37,584][00199] Avg episode reward: [(0, '4.635')]
[2023-02-26 17:19:42,564][00199] Fps is (10 sec: 2867.6, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 1175552. Throughput: 0: 790.9. Samples: 294328. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:19:42,566][00199] Avg episode reward: [(0, '4.748')]
[2023-02-26 17:19:42,571][11467] Saving new best policy, reward=4.748!
[2023-02-26 17:19:45,651][11480] Updated weights for policy 0, policy_version 290 (0.0022)
[2023-02-26 17:19:47,564][00199] Fps is (10 sec: 3279.1, 60 sec: 3140.3, 300 sec: 3263.0). Total num frames: 1191936. Throughput: 0: 815.4. Samples: 297258. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:19:47,566][00199] Avg episode reward: [(0, '4.733')]
[2023-02-26 17:19:52,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 1212416. Throughput: 0: 820.0. Samples: 303090. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-26 17:19:52,570][00199] Avg episode reward: [(0, '4.763')]
[2023-02-26 17:19:52,572][11467] Saving new best policy, reward=4.763!
[2023-02-26 17:19:57,571][00199] Fps is (10 sec: 3274.4, 60 sec: 3208.3, 300 sec: 3248.9). Total num frames: 1224704. Throughput: 0: 778.7. Samples: 306736. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 17:19:57,574][00199] Avg episode reward: [(0, '4.758')]
[2023-02-26 17:19:59,042][11480] Updated weights for policy 0, policy_version 300 (0.0037)
[2023-02-26 17:20:02,565][00199] Fps is (10 sec: 2457.3, 60 sec: 3208.5, 300 sec: 3221.2). Total num frames: 1236992. Throughput: 0: 781.5. Samples: 308642. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:20:02,577][00199] Avg episode reward: [(0, '4.852')]
[2023-02-26 17:20:02,582][11467] Saving new best policy, reward=4.852!
[2023-02-26 17:20:07,564][00199] Fps is (10 sec: 3279.2, 60 sec: 3208.5, 300 sec: 3249.1). Total num frames: 1257472. Throughput: 0: 819.6. Samples: 314106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:20:07,566][00199] Avg episode reward: [(0, '4.701')]
[2023-02-26 17:20:10,285][11480] Updated weights for policy 0, policy_version 310 (0.0016)
[2023-02-26 17:20:12,564][00199] Fps is (10 sec: 3686.8, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 1273856. Throughput: 0: 809.8. Samples: 319620. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:20:12,569][00199] Avg episode reward: [(0, '4.640')]
[2023-02-26 17:20:17,569][00199] Fps is (10 sec: 2865.7, 60 sec: 3208.2, 300 sec: 3221.2). Total num frames: 1286144. Throughput: 0: 783.8. Samples: 321522. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:20:17,572][00199] Avg episode reward: [(0, '4.699')]
[2023-02-26 17:20:22,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3207.4). Total num frames: 1302528. Throughput: 0: 778.7. Samples: 325262. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:20:22,566][00199] Avg episode reward: [(0, '4.760')]
[2023-02-26 17:20:24,575][11480] Updated weights for policy 0, policy_version 320 (0.0023)
[2023-02-26 17:20:27,564][00199] Fps is (10 sec: 3278.6, 60 sec: 3140.3, 300 sec: 3221.3). Total num frames: 1318912. Throughput: 0: 818.6. Samples: 331164. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:20:27,566][00199] Avg episode reward: [(0, '4.748')]
[2023-02-26 17:20:32,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3208.6, 300 sec: 3235.1). Total num frames: 1339392. Throughput: 0: 821.7. Samples: 334234. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:20:32,566][00199] Avg episode reward: [(0, '4.627')]
[2023-02-26 17:20:35,950][11480] Updated weights for policy 0, policy_version 330 (0.0014)
[2023-02-26 17:20:37,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3208.9, 300 sec: 3207.4). Total num frames: 1351680. Throughput: 0: 791.6. Samples: 338710. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:20:37,569][00199] Avg episode reward: [(0, '4.565')]
[2023-02-26 17:20:42,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3207.4). Total num frames: 1368064. Throughput: 0: 807.1. Samples: 343050. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:20:42,565][00199] Avg episode reward: [(0, '4.594')]
[2023-02-26 17:20:47,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3235.1). Total num frames: 1388544. Throughput: 0: 837.0. Samples: 346306. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:20:47,565][00199] Avg episode reward: [(0, '4.614')]
[2023-02-26 17:20:47,701][11480] Updated weights for policy 0, policy_version 340 (0.0020)
[2023-02-26 17:20:52,565][00199] Fps is (10 sec: 4095.5, 60 sec: 3276.7, 300 sec: 3249.0). Total num frames: 1409024. Throughput: 0: 858.6. Samples: 352744. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:20:52,568][00199] Avg episode reward: [(0, '4.504')]
[2023-02-26 17:20:57,564][00199] Fps is (10 sec: 3276.7, 60 sec: 3277.2, 300 sec: 3207.4). Total num frames: 1421312. Throughput: 0: 828.2. Samples: 356888. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:20:57,570][00199] Avg episode reward: [(0, '4.380')]
[2023-02-26 17:20:57,586][11467] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000348_1425408.pth...
[2023-02-26 17:20:57,731][11467] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000160_655360.pth
[2023-02-26 17:21:00,587][11480] Updated weights for policy 0, policy_version 350 (0.0047)
[2023-02-26 17:21:02,564][00199] Fps is (10 sec: 2867.6, 60 sec: 3345.1, 300 sec: 3207.4). Total num frames: 1437696. Throughput: 0: 829.2. Samples: 358830. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:21:02,570][00199] Avg episode reward: [(0, '4.670')]
[2023-02-26 17:21:07,564][00199] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 1458176. Throughput: 0: 876.7. Samples: 364712. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:21:07,571][00199] Avg episode reward: [(0, '4.830')]
[2023-02-26 17:21:10,468][11480] Updated weights for policy 0, policy_version 360 (0.0012)
[2023-02-26 17:21:12,563][00199] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3235.1). Total num frames: 1478656. Throughput: 0: 886.9. Samples: 371076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:21:12,566][00199] Avg episode reward: [(0, '4.702')]
[2023-02-26 17:21:17,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3481.9, 300 sec: 3221.3). Total num frames: 1495040. Throughput: 0: 864.0. Samples: 373116. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:21:17,566][00199] Avg episode reward: [(0, '4.915')]
[2023-02-26 17:21:17,581][11467] Saving new best policy, reward=4.915!
[2023-02-26 17:21:22,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3207.4). Total num frames: 1507328. Throughput: 0: 853.4. Samples: 377114. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:21:22,571][00199] Avg episode reward: [(0, '5.018')]
[2023-02-26 17:21:22,574][11467] Saving new best policy, reward=5.018!
[2023-02-26 17:21:23,896][11480] Updated weights for policy 0, policy_version 370 (0.0014)
[2023-02-26 17:21:27,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3235.1). Total num frames: 1527808. Throughput: 0: 890.5. Samples: 383124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:21:27,570][00199] Avg episode reward: [(0, '4.911')]
[2023-02-26 17:21:32,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3249.0). Total num frames: 1548288. Throughput: 0: 885.8. Samples: 386168. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:21:32,568][00199] Avg episode reward: [(0, '5.095')]
[2023-02-26 17:21:32,573][11467] Saving new best policy, reward=5.095!
[2023-02-26 17:21:34,950][11480] Updated weights for policy 0, policy_version 380 (0.0015)
[2023-02-26 17:21:37,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3221.3). Total num frames: 1560576. Throughput: 0: 843.2. Samples: 390688. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:21:37,575][00199] Avg episode reward: [(0, '5.097')]
[2023-02-26 17:21:37,593][11467] Saving new best policy, reward=5.097!
[2023-02-26 17:21:42,564][00199] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3221.3). Total num frames: 1576960. Throughput: 0: 842.1. Samples: 394782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:21:42,566][00199] Avg episode reward: [(0, '5.305')]
[2023-02-26 17:21:42,576][11467] Saving new best policy, reward=5.305!
[2023-02-26 17:21:47,173][11480] Updated weights for policy 0, policy_version 390 (0.0012)
[2023-02-26 17:21:47,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3249.0). Total num frames: 1597440. Throughput: 0: 870.6. Samples: 398008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:21:47,570][00199] Avg episode reward: [(0, '5.069')]
[2023-02-26 17:21:52,564][00199] Fps is (10 sec: 4096.1, 60 sec: 3481.7, 300 sec: 3262.9). Total num frames: 1617920. Throughput: 0: 884.8. Samples: 404528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 17:21:52,567][00199] Avg episode reward: [(0, '5.358')]
[2023-02-26 17:21:52,572][11467] Saving new best policy, reward=5.358!
[2023-02-26 17:21:57,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3249.0). Total num frames: 1630208. Throughput: 0: 840.7. Samples: 408906. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:21:57,567][00199] Avg episode reward: [(0, '5.278')]
[2023-02-26 17:21:59,366][11480] Updated weights for policy 0, policy_version 400 (0.0018)
[2023-02-26 17:22:02,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3235.2). Total num frames: 1646592. Throughput: 0: 839.7. Samples: 410904. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:22:02,566][00199] Avg episode reward: [(0, '5.367')]
[2023-02-26 17:22:02,573][11467] Saving new best policy, reward=5.367!
[2023-02-26 17:22:07,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3262.9). Total num frames: 1667072. Throughput: 0: 878.0. Samples: 416626. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 17:22:07,565][00199] Avg episode reward: [(0, '5.468')]
[2023-02-26 17:22:07,579][11467] Saving new best policy, reward=5.468!
[2023-02-26 17:22:09,910][11480] Updated weights for policy 0, policy_version 410 (0.0012)
[2023-02-26 17:22:12,564][00199] Fps is (10 sec: 4095.8, 60 sec: 3481.6, 300 sec: 3290.7). Total num frames: 1687552. Throughput: 0: 886.0. Samples: 422996. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:22:12,568][00199] Avg episode reward: [(0, '5.375')]
[2023-02-26 17:22:17,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3276.8). Total num frames: 1703936. Throughput: 0: 865.4. Samples: 425112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:22:17,570][00199] Avg episode reward: [(0, '5.231')]
[2023-02-26 17:22:22,563][00199] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3262.9). Total num frames: 1716224. Throughput: 0: 855.2. Samples: 429170. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:22:22,572][00199] Avg episode reward: [(0, '5.382')]
[2023-02-26 17:22:23,027][11480] Updated weights for policy 0, policy_version 420 (0.0012)
[2023-02-26 17:22:27,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3304.6). Total num frames: 1736704. Throughput: 0: 904.9. Samples: 435504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:22:27,570][00199] Avg episode reward: [(0, '5.358')]
[2023-02-26 17:22:32,385][11480] Updated weights for policy 0, policy_version 430 (0.0024)
[2023-02-26 17:22:32,563][00199] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3346.2). Total num frames: 1761280. Throughput: 0: 905.6. Samples: 438760. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:22:32,567][00199] Avg episode reward: [(0, '5.430')]
[2023-02-26 17:22:37,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3346.2). Total num frames: 1773568. Throughput: 0: 866.4. Samples: 443516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:22:37,570][00199] Avg episode reward: [(0, '5.744')]
[2023-02-26 17:22:37,587][11467] Saving new best policy, reward=5.744!
[2023-02-26 17:22:42,567][00199] Fps is (10 sec: 2456.8, 60 sec: 3481.4, 300 sec: 3318.4). Total num frames: 1785856. Throughput: 0: 864.2. Samples: 447796. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-26 17:22:42,569][00199] Avg episode reward: [(0, '5.719')]
[2023-02-26 17:22:45,627][11480] Updated weights for policy 0, policy_version 440 (0.0019)
[2023-02-26 17:22:47,564][00199] Fps is (10 sec: 3276.6, 60 sec: 3481.6, 300 sec: 3332.3). Total num frames: 1806336. Throughput: 0: 889.2. Samples: 450918. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:22:47,570][00199] Avg episode reward: [(0, '5.696')]
[2023-02-26 17:22:52,570][00199] Fps is (10 sec: 4504.3, 60 sec: 3549.5, 300 sec: 3374.0). Total num frames: 1830912. Throughput: 0: 899.1. Samples: 457092. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:22:52,572][00199] Avg episode reward: [(0, '5.665')]
[2023-02-26 17:22:57,512][11480] Updated weights for policy 0, policy_version 450 (0.0023)
[2023-02-26 17:22:57,568][00199] Fps is (10 sec: 3685.0, 60 sec: 3549.6, 300 sec: 3360.1). Total num frames: 1843200. Throughput: 0: 850.8. Samples: 461286. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:22:57,573][00199] Avg episode reward: [(0, '5.944')]
[2023-02-26 17:22:57,585][11467] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000450_1843200.pth...
[2023-02-26 17:22:57,723][11467] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000252_1032192.pth
[2023-02-26 17:22:57,741][11467] Saving new best policy, reward=5.944!
[2023-02-26 17:23:02,564][00199] Fps is (10 sec: 2459.2, 60 sec: 3481.6, 300 sec: 3332.3). Total num frames: 1855488. Throughput: 0: 842.5. Samples: 463024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:23:02,567][00199] Avg episode reward: [(0, '6.265')]
[2023-02-26 17:23:02,571][11467] Saving new best policy, reward=6.265!
[2023-02-26 17:23:07,564][00199] Fps is (10 sec: 2868.4, 60 sec: 3413.3, 300 sec: 3332.3). Total num frames: 1871872. Throughput: 0: 867.4. Samples: 468204. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:23:07,566][00199] Avg episode reward: [(0, '6.313')]
[2023-02-26 17:23:07,583][11467] Saving new best policy, reward=6.313!
[2023-02-26 17:23:09,860][11480] Updated weights for policy 0, policy_version 460 (0.0026)
[2023-02-26 17:23:12,569][00199] Fps is (10 sec: 3684.3, 60 sec: 3413.0, 300 sec: 3360.0). Total num frames: 1892352. Throughput: 0: 856.5. Samples: 474050. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:23:12,577][00199] Avg episode reward: [(0, '6.675')]
[2023-02-26 17:23:12,580][11467] Saving new best policy, reward=6.675!
[2023-02-26 17:23:17,564][00199] Fps is (10 sec: 3276.9, 60 sec: 3345.1, 300 sec: 3360.1). Total num frames: 1904640. Throughput: 0: 824.8. Samples: 475876. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:23:17,574][00199] Avg episode reward: [(0, '6.681')]
[2023-02-26 17:23:17,588][11467] Saving new best policy, reward=6.681!
[2023-02-26 17:23:22,563][00199] Fps is (10 sec: 2868.8, 60 sec: 3413.3, 300 sec: 3346.2). Total num frames: 1921024. Throughput: 0: 808.2. Samples: 479884. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:23:22,566][00199] Avg episode reward: [(0, '6.745')]
[2023-02-26 17:23:22,573][11467] Saving new best policy, reward=6.745!
[2023-02-26 17:23:23,536][11480] Updated weights for policy 0, policy_version 470 (0.0024)
[2023-02-26 17:23:27,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3346.2). Total num frames: 1941504. Throughput: 0: 848.0. Samples: 485952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:23:27,565][00199] Avg episode reward: [(0, '6.748')]
[2023-02-26 17:23:27,577][11467] Saving new best policy, reward=6.748!
[2023-02-26 17:23:32,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3374.0). Total num frames: 1961984. Throughput: 0: 849.8. Samples: 489160. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:23:32,569][00199] Avg episode reward: [(0, '7.432')]
[2023-02-26 17:23:32,576][11467] Saving new best policy, reward=7.432!
[2023-02-26 17:23:33,746][11480] Updated weights for policy 0, policy_version 480 (0.0013)
[2023-02-26 17:23:37,565][00199] Fps is (10 sec: 3276.3, 60 sec: 3345.0, 300 sec: 3360.1). Total num frames: 1974272. Throughput: 0: 819.6. Samples: 493972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:23:37,567][00199] Avg episode reward: [(0, '8.123')]
[2023-02-26 17:23:37,581][11467] Saving new best policy, reward=8.123!
[2023-02-26 17:23:42,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3413.5, 300 sec: 3346.2). Total num frames: 1990656. Throughput: 0: 818.0. Samples: 498092. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:23:42,566][00199] Avg episode reward: [(0, '8.152')]
[2023-02-26 17:23:42,569][11467] Saving new best policy, reward=8.152!
[2023-02-26 17:23:46,347][11480] Updated weights for policy 0, policy_version 490 (0.0012)
[2023-02-26 17:23:47,564][00199] Fps is (10 sec: 3686.8, 60 sec: 3413.3, 300 sec: 3360.1). Total num frames: 2011136. Throughput: 0: 847.6. Samples: 501168. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:23:47,567][00199] Avg episode reward: [(0, '7.859')]
[2023-02-26 17:23:52,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3345.4, 300 sec: 3387.9). Total num frames: 2031616. Throughput: 0: 876.3. Samples: 507638. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:23:52,569][00199] Avg episode reward: [(0, '6.811')]
[2023-02-26 17:23:57,564][00199] Fps is (10 sec: 3276.9, 60 sec: 3345.3, 300 sec: 3387.9). Total num frames: 2043904. Throughput: 0: 843.1. Samples: 511986. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-26 17:23:57,570][00199] Avg episode reward: [(0, '7.537')]
[2023-02-26 17:23:58,548][11480] Updated weights for policy 0, policy_version 500 (0.0015)
[2023-02-26 17:24:02,564][00199] Fps is (10 sec: 2457.5, 60 sec: 3345.0, 300 sec: 3360.1). Total num frames: 2056192. Throughput: 0: 839.1. Samples: 513638. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:24:02,567][00199] Avg episode reward: [(0, '7.977')]
[2023-02-26 17:24:07,569][00199] Fps is (10 sec: 2456.3, 60 sec: 3276.5, 300 sec: 3346.2). Total num frames: 2068480. Throughput: 0: 827.9. Samples: 517144. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:24:07,573][00199] Avg episode reward: [(0, '8.071')]
[2023-02-26 17:24:12,564][00199] Fps is (10 sec: 2867.3, 60 sec: 3208.8, 300 sec: 3360.1). Total num frames: 2084864. Throughput: 0: 807.7. Samples: 522300. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:24:12,570][00199] Avg episode reward: [(0, '8.508')]
[2023-02-26 17:24:12,573][11467] Saving new best policy, reward=8.508!
[2023-02-26 17:24:12,967][11480] Updated weights for policy 0, policy_version 510 (0.0047)
[2023-02-26 17:24:17,565][00199] Fps is (10 sec: 3278.6, 60 sec: 3276.8, 300 sec: 3360.1). Total num frames: 2101248. Throughput: 0: 794.8. Samples: 524926. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 17:24:17,568][00199] Avg episode reward: [(0, '8.184')]
[2023-02-26 17:24:22,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3332.3). Total num frames: 2113536. Throughput: 0: 777.0. Samples: 528934. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:24:22,568][00199] Avg episode reward: [(0, '8.401')]
[2023-02-26 17:24:26,361][11480] Updated weights for policy 0, policy_version 520 (0.0022)
[2023-02-26 17:24:27,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3346.2). Total num frames: 2134016. Throughput: 0: 799.7. Samples: 534078. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:24:27,571][00199] Avg episode reward: [(0, '9.069')]
[2023-02-26 17:24:27,582][11467] Saving new best policy, reward=9.069!
[2023-02-26 17:24:32,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3208.5, 300 sec: 3374.1). Total num frames: 2154496. Throughput: 0: 798.6. Samples: 537104. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:24:32,569][00199] Avg episode reward: [(0, '9.233')]
[2023-02-26 17:24:32,574][11467] Saving new best policy, reward=9.233!
[2023-02-26 17:24:37,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3208.6, 300 sec: 3360.1). Total num frames: 2166784. Throughput: 0: 774.0. Samples: 542470. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:24:37,568][11480] Updated weights for policy 0, policy_version 530 (0.0017)
[2023-02-26 17:24:37,566][00199] Avg episode reward: [(0, '8.823')]
[2023-02-26 17:24:42,568][00199] Fps is (10 sec: 2865.9, 60 sec: 3208.3, 300 sec: 3360.1). Total num frames: 2183168. Throughput: 0: 766.9. Samples: 546500. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 17:24:42,576][00199] Avg episode reward: [(0, '8.376')]
[2023-02-26 17:24:47,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3346.2). Total num frames: 2199552. Throughput: 0: 784.5. Samples: 548940. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:24:47,569][00199] Avg episode reward: [(0, '8.148')]
[2023-02-26 17:24:49,821][11480] Updated weights for policy 0, policy_version 540 (0.0027)
[2023-02-26 17:24:52,564][00199] Fps is (10 sec: 3688.0, 60 sec: 3140.3, 300 sec: 3374.1). Total num frames: 2220032. Throughput: 0: 841.6. Samples: 555010. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:24:52,566][00199] Avg episode reward: [(0, '8.417')]
[2023-02-26 17:24:57,565][00199] Fps is (10 sec: 3685.8, 60 sec: 3208.5, 300 sec: 3387.9). Total num frames: 2236416. Throughput: 0: 833.9. Samples: 559826. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:24:57,570][00199] Avg episode reward: [(0, '8.833')]
[2023-02-26 17:24:57,584][11467] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000546_2236416.pth...
[2023-02-26 17:24:57,724][11467] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000348_1425408.pth
[2023-02-26 17:25:02,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3208.6, 300 sec: 3360.1). Total num frames: 2248704. Throughput: 0: 816.2. Samples: 561654. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:25:02,568][00199] Avg episode reward: [(0, '9.597')]
[2023-02-26 17:25:02,578][11467] Saving new best policy, reward=9.597!
[2023-02-26 17:25:03,885][11480] Updated weights for policy 0, policy_version 550 (0.0026)
[2023-02-26 17:25:07,563][00199] Fps is (10 sec: 2867.7, 60 sec: 3277.1, 300 sec: 3360.1). Total num frames: 2265088. Throughput: 0: 828.3. Samples: 566208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:25:07,565][00199] Avg episode reward: [(0, '10.009')]
[2023-02-26 17:25:07,579][11467] Saving new best policy, reward=10.009!
[2023-02-26 17:25:12,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3387.9). Total num frames: 2285568. Throughput: 0: 862.8. Samples: 572902. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:25:12,566][00199] Avg episode reward: [(0, '11.100')]
[2023-02-26 17:25:12,573][11467] Saving new best policy, reward=11.100!
[2023-02-26 17:25:13,540][11480] Updated weights for policy 0, policy_version 560 (0.0021)
[2023-02-26 17:25:17,567][00199] Fps is (10 sec: 3685.1, 60 sec: 3344.9, 300 sec: 3387.8). Total num frames: 2301952. Throughput: 0: 856.3. Samples: 575642. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:25:17,573][00199] Avg episode reward: [(0, '11.281')]
[2023-02-26 17:25:17,608][11467] Saving new best policy, reward=11.281!
[2023-02-26 17:25:22,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 2318336. Throughput: 0: 828.8. Samples: 579768. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 17:25:22,567][00199] Avg episode reward: [(0, '10.840')]
[2023-02-26 17:25:26,501][11480] Updated weights for policy 0, policy_version 570 (0.0020)
[2023-02-26 17:25:27,564][00199] Fps is (10 sec: 3687.6, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 2338816. Throughput: 0: 863.0. Samples: 585330. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:25:27,566][00199] Avg episode reward: [(0, '11.483')]
[2023-02-26 17:25:27,572][11467] Saving new best policy, reward=11.483!
[2023-02-26 17:25:32,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 2359296. Throughput: 0: 880.4. Samples: 588558. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:25:32,570][00199] Avg episode reward: [(0, '12.509')]
[2023-02-26 17:25:32,576][11467] Saving new best policy, reward=12.509!
[2023-02-26 17:25:36,880][11480] Updated weights for policy 0, policy_version 580 (0.0025)
[2023-02-26 17:25:37,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3415.6). Total num frames: 2375680. Throughput: 0: 871.2. Samples: 594216. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:25:37,568][00199] Avg episode reward: [(0, '12.621')]
[2023-02-26 17:25:37,588][11467] Saving new best policy, reward=12.621!
[2023-02-26 17:25:42,564][00199] Fps is (10 sec: 2866.9, 60 sec: 3413.5, 300 sec: 3387.9). Total num frames: 2387968. Throughput: 0: 854.6. Samples: 598284. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:25:42,572][00199] Avg episode reward: [(0, '13.135')]
[2023-02-26 17:25:42,579][11467] Saving new best policy, reward=13.135!
[2023-02-26 17:25:47,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3387.9). Total num frames: 2408448. Throughput: 0: 873.0. Samples: 600940. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:25:47,568][00199] Avg episode reward: [(0, '12.280')]
[2023-02-26 17:25:48,843][11480] Updated weights for policy 0, policy_version 590 (0.0018)
[2023-02-26 17:25:52,564][00199] Fps is (10 sec: 4505.9, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 2433024. Throughput: 0: 920.4. Samples: 607626. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:25:52,571][00199] Avg episode reward: [(0, '12.204')]
[2023-02-26 17:25:57,571][00199] Fps is (10 sec: 3683.7, 60 sec: 3481.3, 300 sec: 3415.6). Total num frames: 2445312. Throughput: 0: 884.7. Samples: 612720. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:25:57,573][00199] Avg episode reward: [(0, '11.946')]
[2023-02-26 17:26:00,693][11480] Updated weights for policy 0, policy_version 600 (0.0012)
[2023-02-26 17:26:02,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3401.8). Total num frames: 2461696. Throughput: 0: 870.0. Samples: 614790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:26:02,576][00199] Avg episode reward: [(0, '12.636')]
[2023-02-26 17:26:07,564][00199] Fps is (10 sec: 3689.1, 60 sec: 3618.1, 300 sec: 3401.8). Total num frames: 2482176. Throughput: 0: 894.0. Samples: 620000. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:26:07,571][00199] Avg episode reward: [(0, '13.330')]
[2023-02-26 17:26:07,581][11467] Saving new best policy, reward=13.330!
[2023-02-26 17:26:11,046][11480] Updated weights for policy 0, policy_version 610 (0.0019)
[2023-02-26 17:26:12,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3415.6). Total num frames: 2502656. Throughput: 0: 918.8. Samples: 626674. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:26:12,565][00199] Avg episode reward: [(0, '13.453')]
[2023-02-26 17:26:12,572][11467] Saving new best policy, reward=13.453!
[2023-02-26 17:26:17,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3429.5). Total num frames: 2519040. Throughput: 0: 904.1. Samples: 629242. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:26:17,570][00199] Avg episode reward: [(0, '14.019')]
[2023-02-26 17:26:17,577][11467] Saving new best policy, reward=14.019!
[2023-02-26 17:26:22,565][00199] Fps is (10 sec: 2866.8, 60 sec: 3549.8, 300 sec: 3401.7). Total num frames: 2531328. Throughput: 0: 870.3. Samples: 633380. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:26:22,569][00199] Avg episode reward: [(0, '14.254')]
[2023-02-26 17:26:22,576][11467] Saving new best policy, reward=14.254!
[2023-02-26 17:26:24,088][11480] Updated weights for policy 0, policy_version 620 (0.0013)
[2023-02-26 17:26:27,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3401.8). Total num frames: 2551808. Throughput: 0: 903.3. Samples: 638930. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:26:27,569][00199] Avg episode reward: [(0, '14.012')]
[2023-02-26 17:26:32,563][00199] Fps is (10 sec: 4096.6, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 2572288. Throughput: 0: 914.2. Samples: 642078. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:26:32,571][00199] Avg episode reward: [(0, '13.723')]
[2023-02-26 17:26:33,998][11480] Updated weights for policy 0, policy_version 630 (0.0013)
[2023-02-26 17:26:37,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 2588672. Throughput: 0: 882.8. Samples: 647352. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:26:37,570][00199] Avg episode reward: [(0, '14.092')]
[2023-02-26 17:26:42,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3401.8). Total num frames: 2600960. Throughput: 0: 862.8. Samples: 651540. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:26:42,569][00199] Avg episode reward: [(0, '14.162')]
[2023-02-26 17:26:46,568][11480] Updated weights for policy 0, policy_version 640 (0.0023)
[2023-02-26 17:26:47,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3415.6). Total num frames: 2625536. Throughput: 0: 886.1. Samples: 654664. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:26:47,571][00199] Avg episode reward: [(0, '13.983')]
[2023-02-26 17:26:52,564][00199] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2646016. Throughput: 0: 919.1. Samples: 661358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:26:52,566][00199] Avg episode reward: [(0, '14.994')]
[2023-02-26 17:26:52,577][11467] Saving new best policy, reward=14.994!
[2023-02-26 17:26:57,567][00199] Fps is (10 sec: 3275.7, 60 sec: 3550.1, 300 sec: 3429.5). Total num frames: 2658304. Throughput: 0: 875.9. Samples: 666094. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-26 17:26:57,569][00199] Avg episode reward: [(0, '14.416')]
[2023-02-26 17:26:57,626][11467] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000650_2662400.pth...
[2023-02-26 17:26:57,630][11480] Updated weights for policy 0, policy_version 650 (0.0021)
[2023-02-26 17:26:57,786][11467] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000450_1843200.pth
[2023-02-26 17:27:02,566][00199] Fps is (10 sec: 2866.5, 60 sec: 3549.7, 300 sec: 3415.6). Total num frames: 2674688. Throughput: 0: 865.2. Samples: 668176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:27:02,572][00199] Avg episode reward: [(0, '14.069')]
[2023-02-26 17:27:07,563][00199] Fps is (10 sec: 4097.4, 60 sec: 3618.1, 300 sec: 3429.5). Total num frames: 2699264. Throughput: 0: 909.6. Samples: 674312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:27:07,569][00199] Avg episode reward: [(0, '14.509')]
[2023-02-26 17:27:08,171][11480] Updated weights for policy 0, policy_version 660 (0.0013)
[2023-02-26 17:27:12,564][00199] Fps is (10 sec: 4916.4, 60 sec: 3686.4, 300 sec: 3457.3). Total num frames: 2723840. Throughput: 0: 948.4. Samples: 681606. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:27:12,565][00199] Avg episode reward: [(0, '15.406')]
[2023-02-26 17:27:12,574][11467] Saving new best policy, reward=15.406!
[2023-02-26 17:27:17,566][00199] Fps is (10 sec: 4095.0, 60 sec: 3686.3, 300 sec: 3471.2). Total num frames: 2740224. Throughput: 0: 928.6. Samples: 683866. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:27:17,570][00199] Avg episode reward: [(0, '15.296')]
[2023-02-26 17:27:18,979][11480] Updated weights for policy 0, policy_version 670 (0.0019)
[2023-02-26 17:27:22,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3686.5, 300 sec: 3443.4). Total num frames: 2752512. Throughput: 0: 911.3. Samples: 688362. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:27:22,570][00199] Avg episode reward: [(0, '15.737')]
[2023-02-26 17:27:22,574][11467] Saving new best policy, reward=15.737!
[2023-02-26 17:27:27,564][00199] Fps is (10 sec: 3687.3, 60 sec: 3754.7, 300 sec: 3443.4). Total num frames: 2777088. Throughput: 0: 963.1. Samples: 694880. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:27:27,566][00199] Avg episode reward: [(0, '16.249')]
[2023-02-26 17:27:27,574][11467] Saving new best policy, reward=16.249!
[2023-02-26 17:27:29,327][11480] Updated weights for policy 0, policy_version 680 (0.0030)
[2023-02-26 17:27:32,565][00199] Fps is (10 sec: 4914.4, 60 sec: 3822.8, 300 sec: 3485.1). Total num frames: 2801664. Throughput: 0: 969.6. Samples: 698296. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:27:32,570][00199] Avg episode reward: [(0, '15.640')]
[2023-02-26 17:27:37,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3485.1). Total num frames: 2813952. Throughput: 0: 941.2. Samples: 703714. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 17:27:37,570][00199] Avg episode reward: [(0, '16.506')]
[2023-02-26 17:27:37,587][11467] Saving new best policy, reward=16.506!
[2023-02-26 17:27:40,946][11480] Updated weights for policy 0, policy_version 690 (0.0049)
[2023-02-26 17:27:42,564][00199] Fps is (10 sec: 2867.6, 60 sec: 3822.9, 300 sec: 3471.2). Total num frames: 2830336. Throughput: 0: 933.9. Samples: 708116. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:27:42,574][00199] Avg episode reward: [(0, '16.273')]
[2023-02-26 17:27:47,563][00199] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3457.4). Total num frames: 2850816. Throughput: 0: 965.7. Samples: 711630. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:27:47,565][00199] Avg episode reward: [(0, '16.382')]
[2023-02-26 17:27:50,237][11480] Updated weights for policy 0, policy_version 700 (0.0021)
[2023-02-26 17:27:52,564][00199] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3499.0). Total num frames: 2875392. Throughput: 0: 987.4. Samples: 718746. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:27:52,566][00199] Avg episode reward: [(0, '16.450')]
[2023-02-26 17:27:57,565][00199] Fps is (10 sec: 4095.4, 60 sec: 3891.3, 300 sec: 3512.8). Total num frames: 2891776. Throughput: 0: 932.9. Samples: 723586. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:27:57,568][00199] Avg episode reward: [(0, '16.567')]
[2023-02-26 17:27:57,585][11467] Saving new best policy, reward=16.567!
[2023-02-26 17:28:02,483][11480] Updated weights for policy 0, policy_version 710 (0.0012)
[2023-02-26 17:28:02,563][00199] Fps is (10 sec: 3276.8, 60 sec: 3891.4, 300 sec: 3512.8). Total num frames: 2908160. Throughput: 0: 931.0. Samples: 725758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:28:02,566][00199] Avg episode reward: [(0, '16.690')]
[2023-02-26 17:28:02,570][11467] Saving new best policy, reward=16.690!
[2023-02-26 17:28:07,564][00199] Fps is (10 sec: 3686.9, 60 sec: 3822.9, 300 sec: 3512.9). Total num frames: 2928640. Throughput: 0: 974.0. Samples: 732194. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:28:07,565][00199] Avg episode reward: [(0, '16.210')]
[2023-02-26 17:28:11,021][11480] Updated weights for policy 0, policy_version 720 (0.0012)
[2023-02-26 17:28:12,564][00199] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3554.5). Total num frames: 2953216. Throughput: 0: 985.8. Samples: 739240. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:28:12,570][00199] Avg episode reward: [(0, '16.300')]
[2023-02-26 17:28:17,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3823.1, 300 sec: 3554.5). Total num frames: 2969600. Throughput: 0: 959.6. Samples: 741476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:28:17,570][00199] Avg episode reward: [(0, '15.972')]
[2023-02-26 17:28:22,563][00199] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3540.6). Total num frames: 2985984. Throughput: 0: 940.0. Samples: 746014. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2023-02-26 17:28:22,566][00199] Avg episode reward: [(0, '16.504')]
[2023-02-26 17:28:23,299][11480] Updated weights for policy 0, policy_version 730 (0.0022)
[2023-02-26 17:28:27,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3540.6). Total num frames: 3006464. Throughput: 0: 994.4. Samples: 752862. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:28:27,570][00199] Avg episode reward: [(0, '17.079')]
[2023-02-26 17:28:27,614][11467] Saving new best policy, reward=17.079!
[2023-02-26 17:28:32,171][11480] Updated weights for policy 0, policy_version 740 (0.0015)
[2023-02-26 17:28:32,563][00199] Fps is (10 sec: 4505.6, 60 sec: 3823.0, 300 sec: 3582.3). Total num frames: 3031040. Throughput: 0: 992.5. Samples: 756292. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:28:32,567][00199] Avg episode reward: [(0, '16.873')]
[2023-02-26 17:28:37,569][00199] Fps is (10 sec: 3684.4, 60 sec: 3822.6, 300 sec: 3568.3). Total num frames: 3043328. Throughput: 0: 948.9. Samples: 761452. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:28:37,572][00199] Avg episode reward: [(0, '16.787')]
[2023-02-26 17:28:42,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3568.4). Total num frames: 3063808. Throughput: 0: 953.3. Samples: 766484. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:28:42,569][00199] Avg episode reward: [(0, '17.747')]
[2023-02-26 17:28:42,574][11467] Saving new best policy, reward=17.747!
[2023-02-26 17:28:44,257][11480] Updated weights for policy 0, policy_version 750 (0.0027)
[2023-02-26 17:28:47,564][00199] Fps is (10 sec: 4098.2, 60 sec: 3891.2, 300 sec: 3568.4). Total num frames: 3084288. Throughput: 0: 982.9. Samples: 769988. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:28:47,569][00199] Avg episode reward: [(0, '18.550')]
[2023-02-26 17:28:47,577][11467] Saving new best policy, reward=18.550!
[2023-02-26 17:28:52,564][00199] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3610.0). Total num frames: 3108864. Throughput: 0: 998.5. Samples: 777128. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:28:52,565][00199] Avg episode reward: [(0, '18.675')]
[2023-02-26 17:28:52,569][11467] Saving new best policy, reward=18.675!
[2023-02-26 17:28:53,728][11480] Updated weights for policy 0, policy_version 760 (0.0019)
[2023-02-26 17:28:57,564][00199] Fps is (10 sec: 3686.3, 60 sec: 3823.0, 300 sec: 3610.0). Total num frames: 3121152. Throughput: 0: 940.6. Samples: 781566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:28:57,566][00199] Avg episode reward: [(0, '19.849')]
[2023-02-26 17:28:57,579][11467] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000762_3121152.pth...
[2023-02-26 17:28:57,748][11467] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000546_2236416.pth
[2023-02-26 17:28:57,757][11467] Saving new best policy, reward=19.849!
[2023-02-26 17:29:02,563][00199] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3624.0). Total num frames: 3137536. Throughput: 0: 937.7. Samples: 783674. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:29:02,569][00199] Avg episode reward: [(0, '20.017')]
[2023-02-26 17:29:02,572][11467] Saving new best policy, reward=20.017!
[2023-02-26 17:29:05,300][11480] Updated weights for policy 0, policy_version 770 (0.0011)
[2023-02-26 17:29:07,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3651.7). Total num frames: 3162112. Throughput: 0: 986.7. Samples: 790416. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:29:07,566][00199] Avg episode reward: [(0, '18.593')]
[2023-02-26 17:29:12,569][00199] Fps is (10 sec: 4912.6, 60 sec: 3890.9, 300 sec: 3679.4). Total num frames: 3186688. Throughput: 0: 985.5. Samples: 797216. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:29:12,572][00199] Avg episode reward: [(0, '18.529')]
[2023-02-26 17:29:15,197][11480] Updated weights for policy 0, policy_version 780 (0.0012)
[2023-02-26 17:29:17,568][00199] Fps is (10 sec: 3684.8, 60 sec: 3822.7, 300 sec: 3679.4). Total num frames: 3198976. Throughput: 0: 959.5. Samples: 799474. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:29:17,570][00199] Avg episode reward: [(0, '18.983')]
[2023-02-26 17:29:22,564][00199] Fps is (10 sec: 3278.6, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 3219456. Throughput: 0: 948.9. Samples: 804148. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:29:22,566][00199] Avg episode reward: [(0, '19.416')]
[2023-02-26 17:29:25,869][11480] Updated weights for policy 0, policy_version 790 (0.0036)
[2023-02-26 17:29:27,564][00199] Fps is (10 sec: 4507.6, 60 sec: 3959.5, 300 sec: 3693.3). Total num frames: 3244032. Throughput: 0: 999.4. Samples: 811456. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:29:27,566][00199] Avg episode reward: [(0, '18.963')]
[2023-02-26 17:29:32,566][00199] Fps is (10 sec: 4095.0, 60 sec: 3822.8, 300 sec: 3707.2). Total num frames: 3260416. Throughput: 0: 995.1. Samples: 814768. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:29:32,571][00199] Avg episode reward: [(0, '20.028')]
[2023-02-26 17:29:32,594][11467] Saving new best policy, reward=20.028!
[2023-02-26 17:29:36,571][11480] Updated weights for policy 0, policy_version 800 (0.0046)
[2023-02-26 17:29:37,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3891.6, 300 sec: 3707.3). Total num frames: 3276800. Throughput: 0: 944.6. Samples: 819634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:29:37,571][00199] Avg episode reward: [(0, '20.067')]
[2023-02-26 17:29:37,594][11467] Saving new best policy, reward=20.067!
[2023-02-26 17:29:42,564][00199] Fps is (10 sec: 3687.3, 60 sec: 3891.2, 300 sec: 3721.1). Total num frames: 3297280. Throughput: 0: 961.6. Samples: 824836. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:29:42,570][00199] Avg episode reward: [(0, '19.490')]
[2023-02-26 17:29:46,660][11480] Updated weights for policy 0, policy_version 810 (0.0012)
[2023-02-26 17:29:47,563][00199] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 3321856. Throughput: 0: 994.7. Samples: 828436. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:29:47,570][00199] Avg episode reward: [(0, '19.339')]
[2023-02-26 17:29:52,564][00199] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 3342336. Throughput: 0: 999.2. Samples: 835378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:29:52,575][00199] Avg episode reward: [(0, '19.781')]
[2023-02-26 17:29:57,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 3354624. Throughput: 0: 949.0. Samples: 839918. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:29:57,569][00199] Avg episode reward: [(0, '20.212')]
[2023-02-26 17:29:57,576][11467] Saving new best policy, reward=20.212!
[2023-02-26 17:29:57,892][11480] Updated weights for policy 0, policy_version 820 (0.0030)
[2023-02-26 17:30:02,563][00199] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3762.8). Total num frames: 3375104. Throughput: 0: 945.2. Samples: 842006. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:30:02,573][00199] Avg episode reward: [(0, '20.679')]
[2023-02-26 17:30:02,576][11467] Saving new best policy, reward=20.679!
[2023-02-26 17:30:07,520][11480] Updated weights for policy 0, policy_version 830 (0.0017)
[2023-02-26 17:30:07,563][00199] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 3399680. Throughput: 0: 1000.0. Samples: 849146. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:30:07,566][00199] Avg episode reward: [(0, '19.741')]
[2023-02-26 17:30:12,563][00199] Fps is (10 sec: 4096.0, 60 sec: 3823.3, 300 sec: 3776.7). Total num frames: 3416064. Throughput: 0: 977.2. Samples: 855428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:30:12,568][00199] Avg episode reward: [(0, '20.550')]
[2023-02-26 17:30:17,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3891.5, 300 sec: 3776.7). Total num frames: 3432448. Throughput: 0: 953.3. Samples: 857666. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:30:17,569][00199] Avg episode reward: [(0, '19.446')]
[2023-02-26 17:30:19,416][11480] Updated weights for policy 0, policy_version 840 (0.0023)
[2023-02-26 17:30:22,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 3452928. Throughput: 0: 958.7. Samples: 862774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:30:22,566][00199] Avg episode reward: [(0, '18.737')]
[2023-02-26 17:30:27,563][00199] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 3469312. Throughput: 0: 960.1. Samples: 868040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:30:27,571][00199] Avg episode reward: [(0, '18.506')]
[2023-02-26 17:30:31,713][11480] Updated weights for policy 0, policy_version 850 (0.0027)
[2023-02-26 17:30:32,564][00199] Fps is (10 sec: 2867.2, 60 sec: 3686.5, 300 sec: 3748.9). Total num frames: 3481600. Throughput: 0: 927.8. Samples: 870188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:30:32,575][00199] Avg episode reward: [(0, '18.787')]
[2023-02-26 17:30:37,563][00199] Fps is (10 sec: 2457.6, 60 sec: 3618.1, 300 sec: 3748.9). Total num frames: 3493888. Throughput: 0: 862.8. Samples: 874202. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:30:37,567][00199] Avg episode reward: [(0, '19.152')]
[2023-02-26 17:30:42,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3748.9). Total num frames: 3514368. Throughput: 0: 880.9. Samples: 879560. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:30:42,566][00199] Avg episode reward: [(0, '18.487')]
[2023-02-26 17:30:43,829][11480] Updated weights for policy 0, policy_version 860 (0.0020)
[2023-02-26 17:30:47,564][00199] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3748.9). Total num frames: 3538944. Throughput: 0: 915.7. Samples: 883214. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:30:47,573][00199] Avg episode reward: [(0, '21.089')]
[2023-02-26 17:30:47,582][11467] Saving new best policy, reward=21.089!
[2023-02-26 17:30:52,564][00199] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3776.7). Total num frames: 3559424. Throughput: 0: 907.6. Samples: 889990. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:30:52,571][00199] Avg episode reward: [(0, '21.586')]
[2023-02-26 17:30:52,575][11467] Saving new best policy, reward=21.586!
[2023-02-26 17:30:53,516][11480] Updated weights for policy 0, policy_version 870 (0.0038)
[2023-02-26 17:30:57,567][00199] Fps is (10 sec: 3685.3, 60 sec: 3686.2, 300 sec: 3776.6). Total num frames: 3575808. Throughput: 0: 865.4. Samples: 894374. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:30:57,570][00199] Avg episode reward: [(0, '21.804')]
[2023-02-26 17:30:57,585][11467] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000873_3575808.pth...
[2023-02-26 17:30:57,791][11467] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000650_2662400.pth
[2023-02-26 17:30:57,804][11467] Saving new best policy, reward=21.804!
[2023-02-26 17:31:02,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3762.8). Total num frames: 3592192. Throughput: 0: 860.8. Samples: 896402. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 17:31:02,566][00199] Avg episode reward: [(0, '20.585')]
[2023-02-26 17:31:04,990][11480] Updated weights for policy 0, policy_version 880 (0.0015)
[2023-02-26 17:31:07,563][00199] Fps is (10 sec: 4097.3, 60 sec: 3618.1, 300 sec: 3776.7). Total num frames: 3616768. Throughput: 0: 903.8. Samples: 903444. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:31:07,571][00199] Avg episode reward: [(0, '20.586')]
[2023-02-26 17:31:12,564][00199] Fps is (10 sec: 4095.8, 60 sec: 3618.1, 300 sec: 3776.6). Total num frames: 3633152. Throughput: 0: 920.7. Samples: 909474. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:31:12,567][00199] Avg episode reward: [(0, '19.491')]
[2023-02-26 17:31:15,703][11480] Updated weights for policy 0, policy_version 890 (0.0029)
[2023-02-26 17:31:17,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3790.6). Total num frames: 3649536. Throughput: 0: 923.7. Samples: 911756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:31:17,566][00199] Avg episode reward: [(0, '19.747')]
[2023-02-26 17:31:22,563][00199] Fps is (10 sec: 3686.6, 60 sec: 3618.1, 300 sec: 3790.5). Total num frames: 3670016. Throughput: 0: 946.0. Samples: 916772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:31:22,569][00199] Avg episode reward: [(0, '20.820')]
[2023-02-26 17:31:26,077][11480] Updated weights for policy 0, policy_version 900 (0.0023)
[2023-02-26 17:31:27,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3790.5). Total num frames: 3690496. Throughput: 0: 981.0. Samples: 923704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-26 17:31:27,569][00199] Avg episode reward: [(0, '22.454')]
[2023-02-26 17:31:27,580][11467] Saving new best policy, reward=22.454!
[2023-02-26 17:31:32,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3710976. Throughput: 0: 968.2. Samples: 926784. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 17:31:32,576][00199] Avg episode reward: [(0, '22.026')]
[2023-02-26 17:31:37,564][00199] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3723264. Throughput: 0: 912.8. Samples: 931066. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:31:37,567][00199] Avg episode reward: [(0, '21.734')]
[2023-02-26 17:31:38,123][11480] Updated weights for policy 0, policy_version 910 (0.0039)
[2023-02-26 17:31:42,563][00199] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 3743744. Throughput: 0: 942.1. Samples: 936766. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:31:42,571][00199] Avg episode reward: [(0, '21.082')]
[2023-02-26 17:31:47,257][11480] Updated weights for policy 0, policy_version 920 (0.0020)
[2023-02-26 17:31:47,563][00199] Fps is (10 sec: 4505.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3768320. Throughput: 0: 976.1. Samples: 940326. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:31:47,566][00199] Avg episode reward: [(0, '19.309')]
[2023-02-26 17:31:52,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 3784704. Throughput: 0: 965.1. Samples: 946872. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:31:52,571][00199] Avg episode reward: [(0, '19.220')]
[2023-02-26 17:31:57,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3754.9, 300 sec: 3818.3). Total num frames: 3801088. Throughput: 0: 931.5. Samples: 951390. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:31:57,566][00199] Avg episode reward: [(0, '18.740')]
[2023-02-26 17:31:59,324][11480] Updated weights for policy 0, policy_version 930 (0.0021)
[2023-02-26 17:32:02,564][00199] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3821568. Throughput: 0: 938.0. Samples: 953968. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-26 17:32:02,566][00199] Avg episode reward: [(0, '19.361')]
[2023-02-26 17:32:07,564][00199] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3846144. Throughput: 0: 987.9. Samples: 961228. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-26 17:32:07,566][00199] Avg episode reward: [(0, '18.780')]
[2023-02-26 17:32:08,010][11480] Updated weights for policy 0, policy_version 940 (0.0028)
[2023-02-26 17:32:12,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3804.5). Total num frames: 3862528. Throughput: 0: 964.1. Samples: 967088. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:32:12,569][00199] Avg episode reward: [(0, '19.730')]
[2023-02-26 17:32:17,564][00199] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3878912. Throughput: 0: 944.8. Samples: 969298. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:32:17,571][00199] Avg episode reward: [(0, '20.156')]
[2023-02-26 17:32:20,236][11480] Updated weights for policy 0, policy_version 950 (0.0019)
[2023-02-26 17:32:22,564][00199] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3899392. Throughput: 0: 972.0. Samples: 974806. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:32:22,566][00199] Avg episode reward: [(0, '20.760')]
[2023-02-26 17:32:27,564][00199] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3923968. Throughput: 0: 1005.5. Samples: 982014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:32:27,568][00199] Avg episode reward: [(0, '22.527')]
[2023-02-26 17:32:27,587][11467] Saving new best policy, reward=22.527!
[2023-02-26 17:32:28,920][11480] Updated weights for policy 0, policy_version 960 (0.0022)
[2023-02-26 17:32:32,564][00199] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3940352. Throughput: 0: 988.6. Samples: 984814. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:32:32,566][00199] Avg episode reward: [(0, '22.963')]
[2023-02-26 17:32:32,569][11467] Saving new best policy, reward=22.963!
[2023-02-26 17:32:37,564][00199] Fps is (10 sec: 3276.9, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3956736. Throughput: 0: 940.0. Samples: 989172. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-26 17:32:37,566][00199] Avg episode reward: [(0, '22.753')]
[2023-02-26 17:32:41,215][11480] Updated weights for policy 0, policy_version 970 (0.0025)
[2023-02-26 17:32:42,564][00199] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3977216. Throughput: 0: 977.6. Samples: 995384. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-26 17:32:42,572][00199] Avg episode reward: [(0, '22.261')]
[2023-02-26 17:32:47,563][00199] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 4001792. Throughput: 0: 1000.3. Samples: 998982. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-26 17:32:47,565][00199] Avg episode reward: [(0, '22.201')]
[2023-02-26 17:32:48,059][11467] Stopping Batcher_0...
[2023-02-26 17:32:48,059][11467] Loop batcher_evt_loop terminating...
[2023-02-26 17:32:48,067][11467] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-26 17:32:48,066][00199] Component Batcher_0 stopped!
[2023-02-26 17:32:48,102][11480] Weights refcount: 2 0
[2023-02-26 17:32:48,120][00199] Component InferenceWorker_p0-w0 stopped!
[2023-02-26 17:32:48,127][11486] Stopping RolloutWorker_w3...
[2023-02-26 17:32:48,128][00199] Component RolloutWorker_w3 stopped!
[2023-02-26 17:32:48,133][11480] Stopping InferenceWorker_p0-w0...
[2023-02-26 17:32:48,135][11480] Loop inference_proc0-0_evt_loop terminating...
[2023-02-26 17:32:48,127][11486] Loop rollout_proc3_evt_loop terminating...
[2023-02-26 17:32:48,150][00199] Component RolloutWorker_w0 stopped!
[2023-02-26 17:32:48,150][11482] Stopping RolloutWorker_w0...
[2023-02-26 17:32:48,156][00199] Component RolloutWorker_w7 stopped!
[2023-02-26 17:32:48,162][11484] Stopping RolloutWorker_w1...
[2023-02-26 17:32:48,161][00199] Component RolloutWorker_w1 stopped!
[2023-02-26 17:32:48,160][11489] Stopping RolloutWorker_w7...
[2023-02-26 17:32:48,167][11484] Loop rollout_proc1_evt_loop terminating...
[2023-02-26 17:32:48,157][11482] Loop rollout_proc0_evt_loop terminating...
[2023-02-26 17:32:48,172][11488] Stopping RolloutWorker_w6...
[2023-02-26 17:32:48,172][00199] Component RolloutWorker_w5 stopped!
[2023-02-26 17:32:48,175][00199] Component RolloutWorker_w6 stopped!
[2023-02-26 17:32:48,179][11487] Stopping RolloutWorker_w5...
[2023-02-26 17:32:48,181][11487] Loop rollout_proc5_evt_loop terminating...
[2023-02-26 17:32:48,167][11489] Loop rollout_proc7_evt_loop terminating...
[2023-02-26 17:32:48,180][11488] Loop rollout_proc6_evt_loop terminating...
[2023-02-26 17:32:48,204][11483] Stopping RolloutWorker_w2...
[2023-02-26 17:32:48,203][00199] Component RolloutWorker_w4 stopped!
[2023-02-26 17:32:48,207][00199] Component RolloutWorker_w2 stopped!
[2023-02-26 17:32:48,203][11485] Stopping RolloutWorker_w4...
[2023-02-26 17:32:48,205][11483] Loop rollout_proc2_evt_loop terminating...
[2023-02-26 17:32:48,212][11485] Loop rollout_proc4_evt_loop terminating...
[2023-02-26 17:32:48,237][11467] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000762_3121152.pth
[2023-02-26 17:32:48,251][11467] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-26 17:32:48,400][00199] Component LearnerWorker_p0 stopped!
[2023-02-26 17:32:48,402][00199] Waiting for process learner_proc0 to stop...
[2023-02-26 17:32:48,408][11467] Stopping LearnerWorker_p0...
[2023-02-26 17:32:48,413][11467] Loop learner_proc0_evt_loop terminating...
[2023-02-26 17:32:50,823][00199] Waiting for process inference_proc0-0 to join...
[2023-02-26 17:32:51,481][00199] Waiting for process rollout_proc0 to join...
[2023-02-26 17:32:52,263][00199] Waiting for process rollout_proc1 to join...
[2023-02-26 17:32:52,266][00199] Waiting for process rollout_proc2 to join...
[2023-02-26 17:32:52,269][00199] Waiting for process rollout_proc3 to join...
[2023-02-26 17:32:52,272][00199] Waiting for process rollout_proc4 to join...
[2023-02-26 17:32:52,275][00199] Waiting for process rollout_proc5 to join...
[2023-02-26 17:32:52,277][00199] Waiting for process rollout_proc6 to join...
[2023-02-26 17:32:52,278][00199] Waiting for process rollout_proc7 to join...
[2023-02-26 17:32:52,279][00199] Batcher 0 profile tree view:
batching: 25.6100, releasing_batches: 0.0279
[2023-02-26 17:32:52,282][00199] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0005
wait_policy_total: 567.5688
update_model: 8.7660
weight_update: 0.0025
one_step: 0.0023
handle_policy_step: 542.1298
deserialize: 16.1339, stack: 3.2202, obs_to_device_normalize: 120.3623, forward: 262.3361, send_messages: 27.7685
prepare_outputs: 85.4096
to_cpu: 51.7935
[2023-02-26 17:32:52,286][00199] Learner 0 profile tree view:
misc: 0.0089, prepare_batch: 15.9065
train: 76.6477
epoch_init: 0.0056, minibatch_init: 0.0252, losses_postprocess: 0.5744, kl_divergence: 0.6138, after_optimizer: 33.3847
calculate_losses: 26.9826
losses_init: 0.0140, forward_head: 1.7360, bptt_initial: 17.7814, tail: 1.0877, advantages_returns: 0.3198, losses: 3.4533
bptt: 2.2495
bptt_forward_core: 2.1591
update: 14.3656
clip: 1.4218
[2023-02-26 17:32:52,287][00199] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3614, enqueue_policy_requests: 160.2085, env_step: 865.5381, overhead: 24.1571, complete_rollouts: 7.4646
save_policy_outputs: 21.3844
split_output_tensors: 10.4584
[2023-02-26 17:32:52,288][00199] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.4036, enqueue_policy_requests: 162.0929, env_step: 864.8531, overhead: 23.6291, complete_rollouts: 7.0879
save_policy_outputs: 20.7680
split_output_tensors: 9.9009
[2023-02-26 17:32:52,290][00199] Loop Runner_EvtLoop terminating...
[2023-02-26 17:32:52,291][00199] Runner profile tree view:
main_loop: 1189.3108
[2023-02-26 17:32:52,292][00199] Collected {0: 4005888}, FPS: 3368.2
[2023-02-26 17:32:52,471][00199] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-26 17:32:52,474][00199] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-26 17:32:52,476][00199] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-26 17:32:52,479][00199] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-26 17:32:52,481][00199] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-26 17:32:52,483][00199] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-26 17:32:52,484][00199] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-26 17:32:52,486][00199] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-26 17:32:52,488][00199] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-26 17:32:52,493][00199] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-26 17:32:52,494][00199] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-26 17:32:52,495][00199] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-26 17:32:52,496][00199] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-26 17:32:52,497][00199] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-26 17:32:52,499][00199] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-26 17:32:52,539][00199] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-26 17:32:52,543][00199] RunningMeanStd input shape: (3, 72, 128)
[2023-02-26 17:32:52,549][00199] RunningMeanStd input shape: (1,)
[2023-02-26 17:32:52,571][00199] ConvEncoder: input_channels=3
[2023-02-26 17:32:53,319][00199] Conv encoder output size: 512
[2023-02-26 17:32:53,321][00199] Policy head output size: 512
[2023-02-26 17:32:55,808][00199] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-26 17:32:57,059][00199] Num frames 100...
[2023-02-26 17:32:57,173][00199] Num frames 200...
[2023-02-26 17:32:57,292][00199] Num frames 300...
[2023-02-26 17:32:57,405][00199] Num frames 400...
[2023-02-26 17:32:57,516][00199] Num frames 500...
[2023-02-26 17:32:57,649][00199] Num frames 600...
[2023-02-26 17:32:57,766][00199] Num frames 700...
[2023-02-26 17:32:57,883][00199] Num frames 800...
[2023-02-26 17:32:57,997][00199] Num frames 900...
[2023-02-26 17:32:58,115][00199] Num frames 1000...
[2023-02-26 17:32:58,228][00199] Num frames 1100...
[2023-02-26 17:32:58,346][00199] Num frames 1200...
[2023-02-26 17:32:58,469][00199] Num frames 1300...
[2023-02-26 17:32:58,585][00199] Num frames 1400...
[2023-02-26 17:32:58,675][00199] Avg episode rewards: #0: 29.300, true rewards: #0: 14.300
[2023-02-26 17:32:58,677][00199] Avg episode reward: 29.300, avg true_objective: 14.300
[2023-02-26 17:32:58,777][00199] Num frames 1500...
[2023-02-26 17:32:58,905][00199] Num frames 1600...
[2023-02-26 17:32:59,029][00199] Num frames 1700...
[2023-02-26 17:32:59,140][00199] Num frames 1800...
[2023-02-26 17:32:59,255][00199] Num frames 1900...
[2023-02-26 17:32:59,368][00199] Num frames 2000...
[2023-02-26 17:32:59,432][00199] Avg episode rewards: #0: 19.530, true rewards: #0: 10.030
[2023-02-26 17:32:59,434][00199] Avg episode reward: 19.530, avg true_objective: 10.030
[2023-02-26 17:32:59,558][00199] Num frames 2100...
[2023-02-26 17:32:59,673][00199] Num frames 2200...
[2023-02-26 17:32:59,848][00199] Num frames 2300...
[2023-02-26 17:33:00,017][00199] Num frames 2400...
[2023-02-26 17:33:00,312][00199] Num frames 2500...
[2023-02-26 17:33:00,595][00199] Num frames 2600...
[2023-02-26 17:33:00,814][00199] Num frames 2700...
[2023-02-26 17:33:00,894][00199] Avg episode rewards: #0: 17.367, true rewards: #0: 9.033
[2023-02-26 17:33:00,899][00199] Avg episode reward: 17.367, avg true_objective: 9.033
[2023-02-26 17:33:01,132][00199] Num frames 2800...
[2023-02-26 17:33:01,306][00199] Num frames 2900...
[2023-02-26 17:33:01,467][00199] Num frames 3000...
[2023-02-26 17:33:01,630][00199] Num frames 3100...
[2023-02-26 17:33:01,862][00199] Num frames 3200...
[2023-02-26 17:33:01,988][00199] Num frames 3300...
[2023-02-26 17:33:02,109][00199] Avg episode rewards: #0: 16.390, true rewards: #0: 8.390
[2023-02-26 17:33:02,111][00199] Avg episode reward: 16.390, avg true_objective: 8.390
[2023-02-26 17:33:02,163][00199] Num frames 3400...
[2023-02-26 17:33:02,281][00199] Num frames 3500...
[2023-02-26 17:33:02,393][00199] Num frames 3600...
[2023-02-26 17:33:02,506][00199] Num frames 3700...
[2023-02-26 17:33:02,623][00199] Num frames 3800...
[2023-02-26 17:33:02,737][00199] Num frames 3900...
[2023-02-26 17:33:02,861][00199] Num frames 4000...
[2023-02-26 17:33:02,983][00199] Num frames 4100...
[2023-02-26 17:33:03,104][00199] Num frames 4200...
[2023-02-26 17:33:03,213][00199] Num frames 4300...
[2023-02-26 17:33:03,325][00199] Num frames 4400...
[2023-02-26 17:33:03,435][00199] Num frames 4500...
[2023-02-26 17:33:03,551][00199] Num frames 4600...
[2023-02-26 17:33:03,667][00199] Num frames 4700...
[2023-02-26 17:33:03,779][00199] Num frames 4800...
[2023-02-26 17:33:03,909][00199] Num frames 4900...
[2023-02-26 17:33:04,085][00199] Avg episode rewards: #0: 21.550, true rewards: #0: 9.950
[2023-02-26 17:33:04,087][00199] Avg episode reward: 21.550, avg true_objective: 9.950
[2023-02-26 17:33:04,129][00199] Num frames 5000...
[2023-02-26 17:33:04,279][00199] Num frames 5100...
[2023-02-26 17:33:04,430][00199] Num frames 5200...
[2023-02-26 17:33:04,588][00199] Num frames 5300...
[2023-02-26 17:33:04,745][00199] Num frames 5400...
[2023-02-26 17:33:04,910][00199] Num frames 5500...
[2023-02-26 17:33:05,075][00199] Num frames 5600...
[2023-02-26 17:33:05,156][00199] Avg episode rewards: #0: 19.858, true rewards: #0: 9.358
[2023-02-26 17:33:05,159][00199] Avg episode reward: 19.858, avg true_objective: 9.358
[2023-02-26 17:33:05,287][00199] Num frames 5700...
[2023-02-26 17:33:05,436][00199] Num frames 5800...
[2023-02-26 17:33:05,588][00199] Num frames 5900...
[2023-02-26 17:33:05,749][00199] Num frames 6000...
[2023-02-26 17:33:05,904][00199] Num frames 6100...
[2023-02-26 17:33:06,071][00199] Num frames 6200...
[2023-02-26 17:33:06,228][00199] Num frames 6300...
[2023-02-26 17:33:06,389][00199] Num frames 6400...
[2023-02-26 17:33:06,548][00199] Num frames 6500...
[2023-02-26 17:33:06,713][00199] Num frames 6600...
[2023-02-26 17:33:06,879][00199] Num frames 6700...
[2023-02-26 17:33:07,048][00199] Num frames 6800...
[2023-02-26 17:33:07,158][00199] Avg episode rewards: #0: 20.759, true rewards: #0: 9.759
[2023-02-26 17:33:07,160][00199] Avg episode reward: 20.759, avg true_objective: 9.759
[2023-02-26 17:33:07,267][00199] Num frames 6900...
[2023-02-26 17:33:07,410][00199] Num frames 7000...
[2023-02-26 17:33:07,520][00199] Num frames 7100...
[2023-02-26 17:33:07,630][00199] Num frames 7200...
[2023-02-26 17:33:07,744][00199] Num frames 7300...
[2023-02-26 17:33:07,865][00199] Num frames 7400...
[2023-02-26 17:33:07,993][00199] Num frames 7500...
[2023-02-26 17:33:08,105][00199] Num frames 7600...
[2023-02-26 17:33:08,218][00199] Num frames 7700...
[2023-02-26 17:33:08,328][00199] Num frames 7800...
[2023-02-26 17:33:08,457][00199] Avg episode rewards: #0: 21.085, true rewards: #0: 9.835
[2023-02-26 17:33:08,459][00199] Avg episode reward: 21.085, avg true_objective: 9.835
[2023-02-26 17:33:08,497][00199] Num frames 7900...
[2023-02-26 17:33:08,607][00199] Num frames 8000...
[2023-02-26 17:33:08,723][00199] Num frames 8100...
[2023-02-26 17:33:08,836][00199] Num frames 8200...
[2023-02-26 17:33:08,954][00199] Num frames 8300...
[2023-02-26 17:33:09,074][00199] Num frames 8400...
[2023-02-26 17:33:09,187][00199] Num frames 8500...
[2023-02-26 17:33:09,300][00199] Num frames 8600...
[2023-02-26 17:33:09,416][00199] Num frames 8700...
[2023-02-26 17:33:09,468][00199] Avg episode rewards: #0: 20.556, true rewards: #0: 9.667
[2023-02-26 17:33:09,470][00199] Avg episode reward: 20.556, avg true_objective: 9.667
[2023-02-26 17:33:09,582][00199] Num frames 8800...
[2023-02-26 17:33:09,702][00199] Num frames 8900...
[2023-02-26 17:33:09,815][00199] Num frames 9000...
[2023-02-26 17:33:09,929][00199] Num frames 9100...
[2023-02-26 17:33:10,047][00199] Num frames 9200...
[2023-02-26 17:33:10,162][00199] Num frames 9300...
[2023-02-26 17:33:10,272][00199] Num frames 9400...
[2023-02-26 17:33:10,368][00199] Avg episode rewards: #0: 20.136, true rewards: #0: 9.436
[2023-02-26 17:33:10,369][00199] Avg episode reward: 20.136, avg true_objective: 9.436
[2023-02-26 17:34:07,804][00199] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-26 17:35:32,168][00199] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-26 17:35:32,171][00199] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-26 17:35:32,173][00199] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-26 17:35:32,175][00199] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-26 17:35:32,177][00199] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-26 17:35:32,181][00199] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-26 17:35:32,182][00199] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-26 17:35:32,183][00199] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-26 17:35:32,187][00199] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-26 17:35:32,188][00199] Adding new argument 'hf_repository'='zipbomb/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-26 17:35:32,189][00199] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-26 17:35:32,191][00199] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-26 17:35:32,192][00199] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-26 17:35:32,194][00199] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-26 17:35:32,195][00199] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-26 17:35:32,225][00199] RunningMeanStd input shape: (3, 72, 128)
[2023-02-26 17:35:32,226][00199] RunningMeanStd input shape: (1,)
[2023-02-26 17:35:32,242][00199] ConvEncoder: input_channels=3
[2023-02-26 17:35:32,281][00199] Conv encoder output size: 512
[2023-02-26 17:35:32,282][00199] Policy head output size: 512
[2023-02-26 17:35:32,301][00199] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-26 17:35:33,371][00199] Num frames 100...
[2023-02-26 17:35:33,500][00199] Num frames 200...
[2023-02-26 17:35:33,622][00199] Num frames 300...
[2023-02-26 17:35:33,747][00199] Num frames 400...
[2023-02-26 17:35:33,859][00199] Num frames 500...
[2023-02-26 17:35:34,002][00199] Avg episode rewards: #0: 12.760, true rewards: #0: 5.760
[2023-02-26 17:35:34,004][00199] Avg episode reward: 12.760, avg true_objective: 5.760
[2023-02-26 17:35:34,037][00199] Num frames 600...
[2023-02-26 17:35:34,157][00199] Num frames 700...
[2023-02-26 17:35:34,269][00199] Num frames 800...
[2023-02-26 17:35:34,382][00199] Num frames 900...
[2023-02-26 17:35:34,499][00199] Num frames 1000...
[2023-02-26 17:35:34,612][00199] Num frames 1100...
[2023-02-26 17:35:34,734][00199] Num frames 1200...
[2023-02-26 17:35:34,803][00199] Avg episode rewards: #0: 12.555, true rewards: #0: 6.055
[2023-02-26 17:35:34,804][00199] Avg episode reward: 12.555, avg true_objective: 6.055
[2023-02-26 17:35:34,909][00199] Num frames 1300...
[2023-02-26 17:35:35,025][00199] Num frames 1400...
[2023-02-26 17:35:35,138][00199] Num frames 1500...
[2023-02-26 17:35:35,249][00199] Num frames 1600...
[2023-02-26 17:35:35,361][00199] Num frames 1700...
[2023-02-26 17:35:35,477][00199] Num frames 1800...
[2023-02-26 17:35:35,590][00199] Num frames 1900...
[2023-02-26 17:35:35,714][00199] Num frames 2000...
[2023-02-26 17:35:35,786][00199] Avg episode rewards: #0: 13.370, true rewards: #0: 6.703
[2023-02-26 17:35:35,789][00199] Avg episode reward: 13.370, avg true_objective: 6.703
[2023-02-26 17:35:35,895][00199] Num frames 2100...
[2023-02-26 17:35:36,014][00199] Num frames 2200...
[2023-02-26 17:35:36,128][00199] Num frames 2300...
[2023-02-26 17:35:36,251][00199] Num frames 2400...
[2023-02-26 17:35:36,377][00199] Num frames 2500...
[2023-02-26 17:35:36,499][00199] Num frames 2600...
[2023-02-26 17:35:36,619][00199] Num frames 2700...
[2023-02-26 17:35:36,695][00199] Avg episode rewards: #0: 13.038, true rewards: #0: 6.787
[2023-02-26 17:35:36,697][00199] Avg episode reward: 13.038, avg true_objective: 6.787
[2023-02-26 17:35:36,805][00199] Num frames 2800...
[2023-02-26 17:35:36,915][00199] Num frames 2900...
[2023-02-26 17:35:37,036][00199] Num frames 3000...
[2023-02-26 17:35:37,145][00199] Num frames 3100...
[2023-02-26 17:35:37,258][00199] Num frames 3200...
[2023-02-26 17:35:37,378][00199] Num frames 3300...
[2023-02-26 17:35:37,490][00199] Num frames 3400...
[2023-02-26 17:35:37,616][00199] Num frames 3500...
[2023-02-26 17:35:37,736][00199] Num frames 3600...
[2023-02-26 17:35:37,855][00199] Num frames 3700...
[2023-02-26 17:35:37,975][00199] Num frames 3800...
[2023-02-26 17:35:38,142][00199] Avg episode rewards: #0: 15.398, true rewards: #0: 7.798
[2023-02-26 17:35:38,144][00199] Avg episode reward: 15.398, avg true_objective: 7.798
[2023-02-26 17:35:38,148][00199] Num frames 3900...
[2023-02-26 17:35:38,260][00199] Num frames 4000...
[2023-02-26 17:35:38,378][00199] Num frames 4100...
[2023-02-26 17:35:38,488][00199] Num frames 4200...
[2023-02-26 17:35:38,608][00199] Num frames 4300...
[2023-02-26 17:35:38,720][00199] Num frames 4400...
[2023-02-26 17:35:38,840][00199] Num frames 4500...
[2023-02-26 17:35:38,954][00199] Num frames 4600...
[2023-02-26 17:35:39,070][00199] Num frames 4700...
[2023-02-26 17:35:39,181][00199] Num frames 4800...
[2023-02-26 17:35:39,292][00199] Num frames 4900...
[2023-02-26 17:35:39,408][00199] Num frames 5000...
[2023-02-26 17:35:39,521][00199] Num frames 5100...
[2023-02-26 17:35:39,587][00199] Avg episode rewards: #0: 17.013, true rewards: #0: 8.513
[2023-02-26 17:35:39,589][00199] Avg episode reward: 17.013, avg true_objective: 8.513
[2023-02-26 17:35:39,693][00199] Num frames 5200...
[2023-02-26 17:35:39,809][00199] Num frames 5300...
[2023-02-26 17:35:39,917][00199] Num frames 5400...
[2023-02-26 17:35:40,036][00199] Num frames 5500...
[2023-02-26 17:35:40,144][00199] Num frames 5600...
[2023-02-26 17:35:40,260][00199] Num frames 5700...
[2023-02-26 17:35:40,404][00199] Avg episode rewards: #0: 16.257, true rewards: #0: 8.257
[2023-02-26 17:35:40,407][00199] Avg episode reward: 16.257, avg true_objective: 8.257
[2023-02-26 17:35:40,432][00199] Num frames 5800...
[2023-02-26 17:35:40,539][00199] Num frames 5900...
[2023-02-26 17:35:40,656][00199] Num frames 6000...
[2023-02-26 17:35:40,798][00199] Num frames 6100...
[2023-02-26 17:35:40,973][00199] Num frames 6200...
[2023-02-26 17:35:41,128][00199] Num frames 6300...
[2023-02-26 17:35:41,286][00199] Num frames 6400...
[2023-02-26 17:35:41,443][00199] Num frames 6500...
[2023-02-26 17:35:41,601][00199] Num frames 6600...
[2023-02-26 17:35:41,757][00199] Num frames 6700...
[2023-02-26 17:35:41,914][00199] Num frames 6800...
[2023-02-26 17:35:42,068][00199] Num frames 6900...
[2023-02-26 17:35:42,220][00199] Num frames 7000...
[2023-02-26 17:35:42,384][00199] Num frames 7100...
[2023-02-26 17:35:42,545][00199] Num frames 7200...
[2023-02-26 17:35:42,704][00199] Num frames 7300...
[2023-02-26 17:35:42,862][00199] Num frames 7400...
[2023-02-26 17:35:43,028][00199] Num frames 7500...
[2023-02-26 17:35:43,098][00199] Avg episode rewards: #0: 19.010, true rewards: #0: 9.385
[2023-02-26 17:35:43,100][00199] Avg episode reward: 19.010, avg true_objective: 9.385
[2023-02-26 17:35:43,246][00199] Num frames 7600...
[2023-02-26 17:35:43,418][00199] Num frames 7700...
[2023-02-26 17:35:43,575][00199] Num frames 7800...
[2023-02-26 17:35:43,740][00199] Num frames 7900...
[2023-02-26 17:35:43,897][00199] Num frames 8000...
[2023-02-26 17:35:44,064][00199] Num frames 8100...
[2023-02-26 17:35:44,223][00199] Num frames 8200...
[2023-02-26 17:35:44,328][00199] Avg episode rewards: #0: 18.271, true rewards: #0: 9.160
[2023-02-26 17:35:44,331][00199] Avg episode reward: 18.271, avg true_objective: 9.160
[2023-02-26 17:35:44,398][00199] Num frames 8300...
[2023-02-26 17:35:44,507][00199] Num frames 8400...
[2023-02-26 17:35:44,627][00199] Num frames 8500...
[2023-02-26 17:35:44,739][00199] Num frames 8600...
[2023-02-26 17:35:44,857][00199] Num frames 8700...
[2023-02-26 17:35:44,978][00199] Num frames 8800...
[2023-02-26 17:35:45,133][00199] Avg episode rewards: #0: 17.484, true rewards: #0: 8.884
[2023-02-26 17:35:45,135][00199] Avg episode reward: 17.484, avg true_objective: 8.884
[2023-02-26 17:36:35,866][00199] Replay video saved to /content/train_dir/default_experiment/replay.mp4!