File size: 21,830 Bytes
206942c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 |
[2022-12-04 20:47:56,451][04266] Saving configuration to /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/config.json... [2022-12-04 20:47:56,464][04266] Rollout worker 0 uses device cpu [2022-12-04 20:47:56,464][04266] Rollout worker 1 uses device cpu [2022-12-04 20:47:56,464][04266] Rollout worker 2 uses device cpu [2022-12-04 20:47:56,465][04266] Rollout worker 3 uses device cpu [2022-12-04 20:47:56,465][04266] Rollout worker 4 uses device cpu [2022-12-04 20:47:56,465][04266] Rollout worker 5 uses device cpu [2022-12-04 20:47:56,465][04266] Rollout worker 6 uses device cpu [2022-12-04 20:47:56,465][04266] Rollout worker 7 uses device cpu [2022-12-04 20:47:56,465][04266] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 [2022-12-04 20:47:56,487][04266] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2022-12-04 20:47:56,487][04266] InferenceWorker_p0-w0: min num requests: 2 [2022-12-04 20:47:56,519][04266] Starting all processes... [2022-12-04 20:47:56,520][04266] Starting process learner_proc0 [2022-12-04 20:47:56,570][04266] Starting all processes... [2022-12-04 20:47:56,577][04266] Starting process inference_proc0-0 [2022-12-04 20:47:56,577][04266] Starting process rollout_proc0 [2022-12-04 20:47:56,578][04266] Starting process rollout_proc1 [2022-12-04 20:47:56,578][04266] Starting process rollout_proc2 [2022-12-04 20:47:56,578][04266] Starting process rollout_proc3 [2022-12-04 20:47:56,579][04266] Starting process rollout_proc4 [2022-12-04 20:47:56,579][04266] Starting process rollout_proc5 [2022-12-04 20:47:56,584][04266] Starting process rollout_proc6 [2022-12-04 20:47:56,591][04266] Starting process rollout_proc7 [2022-12-04 20:47:58,489][04366] Worker 5 uses CPU cores [5] [2022-12-04 20:47:58,561][04361] Worker 0 uses CPU cores [0] [2022-12-04 20:47:58,611][04360] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2022-12-04 20:47:58,612][04360] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2022-12-04 20:47:58,705][04367] Worker 4 uses CPU cores [4] [2022-12-04 20:47:58,733][04363] Worker 6 uses CPU cores [6] [2022-12-04 20:47:58,765][04368] Worker 2 uses CPU cores [2] [2022-12-04 20:47:58,779][04365] Worker 3 uses CPU cores [3] [2022-12-04 20:47:58,824][04340] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2022-12-04 20:47:58,825][04340] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2022-12-04 20:47:58,834][04364] Worker 7 uses CPU cores [7] [2022-12-04 20:47:58,885][04362] Worker 1 uses CPU cores [1] [2022-12-04 20:47:59,427][04360] Num visible devices: 1 [2022-12-04 20:47:59,428][04340] Num visible devices: 1 [2022-12-04 20:47:59,446][04340] Starting seed is not provided [2022-12-04 20:47:59,446][04340] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2022-12-04 20:47:59,446][04340] Initializing actor-critic model on device cuda:0 [2022-12-04 20:47:59,446][04340] RunningMeanStd input shape: (27,) [2022-12-04 20:47:59,447][04340] RunningMeanStd input shape: (1,) [2022-12-04 20:47:59,522][04340] Created Actor Critic model with architecture: [2022-12-04 20:47:59,522][04340] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): MultiInputEncoder( (encoders): ModuleDict( (obs): MlpEncoder( (mlp_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=Tanh) (2): RecursiveScriptModule(original_name=Linear) (3): RecursiveScriptModule(original_name=Tanh) ) ) ) ) (core): ModelCoreIdentity() (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=64, out_features=1, bias=True) (action_parameterization): ActionParameterizationContinuousNonAdaptiveStddev( (distribution_linear): Linear(in_features=64, out_features=8, bias=True) ) ) [2022-12-04 20:48:03,416][04340] Using optimizer <class 'torch.optim.adam.Adam'> [2022-12-04 20:48:03,417][04340] No checkpoints found [2022-12-04 20:48:03,417][04340] Did not load from checkpoint, starting from scratch! [2022-12-04 20:48:03,417][04340] Initialized policy 0 weights for model version 0 [2022-12-04 20:48:03,422][04340] LearnerWorker_p0 finished initialization! [2022-12-04 20:48:03,424][04340] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2022-12-04 20:48:03,551][04360] RunningMeanStd input shape: (27,) [2022-12-04 20:48:03,552][04360] RunningMeanStd input shape: (1,) [2022-12-04 20:48:03,650][04266] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2022-12-04 20:48:07,105][04266] Inference worker 0-0 is ready! [2022-12-04 20:48:07,105][04266] All inference workers are ready! Signal rollout workers to start! [2022-12-04 20:48:07,303][04364] Decorrelating experience for 0 frames... [2022-12-04 20:48:07,303][04362] Decorrelating experience for 0 frames... [2022-12-04 20:48:07,305][04363] Decorrelating experience for 0 frames... [2022-12-04 20:48:07,306][04362] Decorrelating experience for 64 frames... [2022-12-04 20:48:07,305][04367] Decorrelating experience for 0 frames... [2022-12-04 20:48:07,305][04364] Decorrelating experience for 64 frames... [2022-12-04 20:48:07,305][04361] Decorrelating experience for 0 frames... [2022-12-04 20:48:07,305][04368] Decorrelating experience for 0 frames... [2022-12-04 20:48:07,306][04366] Decorrelating experience for 0 frames... [2022-12-04 20:48:07,307][04367] Decorrelating experience for 64 frames... [2022-12-04 20:48:07,307][04363] Decorrelating experience for 64 frames... [2022-12-04 20:48:07,307][04365] Decorrelating experience for 0 frames... [2022-12-04 20:48:07,308][04368] Decorrelating experience for 64 frames... [2022-12-04 20:48:07,308][04366] Decorrelating experience for 64 frames... [2022-12-04 20:48:07,308][04361] Decorrelating experience for 64 frames... [2022-12-04 20:48:07,309][04365] Decorrelating experience for 64 frames... [2022-12-04 20:48:07,359][04364] Decorrelating experience for 128 frames... [2022-12-04 20:48:07,360][04363] Decorrelating experience for 128 frames... [2022-12-04 20:48:07,362][04366] Decorrelating experience for 128 frames... [2022-12-04 20:48:07,361][04362] Decorrelating experience for 128 frames... [2022-12-04 20:48:07,362][04361] Decorrelating experience for 128 frames... [2022-12-04 20:48:07,362][04365] Decorrelating experience for 128 frames... [2022-12-04 20:48:07,362][04367] Decorrelating experience for 128 frames... [2022-12-04 20:48:07,362][04368] Decorrelating experience for 128 frames... [2022-12-04 20:48:07,467][04363] Decorrelating experience for 192 frames... [2022-12-04 20:48:07,467][04364] Decorrelating experience for 192 frames... [2022-12-04 20:48:07,469][04367] Decorrelating experience for 192 frames... [2022-12-04 20:48:07,469][04365] Decorrelating experience for 192 frames... [2022-12-04 20:48:07,470][04366] Decorrelating experience for 192 frames... [2022-12-04 20:48:07,471][04361] Decorrelating experience for 192 frames... [2022-12-04 20:48:07,472][04362] Decorrelating experience for 192 frames... [2022-12-04 20:48:07,474][04368] Decorrelating experience for 192 frames... [2022-12-04 20:48:07,650][04364] Decorrelating experience for 256 frames... [2022-12-04 20:48:07,658][04363] Decorrelating experience for 256 frames... [2022-12-04 20:48:07,658][04365] Decorrelating experience for 256 frames... [2022-12-04 20:48:07,659][04367] Decorrelating experience for 256 frames... [2022-12-04 20:48:07,659][04362] Decorrelating experience for 256 frames... [2022-12-04 20:48:07,661][04361] Decorrelating experience for 256 frames... [2022-12-04 20:48:07,662][04366] Decorrelating experience for 256 frames... [2022-12-04 20:48:07,664][04368] Decorrelating experience for 256 frames... [2022-12-04 20:48:07,856][04364] Decorrelating experience for 320 frames... [2022-12-04 20:48:07,863][04363] Decorrelating experience for 320 frames... [2022-12-04 20:48:07,864][04365] Decorrelating experience for 320 frames... [2022-12-04 20:48:07,866][04362] Decorrelating experience for 320 frames... [2022-12-04 20:48:07,866][04361] Decorrelating experience for 320 frames... [2022-12-04 20:48:07,871][04366] Decorrelating experience for 320 frames... [2022-12-04 20:48:07,872][04367] Decorrelating experience for 320 frames... [2022-12-04 20:48:07,877][04368] Decorrelating experience for 320 frames... [2022-12-04 20:48:08,114][04364] Decorrelating experience for 384 frames... [2022-12-04 20:48:08,119][04363] Decorrelating experience for 384 frames... [2022-12-04 20:48:08,121][04365] Decorrelating experience for 384 frames... [2022-12-04 20:48:08,123][04361] Decorrelating experience for 384 frames... [2022-12-04 20:48:08,128][04362] Decorrelating experience for 384 frames... [2022-12-04 20:48:08,129][04366] Decorrelating experience for 384 frames... [2022-12-04 20:48:08,131][04367] Decorrelating experience for 384 frames... [2022-12-04 20:48:08,144][04368] Decorrelating experience for 384 frames... [2022-12-04 20:48:08,431][04364] Decorrelating experience for 448 frames... [2022-12-04 20:48:08,433][04363] Decorrelating experience for 448 frames... [2022-12-04 20:48:08,437][04365] Decorrelating experience for 448 frames... [2022-12-04 20:48:08,437][04361] Decorrelating experience for 448 frames... [2022-12-04 20:48:08,440][04362] Decorrelating experience for 448 frames... [2022-12-04 20:48:08,444][04367] Decorrelating experience for 448 frames... [2022-12-04 20:48:08,452][04366] Decorrelating experience for 448 frames... [2022-12-04 20:48:08,466][04368] Decorrelating experience for 448 frames... [2022-12-04 20:48:08,650][04266] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2022-12-04 20:48:08,652][04340] Saving /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000000_0.pth... [2022-12-04 20:48:13,650][04266] Fps is (10 sec: 819.2, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 8192. Throughput: 0: 846.4. Samples: 8464. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:48:13,650][04266] Avg episode reward: [(0, '-160.026')] [2022-12-04 20:48:16,478][04266] Heartbeat connected on Batcher_0 [2022-12-04 20:48:16,482][04266] Heartbeat connected on LearnerWorker_p0 [2022-12-04 20:48:16,492][04266] Heartbeat connected on InferenceWorker_p0-w0 [2022-12-04 20:48:16,493][04266] Heartbeat connected on RolloutWorker_w0 [2022-12-04 20:48:16,503][04266] Heartbeat connected on RolloutWorker_w2 [2022-12-04 20:48:16,503][04266] Heartbeat connected on RolloutWorker_w1 [2022-12-04 20:48:16,510][04266] Heartbeat connected on RolloutWorker_w4 [2022-12-04 20:48:16,511][04266] Heartbeat connected on RolloutWorker_w3 [2022-12-04 20:48:16,516][04266] Heartbeat connected on RolloutWorker_w5 [2022-12-04 20:48:16,521][04266] Heartbeat connected on RolloutWorker_w6 [2022-12-04 20:48:16,529][04266] Heartbeat connected on RolloutWorker_w7 [2022-12-04 20:48:18,650][04266] Fps is (10 sec: 3686.4, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 36864. Throughput: 0: 1698.1. Samples: 25472. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:48:18,651][04266] Avg episode reward: [(0, '-169.308')] [2022-12-04 20:48:18,924][04360] Updated weights for policy 0, policy_version 80 (0.0006) [2022-12-04 20:48:23,650][04266] Fps is (10 sec: 5734.3, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 65536. Throughput: 0: 2930.0. Samples: 58600. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:48:23,651][04266] Avg episode reward: [(0, '-249.723')] [2022-12-04 20:48:23,656][04340] Saving /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000128_65536.pth... [2022-12-04 20:48:26,260][04360] Updated weights for policy 0, policy_version 160 (0.0007) [2022-12-04 20:48:28,650][04266] Fps is (10 sec: 5734.4, 60 sec: 3768.3, 300 sec: 3768.3). Total num frames: 94208. Throughput: 0: 3705.3. Samples: 92632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:48:28,651][04266] Avg episode reward: [(0, '-89.994')] [2022-12-04 20:48:33,559][04360] Updated weights for policy 0, policy_version 240 (0.0006) [2022-12-04 20:48:33,650][04266] Fps is (10 sec: 5734.4, 60 sec: 4096.0, 300 sec: 4096.0). Total num frames: 122880. Throughput: 0: 3641.5. Samples: 109244. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:48:33,651][04266] Avg episode reward: [(0, '-153.751')] [2022-12-04 20:48:33,651][04340] Saving new best policy, reward=-153.751! [2022-12-04 20:48:38,650][04266] Fps is (10 sec: 5324.8, 60 sec: 4213.0, 300 sec: 4213.0). Total num frames: 147456. Throughput: 0: 4093.4. Samples: 143268. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:48:38,650][04266] Avg episode reward: [(0, '-137.350')] [2022-12-04 20:48:38,669][04340] Saving /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000296_151552.pth... [2022-12-04 20:48:38,675][04340] Removing /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000000_0.pth [2022-12-04 20:48:38,675][04340] Saving new best policy, reward=-137.350! [2022-12-04 20:48:40,889][04360] Updated weights for policy 0, policy_version 320 (0.0006) [2022-12-04 20:48:43,650][04266] Fps is (10 sec: 5324.8, 60 sec: 4403.2, 300 sec: 4403.2). Total num frames: 176128. Throughput: 0: 4415.1. Samples: 176604. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:48:43,651][04266] Avg episode reward: [(0, '-69.206')] [2022-12-04 20:48:43,651][04340] Saving new best policy, reward=-69.206! [2022-12-04 20:48:48,177][04360] Updated weights for policy 0, policy_version 400 (0.0006) [2022-12-04 20:48:48,650][04266] Fps is (10 sec: 5734.4, 60 sec: 4551.1, 300 sec: 4551.1). Total num frames: 204800. Throughput: 0: 4290.7. Samples: 193080. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:48:48,651][04266] Avg episode reward: [(0, '-52.726')] [2022-12-04 20:48:48,651][04340] Saving new best policy, reward=-52.726! [2022-12-04 20:48:53,650][04266] Fps is (10 sec: 5734.4, 60 sec: 4669.5, 300 sec: 4669.5). Total num frames: 233472. Throughput: 0: 5054.2. Samples: 227440. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2022-12-04 20:48:53,650][04266] Avg episode reward: [(0, '-33.694')] [2022-12-04 20:48:53,657][04340] Saving /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000456_233472.pth... [2022-12-04 20:48:53,664][04340] Removing /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000128_65536.pth [2022-12-04 20:48:53,664][04340] Saving new best policy, reward=-33.694! [2022-12-04 20:48:55,518][04360] Updated weights for policy 0, policy_version 480 (0.0006) [2022-12-04 20:48:58,650][04266] Fps is (10 sec: 5734.4, 60 sec: 4766.3, 300 sec: 4766.3). Total num frames: 262144. Throughput: 0: 5586.5. Samples: 259856. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:48:58,651][04266] Avg episode reward: [(0, '-45.611')] [2022-12-04 20:49:03,653][04266] Fps is (10 sec: 4913.5, 60 sec: 4710.1, 300 sec: 4710.1). Total num frames: 282624. Throughput: 0: 5596.9. Samples: 277352. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:49:03,654][04266] Avg episode reward: [(0, '-29.953')] [2022-12-04 20:49:03,655][04340] Saving new best policy, reward=-29.953! [2022-12-04 20:49:04,937][04360] Updated weights for policy 0, policy_version 560 (0.0008) [2022-12-04 20:49:08,650][04266] Fps is (10 sec: 4096.0, 60 sec: 5051.7, 300 sec: 4663.1). Total num frames: 303104. Throughput: 0: 5336.0. Samples: 298720. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:49:08,650][04266] Avg episode reward: [(0, '-29.014')] [2022-12-04 20:49:08,678][04340] Saving /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000600_307200.pth... [2022-12-04 20:49:08,686][04340] Removing /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000296_151552.pth [2022-12-04 20:49:08,686][04340] Saving new best policy, reward=-29.014! [2022-12-04 20:49:12,321][04360] Updated weights for policy 0, policy_version 640 (0.0007) [2022-12-04 20:49:13,650][04266] Fps is (10 sec: 4916.9, 60 sec: 5393.1, 300 sec: 4739.7). Total num frames: 331776. Throughput: 0: 5326.1. Samples: 332308. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2022-12-04 20:49:13,650][04266] Avg episode reward: [(0, '-0.035')] [2022-12-04 20:49:13,651][04340] Saving new best policy, reward=-0.035! [2022-12-04 20:49:18,650][04266] Fps is (10 sec: 5734.4, 60 sec: 5393.1, 300 sec: 4806.0). Total num frames: 360448. Throughput: 0: 5338.0. Samples: 349452. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:49:18,650][04266] Avg episode reward: [(0, '26.827')] [2022-12-04 20:49:18,651][04340] Saving new best policy, reward=26.827! [2022-12-04 20:49:19,490][04360] Updated weights for policy 0, policy_version 720 (0.0006) [2022-12-04 20:49:23,650][04266] Fps is (10 sec: 5734.3, 60 sec: 5393.1, 300 sec: 4864.0). Total num frames: 389120. Throughput: 0: 5356.0. Samples: 384288. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:49:23,651][04266] Avg episode reward: [(0, '75.358')] [2022-12-04 20:49:23,656][04340] Saving /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000760_389120.pth... [2022-12-04 20:49:23,665][04340] Removing /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000456_233472.pth [2022-12-04 20:49:23,665][04340] Saving new best policy, reward=75.358! [2022-12-04 20:49:26,586][04360] Updated weights for policy 0, policy_version 800 (0.0006) [2022-12-04 20:49:28,650][04266] Fps is (10 sec: 5734.4, 60 sec: 5393.1, 300 sec: 4915.2). Total num frames: 417792. Throughput: 0: 5375.7. Samples: 418512. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:49:28,650][04266] Avg episode reward: [(0, '153.991')] [2022-12-04 20:49:28,651][04340] Saving new best policy, reward=153.991! [2022-12-04 20:49:33,650][04266] Fps is (10 sec: 5734.5, 60 sec: 5393.1, 300 sec: 4960.7). Total num frames: 446464. Throughput: 0: 5396.6. Samples: 435928. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:49:33,650][04266] Avg episode reward: [(0, '231.230')] [2022-12-04 20:49:33,671][04340] Saving new best policy, reward=231.230! [2022-12-04 20:49:33,672][04360] Updated weights for policy 0, policy_version 880 (0.0006) [2022-12-04 20:49:38,650][04266] Fps is (10 sec: 5734.3, 60 sec: 5461.3, 300 sec: 5001.4). Total num frames: 475136. Throughput: 0: 5398.1. Samples: 470356. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2022-12-04 20:49:38,651][04266] Avg episode reward: [(0, '321.313')] [2022-12-04 20:49:38,656][04340] Saving /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000928_475136.pth... [2022-12-04 20:49:38,664][04340] Removing /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000600_307200.pth [2022-12-04 20:49:38,665][04340] Saving new best policy, reward=321.313! [2022-12-04 20:49:40,419][04266] Keyboard interrupt detected in the event loop EvtLoop [Runner_EvtLoop, process=main process 4266], exiting... [2022-12-04 20:49:40,420][04266] Runner profile tree view: main_loop: 103.9009 [2022-12-04 20:49:40,421][04266] Collected {0: 487424}, FPS: 4691.2 [2022-12-04 20:49:40,421][04340] Stopping Batcher_0... [2022-12-04 20:49:40,421][04340] Loop batcher_evt_loop terminating... [2022-12-04 20:49:40,421][04365] Stopping RolloutWorker_w3... [2022-12-04 20:49:40,422][04365] Loop rollout_proc3_evt_loop terminating... [2022-12-04 20:49:40,422][04340] Saving /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000952_487424.pth... [2022-12-04 20:49:40,424][04366] Stopping RolloutWorker_w5... [2022-12-04 20:49:40,424][04366] Loop rollout_proc5_evt_loop terminating... [2022-12-04 20:49:40,425][04361] Stopping RolloutWorker_w0... [2022-12-04 20:49:40,425][04362] Stopping RolloutWorker_w1... [2022-12-04 20:49:40,426][04363] Stopping RolloutWorker_w6... [2022-12-04 20:49:40,426][04361] Loop rollout_proc0_evt_loop terminating... [2022-12-04 20:49:40,426][04362] Loop rollout_proc1_evt_loop terminating... [2022-12-04 20:49:40,426][04368] Stopping RolloutWorker_w2... [2022-12-04 20:49:40,426][04363] Loop rollout_proc6_evt_loop terminating... [2022-12-04 20:49:40,426][04368] Loop rollout_proc2_evt_loop terminating... [2022-12-04 20:49:40,429][04340] Removing /home/andrew_huggingface_co/sample-factory/train_dir/ant_test/checkpoint_p0/checkpoint_000000760_389120.pth [2022-12-04 20:49:40,429][04340] Stopping LearnerWorker_p0... [2022-12-04 20:49:40,430][04340] Loop learner_proc0_evt_loop terminating... [2022-12-04 20:49:40,436][04360] Weights refcount: 2 0 [2022-12-04 20:49:40,437][04360] Stopping InferenceWorker_p0-w0... [2022-12-04 20:49:40,438][04360] Loop inference_proc0-0_evt_loop terminating... [2022-12-04 20:49:40,474][04364] Stopping RolloutWorker_w7... [2022-12-04 20:49:40,475][04364] Loop rollout_proc7_evt_loop terminating... [2022-12-04 20:49:40,498][04367] Stopping RolloutWorker_w4... [2022-12-04 20:49:40,521][04367] Loop rollout_proc4_evt_loop terminating... |