ItchyB commited on
Commit
10887b4
1 Parent(s): 3a0c27d

Upload folder using huggingface_hub

Browse files
.summary/0/events.out.tfevents.1682809515.CAPTAIN-AMERICA ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:359eb8396a006e3b13e3a02a40ea08a7eb499a2a8095355685f5f4a28936eae9
3
+ size 2612
.summary/0/events.out.tfevents.1682810224.CAPTAIN-AMERICA ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b77fb55b78546d12e0bb5f78c39f4ddfa2ec99cf80e5ec49a22aaca16b5f4ef
3
+ size 2158
README.md CHANGED
@@ -15,7 +15,7 @@ model-index:
15
  type: doom_health_gathering_supreme
16
  metrics:
17
  - type: mean_reward
18
- value: 3.90 +/- 0.60
19
  name: mean_reward
20
  verified: false
21
  ---
 
15
  type: doom_health_gathering_supreme
16
  metrics:
17
  - type: mean_reward
18
+ value: 4.10 +/- 0.49
19
  name: mean_reward
20
  verified: false
21
  ---
checkpoint_p0/checkpoint_000000980_4014080.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:79287a35aa9bcd16dd35958ecb8ffd006a97ba97324a7a148d07dab54b8b40ea
3
+ size 34929220
checkpoint_p0/checkpoint_000000982_4022272.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9a379b3012b3684e426e9c3a17dfef0fc6ac13d78a67db96f496ef202ab5f916
3
+ size 34929220
replay.mp4 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2e1acb924a30b27425c5dcc73ae4ba44ec05f28cea902cafff951ef665970113
3
- size 5397393
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d032c87a2efab0e06ac7ac1ce54c8ae2332a4fa365b19ca4bd69561d097bdd8b
3
+ size 6150210
sf_log.txt CHANGED
@@ -622,3 +622,582 @@ main_loop: 217.2086
622
  [2023-04-27 22:36:26,842][19320] Avg episode rewards: #0: 4.304, true rewards: #0: 3.904
623
  [2023-04-27 22:36:26,842][19320] Avg episode reward: 4.304, avg true_objective: 3.904
624
  [2023-04-27 22:36:30,880][19320] Replay video saved to /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/replay.mp4!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
622
  [2023-04-27 22:36:26,842][19320] Avg episode rewards: #0: 4.304, true rewards: #0: 3.904
623
  [2023-04-27 22:36:26,842][19320] Avg episode reward: 4.304, avg true_objective: 3.904
624
  [2023-04-27 22:36:30,880][19320] Replay video saved to /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/replay.mp4!
625
+ [2023-04-27 22:36:34,016][19320] The model has been pushed to https://huggingface.co/ItchyB/rl_course_vizdoom_health_gathering_supreme
626
+ [2023-04-29 19:05:17,493][108205] Saving configuration to /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/config.json...
627
+ [2023-04-29 19:05:17,495][108205] Rollout worker 0 uses device cpu
628
+ [2023-04-29 19:05:17,495][108205] Rollout worker 1 uses device cpu
629
+ [2023-04-29 19:05:17,496][108205] Rollout worker 2 uses device cpu
630
+ [2023-04-29 19:05:17,496][108205] Rollout worker 3 uses device cpu
631
+ [2023-04-29 19:05:17,497][108205] Rollout worker 4 uses device cpu
632
+ [2023-04-29 19:05:17,498][108205] Rollout worker 5 uses device cpu
633
+ [2023-04-29 19:05:17,498][108205] Rollout worker 6 uses device cpu
634
+ [2023-04-29 19:05:17,499][108205] Rollout worker 7 uses device cpu
635
+ [2023-04-29 19:05:17,524][108205] Using GPUs [0] for process 0 (actually maps to GPUs [0])
636
+ [2023-04-29 19:05:17,525][108205] InferenceWorker_p0-w0: min num requests: 2
637
+ [2023-04-29 19:05:17,540][108205] Starting all processes...
638
+ [2023-04-29 19:05:17,540][108205] Starting process learner_proc0
639
+ [2023-04-29 19:05:17,634][108205] Starting all processes...
640
+ [2023-04-29 19:05:17,637][108205] Starting process inference_proc0-0
641
+ [2023-04-29 19:05:17,638][108205] Starting process rollout_proc0
642
+ [2023-04-29 19:05:17,638][108205] Starting process rollout_proc1
643
+ [2023-04-29 19:05:17,638][108205] Starting process rollout_proc2
644
+ [2023-04-29 19:05:17,639][108205] Starting process rollout_proc3
645
+ [2023-04-29 19:05:17,639][108205] Starting process rollout_proc4
646
+ [2023-04-29 19:05:17,639][108205] Starting process rollout_proc5
647
+ [2023-04-29 19:05:17,640][108205] Starting process rollout_proc6
648
+ [2023-04-29 19:05:17,640][108205] Starting process rollout_proc7
649
+ [2023-04-29 19:05:18,549][133597] Using GPUs [0] for process 0 (actually maps to GPUs [0])
650
+ [2023-04-29 19:05:18,549][133597] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
651
+ [2023-04-29 19:05:18,577][133617] Worker 6 uses CPU cores [18, 19, 20]
652
+ [2023-04-29 19:05:18,590][133597] Num visible devices: 1
653
+ [2023-04-29 19:05:18,596][133612] Worker 1 uses CPU cores [3, 4, 5]
654
+ [2023-04-29 19:05:18,600][133610] Using GPUs [0] for process 0 (actually maps to GPUs [0])
655
+ [2023-04-29 19:05:18,600][133610] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
656
+ [2023-04-29 19:05:18,614][133615] Worker 4 uses CPU cores [12, 13, 14]
657
+ [2023-04-29 19:05:18,615][133610] Num visible devices: 1
658
+ [2023-04-29 19:05:18,623][133611] Worker 0 uses CPU cores [0, 1, 2]
659
+ [2023-04-29 19:05:18,626][133616] Worker 5 uses CPU cores [15, 16, 17]
660
+ [2023-04-29 19:05:18,626][133613] Worker 3 uses CPU cores [9, 10, 11]
661
+ [2023-04-29 19:05:18,637][133597] Starting seed is not provided
662
+ [2023-04-29 19:05:18,637][133597] Using GPUs [0] for process 0 (actually maps to GPUs [0])
663
+ [2023-04-29 19:05:18,637][133597] Initializing actor-critic model on device cuda:0
664
+ [2023-04-29 19:05:18,638][133597] RunningMeanStd input shape: (3, 72, 128)
665
+ [2023-04-29 19:05:18,638][133597] RunningMeanStd input shape: (1,)
666
+ [2023-04-29 19:05:18,640][133618] Worker 7 uses CPU cores [21, 22, 23]
667
+ [2023-04-29 19:05:18,642][133614] Worker 2 uses CPU cores [6, 7, 8]
668
+ [2023-04-29 19:05:18,648][133597] ConvEncoder: input_channels=3
669
+ [2023-04-29 19:05:18,758][133597] Conv encoder output size: 512
670
+ [2023-04-29 19:05:18,758][133597] Policy head output size: 512
671
+ [2023-04-29 19:05:18,790][133597] Created Actor Critic model with architecture:
672
+ [2023-04-29 19:05:18,790][133597] ActorCriticSharedWeights(
673
+ (obs_normalizer): ObservationNormalizer(
674
+ (running_mean_std): RunningMeanStdDictInPlace(
675
+ (running_mean_std): ModuleDict(
676
+ (obs): RunningMeanStdInPlace()
677
+ )
678
+ )
679
+ )
680
+ (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
681
+ (encoder): VizdoomEncoder(
682
+ (basic_encoder): ConvEncoder(
683
+ (enc): RecursiveScriptModule(
684
+ original_name=ConvEncoderImpl
685
+ (conv_head): RecursiveScriptModule(
686
+ original_name=Sequential
687
+ (0): RecursiveScriptModule(original_name=Conv2d)
688
+ (1): RecursiveScriptModule(original_name=ELU)
689
+ (2): RecursiveScriptModule(original_name=Conv2d)
690
+ (3): RecursiveScriptModule(original_name=ELU)
691
+ (4): RecursiveScriptModule(original_name=Conv2d)
692
+ (5): RecursiveScriptModule(original_name=ELU)
693
+ )
694
+ (mlp_layers): RecursiveScriptModule(
695
+ original_name=Sequential
696
+ (0): RecursiveScriptModule(original_name=Linear)
697
+ (1): RecursiveScriptModule(original_name=ELU)
698
+ )
699
+ )
700
+ )
701
+ )
702
+ (core): ModelCoreRNN(
703
+ (core): GRU(512, 512)
704
+ )
705
+ (decoder): MlpDecoder(
706
+ (mlp): Identity()
707
+ )
708
+ (critic_linear): Linear(in_features=512, out_features=1, bias=True)
709
+ (action_parameterization): ActionParameterizationDefault(
710
+ (distribution_linear): Linear(in_features=512, out_features=5, bias=True)
711
+ )
712
+ )
713
+ [2023-04-29 19:05:20,571][133597] Using optimizer <class 'torch.optim.adam.Adam'>
714
+ [2023-04-29 19:05:20,572][133597] Loading state from checkpoint /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
715
+ [2023-04-29 19:05:20,603][133597] Loading model from checkpoint
716
+ [2023-04-29 19:05:20,607][133597] Loaded experiment state at self.train_step=978, self.env_steps=4005888
717
+ [2023-04-29 19:05:20,607][133597] Initialized policy 0 weights for model version 978
718
+ [2023-04-29 19:05:20,610][133597] LearnerWorker_p0 finished initialization!
719
+ [2023-04-29 19:05:20,610][133597] Using GPUs [0] for process 0 (actually maps to GPUs [0])
720
+ [2023-04-29 19:05:20,727][133610] RunningMeanStd input shape: (3, 72, 128)
721
+ [2023-04-29 19:05:20,728][133610] RunningMeanStd input shape: (1,)
722
+ [2023-04-29 19:05:20,735][133610] ConvEncoder: input_channels=3
723
+ [2023-04-29 19:05:20,794][133610] Conv encoder output size: 512
724
+ [2023-04-29 19:05:20,794][133610] Policy head output size: 512
725
+ [2023-04-29 19:05:20,912][108205] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 4005888. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
726
+ [2023-04-29 19:05:21,937][108205] Inference worker 0-0 is ready!
727
+ [2023-04-29 19:05:21,938][108205] All inference workers are ready! Signal rollout workers to start!
728
+ [2023-04-29 19:05:21,984][133613] Doom resolution: 160x120, resize resolution: (128, 72)
729
+ [2023-04-29 19:05:21,985][133618] Doom resolution: 160x120, resize resolution: (128, 72)
730
+ [2023-04-29 19:05:21,988][133615] Doom resolution: 160x120, resize resolution: (128, 72)
731
+ [2023-04-29 19:05:21,990][133611] Doom resolution: 160x120, resize resolution: (128, 72)
732
+ [2023-04-29 19:05:21,991][133612] Doom resolution: 160x120, resize resolution: (128, 72)
733
+ [2023-04-29 19:05:21,993][133617] Doom resolution: 160x120, resize resolution: (128, 72)
734
+ [2023-04-29 19:05:21,993][133616] Doom resolution: 160x120, resize resolution: (128, 72)
735
+ [2023-04-29 19:05:22,000][133614] Doom resolution: 160x120, resize resolution: (128, 72)
736
+ [2023-04-29 19:05:22,396][133611] Decorrelating experience for 0 frames...
737
+ [2023-04-29 19:05:22,396][133614] Decorrelating experience for 0 frames...
738
+ [2023-04-29 19:05:22,396][133613] Decorrelating experience for 0 frames...
739
+ [2023-04-29 19:05:22,397][133615] Decorrelating experience for 0 frames...
740
+ [2023-04-29 19:05:22,397][133612] Decorrelating experience for 0 frames...
741
+ [2023-04-29 19:05:22,398][133617] Decorrelating experience for 0 frames...
742
+ [2023-04-29 19:05:22,581][133612] Decorrelating experience for 32 frames...
743
+ [2023-04-29 19:05:22,582][133615] Decorrelating experience for 32 frames...
744
+ [2023-04-29 19:05:22,583][133613] Decorrelating experience for 32 frames...
745
+ [2023-04-29 19:05:22,616][133611] Decorrelating experience for 32 frames...
746
+ [2023-04-29 19:05:22,632][133614] Decorrelating experience for 32 frames...
747
+ [2023-04-29 19:05:22,807][133617] Decorrelating experience for 32 frames...
748
+ [2023-04-29 19:05:22,809][133616] Decorrelating experience for 0 frames...
749
+ [2023-04-29 19:05:22,836][133615] Decorrelating experience for 64 frames...
750
+ [2023-04-29 19:05:22,857][133612] Decorrelating experience for 64 frames...
751
+ [2023-04-29 19:05:22,872][133611] Decorrelating experience for 64 frames...
752
+ [2023-04-29 19:05:22,893][133618] Decorrelating experience for 0 frames...
753
+ [2023-04-29 19:05:23,027][133616] Decorrelating experience for 32 frames...
754
+ [2023-04-29 19:05:23,051][133617] Decorrelating experience for 64 frames...
755
+ [2023-04-29 19:05:23,083][133613] Decorrelating experience for 64 frames...
756
+ [2023-04-29 19:05:23,101][133612] Decorrelating experience for 96 frames...
757
+ [2023-04-29 19:05:23,280][133618] Decorrelating experience for 32 frames...
758
+ [2023-04-29 19:05:23,298][133616] Decorrelating experience for 64 frames...
759
+ [2023-04-29 19:05:23,324][133617] Decorrelating experience for 96 frames...
760
+ [2023-04-29 19:05:23,348][133613] Decorrelating experience for 96 frames...
761
+ [2023-04-29 19:05:23,508][133618] Decorrelating experience for 64 frames...
762
+ [2023-04-29 19:05:23,508][133615] Decorrelating experience for 96 frames...
763
+ [2023-04-29 19:05:23,563][133616] Decorrelating experience for 96 frames...
764
+ [2023-04-29 19:05:23,581][133614] Decorrelating experience for 64 frames...
765
+ [2023-04-29 19:05:23,773][133618] Decorrelating experience for 96 frames...
766
+ [2023-04-29 19:05:23,829][133611] Decorrelating experience for 96 frames...
767
+ [2023-04-29 19:05:23,845][133614] Decorrelating experience for 96 frames...
768
+ [2023-04-29 19:05:24,309][133597] Signal inference workers to stop experience collection...
769
+ [2023-04-29 19:05:24,312][133610] InferenceWorker_p0-w0: stopping experience collection
770
+ [2023-04-29 19:05:25,912][108205] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 507.2. Samples: 2536. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
771
+ [2023-04-29 19:05:25,912][108205] Avg episode reward: [(0, '2.019')]
772
+ [2023-04-29 19:05:26,586][133597] Signal inference workers to resume experience collection...
773
+ [2023-04-29 19:05:26,587][133610] InferenceWorker_p0-w0: resuming experience collection
774
+ [2023-04-29 19:05:26,587][133597] Stopping Batcher_0...
775
+ [2023-04-29 19:05:26,587][133597] Loop batcher_evt_loop terminating...
776
+ [2023-04-29 19:05:26,593][133613] Stopping RolloutWorker_w3...
777
+ [2023-04-29 19:05:26,593][133616] Stopping RolloutWorker_w5...
778
+ [2023-04-29 19:05:26,594][133613] Loop rollout_proc3_evt_loop terminating...
779
+ [2023-04-29 19:05:26,594][133616] Loop rollout_proc5_evt_loop terminating...
780
+ [2023-04-29 19:05:26,594][133612] Stopping RolloutWorker_w1...
781
+ [2023-04-29 19:05:26,595][133617] Stopping RolloutWorker_w6...
782
+ [2023-04-29 19:05:26,595][133615] Stopping RolloutWorker_w4...
783
+ [2023-04-29 19:05:26,595][133618] Stopping RolloutWorker_w7...
784
+ [2023-04-29 19:05:26,595][133612] Loop rollout_proc1_evt_loop terminating...
785
+ [2023-04-29 19:05:26,595][133614] Stopping RolloutWorker_w2...
786
+ [2023-04-29 19:05:26,595][133615] Loop rollout_proc4_evt_loop terminating...
787
+ [2023-04-29 19:05:26,595][133617] Loop rollout_proc6_evt_loop terminating...
788
+ [2023-04-29 19:05:26,595][133618] Loop rollout_proc7_evt_loop terminating...
789
+ [2023-04-29 19:05:26,595][133611] Stopping RolloutWorker_w0...
790
+ [2023-04-29 19:05:26,595][133614] Loop rollout_proc2_evt_loop terminating...
791
+ [2023-04-29 19:05:26,595][133611] Loop rollout_proc0_evt_loop terminating...
792
+ [2023-04-29 19:05:26,596][133610] Weights refcount: 2 0
793
+ [2023-04-29 19:05:26,597][133610] Stopping InferenceWorker_p0-w0...
794
+ [2023-04-29 19:05:26,598][133610] Loop inference_proc0-0_evt_loop terminating...
795
+ [2023-04-29 19:05:26,598][108205] Component Batcher_0 stopped!
796
+ [2023-04-29 19:05:26,601][108205] Component RolloutWorker_w3 stopped!
797
+ [2023-04-29 19:05:26,602][108205] Component RolloutWorker_w5 stopped!
798
+ [2023-04-29 19:05:26,603][108205] Component RolloutWorker_w1 stopped!
799
+ [2023-04-29 19:05:26,604][108205] Component RolloutWorker_w6 stopped!
800
+ [2023-04-29 19:05:26,604][108205] Component RolloutWorker_w4 stopped!
801
+ [2023-04-29 19:05:26,605][108205] Component RolloutWorker_w7 stopped!
802
+ [2023-04-29 19:05:26,606][108205] Component RolloutWorker_w2 stopped!
803
+ [2023-04-29 19:05:26,607][108205] Component RolloutWorker_w0 stopped!
804
+ [2023-04-29 19:05:26,607][108205] Component InferenceWorker_p0-w0 stopped!
805
+ [2023-04-29 19:05:26,740][108205] Keyboard interrupt detected in the event loop EvtLoop [Runner_EvtLoop, process=main process 108205], exiting...
806
+ [2023-04-29 19:05:26,741][108205] Runner profile tree view:
807
+ main_loop: 9.2016
808
+ [2023-04-29 19:05:26,742][108205] Collected {0: 4009984}, FPS: 445.1
809
+ [2023-04-29 19:05:27,185][133597] Saving /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/checkpoint_p0/checkpoint_000000980_4014080.pth...
810
+ [2023-04-29 19:05:27,217][133597] Removing /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/checkpoint_p0/checkpoint_000000517_2117632.pth
811
+ [2023-04-29 19:05:27,218][133597] Saving /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/checkpoint_p0/checkpoint_000000980_4014080.pth...
812
+ [2023-04-29 19:05:27,250][133597] Stopping LearnerWorker_p0...
813
+ [2023-04-29 19:05:27,250][133597] Loop learner_proc0_evt_loop terminating...
814
+ [2023-04-29 19:17:05,974][139883] Saving configuration to /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/config.json...
815
+ [2023-04-29 19:17:05,975][139883] Rollout worker 0 uses device cpu
816
+ [2023-04-29 19:17:05,975][139883] Rollout worker 1 uses device cpu
817
+ [2023-04-29 19:17:05,976][139883] Rollout worker 2 uses device cpu
818
+ [2023-04-29 19:17:05,976][139883] Rollout worker 3 uses device cpu
819
+ [2023-04-29 19:17:05,977][139883] Rollout worker 4 uses device cpu
820
+ [2023-04-29 19:17:05,977][139883] Rollout worker 5 uses device cpu
821
+ [2023-04-29 19:17:05,978][139883] Rollout worker 6 uses device cpu
822
+ [2023-04-29 19:17:05,978][139883] Rollout worker 7 uses device cpu
823
+ [2023-04-29 19:17:06,002][139883] Using GPUs [0] for process 0 (actually maps to GPUs [0])
824
+ [2023-04-29 19:17:06,002][139883] InferenceWorker_p0-w0: min num requests: 2
825
+ [2023-04-29 19:17:06,060][139883] Starting all processes...
826
+ [2023-04-29 19:17:06,061][139883] Starting process learner_proc0
827
+ [2023-04-29 19:17:06,110][139883] Starting all processes...
828
+ [2023-04-29 19:17:06,114][139883] Starting process inference_proc0-0
829
+ [2023-04-29 19:17:06,114][139883] Starting process rollout_proc0
830
+ [2023-04-29 19:17:06,114][139883] Starting process rollout_proc1
831
+ [2023-04-29 19:17:06,114][139883] Starting process rollout_proc2
832
+ [2023-04-29 19:17:06,115][139883] Starting process rollout_proc3
833
+ [2023-04-29 19:17:06,115][139883] Starting process rollout_proc4
834
+ [2023-04-29 19:17:06,115][139883] Starting process rollout_proc5
835
+ [2023-04-29 19:17:06,116][139883] Starting process rollout_proc6
836
+ [2023-04-29 19:17:06,116][139883] Starting process rollout_proc7
837
+ [2023-04-29 19:17:06,979][141009] Using GPUs [0] for process 0 (actually maps to GPUs [0])
838
+ [2023-04-29 19:17:06,979][141009] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
839
+ [2023-04-29 19:17:06,997][141024] Worker 1 uses CPU cores [3, 4, 5]
840
+ [2023-04-29 19:17:07,016][141009] Num visible devices: 1
841
+ [2023-04-29 19:17:07,024][141026] Worker 3 uses CPU cores [9, 10, 11]
842
+ [2023-04-29 19:17:07,027][141030] Worker 7 uses CPU cores [21, 22, 23]
843
+ [2023-04-29 19:17:07,038][141028] Worker 4 uses CPU cores [12, 13, 14]
844
+ [2023-04-29 19:17:07,039][141022] Using GPUs [0] for process 0 (actually maps to GPUs [0])
845
+ [2023-04-29 19:17:07,039][141022] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
846
+ [2023-04-29 19:17:07,045][141023] Worker 0 uses CPU cores [0, 1, 2]
847
+ [2023-04-29 19:17:07,053][141022] Num visible devices: 1
848
+ [2023-04-29 19:17:07,054][141009] Starting seed is not provided
849
+ [2023-04-29 19:17:07,054][141009] Using GPUs [0] for process 0 (actually maps to GPUs [0])
850
+ [2023-04-29 19:17:07,054][141009] Initializing actor-critic model on device cuda:0
851
+ [2023-04-29 19:17:07,054][141009] RunningMeanStd input shape: (3, 72, 128)
852
+ [2023-04-29 19:17:07,055][141009] RunningMeanStd input shape: (1,)
853
+ [2023-04-29 19:17:07,059][141027] Worker 5 uses CPU cores [15, 16, 17]
854
+ [2023-04-29 19:17:07,065][141009] ConvEncoder: input_channels=3
855
+ [2023-04-29 19:17:07,066][141025] Worker 2 uses CPU cores [6, 7, 8]
856
+ [2023-04-29 19:17:07,085][141029] Worker 6 uses CPU cores [18, 19, 20]
857
+ [2023-04-29 19:17:07,208][141009] Conv encoder output size: 512
858
+ [2023-04-29 19:17:07,208][141009] Policy head output size: 512
859
+ [2023-04-29 19:17:07,247][141009] Created Actor Critic model with architecture:
860
+ [2023-04-29 19:17:07,247][141009] ActorCriticSharedWeights(
861
+ (obs_normalizer): ObservationNormalizer(
862
+ (running_mean_std): RunningMeanStdDictInPlace(
863
+ (running_mean_std): ModuleDict(
864
+ (obs): RunningMeanStdInPlace()
865
+ )
866
+ )
867
+ )
868
+ (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
869
+ (encoder): VizdoomEncoder(
870
+ (basic_encoder): ConvEncoder(
871
+ (enc): RecursiveScriptModule(
872
+ original_name=ConvEncoderImpl
873
+ (conv_head): RecursiveScriptModule(
874
+ original_name=Sequential
875
+ (0): RecursiveScriptModule(original_name=Conv2d)
876
+ (1): RecursiveScriptModule(original_name=ELU)
877
+ (2): RecursiveScriptModule(original_name=Conv2d)
878
+ (3): RecursiveScriptModule(original_name=ELU)
879
+ (4): RecursiveScriptModule(original_name=Conv2d)
880
+ (5): RecursiveScriptModule(original_name=ELU)
881
+ )
882
+ (mlp_layers): RecursiveScriptModule(
883
+ original_name=Sequential
884
+ (0): RecursiveScriptModule(original_name=Linear)
885
+ (1): RecursiveScriptModule(original_name=ELU)
886
+ )
887
+ )
888
+ )
889
+ )
890
+ (core): ModelCoreRNN(
891
+ (core): GRU(512, 512)
892
+ )
893
+ (decoder): MlpDecoder(
894
+ (mlp): Identity()
895
+ )
896
+ (critic_linear): Linear(in_features=512, out_features=1, bias=True)
897
+ (action_parameterization): ActionParameterizationDefault(
898
+ (distribution_linear): Linear(in_features=512, out_features=5, bias=True)
899
+ )
900
+ )
901
+ [2023-04-29 19:17:08,982][141009] Using optimizer <class 'torch.optim.adam.Adam'>
902
+ [2023-04-29 19:17:08,983][141009] Loading state from checkpoint /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/checkpoint_p0/checkpoint_000000980_4014080.pth...
903
+ [2023-04-29 19:17:08,999][141009] Loading model from checkpoint
904
+ [2023-04-29 19:17:09,002][141009] Loaded experiment state at self.train_step=980, self.env_steps=4014080
905
+ [2023-04-29 19:17:09,002][141009] Initialized policy 0 weights for model version 980
906
+ [2023-04-29 19:17:09,005][141009] LearnerWorker_p0 finished initialization!
907
+ [2023-04-29 19:17:09,006][141009] Using GPUs [0] for process 0 (actually maps to GPUs [0])
908
+ [2023-04-29 19:17:09,125][141022] RunningMeanStd input shape: (3, 72, 128)
909
+ [2023-04-29 19:17:09,126][141022] RunningMeanStd input shape: (1,)
910
+ [2023-04-29 19:17:09,133][141022] ConvEncoder: input_channels=3
911
+ [2023-04-29 19:17:09,191][141022] Conv encoder output size: 512
912
+ [2023-04-29 19:17:09,191][141022] Policy head output size: 512
913
+ [2023-04-29 19:17:09,422][139883] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 4014080. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
914
+ [2023-04-29 19:17:10,104][139883] Inference worker 0-0 is ready!
915
+ [2023-04-29 19:17:10,104][139883] All inference workers are ready! Signal rollout workers to start!
916
+ [2023-04-29 19:17:10,122][141028] Doom resolution: 160x120, resize resolution: (128, 72)
917
+ [2023-04-29 19:17:10,123][141024] Doom resolution: 160x120, resize resolution: (128, 72)
918
+ [2023-04-29 19:17:10,123][141029] Doom resolution: 160x120, resize resolution: (128, 72)
919
+ [2023-04-29 19:17:10,123][141030] Doom resolution: 160x120, resize resolution: (128, 72)
920
+ [2023-04-29 19:17:10,123][141025] Doom resolution: 160x120, resize resolution: (128, 72)
921
+ [2023-04-29 19:17:10,124][141026] Doom resolution: 160x120, resize resolution: (128, 72)
922
+ [2023-04-29 19:17:10,124][141027] Doom resolution: 160x120, resize resolution: (128, 72)
923
+ [2023-04-29 19:17:10,124][141023] Doom resolution: 160x120, resize resolution: (128, 72)
924
+ [2023-04-29 19:17:10,323][141026] Decorrelating experience for 0 frames...
925
+ [2023-04-29 19:17:10,324][141029] Decorrelating experience for 0 frames...
926
+ [2023-04-29 19:17:10,324][141025] Decorrelating experience for 0 frames...
927
+ [2023-04-29 19:17:10,325][141023] Decorrelating experience for 0 frames...
928
+ [2023-04-29 19:17:10,327][141028] Decorrelating experience for 0 frames...
929
+ [2023-04-29 19:17:10,332][141024] Decorrelating experience for 0 frames...
930
+ [2023-04-29 19:17:10,495][141026] Decorrelating experience for 32 frames...
931
+ [2023-04-29 19:17:10,495][141025] Decorrelating experience for 32 frames...
932
+ [2023-04-29 19:17:10,519][141024] Decorrelating experience for 32 frames...
933
+ [2023-04-29 19:17:10,520][141030] Decorrelating experience for 0 frames...
934
+ [2023-04-29 19:17:10,557][141029] Decorrelating experience for 32 frames...
935
+ [2023-04-29 19:17:10,699][141026] Decorrelating experience for 64 frames...
936
+ [2023-04-29 19:17:10,700][141030] Decorrelating experience for 32 frames...
937
+ [2023-04-29 19:17:10,754][141027] Decorrelating experience for 0 frames...
938
+ [2023-04-29 19:17:10,784][141029] Decorrelating experience for 64 frames...
939
+ [2023-04-29 19:17:10,909][141023] Decorrelating experience for 32 frames...
940
+ [2023-04-29 19:17:10,909][141025] Decorrelating experience for 64 frames...
941
+ [2023-04-29 19:17:10,911][141026] Decorrelating experience for 96 frames...
942
+ [2023-04-29 19:17:10,944][141030] Decorrelating experience for 64 frames...
943
+ [2023-04-29 19:17:10,978][141027] Decorrelating experience for 32 frames...
944
+ [2023-04-29 19:17:11,007][141024] Decorrelating experience for 64 frames...
945
+ [2023-04-29 19:17:11,118][141029] Decorrelating experience for 96 frames...
946
+ [2023-04-29 19:17:11,144][141025] Decorrelating experience for 96 frames...
947
+ [2023-04-29 19:17:11,193][141030] Decorrelating experience for 96 frames...
948
+ [2023-04-29 19:17:11,214][141024] Decorrelating experience for 96 frames...
949
+ [2023-04-29 19:17:11,214][141027] Decorrelating experience for 64 frames...
950
+ [2023-04-29 19:17:11,327][141028] Decorrelating experience for 32 frames...
951
+ [2023-04-29 19:17:11,522][141027] Decorrelating experience for 96 frames...
952
+ [2023-04-29 19:17:11,550][141023] Decorrelating experience for 64 frames...
953
+ [2023-04-29 19:17:11,812][141028] Decorrelating experience for 64 frames...
954
+ [2023-04-29 19:17:11,844][141023] Decorrelating experience for 96 frames...
955
+ [2023-04-29 19:17:11,880][141009] Signal inference workers to stop experience collection...
956
+ [2023-04-29 19:17:11,883][141022] InferenceWorker_p0-w0: stopping experience collection
957
+ [2023-04-29 19:17:12,064][141028] Decorrelating experience for 96 frames...
958
+ [2023-04-29 19:17:13,307][141009] Signal inference workers to resume experience collection...
959
+ [2023-04-29 19:17:13,307][141022] InferenceWorker_p0-w0: resuming experience collection
960
+ [2023-04-29 19:17:13,308][141009] Stopping Batcher_0...
961
+ [2023-04-29 19:17:13,308][141009] Loop batcher_evt_loop terminating...
962
+ [2023-04-29 19:17:13,313][141026] Stopping RolloutWorker_w3...
963
+ [2023-04-29 19:17:13,313][141027] Stopping RolloutWorker_w5...
964
+ [2023-04-29 19:17:13,313][141023] Stopping RolloutWorker_w0...
965
+ [2023-04-29 19:17:13,313][141026] Loop rollout_proc3_evt_loop terminating...
966
+ [2023-04-29 19:17:13,313][141027] Loop rollout_proc5_evt_loop terminating...
967
+ [2023-04-29 19:17:13,313][141023] Loop rollout_proc0_evt_loop terminating...
968
+ [2023-04-29 19:17:13,313][141024] Stopping RolloutWorker_w1...
969
+ [2023-04-29 19:17:13,313][141030] Stopping RolloutWorker_w7...
970
+ [2023-04-29 19:17:13,313][141024] Loop rollout_proc1_evt_loop terminating...
971
+ [2023-04-29 19:17:13,314][141030] Loop rollout_proc7_evt_loop terminating...
972
+ [2023-04-29 19:17:13,313][141025] Stopping RolloutWorker_w2...
973
+ [2023-04-29 19:17:13,314][141025] Loop rollout_proc2_evt_loop terminating...
974
+ [2023-04-29 19:17:13,314][141028] Stopping RolloutWorker_w4...
975
+ [2023-04-29 19:17:13,314][141029] Stopping RolloutWorker_w6...
976
+ [2023-04-29 19:17:13,314][141028] Loop rollout_proc4_evt_loop terminating...
977
+ [2023-04-29 19:17:13,314][141029] Loop rollout_proc6_evt_loop terminating...
978
+ [2023-04-29 19:17:13,314][141022] Weights refcount: 2 0
979
+ [2023-04-29 19:17:13,316][141022] Stopping InferenceWorker_p0-w0...
980
+ [2023-04-29 19:17:13,316][141022] Loop inference_proc0-0_evt_loop terminating...
981
+ [2023-04-29 19:17:13,319][139883] Component Batcher_0 stopped!
982
+ [2023-04-29 19:17:13,322][139883] Component RolloutWorker_w3 stopped!
983
+ [2023-04-29 19:17:13,322][139883] Component RolloutWorker_w5 stopped!
984
+ [2023-04-29 19:17:13,323][139883] Component RolloutWorker_w0 stopped!
985
+ [2023-04-29 19:17:13,324][139883] Component RolloutWorker_w1 stopped!
986
+ [2023-04-29 19:17:13,325][139883] Component RolloutWorker_w7 stopped!
987
+ [2023-04-29 19:17:13,325][139883] Component RolloutWorker_w2 stopped!
988
+ [2023-04-29 19:17:13,326][139883] Component RolloutWorker_w4 stopped!
989
+ [2023-04-29 19:17:13,327][139883] Component RolloutWorker_w6 stopped!
990
+ [2023-04-29 19:17:13,328][139883] Component InferenceWorker_p0-w0 stopped!
991
+ [2023-04-29 19:17:13,811][141009] Saving /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/checkpoint_p0/checkpoint_000000982_4022272.pth...
992
+ [2023-04-29 19:17:13,840][141009] Removing /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth
993
+ [2023-04-29 19:17:13,842][141009] Saving /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/checkpoint_p0/checkpoint_000000982_4022272.pth...
994
+ [2023-04-29 19:17:13,873][141009] Stopping LearnerWorker_p0...
995
+ [2023-04-29 19:17:13,874][141009] Loop learner_proc0_evt_loop terminating...
996
+ [2023-04-29 19:17:13,874][139883] Component LearnerWorker_p0 stopped!
997
+ [2023-04-29 19:17:13,874][139883] Waiting for process learner_proc0 to stop...
998
+ [2023-04-29 19:17:14,380][139883] Waiting for process inference_proc0-0 to join...
999
+ [2023-04-29 19:17:14,381][139883] Waiting for process rollout_proc0 to join...
1000
+ [2023-04-29 19:17:14,382][139883] Waiting for process rollout_proc1 to join...
1001
+ [2023-04-29 19:17:14,382][139883] Waiting for process rollout_proc2 to join...
1002
+ [2023-04-29 19:17:14,383][139883] Waiting for process rollout_proc3 to join...
1003
+ [2023-04-29 19:17:14,383][139883] Waiting for process rollout_proc4 to join...
1004
+ [2023-04-29 19:17:14,384][139883] Waiting for process rollout_proc5 to join...
1005
+ [2023-04-29 19:17:14,385][139883] Waiting for process rollout_proc6 to join...
1006
+ [2023-04-29 19:17:14,385][139883] Waiting for process rollout_proc7 to join...
1007
+ [2023-04-29 19:17:14,386][139883] Batcher 0 profile tree view:
1008
+ batching: 0.0301, releasing_batches: 0.0006
1009
+ [2023-04-29 19:17:14,386][139883] InferenceWorker_p0-w0 profile tree view:
1010
+ update_model: 0.0040
1011
+ wait_policy: 0.0000
1012
+ wait_policy_total: 0.8549
1013
+ one_step: 0.0023
1014
+ handle_policy_step: 0.8927
1015
+ deserialize: 0.0172, stack: 0.0021, obs_to_device_normalize: 0.1624, forward: 0.5764, send_messages: 0.0347
1016
+ prepare_outputs: 0.0846
1017
+ to_cpu: 0.0663
1018
+ [2023-04-29 19:17:14,387][139883] Learner 0 profile tree view:
1019
+ misc: 0.0000, prepare_batch: 1.5413
1020
+ train: 0.6188
1021
+ epoch_init: 0.0000, minibatch_init: 0.0000, losses_postprocess: 0.0004, kl_divergence: 0.0005, after_optimizer: 0.0074
1022
+ calculate_losses: 0.0568
1023
+ losses_init: 0.0000, forward_head: 0.0480, bptt_initial: 0.0040, tail: 0.0008, advantages_returns: 0.0005, losses: 0.0018
1024
+ bptt: 0.0015
1025
+ bptt_forward_core: 0.0014
1026
+ update: 0.5533
1027
+ clip: 0.0038
1028
+ [2023-04-29 19:17:14,387][139883] RolloutWorker_w0 profile tree view:
1029
+ wait_for_trajectories: 0.0001, enqueue_policy_requests: 0.0003
1030
+ [2023-04-29 19:17:14,388][139883] RolloutWorker_w7 profile tree view:
1031
+ wait_for_trajectories: 0.0004, enqueue_policy_requests: 0.0197, env_step: 0.3010, overhead: 0.0222, complete_rollouts: 0.0005
1032
+ save_policy_outputs: 0.0155
1033
+ split_output_tensors: 0.0077
1034
+ [2023-04-29 19:17:14,389][139883] Loop Runner_EvtLoop terminating...
1035
+ [2023-04-29 19:17:14,389][139883] Runner profile tree view:
1036
+ main_loop: 8.3291
1037
+ [2023-04-29 19:17:14,390][139883] Collected {0: 4022272}, FPS: 983.5
1038
+ [2023-04-29 19:17:14,480][139883] Loading existing experiment configuration from /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/config.json
1039
+ [2023-04-29 19:17:14,481][139883] Overriding arg 'num_workers' with value 1 passed from command line
1040
+ [2023-04-29 19:17:14,482][139883] Adding new argument 'no_render'=True that is not in the saved config file!
1041
+ [2023-04-29 19:17:14,482][139883] Adding new argument 'save_video'=True that is not in the saved config file!
1042
+ [2023-04-29 19:17:14,483][139883] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
1043
+ [2023-04-29 19:17:14,483][139883] Adding new argument 'video_name'=None that is not in the saved config file!
1044
+ [2023-04-29 19:17:14,484][139883] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
1045
+ [2023-04-29 19:17:14,485][139883] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
1046
+ [2023-04-29 19:17:14,485][139883] Adding new argument 'push_to_hub'=False that is not in the saved config file!
1047
+ [2023-04-29 19:17:14,486][139883] Adding new argument 'hf_repository'=None that is not in the saved config file!
1048
+ [2023-04-29 19:17:14,486][139883] Adding new argument 'policy_index'=0 that is not in the saved config file!
1049
+ [2023-04-29 19:17:14,487][139883] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
1050
+ [2023-04-29 19:17:14,487][139883] Adding new argument 'train_script'=None that is not in the saved config file!
1051
+ [2023-04-29 19:17:14,488][139883] Adding new argument 'enjoy_script'=None that is not in the saved config file!
1052
+ [2023-04-29 19:17:14,488][139883] Using frameskip 1 and render_action_repeat=4 for evaluation
1053
+ [2023-04-29 19:17:14,494][139883] Doom resolution: 160x120, resize resolution: (128, 72)
1054
+ [2023-04-29 19:17:14,495][139883] RunningMeanStd input shape: (3, 72, 128)
1055
+ [2023-04-29 19:17:14,496][139883] RunningMeanStd input shape: (1,)
1056
+ [2023-04-29 19:17:14,503][139883] ConvEncoder: input_channels=3
1057
+ [2023-04-29 19:17:14,588][139883] Conv encoder output size: 512
1058
+ [2023-04-29 19:17:14,589][139883] Policy head output size: 512
1059
+ [2023-04-29 19:17:16,265][139883] Loading state from checkpoint /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/checkpoint_p0/checkpoint_000000982_4022272.pth...
1060
+ [2023-04-29 19:17:17,030][139883] Num frames 100...
1061
+ [2023-04-29 19:17:17,120][139883] Num frames 200...
1062
+ [2023-04-29 19:17:17,212][139883] Num frames 300...
1063
+ [2023-04-29 19:17:17,344][139883] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840
1064
+ [2023-04-29 19:17:17,345][139883] Avg episode reward: 3.840, avg true_objective: 3.840
1065
+ [2023-04-29 19:17:17,361][139883] Num frames 400...
1066
+ [2023-04-29 19:17:17,454][139883] Num frames 500...
1067
+ [2023-04-29 19:17:17,541][139883] Num frames 600...
1068
+ [2023-04-29 19:17:17,628][139883] Num frames 700...
1069
+ [2023-04-29 19:17:17,715][139883] Num frames 800...
1070
+ [2023-04-29 19:17:17,815][139883] Num frames 900...
1071
+ [2023-04-29 19:17:17,902][139883] Avg episode rewards: #0: 5.640, true rewards: #0: 4.640
1072
+ [2023-04-29 19:17:17,903][139883] Avg episode reward: 5.640, avg true_objective: 4.640
1073
+ [2023-04-29 19:17:17,978][139883] Num frames 1000...
1074
+ [2023-04-29 19:17:18,072][139883] Num frames 1100...
1075
+ [2023-04-29 19:17:18,162][139883] Num frames 1200...
1076
+ [2023-04-29 19:17:18,256][139883] Num frames 1300...
1077
+ [2023-04-29 19:17:18,380][139883] Avg episode rewards: #0: 5.587, true rewards: #0: 4.587
1078
+ [2023-04-29 19:17:18,381][139883] Avg episode reward: 5.587, avg true_objective: 4.587
1079
+ [2023-04-29 19:17:18,406][139883] Num frames 1400...
1080
+ [2023-04-29 19:17:18,494][139883] Num frames 1500...
1081
+ [2023-04-29 19:17:18,586][139883] Num frames 1600...
1082
+ [2023-04-29 19:17:18,682][139883] Num frames 1700...
1083
+ [2023-04-29 19:17:18,797][139883] Avg episode rewards: #0: 5.150, true rewards: #0: 4.400
1084
+ [2023-04-29 19:17:18,797][139883] Avg episode reward: 5.150, avg true_objective: 4.400
1085
+ [2023-04-29 19:17:18,838][139883] Num frames 1800...
1086
+ [2023-04-29 19:17:18,947][139883] Num frames 1900...
1087
+ [2023-04-29 19:17:19,053][139883] Num frames 2000...
1088
+ [2023-04-29 19:17:19,150][139883] Num frames 2100...
1089
+ [2023-04-29 19:17:19,247][139883] Avg episode rewards: #0: 4.888, true rewards: #0: 4.288
1090
+ [2023-04-29 19:17:19,248][139883] Avg episode reward: 4.888, avg true_objective: 4.288
1091
+ [2023-04-29 19:17:19,305][139883] Num frames 2200...
1092
+ [2023-04-29 19:17:19,397][139883] Num frames 2300...
1093
+ [2023-04-29 19:17:19,486][139883] Num frames 2400...
1094
+ [2023-04-29 19:17:19,573][139883] Num frames 2500...
1095
+ [2023-04-29 19:17:19,653][139883] Avg episode rewards: #0: 4.713, true rewards: #0: 4.213
1096
+ [2023-04-29 19:17:19,654][139883] Avg episode reward: 4.713, avg true_objective: 4.213
1097
+ [2023-04-29 19:17:19,722][139883] Num frames 2600...
1098
+ [2023-04-29 19:17:19,815][139883] Num frames 2700...
1099
+ [2023-04-29 19:17:19,907][139883] Num frames 2800...
1100
+ [2023-04-29 19:17:20,017][139883] Num frames 2900...
1101
+ [2023-04-29 19:17:20,095][139883] Avg episode rewards: #0: 4.589, true rewards: #0: 4.160
1102
+ [2023-04-29 19:17:20,096][139883] Avg episode reward: 4.589, avg true_objective: 4.160
1103
+ [2023-04-29 19:17:20,184][139883] Num frames 3000...
1104
+ [2023-04-29 19:17:20,285][139883] Num frames 3100...
1105
+ [2023-04-29 19:17:20,374][139883] Num frames 3200...
1106
+ [2023-04-29 19:17:20,466][139883] Num frames 3300...
1107
+ [2023-04-29 19:17:20,601][139883] Avg episode rewards: #0: 4.865, true rewards: #0: 4.240
1108
+ [2023-04-29 19:17:20,602][139883] Avg episode reward: 4.865, avg true_objective: 4.240
1109
+ [2023-04-29 19:17:20,610][139883] Num frames 3400...
1110
+ [2023-04-29 19:17:20,698][139883] Num frames 3500...
1111
+ [2023-04-29 19:17:20,787][139883] Num frames 3600...
1112
+ [2023-04-29 19:17:20,883][139883] Num frames 3700...
1113
+ [2023-04-29 19:17:20,973][139883] Num frames 3800...
1114
+ [2023-04-29 19:17:21,035][139883] Avg episode rewards: #0: 4.898, true rewards: #0: 4.231
1115
+ [2023-04-29 19:17:21,036][139883] Avg episode reward: 4.898, avg true_objective: 4.231
1116
+ [2023-04-29 19:17:21,129][139883] Num frames 3900...
1117
+ [2023-04-29 19:17:21,226][139883] Num frames 4000...
1118
+ [2023-04-29 19:17:21,322][139883] Num frames 4100...
1119
+ [2023-04-29 19:17:21,464][139883] Avg episode rewards: #0: 4.992, true rewards: #0: 4.192
1120
+ [2023-04-29 19:17:21,465][139883] Avg episode reward: 4.992, avg true_objective: 4.192
1121
+ [2023-04-29 19:17:26,026][139883] Replay video saved to /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/replay.mp4!
1122
+ [2023-04-29 19:19:04,743][139883] Loading existing experiment configuration from /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/config.json
1123
+ [2023-04-29 19:19:04,743][139883] Overriding arg 'num_workers' with value 1 passed from command line
1124
+ [2023-04-29 19:19:04,744][139883] Adding new argument 'no_render'=True that is not in the saved config file!
1125
+ [2023-04-29 19:19:04,744][139883] Adding new argument 'save_video'=True that is not in the saved config file!
1126
+ [2023-04-29 19:19:04,745][139883] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
1127
+ [2023-04-29 19:19:04,746][139883] Adding new argument 'video_name'=None that is not in the saved config file!
1128
+ [2023-04-29 19:19:04,746][139883] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
1129
+ [2023-04-29 19:19:04,747][139883] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
1130
+ [2023-04-29 19:19:04,747][139883] Adding new argument 'push_to_hub'=True that is not in the saved config file!
1131
+ [2023-04-29 19:19:04,748][139883] Adding new argument 'hf_repository'='ItchyB/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
1132
+ [2023-04-29 19:19:04,748][139883] Adding new argument 'policy_index'=0 that is not in the saved config file!
1133
+ [2023-04-29 19:19:04,749][139883] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
1134
+ [2023-04-29 19:19:04,750][139883] Adding new argument 'train_script'=None that is not in the saved config file!
1135
+ [2023-04-29 19:19:04,750][139883] Adding new argument 'enjoy_script'=None that is not in the saved config file!
1136
+ [2023-04-29 19:19:04,751][139883] Using frameskip 1 and render_action_repeat=4 for evaluation
1137
+ [2023-04-29 19:19:04,754][139883] RunningMeanStd input shape: (3, 72, 128)
1138
+ [2023-04-29 19:19:04,755][139883] RunningMeanStd input shape: (1,)
1139
+ [2023-04-29 19:19:04,761][139883] ConvEncoder: input_channels=3
1140
+ [2023-04-29 19:19:04,782][139883] Conv encoder output size: 512
1141
+ [2023-04-29 19:19:04,783][139883] Policy head output size: 512
1142
+ [2023-04-29 19:19:04,801][139883] Loading state from checkpoint /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/checkpoint_p0/checkpoint_000000982_4022272.pth...
1143
+ [2023-04-29 19:19:05,173][139883] Num frames 100...
1144
+ [2023-04-29 19:19:05,315][139883] Num frames 200...
1145
+ [2023-04-29 19:19:05,452][139883] Num frames 300...
1146
+ [2023-04-29 19:19:05,634][139883] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840
1147
+ [2023-04-29 19:19:05,635][139883] Avg episode reward: 3.840, avg true_objective: 3.840
1148
+ [2023-04-29 19:19:05,658][139883] Num frames 400...
1149
+ [2023-04-29 19:19:05,797][139883] Num frames 500...
1150
+ [2023-04-29 19:19:05,955][139883] Num frames 600...
1151
+ [2023-04-29 19:19:06,130][139883] Num frames 700...
1152
+ [2023-04-29 19:19:06,262][139883] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840
1153
+ [2023-04-29 19:19:06,263][139883] Avg episode reward: 3.840, avg true_objective: 3.840
1154
+ [2023-04-29 19:19:06,308][139883] Num frames 800...
1155
+ [2023-04-29 19:19:06,440][139883] Num frames 900...
1156
+ [2023-04-29 19:19:06,579][139883] Num frames 1000...
1157
+ [2023-04-29 19:19:06,697][139883] Num frames 1100...
1158
+ [2023-04-29 19:19:06,855][139883] Avg episode rewards: #0: 3.947, true rewards: #0: 3.947
1159
+ [2023-04-29 19:19:06,856][139883] Avg episode reward: 3.947, avg true_objective: 3.947
1160
+ [2023-04-29 19:19:06,874][139883] Num frames 1200...
1161
+ [2023-04-29 19:19:06,994][139883] Num frames 1300...
1162
+ [2023-04-29 19:19:07,103][139883] Num frames 1400...
1163
+ [2023-04-29 19:19:07,216][139883] Num frames 1500...
1164
+ [2023-04-29 19:19:07,344][139883] Avg episode rewards: #0: 3.920, true rewards: #0: 3.920
1165
+ [2023-04-29 19:19:07,345][139883] Avg episode reward: 3.920, avg true_objective: 3.920
1166
+ [2023-04-29 19:19:07,381][139883] Num frames 1600...
1167
+ [2023-04-29 19:19:07,495][139883] Num frames 1700...
1168
+ [2023-04-29 19:19:07,612][139883] Num frames 1800...
1169
+ [2023-04-29 19:19:07,728][139883] Num frames 1900...
1170
+ [2023-04-29 19:19:07,843][139883] Avg episode rewards: #0: 3.904, true rewards: #0: 3.904
1171
+ [2023-04-29 19:19:07,844][139883] Avg episode reward: 3.904, avg true_objective: 3.904
1172
+ [2023-04-29 19:19:07,913][139883] Num frames 2000...
1173
+ [2023-04-29 19:19:08,031][139883] Num frames 2100...
1174
+ [2023-04-29 19:19:08,140][139883] Num frames 2200...
1175
+ [2023-04-29 19:19:08,252][139883] Num frames 2300...
1176
+ [2023-04-29 19:19:08,373][139883] Num frames 2400...
1177
+ [2023-04-29 19:19:08,531][139883] Avg episode rewards: #0: 4.493, true rewards: #0: 4.160
1178
+ [2023-04-29 19:19:08,532][139883] Avg episode reward: 4.493, avg true_objective: 4.160
1179
+ [2023-04-29 19:19:08,539][139883] Num frames 2500...
1180
+ [2023-04-29 19:19:08,658][139883] Num frames 2600...
1181
+ [2023-04-29 19:19:08,778][139883] Num frames 2700...
1182
+ [2023-04-29 19:19:08,915][139883] Num frames 2800...
1183
+ [2023-04-29 19:19:09,064][139883] Avg episode rewards: #0: 4.400, true rewards: #0: 4.114
1184
+ [2023-04-29 19:19:09,064][139883] Avg episode reward: 4.400, avg true_objective: 4.114
1185
+ [2023-04-29 19:19:09,089][139883] Num frames 2900...
1186
+ [2023-04-29 19:19:09,216][139883] Num frames 3000...
1187
+ [2023-04-29 19:19:09,334][139883] Num frames 3100...
1188
+ [2023-04-29 19:19:09,449][139883] Num frames 3200...
1189
+ [2023-04-29 19:19:09,580][139883] Num frames 3300...
1190
+ [2023-04-29 19:19:09,671][139883] Avg episode rewards: #0: 4.535, true rewards: #0: 4.160
1191
+ [2023-04-29 19:19:09,671][139883] Avg episode reward: 4.535, avg true_objective: 4.160
1192
+ [2023-04-29 19:19:09,763][139883] Num frames 3400...
1193
+ [2023-04-29 19:19:09,885][139883] Num frames 3500...
1194
+ [2023-04-29 19:19:10,010][139883] Num frames 3600...
1195
+ [2023-04-29 19:19:10,129][139883] Num frames 3700...
1196
+ [2023-04-29 19:19:10,196][139883] Avg episode rewards: #0: 4.458, true rewards: #0: 4.124
1197
+ [2023-04-29 19:19:10,197][139883] Avg episode reward: 4.458, avg true_objective: 4.124
1198
+ [2023-04-29 19:19:10,307][139883] Num frames 3800...
1199
+ [2023-04-29 19:19:10,426][139883] Num frames 3900...
1200
+ [2023-04-29 19:19:10,533][139883] Num frames 4000...
1201
+ [2023-04-29 19:19:10,685][139883] Avg episode rewards: #0: 4.396, true rewards: #0: 4.096
1202
+ [2023-04-29 19:19:10,686][139883] Avg episode reward: 4.396, avg true_objective: 4.096
1203
+ [2023-04-29 19:19:15,388][139883] Replay video saved to /home/byron/projects/rl-learning-course/unit-08/train_dir/default_experiment/replay.mp4!