File size: 16,690 Bytes
24d6ada |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
2023-05-18 17:32:42 - SimpleLog - INFO: - General Configs: 2023-05-18 17:32:42 - SimpleLog - INFO: - ================================================================================ 2023-05-18 17:32:42 - SimpleLog - INFO: - Name Value Type 2023-05-18 17:32:42 - SimpleLog - INFO: - env_name gym <class 'str'> 2023-05-18 17:32:42 - SimpleLog - INFO: - algo_name NoisyDQN <class 'str'> 2023-05-18 17:32:42 - SimpleLog - INFO: - mode train <class 'str'> 2023-05-18 17:32:42 - SimpleLog - INFO: - device cpu <class 'str'> 2023-05-18 17:32:42 - SimpleLog - INFO: - seed 1 <class 'int'> 2023-05-18 17:32:42 - SimpleLog - INFO: - max_episode 100 <class 'int'> 2023-05-18 17:32:42 - SimpleLog - INFO: - max_step 200 <class 'int'> 2023-05-18 17:32:42 - SimpleLog - INFO: - collect_traj 0 <class 'bool'> 2023-05-18 17:32:42 - SimpleLog - INFO: - mp_backend single <class 'str'> 2023-05-18 17:32:42 - SimpleLog - INFO: - n_workers 2 <class 'int'> 2023-05-18 17:32:42 - SimpleLog - INFO: - n_learners 1 <class 'int'> 2023-05-18 17:32:42 - SimpleLog - INFO: - share_buffer 1 <class 'bool'> 2023-05-18 17:32:42 - SimpleLog - INFO: - online_eval 1 <class 'bool'> 2023-05-18 17:32:42 - SimpleLog - INFO: - online_eval_episode 10 <class 'int'> 2023-05-18 17:32:42 - SimpleLog - INFO: - model_save_fre 500 <class 'int'> 2023-05-18 17:32:42 - SimpleLog - INFO: - load_checkpoint 0 <class 'bool'> 2023-05-18 17:32:42 - SimpleLog - INFO: - load_path Train_single_CartPole-v1_NoisyDQN_20230518-133737 <class 'str'> 2023-05-18 17:32:42 - SimpleLog - INFO: - load_model_step best <class 'str'> 2023-05-18 17:32:42 - SimpleLog - INFO: - ================================================================================ 2023-05-18 17:32:42 - SimpleLog - INFO: - Algo Configs: 2023-05-18 17:32:42 - SimpleLog - INFO: - ================================================================================ 2023-05-18 17:32:42 - SimpleLog - INFO: - Name Value Type 2023-05-18 17:32:42 - SimpleLog - INFO: - epsilon_start 0.95 <class 'float'> 2023-05-18 17:32:42 - SimpleLog - INFO: - epsilon_end 0.01 <class 'float'> 2023-05-18 17:32:42 - SimpleLog - INFO: - epsilon_decay 500 <class 'int'> 2023-05-18 17:32:42 - SimpleLog - INFO: - gamma 0.99 <class 'float'> 2023-05-18 17:32:42 - SimpleLog - INFO: - lr 0.0001 <class 'float'> 2023-05-18 17:32:42 - SimpleLog - INFO: - buffer_size 100000 <class 'int'> 2023-05-18 17:32:42 - SimpleLog - INFO: - batch_size 64 <class 'int'> 2023-05-18 17:32:42 - SimpleLog - INFO: - target_update 4 <class 'int'> 2023-05-18 17:32:42 - SimpleLog - INFO: - value_layers [{'layer_type': 'noisy_linear', 'layer_size': [256], 'activation': 'relu', 'std_init': 0.4}, {'layer_type': 'noisy_linear', 'layer_size': [256], 'activation': 'relu', 'std_init': 0.4}] <class 'str'> 2023-05-18 17:32:42 - SimpleLog - INFO: - buffer_type REPLAY_QUE <class 'str'> 2023-05-18 17:32:42 - SimpleLog - INFO: - ================================================================================ 2023-05-18 17:32:42 - SimpleLog - INFO: - Env Configs: 2023-05-18 17:32:42 - SimpleLog - INFO: - ================================================================================ 2023-05-18 17:32:42 - SimpleLog - INFO: - Name Value Type 2023-05-18 17:32:42 - SimpleLog - INFO: - id CartPole-v1 <class 'str'> 2023-05-18 17:32:42 - SimpleLog - INFO: - render_mode None <class 'str'> 2023-05-18 17:32:42 - SimpleLog - INFO: - wrapper None <class 'str'> 2023-05-18 17:32:42 - SimpleLog - INFO: - ignore_params ['wrapper', 'ignore_params'] <class 'str'> 2023-05-18 17:32:42 - SimpleLog - INFO: - ================================================================================ 2023-05-18 17:32:42 - SimpleLog - INFO: - obs_space: Box([-4.8000002e+00 -3.4028235e+38 -4.1887903e-01 -3.4028235e+38], [4.8000002e+00 3.4028235e+38 4.1887903e-01 3.4028235e+38], (4,), float32), n_actions: Discrete(2) 2023-05-18 17:32:42 - SimpleLog - INFO: - Start training! 2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 0, ep_reward: 10.0, ep_step: 10 2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 1, ep_reward: 10.0, ep_step: 10 2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 2, ep_reward: 10.0, ep_step: 10 2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 3, ep_reward: 18.0, ep_step: 18 2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 4, ep_reward: 36.0, ep_step: 36 2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 5, ep_reward: 12.0, ep_step: 12 2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 6, ep_reward: 13.0, ep_step: 13 2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 7, ep_reward: 13.0, ep_step: 13 2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 8, ep_reward: 16.0, ep_step: 16 2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 9, ep_reward: 11.0, ep_step: 11 2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 10, ep_reward: 10.0, ep_step: 10 2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 11, ep_reward: 10.0, ep_step: 10 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 12, ep_reward: 22.0, ep_step: 22 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 13, ep_reward: 18.0, ep_step: 18 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 14, ep_reward: 20.0, ep_step: 20 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 15, ep_reward: 11.0, ep_step: 11 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 16, ep_reward: 13.0, ep_step: 13 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 17, ep_reward: 10.0, ep_step: 10 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 18, ep_reward: 9.0, ep_step: 9 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 19, ep_reward: 10.0, ep_step: 10 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 20, ep_reward: 11.0, ep_step: 11 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 21, ep_reward: 17.0, ep_step: 17 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 22, ep_reward: 10.0, ep_step: 10 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 23, ep_reward: 25.0, ep_step: 25 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 24, ep_reward: 11.0, ep_step: 11 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 25, ep_reward: 10.0, ep_step: 10 2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 26, ep_reward: 12.0, ep_step: 12 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 27, ep_reward: 22.0, ep_step: 22 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 28, ep_reward: 10.0, ep_step: 10 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 29, ep_reward: 15.0, ep_step: 15 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 30, ep_reward: 10.0, ep_step: 10 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 31, ep_reward: 11.0, ep_step: 11 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 32, ep_reward: 13.0, ep_step: 13 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 33, ep_reward: 15.0, ep_step: 15 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 34, ep_reward: 16.0, ep_step: 16 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 35, ep_reward: 13.0, ep_step: 13 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 36, ep_reward: 13.0, ep_step: 13 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 37, ep_reward: 20.0, ep_step: 20 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 38, ep_reward: 11.0, ep_step: 11 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 39, ep_reward: 12.0, ep_step: 12 2023-05-18 17:32:44 - SimpleLog - INFO: - update_step: 500, online_eval_reward: 10.000 2023-05-18 17:32:44 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 10.000, save the best model! 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 40, ep_reward: 12.0, ep_step: 12 2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 41, ep_reward: 12.0, ep_step: 12 2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 42, ep_reward: 11.0, ep_step: 11 2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 43, ep_reward: 11.0, ep_step: 11 2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 44, ep_reward: 19.0, ep_step: 19 2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 45, ep_reward: 21.0, ep_step: 21 2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 46, ep_reward: 24.0, ep_step: 24 2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 47, ep_reward: 15.0, ep_step: 15 2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 48, ep_reward: 74.0, ep_step: 74 2023-05-18 17:32:46 - SimpleLog - INFO: - episode: 49, ep_reward: 37.0, ep_step: 37 2023-05-18 17:32:46 - SimpleLog - INFO: - episode: 50, ep_reward: 29.0, ep_step: 29 2023-05-18 17:32:46 - SimpleLog - INFO: - episode: 51, ep_reward: 51.0, ep_step: 51 2023-05-18 17:32:46 - SimpleLog - INFO: - episode: 52, ep_reward: 62.0, ep_step: 62 2023-05-18 17:32:47 - SimpleLog - INFO: - episode: 53, ep_reward: 75.0, ep_step: 75 2023-05-18 17:32:47 - SimpleLog - INFO: - update_step: 1000, online_eval_reward: 48.000 2023-05-18 17:32:47 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 48.000, save the best model! 2023-05-18 17:32:48 - SimpleLog - INFO: - episode: 54, ep_reward: 150.0, ep_step: 150 2023-05-18 17:32:48 - SimpleLog - INFO: - episode: 55, ep_reward: 118.0, ep_step: 118 2023-05-18 17:32:49 - SimpleLog - INFO: - episode: 56, ep_reward: 154.0, ep_step: 154 2023-05-18 17:32:50 - SimpleLog - INFO: - update_step: 1500, online_eval_reward: 125.000 2023-05-18 17:32:50 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 125.000, save the best model! 2023-05-18 17:32:50 - SimpleLog - INFO: - episode: 57, ep_reward: 157.0, ep_step: 157 2023-05-18 17:32:51 - SimpleLog - INFO: - episode: 58, ep_reward: 200.0, ep_step: 200 2023-05-18 17:32:51 - SimpleLog - INFO: - episode: 59, ep_reward: 98.0, ep_step: 98 2023-05-18 17:32:52 - SimpleLog - INFO: - update_step: 2000, online_eval_reward: 146.000 2023-05-18 17:32:52 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 146.000, save the best model! 2023-05-18 17:32:52 - SimpleLog - INFO: - episode: 60, ep_reward: 175.0, ep_step: 175 2023-05-18 17:32:53 - SimpleLog - INFO: - episode: 61, ep_reward: 200.0, ep_step: 200 2023-05-18 17:32:54 - SimpleLog - INFO: - episode: 62, ep_reward: 200.0, ep_step: 200 2023-05-18 17:32:55 - SimpleLog - INFO: - update_step: 2500, online_eval_reward: 200.000 2023-05-18 17:32:55 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model! 2023-05-18 17:32:55 - SimpleLog - INFO: - episode: 63, ep_reward: 200.0, ep_step: 200 2023-05-18 17:32:56 - SimpleLog - INFO: - episode: 64, ep_reward: 200.0, ep_step: 200 2023-05-18 17:32:58 - SimpleLog - INFO: - update_step: 3000, online_eval_reward: 200.000 2023-05-18 17:32:58 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model! 2023-05-18 17:32:58 - SimpleLog - INFO: - episode: 65, ep_reward: 200.0, ep_step: 200 2023-05-18 17:32:59 - SimpleLog - INFO: - episode: 66, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:00 - SimpleLog - INFO: - episode: 67, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:01 - SimpleLog - INFO: - update_step: 3500, online_eval_reward: 200.000 2023-05-18 17:33:01 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model! 2023-05-18 17:33:01 - SimpleLog - INFO: - episode: 68, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:02 - SimpleLog - INFO: - episode: 69, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:03 - SimpleLog - INFO: - update_step: 4000, online_eval_reward: 200.000 2023-05-18 17:33:03 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model! 2023-05-18 17:33:03 - SimpleLog - INFO: - episode: 70, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:04 - SimpleLog - INFO: - episode: 71, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:05 - SimpleLog - INFO: - episode: 72, ep_reward: 118.0, ep_step: 118 2023-05-18 17:33:06 - SimpleLog - INFO: - update_step: 4500, online_eval_reward: 108.000 2023-05-18 17:33:06 - SimpleLog - INFO: - episode: 73, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:07 - SimpleLog - INFO: - episode: 74, ep_reward: 120.0, ep_step: 120 2023-05-18 17:33:07 - SimpleLog - INFO: - episode: 75, ep_reward: 99.0, ep_step: 99 2023-05-18 17:33:08 - SimpleLog - INFO: - episode: 76, ep_reward: 100.0, ep_step: 100 2023-05-18 17:33:08 - SimpleLog - INFO: - episode: 77, ep_reward: 94.0, ep_step: 94 2023-05-18 17:33:09 - SimpleLog - INFO: - update_step: 5000, online_eval_reward: 99.000 2023-05-18 17:33:09 - SimpleLog - INFO: - episode: 78, ep_reward: 183.0, ep_step: 183 2023-05-18 17:33:10 - SimpleLog - INFO: - episode: 79, ep_reward: 81.0, ep_step: 81 2023-05-18 17:33:10 - SimpleLog - INFO: - episode: 80, ep_reward: 96.0, ep_step: 96 2023-05-18 17:33:11 - SimpleLog - INFO: - episode: 81, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:12 - SimpleLog - INFO: - update_step: 5500, online_eval_reward: 200.000 2023-05-18 17:33:12 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model! 2023-05-18 17:33:12 - SimpleLog - INFO: - episode: 82, ep_reward: 125.0, ep_step: 125 2023-05-18 17:33:13 - SimpleLog - INFO: - episode: 83, ep_reward: 101.0, ep_step: 101 2023-05-18 17:33:13 - SimpleLog - INFO: - episode: 84, ep_reward: 72.0, ep_step: 72 2023-05-18 17:33:14 - SimpleLog - INFO: - episode: 85, ep_reward: 65.0, ep_step: 65 2023-05-18 17:33:14 - SimpleLog - INFO: - episode: 86, ep_reward: 82.0, ep_step: 82 2023-05-18 17:33:15 - SimpleLog - INFO: - update_step: 6000, online_eval_reward: 92.000 2023-05-18 17:33:15 - SimpleLog - INFO: - episode: 87, ep_reward: 97.0, ep_step: 97 2023-05-18 17:33:16 - SimpleLog - INFO: - episode: 88, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:17 - SimpleLog - INFO: - episode: 89, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:18 - SimpleLog - INFO: - update_step: 6500, online_eval_reward: 200.000 2023-05-18 17:33:18 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model! 2023-05-18 17:33:18 - SimpleLog - INFO: - episode: 90, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:19 - SimpleLog - INFO: - episode: 91, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:21 - SimpleLog - INFO: - update_step: 7000, online_eval_reward: 200.000 2023-05-18 17:33:21 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model! 2023-05-18 17:33:21 - SimpleLog - INFO: - episode: 92, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:22 - SimpleLog - INFO: - episode: 93, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:23 - SimpleLog - INFO: - episode: 94, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:24 - SimpleLog - INFO: - update_step: 7500, online_eval_reward: 200.000 2023-05-18 17:33:24 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model! 2023-05-18 17:33:24 - SimpleLog - INFO: - episode: 95, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:26 - SimpleLog - INFO: - episode: 96, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:27 - SimpleLog - INFO: - update_step: 8000, online_eval_reward: 200.000 2023-05-18 17:33:27 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model! 2023-05-18 17:33:27 - SimpleLog - INFO: - episode: 97, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:28 - SimpleLog - INFO: - episode: 98, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:29 - SimpleLog - INFO: - episode: 99, ep_reward: 200.0, ep_step: 200 2023-05-18 17:33:29 - SimpleLog - INFO: - Finish training! total time consumed: 47.23s |