File size: 16,150 Bytes
8ba2a32 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 |
2023-05-18 23:22:15 - SimpleLog - INFO: - General Configs: 2023-05-18 23:22:15 - SimpleLog - INFO: - ================================================================================ 2023-05-18 23:22:15 - SimpleLog - INFO: - Name Value Type 2023-05-18 23:22:15 - SimpleLog - INFO: - env_name gym <class 'str'> 2023-05-18 23:22:15 - SimpleLog - INFO: - algo_name PER_DQN <class 'str'> 2023-05-18 23:22:15 - SimpleLog - INFO: - mode train <class 'str'> 2023-05-18 23:22:15 - SimpleLog - INFO: - device cuda <class 'str'> 2023-05-18 23:22:15 - SimpleLog - INFO: - seed 1 <class 'int'> 2023-05-18 23:22:15 - SimpleLog - INFO: - max_episode 100 <class 'int'> 2023-05-18 23:22:15 - SimpleLog - INFO: - max_step 200 <class 'int'> 2023-05-18 23:22:15 - SimpleLog - INFO: - collect_traj 0 <class 'bool'> 2023-05-18 23:22:15 - SimpleLog - INFO: - mp_backend single <class 'str'> 2023-05-18 23:22:15 - SimpleLog - INFO: - n_workers 2 <class 'int'> 2023-05-18 23:22:15 - SimpleLog - INFO: - n_learners 1 <class 'int'> 2023-05-18 23:22:15 - SimpleLog - INFO: - share_buffer 1 <class 'bool'> 2023-05-18 23:22:15 - SimpleLog - INFO: - online_eval 1 <class 'bool'> 2023-05-18 23:22:15 - SimpleLog - INFO: - online_eval_episode 10 <class 'int'> 2023-05-18 23:22:15 - SimpleLog - INFO: - model_save_fre 500 <class 'int'> 2023-05-18 23:22:15 - SimpleLog - INFO: - load_checkpoint 0 <class 'bool'> 2023-05-18 23:22:15 - SimpleLog - INFO: - load_path Train_single_CartPole-v1_DQN_20230515-211721 <class 'str'> 2023-05-18 23:22:15 - SimpleLog - INFO: - load_model_step best <class 'str'> 2023-05-18 23:22:15 - SimpleLog - INFO: - ================================================================================ 2023-05-18 23:22:15 - SimpleLog - INFO: - Algo Configs: 2023-05-18 23:22:15 - SimpleLog - INFO: - ================================================================================ 2023-05-18 23:22:15 - SimpleLog - INFO: - Name Value Type 2023-05-18 23:22:15 - SimpleLog - INFO: - epsilon_start 0.95 <class 'float'> 2023-05-18 23:22:15 - SimpleLog - INFO: - epsilon_end 0.01 <class 'float'> 2023-05-18 23:22:15 - SimpleLog - INFO: - epsilon_decay 1000 <class 'int'> 2023-05-18 23:22:15 - SimpleLog - INFO: - gamma 0.99 <class 'float'> 2023-05-18 23:22:15 - SimpleLog - INFO: - lr 0.0001 <class 'float'> 2023-05-18 23:22:15 - SimpleLog - INFO: - buffer_type PER_QUE <class 'str'> 2023-05-18 23:22:15 - SimpleLog - INFO: - buffer_size 100000 <class 'int'> 2023-05-18 23:22:15 - SimpleLog - INFO: - per_alpha 0.6 <class 'float'> 2023-05-18 23:22:15 - SimpleLog - INFO: - per_beta 0.4 <class 'float'> 2023-05-18 23:22:15 - SimpleLog - INFO: - per_beta_annealing 0.001 <class 'float'> 2023-05-18 23:22:15 - SimpleLog - INFO: - per_epsilon 0.01 <class 'float'> 2023-05-18 23:22:15 - SimpleLog - INFO: - batch_size 64 <class 'int'> 2023-05-18 23:22:15 - SimpleLog - INFO: - target_update 4 <class 'int'> 2023-05-18 23:22:15 - SimpleLog - INFO: - value_layers [{'layer_type': 'linear', 'layer_size': [256], 'activation': 'relu'}, {'layer_type': 'linear', 'layer_size': [256], 'activation': 'relu'}] <class 'str'> 2023-05-18 23:22:15 - SimpleLog - INFO: - ================================================================================ 2023-05-18 23:22:15 - SimpleLog - INFO: - Env Configs: 2023-05-18 23:22:15 - SimpleLog - INFO: - ================================================================================ 2023-05-18 23:22:15 - SimpleLog - INFO: - Name Value Type 2023-05-18 23:22:15 - SimpleLog - INFO: - id CartPole-v1 <class 'str'> 2023-05-18 23:22:15 - SimpleLog - INFO: - render_mode None <class 'str'> 2023-05-18 23:22:15 - SimpleLog - INFO: - wrapper None <class 'str'> 2023-05-18 23:22:15 - SimpleLog - INFO: - ignore_params ['wrapper', 'ignore_params'] <class 'str'> 2023-05-18 23:22:15 - SimpleLog - INFO: - ================================================================================ 2023-05-18 23:22:15 - SimpleLog - INFO: - obs_space: Box([-4.8000002e+00 -3.4028235e+38 -4.1887903e-01 -3.4028235e+38], [4.8000002e+00 3.4028235e+38 4.1887903e-01 3.4028235e+38], (4,), float32), n_actions: Discrete(2) 2023-05-18 23:22:16 - SimpleLog - INFO: - Start training! 2023-05-18 23:22:17 - SimpleLog - INFO: - episode: 0, ep_reward: 34.0, ep_step: 34 2023-05-18 23:22:17 - SimpleLog - INFO: - episode: 1, ep_reward: 14.0, ep_step: 14 2023-05-18 23:22:17 - SimpleLog - INFO: - episode: 2, ep_reward: 15.0, ep_step: 15 2023-05-18 23:22:17 - SimpleLog - INFO: - episode: 3, ep_reward: 17.0, ep_step: 17 2023-05-18 23:22:17 - SimpleLog - INFO: - episode: 4, ep_reward: 12.0, ep_step: 12 2023-05-18 23:22:17 - SimpleLog - INFO: - episode: 5, ep_reward: 39.0, ep_step: 39 2023-05-18 23:22:18 - SimpleLog - INFO: - episode: 6, ep_reward: 28.0, ep_step: 28 2023-05-18 23:22:18 - SimpleLog - INFO: - episode: 7, ep_reward: 33.0, ep_step: 33 2023-05-18 23:22:18 - SimpleLog - INFO: - episode: 8, ep_reward: 15.0, ep_step: 15 2023-05-18 23:22:18 - SimpleLog - INFO: - episode: 9, ep_reward: 15.0, ep_step: 15 2023-05-18 23:22:18 - SimpleLog - INFO: - episode: 10, ep_reward: 20.0, ep_step: 20 2023-05-18 23:22:18 - SimpleLog - INFO: - episode: 11, ep_reward: 20.0, ep_step: 20 2023-05-18 23:22:18 - SimpleLog - INFO: - episode: 12, ep_reward: 13.0, ep_step: 13 2023-05-18 23:22:18 - SimpleLog - INFO: - episode: 13, ep_reward: 19.0, ep_step: 19 2023-05-18 23:22:18 - SimpleLog - INFO: - episode: 14, ep_reward: 30.0, ep_step: 30 2023-05-18 23:22:18 - SimpleLog - INFO: - episode: 15, ep_reward: 15.0, ep_step: 15 2023-05-18 23:22:19 - SimpleLog - INFO: - episode: 16, ep_reward: 20.0, ep_step: 20 2023-05-18 23:22:19 - SimpleLog - INFO: - episode: 17, ep_reward: 14.0, ep_step: 14 2023-05-18 23:22:19 - SimpleLog - INFO: - episode: 18, ep_reward: 11.0, ep_step: 11 2023-05-18 23:22:19 - SimpleLog - INFO: - episode: 19, ep_reward: 21.0, ep_step: 21 2023-05-18 23:22:19 - SimpleLog - INFO: - episode: 20, ep_reward: 15.0, ep_step: 15 2023-05-18 23:22:19 - SimpleLog - INFO: - episode: 21, ep_reward: 18.0, ep_step: 18 2023-05-18 23:22:19 - SimpleLog - INFO: - episode: 22, ep_reward: 12.0, ep_step: 12 2023-05-18 23:22:19 - SimpleLog - INFO: - episode: 23, ep_reward: 24.0, ep_step: 24 2023-05-18 23:22:19 - SimpleLog - INFO: - episode: 24, ep_reward: 23.0, ep_step: 23 2023-05-18 23:22:19 - SimpleLog - INFO: - episode: 25, ep_reward: 25.0, ep_step: 25 2023-05-18 23:22:20 - SimpleLog - INFO: - episode: 26, ep_reward: 17.0, ep_step: 17 2023-05-18 23:22:20 - SimpleLog - INFO: - episode: 27, ep_reward: 12.0, ep_step: 12 2023-05-18 23:22:20 - SimpleLog - INFO: - update_step: 500, online_eval_reward: 9.000 2023-05-18 23:22:20 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 9.000, save the best model! 2023-05-18 23:22:20 - SimpleLog - INFO: - episode: 28, ep_reward: 17.0, ep_step: 17 2023-05-18 23:22:20 - SimpleLog - INFO: - episode: 29, ep_reward: 12.0, ep_step: 12 2023-05-18 23:22:20 - SimpleLog - INFO: - episode: 30, ep_reward: 12.0, ep_step: 12 2023-05-18 23:22:20 - SimpleLog - INFO: - episode: 31, ep_reward: 16.0, ep_step: 16 2023-05-18 23:22:20 - SimpleLog - INFO: - episode: 32, ep_reward: 11.0, ep_step: 11 2023-05-18 23:22:20 - SimpleLog - INFO: - episode: 33, ep_reward: 15.0, ep_step: 15 2023-05-18 23:22:20 - SimpleLog - INFO: - episode: 34, ep_reward: 12.0, ep_step: 12 2023-05-18 23:22:20 - SimpleLog - INFO: - episode: 35, ep_reward: 18.0, ep_step: 18 2023-05-18 23:22:20 - SimpleLog - INFO: - episode: 36, ep_reward: 9.0, ep_step: 9 2023-05-18 23:22:21 - SimpleLog - INFO: - episode: 37, ep_reward: 50.0, ep_step: 50 2023-05-18 23:22:21 - SimpleLog - INFO: - episode: 38, ep_reward: 12.0, ep_step: 12 2023-05-18 23:22:21 - SimpleLog - INFO: - episode: 39, ep_reward: 14.0, ep_step: 14 2023-05-18 23:22:21 - SimpleLog - INFO: - episode: 40, ep_reward: 15.0, ep_step: 15 2023-05-18 23:22:21 - SimpleLog - INFO: - episode: 41, ep_reward: 10.0, ep_step: 10 2023-05-18 23:22:21 - SimpleLog - INFO: - episode: 42, ep_reward: 19.0, ep_step: 19 2023-05-18 23:22:21 - SimpleLog - INFO: - episode: 43, ep_reward: 13.0, ep_step: 13 2023-05-18 23:22:21 - SimpleLog - INFO: - episode: 44, ep_reward: 16.0, ep_step: 16 2023-05-18 23:22:21 - SimpleLog - INFO: - episode: 45, ep_reward: 20.0, ep_step: 20 2023-05-18 23:22:21 - SimpleLog - INFO: - episode: 46, ep_reward: 16.0, ep_step: 16 2023-05-18 23:22:21 - SimpleLog - INFO: - episode: 47, ep_reward: 32.0, ep_step: 32 2023-05-18 23:22:21 - SimpleLog - INFO: - episode: 48, ep_reward: 25.0, ep_step: 25 2023-05-18 23:22:22 - SimpleLog - INFO: - episode: 49, ep_reward: 73.0, ep_step: 73 2023-05-18 23:22:22 - SimpleLog - INFO: - episode: 50, ep_reward: 28.0, ep_step: 28 2023-05-18 23:22:22 - SimpleLog - INFO: - update_step: 1000, online_eval_reward: 75.000 2023-05-18 23:22:22 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 75.000, save the best model! 2023-05-18 23:22:23 - SimpleLog - INFO: - episode: 51, ep_reward: 88.0, ep_step: 88 2023-05-18 23:22:23 - SimpleLog - INFO: - episode: 52, ep_reward: 58.0, ep_step: 58 2023-05-18 23:22:23 - SimpleLog - INFO: - episode: 53, ep_reward: 53.0, ep_step: 53 2023-05-18 23:22:24 - SimpleLog - INFO: - episode: 54, ep_reward: 77.0, ep_step: 77 2023-05-18 23:22:24 - SimpleLog - INFO: - episode: 55, ep_reward: 48.0, ep_step: 48 2023-05-18 23:22:25 - SimpleLog - INFO: - episode: 56, ep_reward: 150.0, ep_step: 150 2023-05-18 23:22:25 - SimpleLog - INFO: - episode: 57, ep_reward: 45.0, ep_step: 45 2023-05-18 23:22:25 - SimpleLog - INFO: - update_step: 1500, online_eval_reward: 68.000 2023-05-18 23:22:25 - SimpleLog - INFO: - episode: 58, ep_reward: 53.0, ep_step: 53 2023-05-18 23:22:26 - SimpleLog - INFO: - episode: 59, ep_reward: 75.0, ep_step: 75 2023-05-18 23:22:26 - SimpleLog - INFO: - episode: 60, ep_reward: 49.0, ep_step: 49 2023-05-18 23:22:27 - SimpleLog - INFO: - episode: 61, ep_reward: 127.0, ep_step: 127 2023-05-18 23:22:27 - SimpleLog - INFO: - episode: 62, ep_reward: 107.0, ep_step: 107 2023-05-18 23:22:28 - SimpleLog - INFO: - episode: 63, ep_reward: 72.0, ep_step: 72 2023-05-18 23:22:28 - SimpleLog - INFO: - update_step: 2000, online_eval_reward: 59.000 2023-05-18 23:22:28 - SimpleLog - INFO: - episode: 64, ep_reward: 70.0, ep_step: 70 2023-05-18 23:22:29 - SimpleLog - INFO: - episode: 65, ep_reward: 54.0, ep_step: 54 2023-05-18 23:22:29 - SimpleLog - INFO: - episode: 66, ep_reward: 49.0, ep_step: 49 2023-05-18 23:22:29 - SimpleLog - INFO: - episode: 67, ep_reward: 56.0, ep_step: 56 2023-05-18 23:22:29 - SimpleLog - INFO: - episode: 68, ep_reward: 69.0, ep_step: 69 2023-05-18 23:22:30 - SimpleLog - INFO: - episode: 69, ep_reward: 70.0, ep_step: 70 2023-05-18 23:22:30 - SimpleLog - INFO: - episode: 70, ep_reward: 65.0, ep_step: 65 2023-05-18 23:22:30 - SimpleLog - INFO: - episode: 71, ep_reward: 57.0, ep_step: 57 2023-05-18 23:22:31 - SimpleLog - INFO: - episode: 72, ep_reward: 50.0, ep_step: 50 2023-05-18 23:22:31 - SimpleLog - INFO: - update_step: 2500, online_eval_reward: 124.000 2023-05-18 23:22:31 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 124.000, save the best model! 2023-05-18 23:22:32 - SimpleLog - INFO: - episode: 73, ep_reward: 82.0, ep_step: 82 2023-05-18 23:22:32 - SimpleLog - INFO: - episode: 74, ep_reward: 74.0, ep_step: 74 2023-05-18 23:22:33 - SimpleLog - INFO: - episode: 75, ep_reward: 93.0, ep_step: 93 2023-05-18 23:22:33 - SimpleLog - INFO: - episode: 76, ep_reward: 80.0, ep_step: 80 2023-05-18 23:22:33 - SimpleLog - INFO: - episode: 77, ep_reward: 56.0, ep_step: 56 2023-05-18 23:22:34 - SimpleLog - INFO: - episode: 78, ep_reward: 87.0, ep_step: 87 2023-05-18 23:22:34 - SimpleLog - INFO: - update_step: 3000, online_eval_reward: 68.000 2023-05-18 23:22:35 - SimpleLog - INFO: - episode: 79, ep_reward: 67.0, ep_step: 67 2023-05-18 23:22:35 - SimpleLog - INFO: - episode: 80, ep_reward: 80.0, ep_step: 80 2023-05-18 23:22:35 - SimpleLog - INFO: - episode: 81, ep_reward: 65.0, ep_step: 65 2023-05-18 23:22:36 - SimpleLog - INFO: - episode: 82, ep_reward: 79.0, ep_step: 79 2023-05-18 23:22:36 - SimpleLog - INFO: - episode: 83, ep_reward: 66.0, ep_step: 66 2023-05-18 23:22:37 - SimpleLog - INFO: - episode: 84, ep_reward: 90.0, ep_step: 90 2023-05-18 23:22:38 - SimpleLog - INFO: - update_step: 3500, online_eval_reward: 146.000 2023-05-18 23:22:38 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 146.000, save the best model! 2023-05-18 23:22:38 - SimpleLog - INFO: - episode: 85, ep_reward: 134.0, ep_step: 134 2023-05-18 23:22:39 - SimpleLog - INFO: - episode: 86, ep_reward: 156.0, ep_step: 156 2023-05-18 23:22:40 - SimpleLog - INFO: - episode: 87, ep_reward: 200.0, ep_step: 200 2023-05-18 23:22:41 - SimpleLog - INFO: - update_step: 4000, online_eval_reward: 185.000 2023-05-18 23:22:41 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 185.000, save the best model! 2023-05-18 23:22:42 - SimpleLog - INFO: - episode: 88, ep_reward: 196.0, ep_step: 196 2023-05-18 23:22:43 - SimpleLog - INFO: - episode: 89, ep_reward: 190.0, ep_step: 190 2023-05-18 23:22:44 - SimpleLog - INFO: - episode: 90, ep_reward: 200.0, ep_step: 200 2023-05-18 23:22:45 - SimpleLog - INFO: - update_step: 4500, online_eval_reward: 200.000 2023-05-18 23:22:45 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model! 2023-05-18 23:22:46 - SimpleLog - INFO: - episode: 91, ep_reward: 200.0, ep_step: 200 2023-05-18 23:22:47 - SimpleLog - INFO: - episode: 92, ep_reward: 200.0, ep_step: 200 2023-05-18 23:22:48 - SimpleLog - INFO: - update_step: 5000, online_eval_reward: 200.000 2023-05-18 23:22:48 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model! 2023-05-18 23:22:49 - SimpleLog - INFO: - episode: 93, ep_reward: 200.0, ep_step: 200 2023-05-18 23:22:50 - SimpleLog - INFO: - episode: 94, ep_reward: 200.0, ep_step: 200 2023-05-18 23:22:51 - SimpleLog - INFO: - episode: 95, ep_reward: 200.0, ep_step: 200 2023-05-18 23:22:52 - SimpleLog - INFO: - update_step: 5500, online_eval_reward: 200.000 2023-05-18 23:22:52 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model! 2023-05-18 23:22:53 - SimpleLog - INFO: - episode: 96, ep_reward: 200.0, ep_step: 200 2023-05-18 23:22:54 - SimpleLog - INFO: - episode: 97, ep_reward: 200.0, ep_step: 200 2023-05-18 23:22:55 - SimpleLog - INFO: - update_step: 6000, online_eval_reward: 200.000 2023-05-18 23:22:55 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model! 2023-05-18 23:22:56 - SimpleLog - INFO: - episode: 98, ep_reward: 200.0, ep_step: 200 2023-05-18 23:22:57 - SimpleLog - INFO: - episode: 99, ep_reward: 200.0, ep_step: 200 2023-05-18 23:22:57 - SimpleLog - INFO: - Finish training! total time consumed: 41.45s |