johnjim0816's picture
update NoisyDQN Cartpolev1
24d6ada
raw
history blame
16.7 kB
2023-05-18 17:32:42 - SimpleLog - INFO: - General Configs:
2023-05-18 17:32:42 - SimpleLog - INFO: - ================================================================================
2023-05-18 17:32:42 - SimpleLog - INFO: - Name Value Type
2023-05-18 17:32:42 - SimpleLog - INFO: - env_name gym <class 'str'>
2023-05-18 17:32:42 - SimpleLog - INFO: - algo_name NoisyDQN <class 'str'>
2023-05-18 17:32:42 - SimpleLog - INFO: - mode train <class 'str'>
2023-05-18 17:32:42 - SimpleLog - INFO: - device cpu <class 'str'>
2023-05-18 17:32:42 - SimpleLog - INFO: - seed 1 <class 'int'>
2023-05-18 17:32:42 - SimpleLog - INFO: - max_episode 100 <class 'int'>
2023-05-18 17:32:42 - SimpleLog - INFO: - max_step 200 <class 'int'>
2023-05-18 17:32:42 - SimpleLog - INFO: - collect_traj 0 <class 'bool'>
2023-05-18 17:32:42 - SimpleLog - INFO: - mp_backend single <class 'str'>
2023-05-18 17:32:42 - SimpleLog - INFO: - n_workers 2 <class 'int'>
2023-05-18 17:32:42 - SimpleLog - INFO: - n_learners 1 <class 'int'>
2023-05-18 17:32:42 - SimpleLog - INFO: - share_buffer 1 <class 'bool'>
2023-05-18 17:32:42 - SimpleLog - INFO: - online_eval 1 <class 'bool'>
2023-05-18 17:32:42 - SimpleLog - INFO: - online_eval_episode 10 <class 'int'>
2023-05-18 17:32:42 - SimpleLog - INFO: - model_save_fre 500 <class 'int'>
2023-05-18 17:32:42 - SimpleLog - INFO: - load_checkpoint 0 <class 'bool'>
2023-05-18 17:32:42 - SimpleLog - INFO: - load_path Train_single_CartPole-v1_NoisyDQN_20230518-133737 <class 'str'>
2023-05-18 17:32:42 - SimpleLog - INFO: - load_model_step best <class 'str'>
2023-05-18 17:32:42 - SimpleLog - INFO: - ================================================================================
2023-05-18 17:32:42 - SimpleLog - INFO: - Algo Configs:
2023-05-18 17:32:42 - SimpleLog - INFO: - ================================================================================
2023-05-18 17:32:42 - SimpleLog - INFO: - Name Value Type
2023-05-18 17:32:42 - SimpleLog - INFO: - epsilon_start 0.95 <class 'float'>
2023-05-18 17:32:42 - SimpleLog - INFO: - epsilon_end 0.01 <class 'float'>
2023-05-18 17:32:42 - SimpleLog - INFO: - epsilon_decay 500 <class 'int'>
2023-05-18 17:32:42 - SimpleLog - INFO: - gamma 0.99 <class 'float'>
2023-05-18 17:32:42 - SimpleLog - INFO: - lr 0.0001 <class 'float'>
2023-05-18 17:32:42 - SimpleLog - INFO: - buffer_size 100000 <class 'int'>
2023-05-18 17:32:42 - SimpleLog - INFO: - batch_size 64 <class 'int'>
2023-05-18 17:32:42 - SimpleLog - INFO: - target_update 4 <class 'int'>
2023-05-18 17:32:42 - SimpleLog - INFO: - value_layers [{'layer_type': 'noisy_linear', 'layer_size': [256], 'activation': 'relu', 'std_init': 0.4}, {'layer_type': 'noisy_linear', 'layer_size': [256], 'activation': 'relu', 'std_init': 0.4}] <class 'str'>
2023-05-18 17:32:42 - SimpleLog - INFO: - buffer_type REPLAY_QUE <class 'str'>
2023-05-18 17:32:42 - SimpleLog - INFO: - ================================================================================
2023-05-18 17:32:42 - SimpleLog - INFO: - Env Configs:
2023-05-18 17:32:42 - SimpleLog - INFO: - ================================================================================
2023-05-18 17:32:42 - SimpleLog - INFO: - Name Value Type
2023-05-18 17:32:42 - SimpleLog - INFO: - id CartPole-v1 <class 'str'>
2023-05-18 17:32:42 - SimpleLog - INFO: - render_mode None <class 'str'>
2023-05-18 17:32:42 - SimpleLog - INFO: - wrapper None <class 'str'>
2023-05-18 17:32:42 - SimpleLog - INFO: - ignore_params ['wrapper', 'ignore_params'] <class 'str'>
2023-05-18 17:32:42 - SimpleLog - INFO: - ================================================================================
2023-05-18 17:32:42 - SimpleLog - INFO: - obs_space: Box([-4.8000002e+00 -3.4028235e+38 -4.1887903e-01 -3.4028235e+38], [4.8000002e+00 3.4028235e+38 4.1887903e-01 3.4028235e+38], (4,), float32), n_actions: Discrete(2)
2023-05-18 17:32:42 - SimpleLog - INFO: - Start training!
2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 0, ep_reward: 10.0, ep_step: 10
2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 1, ep_reward: 10.0, ep_step: 10
2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 2, ep_reward: 10.0, ep_step: 10
2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 3, ep_reward: 18.0, ep_step: 18
2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 4, ep_reward: 36.0, ep_step: 36
2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 5, ep_reward: 12.0, ep_step: 12
2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 6, ep_reward: 13.0, ep_step: 13
2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 7, ep_reward: 13.0, ep_step: 13
2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 8, ep_reward: 16.0, ep_step: 16
2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 9, ep_reward: 11.0, ep_step: 11
2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 10, ep_reward: 10.0, ep_step: 10
2023-05-18 17:32:42 - SimpleLog - INFO: - episode: 11, ep_reward: 10.0, ep_step: 10
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 12, ep_reward: 22.0, ep_step: 22
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 13, ep_reward: 18.0, ep_step: 18
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 14, ep_reward: 20.0, ep_step: 20
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 15, ep_reward: 11.0, ep_step: 11
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 16, ep_reward: 13.0, ep_step: 13
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 17, ep_reward: 10.0, ep_step: 10
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 18, ep_reward: 9.0, ep_step: 9
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 19, ep_reward: 10.0, ep_step: 10
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 20, ep_reward: 11.0, ep_step: 11
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 21, ep_reward: 17.0, ep_step: 17
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 22, ep_reward: 10.0, ep_step: 10
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 23, ep_reward: 25.0, ep_step: 25
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 24, ep_reward: 11.0, ep_step: 11
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 25, ep_reward: 10.0, ep_step: 10
2023-05-18 17:32:43 - SimpleLog - INFO: - episode: 26, ep_reward: 12.0, ep_step: 12
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 27, ep_reward: 22.0, ep_step: 22
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 28, ep_reward: 10.0, ep_step: 10
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 29, ep_reward: 15.0, ep_step: 15
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 30, ep_reward: 10.0, ep_step: 10
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 31, ep_reward: 11.0, ep_step: 11
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 32, ep_reward: 13.0, ep_step: 13
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 33, ep_reward: 15.0, ep_step: 15
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 34, ep_reward: 16.0, ep_step: 16
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 35, ep_reward: 13.0, ep_step: 13
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 36, ep_reward: 13.0, ep_step: 13
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 37, ep_reward: 20.0, ep_step: 20
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 38, ep_reward: 11.0, ep_step: 11
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 39, ep_reward: 12.0, ep_step: 12
2023-05-18 17:32:44 - SimpleLog - INFO: - update_step: 500, online_eval_reward: 10.000
2023-05-18 17:32:44 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 10.000, save the best model!
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 40, ep_reward: 12.0, ep_step: 12
2023-05-18 17:32:44 - SimpleLog - INFO: - episode: 41, ep_reward: 12.0, ep_step: 12
2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 42, ep_reward: 11.0, ep_step: 11
2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 43, ep_reward: 11.0, ep_step: 11
2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 44, ep_reward: 19.0, ep_step: 19
2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 45, ep_reward: 21.0, ep_step: 21
2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 46, ep_reward: 24.0, ep_step: 24
2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 47, ep_reward: 15.0, ep_step: 15
2023-05-18 17:32:45 - SimpleLog - INFO: - episode: 48, ep_reward: 74.0, ep_step: 74
2023-05-18 17:32:46 - SimpleLog - INFO: - episode: 49, ep_reward: 37.0, ep_step: 37
2023-05-18 17:32:46 - SimpleLog - INFO: - episode: 50, ep_reward: 29.0, ep_step: 29
2023-05-18 17:32:46 - SimpleLog - INFO: - episode: 51, ep_reward: 51.0, ep_step: 51
2023-05-18 17:32:46 - SimpleLog - INFO: - episode: 52, ep_reward: 62.0, ep_step: 62
2023-05-18 17:32:47 - SimpleLog - INFO: - episode: 53, ep_reward: 75.0, ep_step: 75
2023-05-18 17:32:47 - SimpleLog - INFO: - update_step: 1000, online_eval_reward: 48.000
2023-05-18 17:32:47 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 48.000, save the best model!
2023-05-18 17:32:48 - SimpleLog - INFO: - episode: 54, ep_reward: 150.0, ep_step: 150
2023-05-18 17:32:48 - SimpleLog - INFO: - episode: 55, ep_reward: 118.0, ep_step: 118
2023-05-18 17:32:49 - SimpleLog - INFO: - episode: 56, ep_reward: 154.0, ep_step: 154
2023-05-18 17:32:50 - SimpleLog - INFO: - update_step: 1500, online_eval_reward: 125.000
2023-05-18 17:32:50 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 125.000, save the best model!
2023-05-18 17:32:50 - SimpleLog - INFO: - episode: 57, ep_reward: 157.0, ep_step: 157
2023-05-18 17:32:51 - SimpleLog - INFO: - episode: 58, ep_reward: 200.0, ep_step: 200
2023-05-18 17:32:51 - SimpleLog - INFO: - episode: 59, ep_reward: 98.0, ep_step: 98
2023-05-18 17:32:52 - SimpleLog - INFO: - update_step: 2000, online_eval_reward: 146.000
2023-05-18 17:32:52 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 146.000, save the best model!
2023-05-18 17:32:52 - SimpleLog - INFO: - episode: 60, ep_reward: 175.0, ep_step: 175
2023-05-18 17:32:53 - SimpleLog - INFO: - episode: 61, ep_reward: 200.0, ep_step: 200
2023-05-18 17:32:54 - SimpleLog - INFO: - episode: 62, ep_reward: 200.0, ep_step: 200
2023-05-18 17:32:55 - SimpleLog - INFO: - update_step: 2500, online_eval_reward: 200.000
2023-05-18 17:32:55 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model!
2023-05-18 17:32:55 - SimpleLog - INFO: - episode: 63, ep_reward: 200.0, ep_step: 200
2023-05-18 17:32:56 - SimpleLog - INFO: - episode: 64, ep_reward: 200.0, ep_step: 200
2023-05-18 17:32:58 - SimpleLog - INFO: - update_step: 3000, online_eval_reward: 200.000
2023-05-18 17:32:58 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model!
2023-05-18 17:32:58 - SimpleLog - INFO: - episode: 65, ep_reward: 200.0, ep_step: 200
2023-05-18 17:32:59 - SimpleLog - INFO: - episode: 66, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:00 - SimpleLog - INFO: - episode: 67, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:01 - SimpleLog - INFO: - update_step: 3500, online_eval_reward: 200.000
2023-05-18 17:33:01 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model!
2023-05-18 17:33:01 - SimpleLog - INFO: - episode: 68, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:02 - SimpleLog - INFO: - episode: 69, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:03 - SimpleLog - INFO: - update_step: 4000, online_eval_reward: 200.000
2023-05-18 17:33:03 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model!
2023-05-18 17:33:03 - SimpleLog - INFO: - episode: 70, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:04 - SimpleLog - INFO: - episode: 71, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:05 - SimpleLog - INFO: - episode: 72, ep_reward: 118.0, ep_step: 118
2023-05-18 17:33:06 - SimpleLog - INFO: - update_step: 4500, online_eval_reward: 108.000
2023-05-18 17:33:06 - SimpleLog - INFO: - episode: 73, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:07 - SimpleLog - INFO: - episode: 74, ep_reward: 120.0, ep_step: 120
2023-05-18 17:33:07 - SimpleLog - INFO: - episode: 75, ep_reward: 99.0, ep_step: 99
2023-05-18 17:33:08 - SimpleLog - INFO: - episode: 76, ep_reward: 100.0, ep_step: 100
2023-05-18 17:33:08 - SimpleLog - INFO: - episode: 77, ep_reward: 94.0, ep_step: 94
2023-05-18 17:33:09 - SimpleLog - INFO: - update_step: 5000, online_eval_reward: 99.000
2023-05-18 17:33:09 - SimpleLog - INFO: - episode: 78, ep_reward: 183.0, ep_step: 183
2023-05-18 17:33:10 - SimpleLog - INFO: - episode: 79, ep_reward: 81.0, ep_step: 81
2023-05-18 17:33:10 - SimpleLog - INFO: - episode: 80, ep_reward: 96.0, ep_step: 96
2023-05-18 17:33:11 - SimpleLog - INFO: - episode: 81, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:12 - SimpleLog - INFO: - update_step: 5500, online_eval_reward: 200.000
2023-05-18 17:33:12 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model!
2023-05-18 17:33:12 - SimpleLog - INFO: - episode: 82, ep_reward: 125.0, ep_step: 125
2023-05-18 17:33:13 - SimpleLog - INFO: - episode: 83, ep_reward: 101.0, ep_step: 101
2023-05-18 17:33:13 - SimpleLog - INFO: - episode: 84, ep_reward: 72.0, ep_step: 72
2023-05-18 17:33:14 - SimpleLog - INFO: - episode: 85, ep_reward: 65.0, ep_step: 65
2023-05-18 17:33:14 - SimpleLog - INFO: - episode: 86, ep_reward: 82.0, ep_step: 82
2023-05-18 17:33:15 - SimpleLog - INFO: - update_step: 6000, online_eval_reward: 92.000
2023-05-18 17:33:15 - SimpleLog - INFO: - episode: 87, ep_reward: 97.0, ep_step: 97
2023-05-18 17:33:16 - SimpleLog - INFO: - episode: 88, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:17 - SimpleLog - INFO: - episode: 89, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:18 - SimpleLog - INFO: - update_step: 6500, online_eval_reward: 200.000
2023-05-18 17:33:18 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model!
2023-05-18 17:33:18 - SimpleLog - INFO: - episode: 90, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:19 - SimpleLog - INFO: - episode: 91, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:21 - SimpleLog - INFO: - update_step: 7000, online_eval_reward: 200.000
2023-05-18 17:33:21 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model!
2023-05-18 17:33:21 - SimpleLog - INFO: - episode: 92, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:22 - SimpleLog - INFO: - episode: 93, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:23 - SimpleLog - INFO: - episode: 94, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:24 - SimpleLog - INFO: - update_step: 7500, online_eval_reward: 200.000
2023-05-18 17:33:24 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model!
2023-05-18 17:33:24 - SimpleLog - INFO: - episode: 95, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:26 - SimpleLog - INFO: - episode: 96, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:27 - SimpleLog - INFO: - update_step: 8000, online_eval_reward: 200.000
2023-05-18 17:33:27 - SimpleLog - INFO: - current update step obtain a better online_eval_reward: 200.000, save the best model!
2023-05-18 17:33:27 - SimpleLog - INFO: - episode: 97, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:28 - SimpleLog - INFO: - episode: 98, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:29 - SimpleLog - INFO: - episode: 99, ep_reward: 200.0, ep_step: 200
2023-05-18 17:33:29 - SimpleLog - INFO: - Finish training! total time consumed: 47.23s