igorcheb commited on
Commit
efb8a0a
1 Parent(s): 0a2c366

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md CHANGED
@@ -29,3 +29,22 @@ Training progress:
29
  Numbers on X axis are average over 40 episodes, each lasting for about 500 timesteps on average. So in total the agent was trained over about 5e6 timesteps.
30
  Learning rate decay schedule: <code>torch.optim.lr_scheduler.StepLR(opt, step_size=4000, gamma=0.7)</code>
31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  Numbers on X axis are average over 40 episodes, each lasting for about 500 timesteps on average. So in total the agent was trained over about 5e6 timesteps.
30
  Learning rate decay schedule: <code>torch.optim.lr_scheduler.StepLR(opt, step_size=4000, gamma=0.7)</code>
31
 
32
+ Minimal code to use the agent:</br>
33
+ <pre><code>
34
+ import gym</br>
35
+ </br>
36
+ env_name = 'LunarLanderContinuous-v2'</br>
37
+ env = gym.make(env_name)</br>
38
+ agent = torch.load('best_models/best_reinforce_lunar_lander_cont_model_269.402.pt')</br>
39
+ render = True</br>
40
+ observation = env.reset()</br>
41
+ while True:</br>
42
+ if render:</br>
43
+ env.render()</br>
44
+ action = agent.act(observation)</br>
45
+ observation, reward, done, info = env.step(action)</br>
46
+ </br>
47
+ if done:</br>
48
+ break</br>
49
+ env.close()</br>
50
+ </code></pre>