sonny-dev commited on
Commit
7f844d4
·
1 Parent(s): d0fc446

First Push

Browse files
Files changed (3) hide show
  1. README.md +29 -0
  2. config.json +1 -0
  3. configuration.yaml +27 -0
README.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: ml-agents
3
+ tags:
4
+ - SnowballTarget
5
+ - deep-reinforcement-learning
6
+ - reinforcement-learning
7
+ - ML-Agents-SnowballTarget
8
+ ---
9
+
10
+ # **ppo** Agent playing **SnowballTarget**
11
+ This is a trained model of a **ppo** agent playing **SnowballTarget** using the [Unity ML-Agents Library](https://github.com/Unity-Technologies/ml-agents).
12
+
13
+ ## Usage (with ML-Agents)
14
+ The Documentation: https://github.com/huggingface/ml-agents#get-started
15
+ We wrote a complete tutorial to learn to train your first agent using ML-Agents and publish it to the Hub:
16
+
17
+
18
+ ### Resume the training
19
+ ```
20
+ mlagents-learn <your_configuration_file_path.yaml> --run-id=<run_id> --resume
21
+ ```
22
+ ### Watch your Agent play
23
+ You can watch your agent **playing directly in your browser:**.
24
+
25
+ 1. Go to https://huggingface.co/spaces/unity/ML-Agents-SnowballTarget
26
+ 2. Step 1: Find your model_id: sonny-dev/ppo-SnowballTargetTESTCOLAB
27
+ 3. Step 2: Select your *.nn /*.onnx file
28
+ 4. Click on Watch the agent play 👀
29
+
config.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"behaviors": {"SnowballTarget": {"trainer_type": "ppo", "summary_freq": 10000, "keep_checkpoints": 10, "checkpoint_interval": 50000, "max_steps": 200000, "time_horizon": 64, "threaded": true, "hyperparameters": {"learning_rate": 0.0003, "learning_rate_schedule": "linear", "batch_size": 128, "buffer_size": 2048, "beta": 0.005, "epsilon": 0.2, "lambd": 0.95, "num_epoch": 3}, "network_settings": {"normalize": false, "hidden_units": 256, "num_layers": 2, "vis_encode_type": "simple"}, "reward_signals": {"extrinsic": {"gamma": 0.99, "strength": 1.0}}}}}
configuration.yaml ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ behaviors:
2
+ SnowballTarget:
3
+ trainer_type: ppo
4
+ summary_freq: 10000
5
+ keep_checkpoints: 10
6
+ checkpoint_interval: 50000
7
+ max_steps: 200000
8
+ time_horizon: 64
9
+ threaded: true
10
+ hyperparameters:
11
+ learning_rate: 0.0003
12
+ learning_rate_schedule: linear
13
+ batch_size: 128
14
+ buffer_size: 2048
15
+ beta: 0.005
16
+ epsilon: 0.2
17
+ lambd: 0.95
18
+ num_epoch: 3
19
+ network_settings:
20
+ normalize: false
21
+ hidden_units: 256
22
+ num_layers: 2
23
+ vis_encode_type: simple
24
+ reward_signals:
25
+ extrinsic:
26
+ gamma: 0.99
27
+ strength: 1.0