metadata
library_name: stable-baselines3
tags:
- PandaReachDense-v3
- deep-reinforcement-learning
- reinforcement-learning
- stable-baselines3
model-index:
- name: A2C
results:
- task:
type: reinforcement-learning
name: reinforcement-learning
dataset:
name: PandaReachDense-v3
type: PandaReachDense-v3
metrics:
- type: mean_reward
value: '-0.20 +/- 0.09'
name: mean_reward
verified: false
A2C Agent playing PandaReachDense-v3
This is a trained model of a A2C agent playing PandaReachDense-v3 using the stable-baselines3 library.
Usage (with Stable-baselines3)
from stable_baselines3 import A2C
from huggingface_sb3 import load_from_hub
model = load_from_hub(repo_id='Francesco-A/a2c-PandaReachDense-v3',
filename= 'a2c-PandaReachDense-v3.zip')
Training details (last output)
Metric | Value |
---|---|
rollout/ep_len_mean | 4.05 |
rollout/ep_rew_mean | -0.317 |
time/fps | 378 |
time/iterations | 50000 |
time/time_elapsed | 2641 |
time/total_timesteps | 1000000 |
train/entropy_loss | 1.25 |
train/explained_variance | 0.975 |
train/learning_rate | 0.0007 |
train/n_updates | 49999 |
train/policy_loss | -0.0935 |
train/std | 0.185 |
train/value_loss | 0.0306 |