Trouble reproducing results

#1
by Stephanehk - opened

Hello,

I am attempting to use the ppo-Walker2d-v3.zip model, but am unable to get near the reported mean return of 3571. Instead, I consistently get a return of ~1 for episodes of 1000 steps. The snippet below illustrates how I am setting up the environment, perhaps the issue is here?

from sb3_contrib.common.wrappers import TimeFeatureWrapper
from gym.wrappers import NormalizeObservation
from stable_baselines3 import PPO
from huggingface_sb3 import load_from_hub
import gym

checkpoint = load_from_hub(
    repo_id="sb3/ppo-Walker2d-v3",
    filename="ppo-Walker2d-v3.zip",
)
custom_objects = {
    "learning_rate": 0.0,
    "lr_schedule": lambda _: 0.0,
    "clip_range": lambda _: 0.0,
}
expert_model = PPO.load(checkpoint,custom_objects=custom_objects)

env_name = "Walker2d-v3"
venv = gym.make(env_name)
venv = NormalizeObservation(venv)
venv = TimeFeatureWrapper(venv)
Stable-Baselines3 org

Hello,
Please follow the instructions and use the RL Zoo. You are not loading the normalizing statistics.

Stephanehk changed discussion status to closed

Sign up or log in to comment