|
# Upgrading |
|
|
|
# Migrating |
|
<!--- |
|
TODO: update ml-agents-env package version before release |
|
---> |
|
## Migrating to the ml-agents-envs 0.29.0 package |
|
- Python 3.8 is now the minimum version of python supported due to [python3.6 EOL](https://endoflife.date/python). |
|
Please update your python installation to 3.8.13 or higher. |
|
- The `gym-unity` package has been refactored into the `ml-agents-envs` package. Please update your imports accordingly. |
|
- Example: |
|
- Before |
|
```python |
|
from gym_unity.unity_gym_env import UnityToGymWrapper |
|
``` |
|
- After: |
|
```python |
|
from mlagents_envs.envs.unity_gym_env import UnityToGymWrapper |
|
``` |
|
|
|
|
|
## Migrating the package to version 2.0 |
|
- The official version of Unity ML-Agents supports is now 2022.3 LTS. If you run |
|
into issues, please consider deleting your project's Library folder and reponening your |
|
project. |
|
- If you used any of the APIs that were deprecated before version 2.0, you need to use their replacement. These |
|
deprecated APIs have been removed. See the migration steps bellow for specific API replacements. |
|
|
|
### Deprecated methods removed |
|
| **Deprecated API** | **Suggested Replacement** | |
|
|:-------:|:------:| |
|
| `IActuator ActuatorComponent.CreateActuator()` | `IActuator[] ActuatorComponent.CreateActuators()` | |
|
| `IActionReceiver.PackActions(in float[] destination)` | none | |
|
| `Agent.CollectDiscreteActionMasks(DiscreteActionMasker actionMasker)` | `Agent.WriteDiscreteActionMask(IDiscreteActionMask actionMask)` | |
|
| `Agent.Heuristic(float[] actionsOut)` | `Agent.Heuristic(in ActionBuffers actionsOut)` | |
|
| `Agent.OnActionReceived(float[] vectorAction)` | `Agent.OnActionReceived(ActionBuffers actions)` | |
|
| `Agent.GetAction()` | `Agent.GetStoredActionBuffers()` | |
|
| `BrainParameters.SpaceType`, `VectorActionSize`, `VectorActionSpaceType`, and `NumActions` | `BrainParameters.ActionSpec` | |
|
| `ObservationWriter.AddRange(IEnumerable<float> data, int writeOffset = 0)` | `ObservationWriter. AddList(IList<float> data, int writeOffset = 0` | |
|
| `SensorComponent.IsVisual()` and `IsVector()` | none | |
|
| `VectorSensor.AddObservation(IEnumerable<float> observation)` | `VectorSensor.AddObservation(IList<float> observation)` | |
|
| `SideChannelsManager` | `SideChannelManager` | |
|
|
|
### IDiscreteActionMask changes |
|
- The interface for disabling specific discrete actions has changed. `IDiscreteActionMask.WriteMask()` was removed, |
|
and replaced with `SetActionEnabled()`. Instead of returning an IEnumerable with indices to disable, you can |
|
now call `SetActionEnabled` for each index to disable (or enable). As an example, if you overrode |
|
`Agent.WriteDiscreteActionMask()` with something that looked like: |
|
|
|
```csharp |
|
public override void WriteDiscreteActionMask(IDiscreteActionMask actionMask) |
|
{ |
|
var branch = 2; |
|
var actionsToDisable = new[] {1, 3}; |
|
actionMask.WriteMask(branch, actionsToDisable); |
|
} |
|
``` |
|
|
|
the equivalent code would now be |
|
|
|
```csharp |
|
public override void WriteDiscreteActionMask(IDiscreteActionMask actionMask) |
|
{ |
|
var branch = 2; |
|
actionMask.SetActionEnabled(branch, 1, false); |
|
actionMask.SetActionEnabled(branch, 3, false); |
|
} |
|
``` |
|
### IActuator changes |
|
- The `IActuator` interface now implements `IHeuristicProvider`. Please add the corresponding `Heuristic(in ActionBuffers)` |
|
method to your custom Actuator classes. |
|
|
|
### ISensor and SensorComponent changes |
|
- The `ISensor.GetObservationShape()` method and `ITypedSensor` |
|
and `IDimensionPropertiesSensor` interfaces were removed, and `GetObservationSpec()` was added. You can use |
|
`ObservationSpec.Vector()` or `ObservationSpec.Visual()` to generate `ObservationSpec`s that are equivalent to |
|
the previous shape. For example, if your old ISensor looked like: |
|
|
|
```csharp |
|
public override int[] GetObservationShape() |
|
{ |
|
return new[] { m_Height, m_Width, m_NumChannels }; |
|
} |
|
``` |
|
|
|
the equivalent code would now be |
|
|
|
```csharp |
|
public override ObservationSpec GetObservationSpec() |
|
{ |
|
return ObservationSpec.Visual(m_Height, m_Width, m_NumChannels); |
|
} |
|
``` |
|
|
|
- The `ISensor.GetCompressionType()` method and `ISparseChannelSensor` interface was removed, |
|
and `GetCompressionSpec()` was added. You can use `CompressionSpec.Default()` or |
|
`CompressionSpec.Compressed()` to generate `CompressionSpec`s that are equivalent to |
|
the previous values. For example, if your old ISensor looked like: |
|
```csharp |
|
public virtual SensorCompressionType GetCompressionType() |
|
{ |
|
return SensorCompressionType.None; |
|
} |
|
``` |
|
|
|
the equivalent code would now be |
|
|
|
```csharp |
|
public CompressionSpec GetCompressionSpec() |
|
{ |
|
return CompressionSpec.Default(); |
|
} |
|
``` |
|
|
|
- The abstract method `SensorComponent.GetObservationShape()` was removed. |
|
- The abstract method `SensorComponent.CreateSensor()` was replaced with `CreateSensors()`, which returns an `ISensor[]`. |
|
|
|
### Match3 integration changes |
|
The Match-3 integration utilities were moved from `com.unity.ml-agents.extensions` to `com.unity.ml-agents`. |
|
|
|
The `AbstractBoard` interface was changed: |
|
* `AbstractBoard` no longer contains `Rows`, `Columns`, `NumCellTypes`, and `NumSpecialTypes` fields. |
|
* `public abstract BoardSize GetMaxBoardSize()` was added as an abstract method. `BoardSize` is a new struct that |
|
contains `Rows`, `Columns`, `NumCellTypes`, and `NumSpecialTypes` fields, with the same meanings as the old |
|
`AbstractBoard` fields. |
|
* `public virtual BoardSize GetCurrentBoardSize()` is an optional method; by default it returns `GetMaxBoardSize()`. If |
|
you wish to use a single behavior to work with multiple board sizes, override `GetCurrentBoardSize()` to return the |
|
current `BoardSize`. The values returned by `GetCurrentBoardSize()` must be less than or equal to the corresponding |
|
values from `GetMaxBoardSize()`. |
|
|
|
### GridSensor changes |
|
The sensor configuration has changed: |
|
* The sensor implementation has been refactored and exsisting GridSensor created from extension package |
|
will not work in newer version. Some errors might show up when loading the old sensor in the scene. |
|
You'll need to remove the old sensor and create a new GridSensor. |
|
* These parameters names have changed but still refer to the same concept in the sensor: `GridNumSide` -> `GridSize`, |
|
`RotateToAgent` -> `RotateWithAgent`, `ObserveMask` -> `ColliderMask`, `DetectableObjects` -> `DetectableTags` |
|
* `DepthType` (`ChanelBase`/`ChannelHot`) option and `ChannelDepth` are removed. Now the default is |
|
one-hot encoding for detected tag. If you were using original GridSensor without overriding any method, |
|
switching to new GridSensor will produce similar effect for training although the actual observations |
|
will be slightly different. |
|
|
|
For creating your GridSensor implementation with custom data: |
|
* To create custom GridSensor, derive from `GridSensorBase` instead of `GridSensor`. Besides overriding |
|
`GetObjectData()`, you will also need to consider override `GetCellObservationSize()`, `IsDataNormalized()` |
|
and `GetProcessCollidersMethod()` according to the data you collect. Also you'll need to override |
|
`GridSensorComponent.GetGridSensors()` and return your custom GridSensor. |
|
* The input argument `tagIndex` in `GetObjectData()` has changed from 1-indexed to 0-indexed and the |
|
data type changed from `float` to `int`. The index of first detectable tag will be 0 instead of 1. |
|
`normalizedDistance` was removed from input. |
|
* The observation data should be written to the input `dataBuffer` instead of creating and returning a new array. |
|
* Removed the constraint of all data required to be normalized. You should specify it in `IsDataNormalized()`. |
|
Sensors with non-normalized data cannot use PNG compression type. |
|
* The sensor will not further encode the data recieved from `GetObjectData()` anymore. The values |
|
recieved from `GetObjectData()` will be the observation sent to the trainer. |
|
|
|
### LSTM models from previous releases no longer supported |
|
The way the Unity Inference Engine processes LSTM (recurrent neural networks) has changed. As a result, models |
|
trained with previous versions of ML-Agents will not be usable at inference if they were trained with a `memory` |
|
setting in the `.yaml` config file. |
|
If you want to use a model that has a recurrent neural network in this release of ML-Agents, you need to train |
|
the model using the python trainer from this release. |
|
|
|
|
|
## Migrating to Release 13 |
|
### Implementing IHeuristic in your IActuator implementations |
|
- If you have any custom actuators, you can now implement the `IHeuristicProvider` interface to have your actuator |
|
handle the generation of actions when an Agent is running in heuristic mode. |
|
- `VectorSensor.AddObservation(IEnumerable<float>)` is deprecated. Use `VectorSensor.AddObservation(IList<float>)` |
|
instead. |
|
- `ObservationWriter.AddRange()` is deprecated. Use `ObservationWriter.AddList()` instead. |
|
- `ActuatorComponent.CreateAcuator()` is deprecated. Please use override `ActuatorComponent.CreateActuators` |
|
instead. Since `ActuatorComponent.CreateActuator()` is abstract, you will still need to override it in your |
|
class until it is removed. It is only ever called if you don't override `ActuatorComponent.CreateActuators`. |
|
You can suppress the warnings by surrounding the method with the following pragma: |
|
```c# |
|
#pragma warning disable 672 |
|
public IActuator CreateActuator() { ... } |
|
#pragma warning restore 672 |
|
``` |
|
|
|
|
|
# Migrating |
|
## Migrating to Release 11 |
|
### Agent virtual method deprecation |
|
- `Agent.CollectDiscreteActionMasks()` was deprecated and should be replaced with `Agent.WriteDiscreteActionMask()` |
|
- `Agent.Heuristic(float[])` was deprecated and should be replaced with `Agent.Heuristic(ActionBuffers)`. |
|
- `Agent.OnActionReceived(float[])` was deprecated and should be replaced with `Agent.OnActionReceived(ActionBuffers)`. |
|
- `Agent.GetAction()` was deprecated and should be replaced with `Agent.GetStoredActionBuffers()`. |
|
|
|
The default implementation of these will continue to call the deprecated versions where appropriate. However, the |
|
deprecated versions may not be compatible with continuous and discrete actions on the same Agent. |
|
|
|
### BrainParameters field and method deprecation |
|
- `BrainParameters.VectorActionSize` was deprecated; you can now set `BrainParameters.ActionSpec.NumContinuousActions` |
|
or `BrainParameters.ActionSpec.BranchSizes` instead. |
|
- `BrainParameters.VectorActionSpaceType` was deprecated, since both continuous and discrete actions can now be used. |
|
- `BrainParameters.NumActions()` was deprecated. Use `BrainParameters.ActionSpec.NumContinuousActions` and |
|
`BrainParameters.ActionSpec.NumDiscreteActions` instead. |
|
|
|
## Migrating from Release 7 to latest |
|
|
|
### Important changes |
|
- Some trainer files were moved. If you were using the `TrainerFactory` class, it was moved to |
|
the `trainers/trainer` folder. |
|
- The `components` folder containing `bc` and `reward_signals` code was moved to the `trainers/tf` |
|
folder |
|
|
|
### Steps to Migrate |
|
- Replace calls to `from mlagents.trainers.trainer_util import TrainerFactory` to `from mlagents.trainers.trainer import TrainerFactory` |
|
- Replace calls to `from mlagents.trainers.trainer_util import handle_existing_directories` to `from mlagents.trainers.directory_utils import validate_existing_directories` |
|
- Replace `mlagents.trainers.components` with `mlagents.trainers.tf.components` in your import statements. |
|
|
|
|
|
## Migrating from Release 3 to Release 7 |
|
|
|
### Important changes |
|
- The Parameter Randomization feature has been merged with the Curriculum feature. It is now possible to specify a sampler |
|
in the lesson of a Curriculum. Curriculum has been refactored and is now specified at the level of the parameter, not the |
|
behavior. More information |
|
[here](https://github.com/Unity-Technologies/ml-agents/blob/release_20_docs/docs/Training-ML-Agents.md).(#4160) |
|
|
|
### Steps to Migrate |
|
- The configuration format for curriculum and parameter randomization has changed. To upgrade your configuration files, |
|
an upgrade script has been provided. Run `python -m mlagents.trainers.upgrade_config -h` to see the script usage. Note that you will have had to upgrade to/install the current version of ML-Agents before running the script. To update manually: |
|
- If your config file used a `parameter_randomization` section, rename that section to `environment_parameters` |
|
- If your config file used a `curriculum` section, you will need to rewrite your curriculum with this [format](Training-ML-Agents.md#curriculum). |
|
|
|
## Migrating from Release 1 to Release 3 |
|
|
|
### Important changes |
|
- Training artifacts (trained models, summaries) are now found under `results/` |
|
instead of `summaries/` and `models/`. |
|
- Trainer configuration, curriculum configuration, and parameter randomization |
|
configuration have all been moved to a single YAML file. (#3791) |
|
- Trainer configuration format has changed, and using a "default" behavior name has |
|
been deprecated. (#3936) |
|
- `max_step` in the `TerminalStep` and `TerminalSteps` objects was renamed `interrupted`. |
|
- On the UnityEnvironment API, `get_behavior_names()` and `get_behavior_specs()` methods were combined into the property `behavior_specs` that contains a mapping from behavior names to behavior spec. |
|
- `use_visual` and `allow_multiple_visual_obs` in the `UnityToGymWrapper` constructor |
|
were replaced by `allow_multiple_obs` which allows one or more visual observations and |
|
vector observations to be used simultaneously. |
|
- `--save-freq` has been removed from the CLI and is now configurable in the trainer configuration |
|
file. |
|
- `--lesson` has been removed from the CLI. Lessons will resume when using `--resume`. |
|
To start at a different lesson, modify your Curriculum configuration. |
|
|
|
### Steps to Migrate |
|
- To upgrade your configuration files, an upgrade script has been provided. Run |
|
`python -m mlagents.trainers.upgrade_config -h` to see the script usage. Note that you will have |
|
had to upgrade to/install the current version of ML-Agents before running the script. |
|
|
|
To do it manually, copy your `<BehaviorName>` sections from `trainer_config.yaml` into a separate trainer configuration file, under a `behaviors` section. |
|
The `default` section is no longer needed. This new file should be specific to your environment, and not contain |
|
configurations for multiple environments (unless they have the same Behavior Names). |
|
- You will need to reformat your trainer settings as per the [example](Training-ML-Agents.md). |
|
- If your training uses [curriculum](Training-ML-Agents.md#curriculum-learning), move those configurations under a `curriculum` section. |
|
- If your training uses [parameter randomization](Training-ML-Agents.md#environment-parameter-randomization), move |
|
the contents of the sampler config to `parameter_randomization` in the main trainer configuration. |
|
- If you are using `UnityEnvironment` directly, replace `max_step` with `interrupted` |
|
in the `TerminalStep` and `TerminalSteps` objects. |
|
- Replace usage of `get_behavior_names()` and `get_behavior_specs()` in UnityEnvironment with `behavior_specs`. |
|
- If you use the `UnityToGymWrapper`, remove `use_visual` and `allow_multiple_visual_obs` |
|
from the constructor and add `allow_multiple_obs = True` if the environment contains either |
|
both visual and vector observations or multiple visual observations. |
|
- If you were setting `--save-freq` in the CLI, add a `checkpoint_interval` value in your |
|
trainer configuration, and set it equal to `save-freq * n_agents_in_scene`. |
|
|
|
## Migrating from 0.15 to Release 1 |
|
|
|
### Important changes |
|
|
|
- The `MLAgents` C# namespace was renamed to `Unity.MLAgents`, and other nested |
|
namespaces were similarly renamed (#3843). |
|
- The `--load` and `--train` command-line flags have been deprecated and |
|
replaced with `--resume` and `--inference`. |
|
- Running with the same `--run-id` twice will now throw an error. |
|
- The `play_against_current_self_ratio` self-play trainer hyperparameter has |
|
been renamed to `play_against_latest_model_ratio` |
|
- Removed the multi-agent gym option from the gym wrapper. For multi-agent |
|
scenarios, use the [Low Level Python API](Python-LLAPI.md). |
|
- The low level Python API has changed. You can look at the document |
|
[Low Level Python API documentation](Python-LLAPI.md) for more information. If |
|
you use `mlagents-learn` for training, this should be a transparent change. |
|
- The obsolete `Agent` methods `GiveModel`, `Done`, `InitializeAgent`, |
|
`AgentAction` and `AgentReset` have been removed. |
|
- The signature of `Agent.Heuristic()` was changed to take a `float[]` as a |
|
parameter, instead of returning the array. This was done to prevent a common |
|
source of error where users would return arrays of the wrong size. |
|
- The SideChannel API has changed (#3833, #3660) : |
|
- Introduced the `SideChannelManager` to register, unregister and access side |
|
channels. |
|
- `EnvironmentParameters` replaces the default `FloatProperties`. You can |
|
access the `EnvironmentParameters` with |
|
`Academy.Instance.EnvironmentParameters` on C#. If you were previously |
|
creating a `UnityEnvironment` in python and passing it a |
|
`FloatPropertiesChannel`, create an `EnvironmentParametersChannel` instead. |
|
- `SideChannel.OnMessageReceived` is now a protected method (was public) |
|
- SideChannel IncomingMessages methods now take an optional default argument, |
|
which is used when trying to read more data than the message contains. |
|
- Added a feature to allow sending stats from C# environments to TensorBoard |
|
(and other python StatsWriters). To do this from your code, use |
|
`Academy.Instance.StatsRecorder.Add(key, value)`(#3660) |
|
- `num_updates` and `train_interval` for SAC have been replaced with |
|
`steps_per_update`. |
|
- The `UnityEnv` class from the `gym-unity` package was renamed |
|
`UnityToGymWrapper` and no longer creates the `UnityEnvironment`. Instead, the |
|
`UnityEnvironment` must be passed as input to the constructor of |
|
`UnityToGymWrapper` |
|
- Public fields and properties on several classes were renamed to follow Unity's |
|
C# style conventions. All public fields and properties now use "PascalCase" |
|
instead of "camelCase"; for example, `Agent.maxStep` was renamed to |
|
`Agent.MaxStep`. For a full list of changes, see the pull request. (#3828) |
|
- `WriteAdapter` was renamed to `ObservationWriter`. (#3834) |
|
|
|
### Steps to Migrate |
|
|
|
- In C# code, replace `using MLAgents` with `using Unity.MLAgents`. Replace |
|
other nested namespaces such as `using MLAgents.Sensors` with |
|
`using Unity.MLAgents.Sensors` |
|
- Replace the `--load` flag with `--resume` when calling `mlagents-learn`, and |
|
don't use the `--train` flag as training will happen by default. To run |
|
without training, use `--inference`. |
|
- To force-overwrite files from a pre-existing run, add the `--force` |
|
command-line flag. |
|
- The Jupyter notebooks have been removed from the repository. |
|
- If your Agent class overrides `Heuristic()`, change the signature to |
|
`public override void Heuristic(float[] actionsOut)` and assign values to |
|
`actionsOut` instead of returning an array. |
|
- If you used `SideChannels` you must: |
|
- Replace `Academy.FloatProperties` with |
|
`Academy.Instance.EnvironmentParameters`. |
|
- `Academy.RegisterSideChannel` and `Academy.UnregisterSideChannel` were |
|
removed. Use `SideChannelManager.RegisterSideChannel` and |
|
`SideChannelManager.UnregisterSideChannel` instead. |
|
- Set `steps_per_update` to be around equal to the number of agents in your |
|
environment, times `num_updates` and divided by `train_interval`. |
|
- Replace `UnityEnv` with `UnityToGymWrapper` in your code. The constructor no |
|
longer takes a file name as input but a fully constructed `UnityEnvironment` |
|
instead. |
|
- Update uses of "camelCase" fields and properties to "PascalCase". |
|
|
|
## Migrating from 0.14 to 0.15 |
|
|
|
### Important changes |
|
|
|
- The `Agent.CollectObservations()` virtual method now takes as input a |
|
`VectorSensor` sensor as argument. The `Agent.AddVectorObs()` methods were |
|
removed. |
|
- The `SetMask` was renamed to `SetMask` method must now be called on the |
|
`DiscreteActionMasker` argument of the `CollectDiscreteActionMasks` virtual |
|
method. |
|
- We consolidated our API for `DiscreteActionMasker`. `SetMask` takes two |
|
arguments : the branch index and the list of masked actions for that branch. |
|
- The `Monitor` class has been moved to the Examples Project. (It was prone to |
|
errors during testing) |
|
- The `MLAgents.Sensors` namespace has been introduced. All sensors classes are |
|
part of the `MLAgents.Sensors` namespace. |
|
- The `MLAgents.SideChannels` namespace has been introduced. All side channel |
|
classes are part of the `MLAgents.SideChannels` namespace. |
|
- The interface for `RayPerceptionSensor.PerceiveStatic()` was changed to take |
|
an input class and write to an output class, and the method was renamed to |
|
`Perceive()`. |
|
- The `SetMask` method must now be called on the `DiscreteActionMasker` argument |
|
of the `CollectDiscreteActionMasks` method. |
|
- The method `GetStepCount()` on the Agent class has been replaced with the |
|
property getter `StepCount` |
|
- The `--multi-gpu` option has been removed temporarily. |
|
- `AgentInfo.actionMasks` has been renamed to `AgentInfo.discreteActionMasks`. |
|
- `BrainParameters` and `SpaceType` have been removed from the public API |
|
- `BehaviorParameters` have been removed from the public API. |
|
- `DecisionRequester` has been made internal (you can still use the |
|
DecisionRequesterComponent from the inspector). `RepeatAction` was renamed |
|
`TakeActionsBetweenDecisions` for clarity. |
|
- The following methods in the `Agent` class have been renamed. The original |
|
method names will be removed in a later release: |
|
- `InitializeAgent()` was renamed to `Initialize()` |
|
- `AgentAction()` was renamed to `OnActionReceived()` |
|
- `AgentReset()` was renamed to `OnEpsiodeBegin()` |
|
- `Done()` was renamed to `EndEpisode()` |
|
- `GiveModel()` was renamed to `SetModel()` |
|
- The `IFloatProperties` interface has been removed. |
|
- The interface for SideChannels was changed: |
|
- In C#, `OnMessageReceived` now takes a `IncomingMessage` argument, and |
|
`QueueMessageToSend` takes an `OutgoingMessage` argument. |
|
- In python, `on_message_received` now takes a `IncomingMessage` argument, and |
|
`queue_message_to_send` takes an `OutgoingMessage` argument. |
|
- Automatic stepping for Academy is now controlled from the |
|
AutomaticSteppingEnabled property. |
|
|
|
### Steps to Migrate |
|
|
|
- Add the `using MLAgents.Sensors;` in addition to `using MLAgents;` on top of |
|
your Agent's script. |
|
- Replace your Agent's implementation of `CollectObservations()` with |
|
`CollectObservations(VectorSensor sensor)`. In addition, replace all calls to |
|
`AddVectorObs()` with `sensor.AddObservation()` or |
|
`sensor.AddOneHotObservation()` on the `VectorSensor` passed as argument. |
|
- Replace your calls to `SetActionMask` on your Agent to |
|
`DiscreteActionMasker.SetActionMask` in `CollectDiscreteActionMasks`. |
|
- If you call `RayPerceptionSensor.PerceiveStatic()` manually, add your inputs |
|
to a `RayPerceptionInput`. To get the previous float array output, iterate |
|
through `RayPerceptionOutput.rayOutputs` and call |
|
`RayPerceptionOutput.RayOutput.ToFloatArray()`. |
|
- Replace all calls to `Agent.GetStepCount()` with `Agent.StepCount` |
|
- We strongly recommend replacing the following methods with their new |
|
equivalent as they will be removed in a later release: |
|
- `InitializeAgent()` to `Initialize()` |
|
- `AgentAction()` to `OnActionReceived()` |
|
- `AgentReset()` to `OnEpisodeBegin()` |
|
- `Done()` to `EndEpisode()` |
|
- `GiveModel()` to `SetModel()` |
|
- Replace `IFloatProperties` variables with `FloatPropertiesChannel` variables. |
|
- If you implemented custom `SideChannels`, update the signatures of your |
|
methods, and add your data to the `OutgoingMessage` or read it from the |
|
`IncomingMessage`. |
|
- Replace calls to Academy.EnableAutomaticStepping()/DisableAutomaticStepping() |
|
with Academy.AutomaticSteppingEnabled = true/false. |
|
|
|
## Migrating from 0.13 to 0.14 |
|
|
|
### Important changes |
|
|
|
- The `UnitySDK` folder has been split into a Unity Package |
|
(`com.unity.ml-agents`) and an examples project (`Project`). Please follow the |
|
[Installation Guide](Installation.md) to get up and running with this new repo |
|
structure. |
|
- Several changes were made to how agents are reset and marked as done: |
|
- Calling `Done()` on the Agent will now reset it immediately and call the |
|
`AgentReset` virtual method. (This is to simplify the previous logic in |
|
which the Agent had to wait for the next `EnvironmentStep` to reset) |
|
- The "Reset on Done" setting in AgentParameters was removed; this is now |
|
effectively always true. `AgentOnDone` virtual method on the Agent has been |
|
removed. |
|
- The `Decision Period` and `On Demand decision` checkbox have been removed from |
|
the Agent. On demand decision is now the default (calling `RequestDecision` on |
|
the Agent manually.) |
|
- The Academy class was changed to a singleton, and its virtual methods were |
|
removed. |
|
- Trainer steps are now counted per-Agent, not per-environment as in previous |
|
versions. For instance, if you have 10 Agents in the scene, 20 environment |
|
steps now corresponds to 200 steps as printed in the terminal and in |
|
Tensorboard. |
|
- Curriculum config files are now YAML formatted and all curricula for a |
|
training run are combined into a single file. |
|
- The `--num-runs` command-line option has been removed from `mlagents-learn`. |
|
- Several fields on the Agent were removed or made private in order to simplify |
|
the interface. |
|
- The `agentParameters` field of the Agent has been removed. (Contained only |
|
`maxStep` information) |
|
- `maxStep` is now a public field on the Agent. (Was moved from |
|
`agentParameters`) |
|
- The `Info` field of the Agent has been made private. (Was only used |
|
internally and not meant to be modified outside of the Agent) |
|
- The `GetReward()` method on the Agent has been removed. (It was being |
|
confused with `GetCumulativeReward()`) |
|
- The `AgentAction` struct no longer contains a `value` field. (Value |
|
estimates were not set during inference) |
|
- The `GetValueEstimate()` method on the Agent has been removed. |
|
- The `UpdateValueAction()` method on the Agent has been removed. |
|
- The deprecated `RayPerception3D` and `RayPerception2D` classes were removed, |
|
and the `legacyHitFractionBehavior` argument was removed from |
|
`RayPerceptionSensor.PerceiveStatic()`. |
|
- RayPerceptionSensor was inconsistent in how it handle scale on the Agent's |
|
transform. It now scales the ray length and sphere size for casting as the |
|
transform's scale changes. |
|
|
|
### Steps to Migrate |
|
|
|
- Follow the instructions on how to install the `com.unity.ml-agents` package |
|
into your project in the [Installation Guide](Installation.md). |
|
- If your Agent implemented `AgentOnDone` and did not have the checkbox |
|
`Reset On Done` checked in the inspector, you must call the code that was in |
|
`AgentOnDone` manually. |
|
- If you give your Agent a reward or penalty at the end of an episode (e.g. for |
|
reaching a goal or falling off of a platform), make sure you call |
|
`AddReward()` or `SetReward()` _before_ calling `Done()`. Previously, the |
|
order didn't matter. |
|
- If you were not using `On Demand Decision` for your Agent, you **must** add a |
|
`DecisionRequester` component to your Agent GameObject and set its |
|
`Decision Period` field to the old `Decision Period` of the Agent. |
|
- If you have a class that inherits from Academy: |
|
- If the class didn't override any of the virtual methods and didn't store any |
|
additional data, you can just remove the old script from the scene. |
|
- If the class had additional data, create a new MonoBehaviour and store the |
|
data in the new MonoBehaviour instead. |
|
- If the class overrode the virtual methods, create a new MonoBehaviour and |
|
move the logic to it: |
|
- Move the InitializeAcademy code to MonoBehaviour.Awake |
|
- Move the AcademyStep code to MonoBehaviour.FixedUpdate |
|
- Move the OnDestroy code to MonoBehaviour.OnDestroy. |
|
- Move the AcademyReset code to a new method and add it to the |
|
Academy.OnEnvironmentReset action. |
|
- Multiply `max_steps` and `summary_freq` in your `trainer_config.yaml` by the |
|
number of Agents in the scene. |
|
- Combine curriculum configs into a single file. See |
|
[the WallJump curricula](https://github.com/Unity-Technologies/ml-agents/blob/0.14.1/config/curricula/wall_jump.yaml) for an example of |
|
the new curriculum config format. A tool like https://www.json2yaml.com may be |
|
useful to help with the conversion. |
|
- If you have a model trained which uses RayPerceptionSensor and has non-1.0 |
|
scale in the Agent's transform, it must be retrained. |
|
|
|
## Migrating from ML-Agents Toolkit v0.12.0 to v0.13.0 |
|
|
|
### Important changes |
|
|
|
- The low level Python API has changed. You can look at the document |
|
[Low Level Python API documentation](Python-LLAPI.md) for more information. This |
|
should only affect you if you're writing a custom trainer; if you use |
|
`mlagents-learn` for training, this should be a transparent change. |
|
- `reset()` on the Low-Level Python API no longer takes a `train_mode` |
|
argument. To modify the performance/speed of the engine, you must use an |
|
`EngineConfigurationChannel` |
|
- `reset()` on the Low-Level Python API no longer takes a `config` argument. |
|
`UnityEnvironment` no longer has a `reset_parameters` field. To modify float |
|
properties in the environment, you must use a `FloatPropertiesChannel`. For |
|
more information, refer to the |
|
[Low Level Python API documentation](Python-LLAPI.md) |
|
- `CustomResetParameters` are now removed. |
|
- The Academy no longer has a `Training Configuration` nor |
|
`Inference Configuration` field in the inspector. To modify the configuration |
|
from the Low-Level Python API, use an `EngineConfigurationChannel`. To modify |
|
it during training, use the new command line arguments `--width`, `--height`, |
|
`--quality-level`, `--time-scale` and `--target-frame-rate` in |
|
`mlagents-learn`. |
|
- The Academy no longer has a `Default Reset Parameters` field in the inspector. |
|
The Academy class no longer has a `ResetParameters`. To access shared float |
|
properties with Python, use the new `FloatProperties` field on the Academy. |
|
- Offline Behavioral Cloning has been removed. To learn from demonstrations, use |
|
the GAIL and Behavioral Cloning features with either PPO or SAC. |
|
- `mlagents.envs` was renamed to `mlagents_envs`. The previous repo layout |
|
depended on [PEP420](https://www.python.org/dev/peps/pep-0420/), which caused |
|
problems with some of our tooling such as mypy and pylint. |
|
- The official version of Unity ML-Agents supports is now 2018.4 LTS. If you run |
|
into issues, please consider deleting your library folder and reponening your |
|
projects. You will need to install the Barracuda package into your project in |
|
order to ML-Agents to compile correctly. |
|
|
|
### Steps to Migrate |
|
|
|
- If you had a custom `Training Configuration` in the Academy inspector, you |
|
will need to pass your custom configuration at every training run using the |
|
new command line arguments `--width`, `--height`, `--quality-level`, |
|
`--time-scale` and `--target-frame-rate`. |
|
- If you were using `--slow` in `mlagents-learn`, you will need to pass your old |
|
`Inference Configuration` of the Academy inspector with the new command line |
|
arguments `--width`, `--height`, `--quality-level`, `--time-scale` and |
|
`--target-frame-rate` instead. |
|
- Any imports from `mlagents.envs` should be replaced with `mlagents_envs`. |
|
|
|
## Migrating from ML-Agents Toolkit v0.11.0 to v0.12.0 |
|
|
|
### Important Changes |
|
|
|
- Text actions and observations, and custom action and observation protos have |
|
been removed. |
|
- RayPerception3D and RayPerception2D are marked deprecated, and will be removed |
|
in a future release. They can be replaced by RayPerceptionSensorComponent3D |
|
and RayPerceptionSensorComponent2D. |
|
- The `Use Heuristic` checkbox in Behavior Parameters has been replaced with a |
|
`Behavior Type` dropdown menu. This has the following options: |
|
- `Default` corresponds to the previous unchecked behavior, meaning that |
|
Agents will train if they connect to a python trainer, otherwise they will |
|
perform inference. |
|
- `Heuristic Only` means the Agent will always use the `Heuristic()` method. |
|
This corresponds to having "Use Heuristic" selected in 0.11.0. |
|
- `Inference Only` means the Agent will always perform inference. |
|
- Barracuda was upgraded to 0.3.2, and it is now installed via the Unity Package |
|
Manager. |
|
|
|
### Steps to Migrate |
|
|
|
- We [fixed a bug](https://github.com/Unity-Technologies/ml-agents/pull/2823) in |
|
`RayPerception3d.Perceive()` that was causing the `endOffset` to be used |
|
incorrectly. However this may produce different behavior from previous |
|
versions if you use a non-zero `startOffset`. To reproduce the old behavior, |
|
you should increase the value of `endOffset` by `startOffset`. You can |
|
verify your raycasts are performing as expected in scene view using the debug |
|
rays. |
|
- If you use RayPerception3D, replace it with RayPerceptionSensorComponent3D |
|
(and similarly for 2D). The settings, such as ray angles and detectable tags, |
|
are configured on the component now. RayPerception3D would contribute |
|
`(# of rays) * (# of tags + 2)` to the State Size in Behavior Parameters, but |
|
this is no longer necessary, so you should reduce the State Size by this |
|
amount. Making this change will require retraining your model, since the |
|
observations that RayPerceptionSensorComponent3D produces are different from |
|
the old behavior. |
|
- If you see messages such as |
|
`The type or namespace 'Barracuda' could not be found` or |
|
`The type or namespace 'Google' could not be found`, you will need to |
|
[install the Barracuda preview package](Installation.md#package-installation). |
|
|
|
## Migrating from ML-Agents Toolkit v0.10 to v0.11.0 |
|
|
|
### Important Changes |
|
|
|
- The definition of the gRPC service has changed. |
|
- The online BC training feature has been removed. |
|
- The BroadcastHub has been deprecated. If there is a training Python process, |
|
all LearningBrains in the scene will automatically be trained. If there is no |
|
Python process, inference will be used. |
|
- The Brain ScriptableObjects have been deprecated. The Brain Parameters are now |
|
on the Agent and are referred to as Behavior Parameters. Make sure the |
|
Behavior Parameters is attached to the Agent GameObject. |
|
- To use a heuristic behavior, implement the `Heuristic()` method in the Agent |
|
class and check the `use heuristic` checkbox in the Behavior Parameters. |
|
- Several changes were made to the setup for visual observations (i.e. using |
|
Cameras or RenderTextures): |
|
- Camera resolutions are no longer stored in the Brain Parameters. |
|
- AgentParameters no longer stores lists of Cameras and RenderTextures |
|
- To add visual observations to an Agent, you must now attach a |
|
CameraSensorComponent or RenderTextureComponent to the agent. The |
|
corresponding Camera or RenderTexture can be added to these in the editor, |
|
and the resolution and color/grayscale is configured on the component |
|
itself. |
|
|
|
#### Steps to Migrate |
|
|
|
- In order to be able to train, make sure both your ML-Agents Python package and |
|
UnitySDK code come from the v0.11 release. Training will not work, for |
|
example, if you update the ML-Agents Python package, and only update the API |
|
Version in UnitySDK. |
|
- If your Agents used visual observations, you must add a CameraSensorComponent |
|
corresponding to each old Camera in the Agent's camera list (and similarly for |
|
RenderTextures). |
|
- Since Brain ScriptableObjects have been removed, you will need to delete all |
|
the Brain ScriptableObjects from your `Assets` folder. Then, add a |
|
`Behavior Parameters` component to each `Agent` GameObject. You will then need |
|
to complete the fields on the new `Behavior Parameters` component with the |
|
BrainParameters of the old Brain. |
|
|
|
## Migrating from ML-Agents Toolkit v0.9 to v0.10 |
|
|
|
### Important Changes |
|
|
|
- We have updated the C# code in our repository to be in line with Unity Coding |
|
Conventions. This has changed the name of some public facing classes and |
|
enums. |
|
- The example environments have been updated. If you were using these |
|
environments to benchmark your training, please note that the resulting |
|
rewards may be slightly different in v0.10. |
|
|
|
#### Steps to Migrate |
|
|
|
- `UnitySDK/Assets/ML-Agents/Scripts/Communicator.cs` and its class |
|
`Communicator` have been renamed to |
|
`UnitySDK/Assets/ML-Agents/Scripts/ICommunicator.cs` and `ICommunicator` |
|
respectively. |
|
- The `SpaceType` Enums `discrete`, and `continuous` have been renamed to |
|
`Discrete` and `Continuous`. |
|
- We have removed the `Done` call as well as the capacity to set `Max Steps` on |
|
the Academy. Therefore an AcademyReset will never be triggered from C# (only |
|
from Python). If you want to reset the simulation after a fixed number of |
|
steps, or when an event in the simulation occurs, we recommend looking at our |
|
multi-agent example environments (such as FoodCollector). In our examples, |
|
groups of Agents can be reset through an "Area" that can reset groups of |
|
Agents. |
|
- The import for `mlagents.envs.UnityEnvironment` was removed. If you are using |
|
the Python API, change `from mlagents_envs import UnityEnvironment` to |
|
`from mlagents_envs.environment import UnityEnvironment`. |
|
|
|
## Migrating from ML-Agents Toolkit v0.8 to v0.9 |
|
|
|
### Important Changes |
|
|
|
- We have changed the way reward signals (including Curiosity) are defined in |
|
the `trainer_config.yaml`. |
|
- When using multiple environments, every "step" is recorded in TensorBoard. |
|
- The steps in the command line console corresponds to a single step of a single |
|
environment. Previously, each step corresponded to one step for all |
|
environments (i.e., `num_envs` steps). |
|
|
|
#### Steps to Migrate |
|
|
|
- If you were overriding any of these following parameters in your config file, |
|
remove them from the top-level config and follow the steps below: |
|
- `gamma`: Define a new `extrinsic` reward signal and set it's `gamma` to your |
|
new gamma. |
|
- `use_curiosity`, `curiosity_strength`, `curiosity_enc_size`: Define a |
|
`curiosity` reward signal and set its `strength` to `curiosity_strength`, |
|
and `encoding_size` to `curiosity_enc_size`. Give it the same `gamma` as |
|
your `extrinsic` signal to mimic previous behavior. |
|
- TensorBoards generated when running multiple environments in v0.8 are not |
|
comparable to those generated in v0.9 in terms of step count. Multiply your |
|
v0.8 step count by `num_envs` for an approximate comparison. You may need to |
|
change `max_steps` in your config as appropriate as well. |
|
|
|
## Migrating from ML-Agents Toolkit v0.7 to v0.8 |
|
|
|
### Important Changes |
|
|
|
- We have split the Python packages into two separate packages `ml-agents` and |
|
`ml-agents-envs`. |
|
- `--worker-id` option of `learn.py` has been removed, use `--base-port` instead |
|
if you'd like to run multiple instances of `learn.py`. |
|
|
|
#### Steps to Migrate |
|
|
|
- If you are installing via PyPI, there is no change. |
|
- If you intend to make modifications to `ml-agents` or `ml-agents-envs` please |
|
check the Installing for Development in the |
|
[Installation documentation](Installation.md). |
|
|
|
## Migrating from ML-Agents Toolkit v0.6 to v0.7 |
|
|
|
### Important Changes |
|
|
|
- We no longer support TFS and are now using the |
|
[Unity Inference Engine](Unity-Inference-Engine.md) |
|
|
|
#### Steps to Migrate |
|
|
|
- Make sure to remove the `ENABLE_TENSORFLOW` flag in your Unity Project |
|
settings |
|
|
|
## Migrating from ML-Agents Toolkit v0.5 to v0.6 |
|
|
|
### Important Changes |
|
|
|
- Brains are now Scriptable Objects instead of MonoBehaviors. |
|
- You can no longer modify the type of a Brain. If you want to switch between |
|
`PlayerBrain` and `LearningBrain` for multiple agents, you will need to assign |
|
a new Brain to each agent separately. **Note:** You can pass the same Brain to |
|
multiple agents in a scene by leveraging Unity's prefab system or look for all |
|
the agents in a scene using the search bar of the `Hierarchy` window with the |
|
word `Agent`. |
|
|
|
- We replaced the **Internal** and **External** Brain with **Learning Brain**. |
|
When you need to train a model, you need to drag it into the `Broadcast Hub` |
|
inside the `Academy` and check the `Control` checkbox. |
|
- We removed the `Broadcast` checkbox of the Brain, to use the broadcast |
|
functionality, you need to drag the Brain into the `Broadcast Hub`. |
|
- When training multiple Brains at the same time, each model is now stored into |
|
a separate model file rather than in the same file under different graph |
|
scopes. |
|
- The **Learning Brain** graph scope, placeholder names, output names and custom |
|
placeholders can no longer be modified. |
|
|
|
#### Steps to Migrate |
|
|
|
- To update a scene from v0.5 to v0.6, you must: |
|
- Remove the `Brain` GameObjects in the scene. (Delete all of the Brain |
|
GameObjects under Academy in the scene.) |
|
- Create new `Brain` Scriptable Objects using `Assets -> Create -> ML-Agents` |
|
for each type of the Brain you plan to use, and put the created files under |
|
a folder called Brains within your project. |
|
- Edit their `Brain Parameters` to be the same as the parameters used in the |
|
`Brain` GameObjects. |
|
- Agents have a `Brain` field in the Inspector, you need to drag the |
|
appropriate Brain ScriptableObject in it. |
|
- The Academy has a `Broadcast Hub` field in the inspector, which is list of |
|
brains used in the scene. To train or control your Brain from the |
|
`mlagents-learn` Python script, you need to drag the relevant |
|
`LearningBrain` ScriptableObjects used in your scene into entries into this |
|
list. |
|
|
|
## Migrating from ML-Agents Toolkit v0.4 to v0.5 |
|
|
|
### Important |
|
|
|
- The Unity project `unity-environment` has been renamed `UnitySDK`. |
|
- The `python` folder has been renamed to `ml-agents`. It now contains two |
|
packages, `mlagents.env` and `mlagents.trainers`. `mlagents.env` can be used |
|
to interact directly with a Unity environment, while `mlagents.trainers` |
|
contains the classes for training agents. |
|
- The supported Unity version has changed from `2017.1 or later` to |
|
`2017.4 or later`. 2017.4 is an LTS (Long Term Support) version that helps us |
|
maintain good quality and support. Earlier versions of Unity might still work, |
|
but you may encounter an |
|
[error](FAQ.md#instance-of-corebraininternal-couldnt-be-created) listed here. |
|
|
|
### Unity API |
|
|
|
- Discrete Actions now use [branches](https://arxiv.org/abs/1711.08946). You can |
|
now specify concurrent discrete actions. You will need to update the Brain |
|
Parameters in the Brain Inspector in all your environments that use discrete |
|
actions. Refer to the |
|
[discrete action documentation](Learning-Environment-Design-Agents.md#discrete-action-space) |
|
for more information. |
|
|
|
### Python API |
|
|
|
- In order to run a training session, you can now use the command |
|
`mlagents-learn` instead of `python3 learn.py` after installing the `mlagents` |
|
packages. This change is documented |
|
[here](Training-ML-Agents.md#training-with-mlagents-learn). For example, if we |
|
previously ran |
|
|
|
```sh |
|
python3 learn.py 3DBall --train |
|
``` |
|
|
|
from the `python` subdirectory (which is changed to `ml-agents` subdirectory |
|
in v0.5), we now run |
|
|
|
```sh |
|
mlagents-learn config/trainer_config.yaml --env=3DBall --train |
|
``` |
|
|
|
from the root directory where we installed the ML-Agents Toolkit. |
|
|
|
- It is now required to specify the path to the yaml trainer configuration file |
|
when running `mlagents-learn`. For an example trainer configuration file, see |
|
[trainer_config.yaml](https://github.com/Unity-Technologies/ml-agents/blob/0.5.0a/config/trainer_config.yaml). An example of passing a |
|
trainer configuration to `mlagents-learn` is shown above. |
|
- The environment name is now passed through the `--env` option. |
|
- Curriculum learning has been changed. In summary: |
|
- Curriculum files for the same environment must now be placed into a folder. |
|
Each curriculum file should be named after the Brain whose curriculum it |
|
specifies. |
|
- `min_lesson_length` now specifies the minimum number of episodes in a lesson |
|
and affects reward thresholding. |
|
- It is no longer necessary to specify the `Max Steps` of the Academy to use |
|
curriculum learning. |
|
|
|
## Migrating from ML-Agents Toolkit v0.3 to v0.4 |
|
|
|
### Unity API |
|
|
|
- `using MLAgents;` needs to be added in all of the C# scripts that use |
|
ML-Agents. |
|
|
|
### Python API |
|
|
|
- We've changed some of the Python packages dependencies in requirement.txt |
|
file. Make sure to run `pip3 install -e .` within your `ml-agents/python` |
|
folder to update your Python packages. |
|
|
|
## Migrating from ML-Agents Toolkit v0.2 to v0.3 |
|
|
|
There are a large number of new features and improvements in the ML-Agents |
|
toolkit v0.3 which change both the training process and Unity API in ways which |
|
will cause incompatibilities with environments made using older versions. This |
|
page is designed to highlight those changes for users familiar with v0.1 or v0.2 |
|
in order to ensure a smooth transition. |
|
|
|
### Important |
|
|
|
- The ML-Agents Toolkit is no longer compatible with Python 2. |
|
|
|
### Python Training |
|
|
|
- The training script `ppo.py` and `PPO.ipynb` Python notebook have been |
|
replaced with a single `learn.py` script as the launching point for training |
|
with ML-Agents. For more information on using `learn.py`, see |
|
[here](Training-ML-Agents.md#training-with-mlagents-learn). |
|
- Hyperparameters for training Brains are now stored in the |
|
`trainer_config.yaml` file. For more information on using this file, see |
|
[here](Training-ML-Agents.md#training-configurations). |
|
|
|
### Unity API |
|
|
|
- Modifications to an Agent's rewards must now be done using either |
|
`AddReward()` or `SetReward()`. |
|
- Setting an Agent to done now requires the use of the `Done()` method. |
|
- `CollectStates()` has been replaced by `CollectObservations()`, which now no |
|
longer returns a list of floats. |
|
- To collect observations, call `AddVectorObs()` within `CollectObservations()`. |
|
Note that you can call `AddVectorObs()` with floats, integers, lists and |
|
arrays of floats, Vector3 and Quaternions. |
|
- `AgentStep()` has been replaced by `AgentAction()`. |
|
- `WaitTime()` has been removed. |
|
- The `Frame Skip` field of the Academy is replaced by the Agent's |
|
`Decision Frequency` field, enabling the Agent to make decisions at different |
|
frequencies. |
|
- The names of the inputs in the Internal Brain have been changed. You must |
|
replace `state` with `vector_observation` and `observation` with |
|
`visual_observation`. In addition, you must remove the `epsilon` placeholder. |
|
|
|
### Semantics |
|
|
|
In order to more closely align with the terminology used in the Reinforcement |
|
Learning field, and to be more descriptive, we have changed the names of some of |
|
the concepts used in ML-Agents. The changes are highlighted in the table below. |
|
|
|
| Old - v0.2 and earlier | New - v0.3 and later | |
|
| ---------------------- | -------------------- | |
|
| State | Vector Observation | |
|
| Observation | Visual Observation | |
|
| Action | Vector Action | |
|
| N/A | Text Observation | |
|
| N/A | Text Action | |
|
|