Tau

File size: 4,140 Bytes

---

license: mit
---


# Tau LLM Unity ML Agents Project

Welcome to the Tau LLM Unity ML Agents Project repository! This project focuses on training reinforcement learning agents using Unity ML-Agents and the PPO algorithm. Our goal is to optimize the performance of the agents through various configurations and training runs.

## Project Overview

This repository contains the code and configurations for training agents in a Unity environment using the Proximal Policy Optimization (PPO) algorithm. The agents are designed to learn and adapt to their environment, improving their performance over time.

### Key Features

- **Reinforcement Learning**: Utilizes the PPO algorithm for training agents.
- **Unity ML-Agents**: Integrates with Unity ML-Agents for a seamless training experience.
- **Custom Reward Functions**: Implements gradient-based reward functions for nuanced feedback.
- **Memory Networks**: Incorporates memory networks to handle temporal dependencies.
- **TensorBoard Integration**: Monitors training progress and performance using TensorBoard.

## Configuration

Below is the configuration used for training the agents:

```yaml

behaviors:

  TauAgent:

    trainer_type: ppo

    hyperparameters:

      batch_size: 256

      buffer_size: 4096

      learning_rate: 0.00003

      beta: 0.005

      epsilon: 0.2

      lambd: 0.95

      num_epoch: 10

      learning_rate_schedule: linear

    network_settings:

      normalize: true

      hidden_units: 256

      num_layers: 4

      vis_encode_type: simple

      memory:

        memory_size: 256

        sequence_length: 256

        num_layers: 4

    reward_signals:

      extrinsic:

        gamma: 0.99

        strength: 1.0

      curiosity:

        gamma: 0.995

        strength: 0.1

        network_settings:

          normalize: true

          hidden_units: 256

          num_layers: 4

          learning_rate: 0.00003

    keep_checkpoints: 10

    checkpoint_interval: 100000

    threaded: true

    max_steps: 3000000

    time_horizon: 256

    summary_freq: 10000

```

## Model Naming Convention

The models in this repository follow the naming convention `Tau_<series>_<max_steps>`. This helps in easily identifying the series and the number of training steps for each model.

## Getting Started

### Prerequisites

- Unity 6
- Unity ML-Agents Toolkit
- Python 3.10.11
- PyTorch
- Transformers

### Installation

1. Clone the repository:
   ```bash

   git clone https://github.com/p3nGu1nZz/Tau.git

   cd tau\MLAgentsProject

   ```

2. Install the required Python packages:
   ```bash

   pip install -r requirements.txt

   ```

3. Open the Unity project:
   - Launch Unity Hub and open the project folder.

### Training the Agent

To start training the agent, run the following command:
```bash

mlagents-learn .\config\tau_agent_ppo_c.yaml --run-id=tau_agent_ppo_A0 --env .\Build --torch-device cuda --timeout-wait 300 --force

```
Note: The preferred way to run a build is by creating a new build into the `Build` directory which is referenced by the above command.

### Monitoring Training

You can monitor the training progress using TensorBoard:
```bash

tensorboard --logdir results 

```

## Results

The training results, including the average reward and cumulative reward, can be visualized using TensorBoard. The graphs below show the performance of the agent over time:

![Average Reward](path/to/average_reward.png)
![Cumulative Reward](path/to/cumulative_reward.png)

## Citation

If you use this project in your research, please cite it as follows:

```bibtex

@misc{Tau,

  author = {K. Rawson},

  title = {Tau LLM Unity ML Agents Project},

  year = {2024},

  publisher = {GitHub},

  journal = {GitHub repository},

  howpublished = {\url{https://github.com/p3nGu1nZz/Tau}},

}

```

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

- Unity ML-Agents Toolkit
- TensorFlow and PyTorch communities
- Hugging Face for hosting the model repository