Tau / README.md
p3nGu1nZz's picture
update readme
e4dbea9
---
license: mit
---
# Tau LLM Unity ML Agents Project
Welcome to the Tau LLM Unity ML Agents Project repository! This project focuses on training reinforcement learning agents using Unity ML-Agents and the PPO algorithm. Our goal is to optimize the performance of the agents through various configurations and training runs.
## Project Overview
This repository contains the code and configurations for training agents in a Unity environment using the Proximal Policy Optimization (PPO) algorithm. The agents are designed to learn and adapt to their environment, improving their performance over time.
### Key Features
- **Reinforcement Learning**: Utilizes the PPO algorithm for training agents.
- **Unity ML-Agents**: Integrates with Unity ML-Agents for a seamless training experience.
- **Custom Reward Functions**: Implements gradient-based reward functions for nuanced feedback.
- **Memory Networks**: Incorporates memory networks to handle temporal dependencies.
- **TensorBoard Integration**: Monitors training progress and performance using TensorBoard.
## Configuration
Below is the configuration used for training the agents:
```yaml
behaviors:
TauAgent:
trainer_type: ppo
hyperparameters:
batch_size: 256
buffer_size: 4096
learning_rate: 0.00003
beta: 0.005
epsilon: 0.2
lambd: 0.95
num_epoch: 10
learning_rate_schedule: linear
network_settings:
normalize: true
hidden_units: 256
num_layers: 4
vis_encode_type: simple
memory:
memory_size: 256
sequence_length: 256
num_layers: 4
reward_signals:
extrinsic:
gamma: 0.99
strength: 1.0
curiosity:
gamma: 0.995
strength: 0.1
network_settings:
normalize: true
hidden_units: 256
num_layers: 4
learning_rate: 0.00003
keep_checkpoints: 10
checkpoint_interval: 100000
threaded: true
max_steps: 3000000
time_horizon: 256
summary_freq: 10000
```
## Model Naming Convention
The models in this repository follow the naming convention `Tau_<series>_<max_steps>`. This helps in easily identifying the series and the number of training steps for each model.
## Getting Started
### Prerequisites
- Unity 6
- Unity ML-Agents Toolkit
- Python 3.10.11
- PyTorch
- Transformers
### Installation
1. Clone the repository:
```bash
git clone https://github.com/p3nGu1nZz/Tau.git
cd tau\MLAgentsProject
```
2. Install the required Python packages:
```bash
pip install -r requirements.txt
```
3. Open the Unity project:
- Launch Unity Hub and open the project folder.
### Training the Agent
To start training the agent, run the following command:
```bash
mlagents-learn .\config\tau_agent_ppo_c.yaml --run-id=tau_agent_ppo_A0 --env .\Build --torch-device cuda --timeout-wait 300 --force
```
Note: The preferred way to run a build is by creating a new build into the `Build` directory which is referenced by the above command.
### Monitoring Training
You can monitor the training progress using TensorBoard:
```bash
tensorboard --logdir results
```
## Results
The training results, including the average reward and cumulative reward, can be visualized using TensorBoard. The graphs below show the performance of the agent over time:
![Average Reward](path/to/average_reward.png)
![Cumulative Reward](path/to/cumulative_reward.png)
## Citation
If you use this project in your research, please cite it as follows:
```bibtex
@misc{Tau,
author = {K. Rawson},
title = {Tau LLM Unity ML Agents Project},
year = {2024},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/p3nGu1nZz/Tau}},
}
```
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
- Unity ML-Agents Toolkit
- TensorFlow and PyTorch communities
- Hugging Face for hosting the model repository