Tau / README.md

update readme

e4dbea9 2 months ago

4.14 kB

	---
	license: mit
	---

	# Tau LLM Unity ML Agents Project

	Welcome to the Tau LLM Unity ML Agents Project repository! This project focuses on training reinforcement learning agents using Unity ML-Agents and the PPO algorithm. Our goal is to optimize the performance of the agents through various configurations and training runs.

	## Project Overview

	This repository contains the code and configurations for training agents in a Unity environment using the Proximal Policy Optimization (PPO) algorithm. The agents are designed to learn and adapt to their environment, improving their performance over time.

	### Key Features

	- Reinforcement Learning: Utilizes the PPO algorithm for training agents.
	- Unity ML-Agents: Integrates with Unity ML-Agents for a seamless training experience.
	- Custom Reward Functions: Implements gradient-based reward functions for nuanced feedback.
	- Memory Networks: Incorporates memory networks to handle temporal dependencies.
	- TensorBoard Integration: Monitors training progress and performance using TensorBoard.

	## Configuration

	Below is the configuration used for training the agents:

	```yaml
	behaviors:
	TauAgent:
	trainer_type: ppo
	hyperparameters:
	batch_size: 256
	buffer_size: 4096
	learning_rate: 0.00003
	beta: 0.005
	epsilon: 0.2
	lambd: 0.95
	num_epoch: 10
	learning_rate_schedule: linear
	network_settings:
	normalize: true
	hidden_units: 256
	num_layers: 4
	vis_encode_type: simple
	memory:
	memory_size: 256
	sequence_length: 256
	num_layers: 4
	reward_signals:
	extrinsic:
	gamma: 0.99
	strength: 1.0
	curiosity:
	gamma: 0.995
	strength: 0.1
	network_settings:
	normalize: true
	hidden_units: 256
	num_layers: 4
	learning_rate: 0.00003
	keep_checkpoints: 10
	checkpoint_interval: 100000
	threaded: true
	max_steps: 3000000
	time_horizon: 256
	summary_freq: 10000
	```

	## Model Naming Convention

	The models in this repository follow the naming convention `Tau_<series>_<max_steps>`. This helps in easily identifying the series and the number of training steps for each model.

	## Getting Started

	### Prerequisites

	- Unity 6
	- Unity ML-Agents Toolkit
	- Python 3.10.11
	- PyTorch
	- Transformers

	### Installation

	1. Clone the repository:
	```bash
	git clone https://github.com/p3nGu1nZz/Tau.git
	cd tau\MLAgentsProject
	```

	2. Install the required Python packages:
	```bash
	pip install -r requirements.txt
	```

	3. Open the Unity project:
	- Launch Unity Hub and open the project folder.

	### Training the Agent

	To start training the agent, run the following command:
	```bash
	mlagents-learn .\config\tau_agent_ppo_c.yaml --run-id=tau_agent_ppo_A0 --env .\Build --torch-device cuda --timeout-wait 300 --force
	```
	Note: The preferred way to run a build is by creating a new build into the `Build` directory which is referenced by the above command.

	### Monitoring Training

	You can monitor the training progress using TensorBoard:
	```bash
	tensorboard --logdir results
	```

	## Results

	The training results, including the average reward and cumulative reward, can be visualized using TensorBoard. The graphs below show the performance of the agent over time:

	![Average Reward](path/to/average_reward.png)
	![Cumulative Reward](path/to/cumulative_reward.png)

	## Citation

	If you use this project in your research, please cite it as follows:

	```bibtex
	@misc{Tau,
	author = {K. Rawson},
	title = {Tau LLM Unity ML Agents Project},
	year = {2024},
	publisher = {GitHub},
	journal = {GitHub repository},
	howpublished = {\url{https://github.com/p3nGu1nZz/Tau}},
	}
	```

	## License

	This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

	## Acknowledgments

	- Unity ML-Agents Toolkit
	- TensorFlow and PyTorch communities
	- Hugging Face for hosting the model repository