p3nGu1nZz
/

Tau

ONNX

Model card Files Files and versions

xet

Community

p3nGu1nZz commited on May 14

Commit

e5844fa

verified ·

1 Parent(s): 83bab82

Update README.md

Browse files

Files changed (1) hide show

README.md +142 -141

README.md CHANGED Viewed

@@ -1,142 +1,143 @@
----
-license: mit
----
-# Tau LLM Unity ML Agents Project
-Welcome to the Tau LLM Unity ML Agents Project repository! This project focuses on training reinforcement learning agents using Unity ML-Agents and the PPO algorithm. Our goal is to optimize the performance of the agents through various configurations and training runs.
-## Project Overview
-This repository contains the code and configurations for training agents in a Unity environment using the Proximal Policy Optimization (PPO) algorithm. The agents are designed to learn and adapt to their environment, improving their performance over time.
-### Key Features
-- **Reinforcement Learning**: Utilizes the PPO algorithm for training agents.
-- **Unity ML-Agents**: Integrates with Unity ML-Agents for a seamless training experience.
-- **Custom Reward Functions**: Implements gradient-based reward functions for nuanced feedback.
-- **Memory Networks**: Incorporates memory networks to handle temporal dependencies.
-- **TensorBoard Integration**: Monitors training progress and performance using TensorBoard.
-## Configuration
-Below is the configuration used for training the agents:
-```yaml
-behaviors:
-  TauAgent:
-    trainer_type: ppo
-    hyperparameters:
-      batch_size: 256
-      buffer_size: 4096
-      learning_rate: 0.00003
-      beta: 0.005
-      epsilon: 0.2
-      lambd: 0.95
-      num_epoch: 10
-      learning_rate_schedule: linear
-    network_settings:
-      normalize: true
-      hidden_units: 256
-      num_layers: 4
-      vis_encode_type: simple
-      memory:
-        memory_size: 256
-        sequence_length: 256
-        num_layers: 4
-    reward_signals:
-      extrinsic:
-        gamma: 0.99
-        strength: 1.0
-      curiosity:
-        gamma: 0.995
-        strength: 0.1
-        network_settings:
-          normalize: true
-          hidden_units: 256
-          num_layers: 4
-          learning_rate: 0.00003
-    keep_checkpoints: 10
-    checkpoint_interval: 100000
-    threaded: true
-    max_steps: 3000000
-    time_horizon: 256
-    summary_freq: 10000
-```
-## Model Naming Convention
-The models in this repository follow the naming convention `Tau_<series>_<max_steps>`. This helps in easily identifying the series and the number of training steps for each model.
-## Getting Started
-### Prerequisites
-- Unity 6
-- Unity ML-Agents Toolkit
-- Python 3.10.11
-- PyTorch
-- Transformers
-### Installation
-1. Clone the repository:
-   ```bash
-   git clone https://github.com/p3nGu1nZz/Tau.git
-   cd tau\MLAgentsProject
-   ```
-2. Install the required Python packages:
-   ```bash
-   pip install -r requirements.txt
-   ```
-3. Open the Unity project:
-   - Launch Unity Hub and open the project folder.
-### Training the Agent
-To start training the agent, run the following command:
-```bash
-mlagents-learn .\config\tau_agent_ppo_c.yaml --run-id=tau_agent_ppo_A0 --env .\Build --torch-device cuda --timeout-wait 300 --force
-```
-Note: The preferred way to run a build is by creating a new build into the `Build` directory which is referenced by the above command.
-### Monitoring Training
-You can monitor the training progress using TensorBoard:
-```bash
-tensorboard --logdir results
-```
-## Results
-The training results, including the average reward and cumulative reward, can be visualized using TensorBoard. The graphs below show the performance of the agent over time:
-![Average Reward](path/to/average_reward.png)
-![Cumulative Reward](path/to/cumulative_reward.png)
-## Citation
-If you use this project in your research, please cite it as follows:
-```bibtex
-@misc{Tau,
-  author = {K. Rawson},
-  title = {Tau LLM Unity ML Agents Project},
-  year = {2024},
-  publisher = {GitHub},
-  journal = {GitHub repository},
-  howpublished = {\url{https://github.com/p3nGu1nZz/Tau}},
-}
-```
-## License
-This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
-## Acknowledgments
-- Unity ML-Agents Toolkit
-- TensorFlow and PyTorch communities
 - Hugging Face for hosting the model repository

+---
+license: mit
+---
+# Tau LLM Unity ML Agents Project
+Welcome to the Tau LLM Unity ML Agents Project repository! This project focuses on training reinforcement learning agents using Unity ML-Agents and the PPO algorithm. Our goal is to optimize the performance of the agents through various configurations and training runs.
+## Project Overview
+This repository contains the code and configurations for training agents in a Unity environment using the Proximal Policy Optimization (PPO) algorithm. The agents are designed to learn and adapt to their environment, improving their performance over time.
+### Key Features
+- **Reinforcement Learning**: Utilizes the PPO algorithm for training agents.
+- **Unity ML-Agents**: Integrates with Unity ML-Agents for a seamless training experience.
+- **Custom Reward Functions**: Implements gradient-based reward functions for nuanced feedback.
+- **Memory Networks**: Incorporates memory networks to handle temporal dependencies.
+- **TensorBoard Integration**: Monitors training progress and performance using TensorBoard.
+## Configuration
+Below is the configuration used for training the agents:
+```yaml
+behaviors:
+  TauAgent:
+    trainer_type: ppo
+    hyperparameters:
+      batch_size: 256
+      buffer_size: 4096
+      learning_rate: 0.00003
+      beta: 0.005
+      epsilon: 0.2
+      lambd: 0.95
+      num_epoch: 10
+      learning_rate_schedule: linear
+    network_settings:
+      normalize: true
+      hidden_units: 256
+      num_layers: 4
+      vis_encode_type: simple
+      memory:
+        memory_size: 256
+        sequence_length: 256
+        num_layers: 4
+    reward_signals:
+      extrinsic:
+        gamma: 0.99
+        strength: 1.0
+      curiosity:
+        gamma: 0.995
+        strength: 0.1
+        network_settings:
+          normalize: true
+          hidden_units: 256
+          num_layers: 4
+          learning_rate: 0.00003
+    keep_checkpoints: 10
+    checkpoint_interval: 100000
+    threaded: true
+    max_steps: 3000000
+    time_horizon: 256
+    summary_freq: 10000
+```
+## Model Naming Convention
+The models in this repository follow the naming convention `Tau_<series>_<max_steps>`. This helps in easily identifying the series and the number of training steps for each model.
+## Getting Started
+### Prerequisites
+- Unity 6
+- Unity ML-Agents Toolkit
+- Python 3.10.11
+- PyTorch
+- Transformers
+### Installation
+1. Clone the repository:
+   ```bash
+   git clone https://github.com/p3nGu1nZz/Tau.git
+   cd tau\MLAgentsProject
+   ```
+2. Install the required Python packages:
+   ```bash
+   pip install -r requirements.txt
+   ```
+3. Open the Unity project:
+   - Launch Unity Hub and open the project folder.
+### Training the Agent
+To start training the agent, run the following command:
+```bash
+mlagents-learn .\config\tau_agent_ppo_c.yaml --run-id=tau_agent_ppo_A0 --env .\Build --torch-device cuda --timeout-wait 300 --force
+```
+Note: The preferred way to run a build is by creating a new build into the `Build` directory which is referenced by the above command.
+### Monitoring Training
+You can monitor the training progress using TensorBoard:
+```bash
+tensorboard --logdir results
+```
+## Results
+The training results, including the average reward and cumulative reward, can be visualized using TensorBoard. The graphs below show the performance of the agent over time:
+![Average Reward](chart_tau_B1_10M_a.png)
+![Average Reward](chart_tau_B1_10M_b.png)
+![Average Reward](chart_tau_B1_10M_c.png)
+## Citation
+If you use this project in your research, please cite it as follows:
+```bibtex
+@misc{Tau,
+  author = {K. Rawson},
+  title = {Tau LLM Unity ML Agents Project},
+  year = {2024},
+  publisher = {GitHub},
+  journal = {GitHub repository},
+  howpublished = {\url{https://github.com/p3nGu1nZz/Tau}},
+}
+```
+## License
+This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
+## Acknowledgments
+- Unity ML-Agents Toolkit
+- TensorFlow and PyTorch communities
 - Hugging Face for hosting the model repository