Spaces:
Sleeping
title: SocialAI School Demo
emoji: π§π»ββοΈ
colorFrom: gray
colorTo: indigo
sdk: docker
app_port: 7860
SocialAI
This repository is the official implementation of SocialAI: Benchmarking Socio-Cognitive Abilities inDeep Reinforcement Learning Agents.
The website of the project is here
The code is based on: minigrid
Additional repositories used: BabyAI RIDE astar
Installation
Create and activate your conda env
conda create --name social_ai python=3.7
conda activate social_ai
conda install -c anaconda graphviz
Install the required packages
pip install -r requirements.txt
pip install -e torch-ac
pip install -e gym-minigrid
conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
Interactive policy
To run an enviroment in the interactive mode run:
python -m scripts.manual_control.py
You can test different enviroments with the --env
parameter.
RL experiments
Training
Minimal example
To train a policy, run:
python -m scripts.train --model test_model_name --seed 1 --compact-save --algo ppo --env SocialAI-AsocialBoxInformationSeekingParamEnv-v1 --dialogue --save-interval 1 --log-interval 1 --frames 5000000 --multi-modal-babyai11-agent --arch original_endpool_res --custom-ppo-2
The policy should be above 0.95 success rate after the first 2M environment interactions.
Recreating all the experiments
See run_SAI_final_case_studies.txt
for the experiments in the paper.
Regular machine
To run the experiments on a regular machine run_SAI_final_case_studies.txt
contains all the bash commands running the RL experiments.
Slurm based cluster (todo:)
To recreate all the experiments from the paper on a slurm based server configure the campaign_launcher.py
script and run:
python campaign_launcher.py run_NeurIPS.txt
Evaluation
To evaluate a policy, run:
python -m scripts.evaluate_new --episodes 500 --test-set-seed 1 --model-label test_model --eval-env SocialAI-TestLanguageFeedbackSwitchesInformationSeekingParamEnv-v1 --model-to-evaluate storage/test/ --n-seeds 8
To visualize a policy, run:
python -m scripts.visualize --model storage/test_model_name/1/ --pause 0.1 --seed $RANDOM --episodes 20 --gif viz/test
LLM experiments
For LLMs set your OPENAI_API_KEY
(and HF_TOKEN
) variable in ~/.bashrc
or wherever you want.
Creating in-context examples
To create in_context examples you can use the create_LLM_examples.py
script.
This script will open an interactive window, where you can manually control the agent. By default, nothing is saved. The general procedure is to press 'enter' to skip over environments which you don't like. When you see a wanted enviroment, move the agent in the wanted position and start recording (press 'r'). The current and the following steps in the episode will be recorded. Then control the agent and finish the episode. The new episode will start and recording will be turned off again.
If you already like some of the previously collected examples and want to append to them you can use the --load
argument.
Evaluating LLM-based agents
The script eval_LLMs.sh
contains the bash commands to run all the experiments in the paper.
Here is an example of running evaluation on the text-ada-001
model on the AsocialBox environment:
python -m scripts.LLM_test --episodes 10 --max-steps 15 --model text-ada-001 --env-args size 7 --env-name SocialAI-AsocialBoxInformationSeekingParamEnv-v1 --in-context-path llm_data/in_context_examples/in_context_asocialbox_SocialAI-AsocialBoxInformationSeekingParamEnv-v1_2023_07_19_19_28_48/episodes.pkl
If you want to control the agent yourself you can set the model to interactive
.
dummy
agent just executes the move forward action, and random
executes a random action. These agent are usefull for testing.